Skip to main content

What is Web Scrapping? It is a powerful way to extract data

Web scraping is a highly effective method to extract data from websites. It is a very useful method for collecting data for data science projects. Using the python programming language and writing a simple code we can achieve it.

What is Web Scrapping?
What is Web Scrapping?

What are the popular tools and libraries for Web Scrapping?

BeautifulSoup

BeautifulSoup is a parsing library that enables us to extract data from HTML and XML documents.  It can automatically detect encoding's and gracefully handles HTML documents even with special characters.

Scrapy

Scrapy is a python framework for large scale web scrapping. It gives you all the tools to extract the data from website, process them as you want and store them in your preferred format.

Selenium

Selenium is a tool to automate browsers. It is primarily used for testing but is also very useful for web scraping.

Steps involved in web scraping:

1. Crawl 

The first step is to navigate the target website by making an HTTP request and download the response you get.

2. Parse and Transform

Once you get the positive response, now it's the time to parse this downloaded data into an HTML parser like Beautiful Soup and extract the required data.

3. Storing The Data

The final part is the storing of the data which you have extracted. You can store this data in JSON or CSV file or any other databases which supports the format.

Comments

Popular posts from this blog