Skip to content

Manisha7530/Data_Collection

Repository files navigation

📚 Data Collection using Web Scraping

Python BeautifulSoup Requests

📖 Overview

This project demonstrates data collection through web scraping using Python. The data was collected from the Books to Scrape practice website and processed using Requests and BeautifulSoup.

The project covers the complete workflow of:

  • Sending HTTP requests
  • Downloading web pages
  • Parsing HTML content
  • Extracting structured information
  • Saving data into CSV format

🎯 Objectives

  • Learn web scraping fundamentals
  • Understand HTML page structure
  • Extract book information automatically
  • Store collected data in CSV format
  • Build a reusable scraping workflow

🛠️ Technologies Used

  • Python
  • Requests
  • BeautifulSoup4
  • Jupyter Notebook
  • CSV

📂 Repository Structure

Data_Collection/
│
├── Web_Scrape/
│   ├── requests.ipynb
│   ├── Beautifulsoup.ipynb
│   ├── page1.html
│   ├── page2.html
│   ├── page3.html
│   ├── page4.html
│   ├── page5.html
│   └── HTML-Books.csv
│
└── README.md

🚀 Workflow

Step 1: Fetch Web Pages

Use the Requests library to download HTML pages.

Step 2: Parse HTML

Use BeautifulSoup to parse page content and locate required elements.

Step 3: Extract Data

Collect information such as:

  • Book Title
  • Price
  • Availability
  • Rating

Step 4: Store Data

Save extracted information into CSV format for further analysis.


📊 Output

The extracted data is stored in:

HTML-Books.csv

This dataset can be used for:

  • Data Analysis
  • Machine Learning Practice
  • Data Cleaning Exercises
  • Visualization Projects

💡 Learning Outcomes

After completing this project, you will understand:

  • HTTP Requests
  • HTML Parsing
  • CSS Selectors
  • Data Extraction
  • CSV Handling
  • Basic Data Collection Pipeline

👩‍💻 Author

Manisha Kumari

Aspiring AI/ML Engineer | Open Source Contributor | Python Enthusiast

About

Data collection from Web Scraping Using by BeautifulSoup, Request Libraries

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors