Best Buy Review Scraper & Sentiment Analyzer

This project is a web scraping and data analysis tool designed to extract customer reviews from Best Buy product pages, analyze their sentiment using a hybrid approach, and visualize key customer insights.

Features

Advanced Web Scraping: Uses undetected-chromedriver and Selenium to bypass anti-bot measures and handle dynamic content loading (infinite scrolling/pagination).
Data Extraction: Parses HTML using BeautifulSoup to extract:
- Review Title & Body
- Star Rating (1-5)
- Date of Review
- Reviewer Name & "Verified Buyer" Status
- "Recommendation" Status (Yes/No)
Hybrid Sentiment Analysis: Calculates sentiment scores using a custom weighted algorithm:
- NLTK VADER: Base sentiment scoring.
- TextBlob: Noun phrase extraction for topic modeling.
- Contextual Weighting: Adjusts sentiment scores based on the Star Rating and the user's "Recommended" flag.
Visualizations: Generates insightful charts using Matplotlib and Seaborn:
- Overall Sentiment Distribution (Pie Chart).
- Top Drivers of Sentiment (Bar Chart of key topics).
- Average Rating comparison (Verified vs. Unverified Buyers).

Prerequisites

Python 3.11
Google Chrome Browser (Must be installed on the system for the webdriver to work).

Installation

Clone the repository:

git clone https://github.com/SidoJain/Web-Scraping-Sentiment-Analysis.git

Install required Python packages: You can install the dependencies using the command below:
```
uv pip install -r requirements.txt
```
NLTK Data: The script automatically downloads the necessary NLTK lexicon (vader_lexicon) upon first run.

Usage

Open the Jupyter Notebook (main.ipynb).
Locate the main() function in the Driver Code cell.
Update the target_url variable with the link to the Reviews Page of the Best Buy product you wish to analyze.
- Note: Ensure the URL ends with /review or points specifically to the review section.

Set Chrome Verion number as follows:

driver = uc.Chrome(options=options, version_main={version_num})

Run all cells in the notebook.

def main():
    # Example URL
    target_url = "https://www.bestbuy.ca/en-ca/product/apple-macbook-air-13-6-w-touch-id-2025-midnight-apple-m4-16gb-ram-256gb-ssd-english/19205139/review"
    # ... rest of the code

Methodology

The Scraper The script launches a headless-like (but visible to avoid detection) Chrome instance. It:
- Loads the page and removes cookie/privacy banners.
- Applies the "Relevancy" filter.
- Repeatedly clicks the "Load More" button with random time delays to mimic human behavior until all reviews are loaded.
Sentiment Logic The analyze_sentiment function is more robust than standard library calls. It calculates a compound score based on:
- Text Analysis: VADER polarity score.
- Rating Bias: If the rating is >= 4, the score gets a bonus. If <= 2, it gets a penalty.
- Recommendation Bias: If the user clicked "No" on "Would you recommend this?", the score is heavily penalized.
Topic Extraction It uses TextBlob to extract Noun Phrases (e.g., "battery life", "screen quality") to identify what the user is talking about, assigning the sentiment score to that specific topic.

Disclaimer

This tool is for educational and research purposes only. Web scraping may violate the Terms of Service of specific websites. Please respect robots.txt files and scrape responsibly. Do not use this tool to overwhelm servers.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
main.ipynb		main.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Best Buy Review Scraper & Sentiment Analyzer

Features

Prerequisites

Installation

Usage

Methodology

Disclaimer

About

Uh oh!

Releases

Packages

Languages

SidoJain/Web-Scraping-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Best Buy Review Scraper & Sentiment Analyzer

Features

Prerequisites

Installation

Usage

Methodology

Disclaimer

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages