Compare TextBlob, VADER, and Transformer-based models on the same dataset to see how different sentiment analysis methods interpret product reviews over time.
Given a CSV with datetimes, product identifiers, and English-language text, this project runs three independent sentiment analysis methods against every record and produces side-by-side comparisons. It generates per-product summaries, time-series trend lines, word clouds, and an interactive HTML dashboard so you can evaluate which method best fits your data.
TextBlob -- A rule-based approach that returns polarity (negative to positive) and subjectivity scores. Fast and simple, but treats every domain the same way.
VADER -- A lexicon and rule-based tool built specifically for short, informal text. It handles punctuation emphasis, capitalization, and degree modifiers (e.g., "very good" vs. "good"). Generally the best choice for social media and review data.
Transformers (BERT/RoBERTa) -- Pre-trained deep learning models fine-tuned on sentiment tasks. Slower and more resource-intensive, but captures context and nuance that lexicon methods miss.
All results are written to the results/ directory:
interactive_dashboard_vader.html-- Plotly dashboard with filterable chartssentiment_distribution_vader.png-- Histogram of sentiment score distributionsentiment_by_product_vader.png-- Per-product sentiment comparisonsentiment_timeline_vader.png-- Sentiment trends over timewordcloud_positive.png/wordcloud_negative.png-- Most frequent terms by polaritysentiment_analysis_report_*.txt-- Full text reportsentiment_analysis_results_*.csv-- Raw scored data*_summary.json-- Aggregated statistics
git clone https://github.com/stone-ericm/sentiment_analysis.git
cd sentiment_analysis
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txtDownload NLTK data (once):
python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords'); nltk.download('vader_lexicon')"Run the analysis:
python src/main.py data/sample_data.csv
python src/main.py data/sample_data.csv --methods vader textblob
python src/main.py data/sample_data.csv --methods vader --viz-method vaderInput CSV format:
datetime,product,quote
2024-01-15 10:30:00,Widget A,"This product is amazing and works perfectly!"
Or explore interactively with jupyter lab and open notebooks/sentiment_analysis_exploration.ipynb.
sentiment_analysis/
src/
main.py Entry point and CLI
data_processor.py Loading, cleaning, datetime parsing
sentiment_analyzer.py TextBlob, VADER, and Transformer wrappers
visualizer.py Charts, dashboards, word clouds
notebooks/
sentiment_analysis_exploration.ipynb
data/
sample_data.csv Example dataset
results/ Generated reports and visualizations
requirements.txt
setup.py Automated environment setup script