This project demonstrates a full machine learning workflow for sentiment analysis on Amazon product reviews, including:
- Cloud-based training on AWS SageMaker with HuggingFace Transformers
- IAM and resource management for real-world deployment
- Model artifact management and local inference (batch and single predictions)
- Visualization of sentiment predictions
sentiment-sagemaker/ ├── input/ # Input dataset (CSV)
├── sagemaker_trained/ # Trained model artifact from SageMaker
├── src/ # Source scripts
├── Scripts/ # Utility scripts
├── local_inference.py # Script for extracting and running local predictions
├── batch_prediction.py # Batch inference & visualization script
├── .gitignore
└── README.md
-
Clone the repo:
git clone https://github.com/<your-username>/Sentiment_analysis_2025.git cd Sentiment_analysis_2025
-
Install requirements (use virtualenv!):
pip install -r requirements.txt
(Add your
requirements.txtusingpip freeze > requirements.txtif needed) -
Run local inference:
python local_inference.py
- Extracts the SageMaker model and predicts sample reviews.
-
Batch prediction & visualization:
python batch_prediction.py
- Processes the whole dataset and shows result charts.
- Real-world AWS resource, quota, and IAM troubleshooting
- How to train and deploy Transformer models at scale
- Automating ML pipelines for reproducibility
- Visualizing and interpreting results
- Python 3.11+
- HuggingFace Transformers & PyTorch
- Pandas, Matplotlib
- AWS SageMaker (training), S3 (storage)
- IAM roles and permissions
- Souritra Banerjee
For questions, open an issue or contact me on LinkedIn





