An end-to-end time-series forecasting pipeline for accurate product demand predictions in retail and supply chain scenarios.
This repository presents a comprehensive AI-based demand forecasting solution designed to predict product demand with high accuracy. It encapsulates an entire time-series forecasting pipeline, from exploratory data analysis (EDA) and feature engineering to model training, evaluation, and deployment. The system is particularly tailored for retail and supply chain contexts, enabling businesses to optimize inventory, streamline operations, and make informed strategic decisions based on precise demand predictions.
The core of the project is a deployed machine learning model that, given relevant input features, can provide future demand estimates, crucial for proactive planning and resource allocation.
- End-to-End Forecasting Pipeline: A complete workflow covering data ingestion, preprocessing, modeling, and deployment.
- Automated Feature Engineering: Utilizes various time-series features (e.g., lagged variables, rolling statistics, calendar features) to enhance model accuracy.
- Advanced Machine Learning Model: Employs a robust XGBoost model for high-performance demand prediction.
- Model Persistence: Pre-trained models, scalers, and label encoders are serialized (
.pklfiles) for efficient loading and deployment. - API for Predictions: The
app.pyscript serves as an API endpoint to receive new data and return demand forecasts. - Performance Tracking: Includes
model_performance_metrics.csvto track and evaluate model performance. - Scalable Data Preprocessing: Leverages
scikit-learnfor data scaling and categorical encoding, ensuring consistency in feature transformation. - Real-world Data Integration: Designed to work with real-world inventory data, as exemplified by
inventory_data.json.
Machine Learning & Data Science Libraries:
Follow these steps to set up and run the AI-based demand forecasting application locally.
- Python 3.8+
- It's recommended to use a virtual environment.
-
Clone the repository
git clone https://github.com/abz-mhd/ai-based-demand-forcasting.git cd ai-based-demand-forcasting -
Create and activate a virtual environment
python -m venv venv # On macOS/Linux source venv/bin/activate # On Windows .\venv\Scripts\activate
-
Install dependencies Since there is no
requirements.txtprovided, you will need to manually install the required libraries. It's highly recommended to create arequirements.txtfile first.Option A: Create
requirements.txt(Recommended) Create a file namedrequirements.txtin the project root with the following content:Flask pandas numpy scikit-learn xgboostThen, install:
pip install -r requirements.txt
Option B: Install directly
pip install Flask pandas numpy scikit-learn xgboost
The project relies on several data files and pre-trained model artifacts, which are already included in the repository:
Dataset/: Directory for input data.inventory_data.json: The primary dataset used for forecasting.feature_names.pkl: Pickled list of feature names used by the model.label_encoders.pkl: Pickled dictionary of label encoders for categorical features.scaler.pkl: Pickled data scaler (e.g.,StandardScaler) for numerical features.xgboost_demand_forecasting_model.pkl: The serialized XGBoost model.model_performance_metrics.csv: Stores evaluation metrics of the trained model.
Ensure these files are present in the root directory or their respective locations as per the Project Structure.
The App Run Commend.txt suggests running app.py directly.
- Run the Flask application:
The application will typically start on
python app.py
http://localhost:5000(or another port if configured withinapp.py).
ai-based-demand-forcasting/
├── .idea/ # IDE specific configurations (e.g., PyCharm)
├── App Run Commend.txt # Instructions to run the application
├── Dataset/ # Directory intended for datasets
├── app.py # Main application file, likely hosting the prediction API
├── feature_names.pkl # Pickled list of feature names used by the model
├── inventory_data.json # Primary dataset containing inventory information
├── label_encoders.pkl # Pickled label encoders for categorical features
├── model_performance_metrics.csv # CSV containing performance metrics of the trained model
├── scaler.pkl # Pickled data scaler (e.g., StandardScaler)
├── xgboost_demand_forecasting_model.pkl # The pre-trained XGBoost demand forecasting model
└── README.md # This README file
The app.py file likely exposes a prediction endpoint. Below is an inferred structure for how to interact with the deployed model.
Makes a demand prediction based on provided input features.
- URL:
/predict - Method:
POST - Content-Type:
application/json
Request Body Example:
{
"date": "2026-03-01",
"store": "StoreA",
"item": "Item1",
"price": 10.50,
"promotion": 0,
"holiday": 0
// ... other features required by the model
}Note: The actual required features depend on the feature_names.pkl and the model's training data. Ensure all necessary features are provided in the correct format.
Response Body Example (Success):
{
"predicted_demand": 125.75,
"unit": "units"
}Response Body Example (Error):
{
"error": "Missing input feature: 'promotion'",
"status_code": 400
}python app.py: Starts the Flask application and makes the prediction API available.
- Ensure all prerequisites are installed and the virtual environment is activated.
- Make changes to
app.pyor any related scripts. - Restart the application using
python app.pyto apply changes. - Test the API endpoint using tools like
curl, Postman, or a custom script.
This project includes a model_performance_metrics.csv file which records the evaluation metrics of the trained model. For real-time testing of the deployed prediction API:
- Ensure the application is running (
python app.py). - Send
POSTrequests to the/predictendpoint with various input data. - Verify that the API returns predictions in the expected format.
Note: There are no explicit unit/integration test files (tests/) detected in the repository structure. It is recommended to add these for robust development.
The app.py script serves as a standalone Flask application that can be deployed as a web service.
- For local development, simply run
python app.py. - For production deployment, it is recommended to use a production-ready WSGI server like Gunicorn or uWSGI, typically behind a reverse proxy (e.g., Nginx), or deploy to cloud platforms such as AWS Elastic Beanstalk, Google Cloud Run, or Azure App Service.
- Consider containerization with Docker for consistent environments across development and production. (No Dockerfile detected in current structure).
We welcome contributions to enhance this AI-based demand forecasting pipeline! Please consider contributing by:
- Improving the demand forecasting model (e.g., trying new algorithms, hyperparameter tuning).
- Adding more sophisticated feature engineering techniques.
- Enhancing the API with additional endpoints or error handling.
- Adding a proper
requirements.txtfile for easier setup. - Implementing unit and integration tests.
- Creating a Dockerfile for containerization.
- Improving documentation or providing examples.
This project is open-source. Please see the repository for specific licensing information. If no explicit LICENSE file is present, standard open-source practices apply.
- This project leverages powerful open-source libraries like
Pandas,NumPy,Scikit-learn,XGBoost, andFlask. - Inspired by best practices in time-series forecasting and MLOps.
- 🐛 Issues: Feel free to open an issue on the GitHub Issues page for bug reports or feature requests.
⭐ Star this repo if you find it helpful!
Made with ❤️ by abz-mhd