This project aims to predict whether a customer will make a specific transaction using a binary classification model. The work includes data understanding, exploratory analysis, model development, and performance evaluation.
You can find the project summary in 02_Non-Technical-Summary.md
All the product requirements are in 03_PRD.md
- Python 3.11.3
- All dependencies are listed in requirements.txt
Before running the notebooks, fork the repository and set up a fresh virtual environment.
Use the commands below for your operating system.
If installation errors occur (especially on Apple Silicon), removing strict version pins in requirements.txt may help.
Here's a combined README with all necessary information organized clearly:
Please make sure you have forked the repo and set up a new virtual environment.
Note:
- If there are errors during environment setup, try removing the versions from the failing packages in the requirements file.
- In some cases it is necessary to install the graphviz compiler for the transformers library.
- Make sure to install hdf5 if you haven't done it before.
Check if graphviz is already installed by running:
dot -VIf you haven't installed it yet, follow the instructions below for your operating system.
Update Homebrew and install graphviz and hdf5:
brew update
brew install graphviz
brew install hdf5Restart your terminal and verify the installation:
dot -VUpdate chocolatey and install graphviz:
choco upgrade chocolatey
choco install graphvizPress Y for the standard installation.
For hdf5, visit this website to install hdf5.
Restart your terminal and verify the installation:
dot -Vsudo apt-get update
sudo apt-get install -y build-essential cmake libomp-dev graphviz libhdf5-devsudo dnf install -y gcc-c++ cmake libomp graphviz hdf5-develFor macOS with Intel chips:
pyenv local 3.11.3
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txtFor macOS with Silicon chips (M1/M2/M3):
pyenv local 3.11.3
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements_silicon.txtIf LightGBM fails to install:
brew install cmake libomp
pip install lightgbm
pip install -r requirements.txtPowerShell:
pyenv local 3.11.3
python -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install -r requirements.txtGit Bash:
pyenv local 3.11.3
python -m venv .venv
source .venv/Scripts/activate
python -m pip install --upgrade pip
pip install -r requirements.txtIf LightGBM fails due to CMake or compiler errors:
pip install cmake
pip install lightgbmpyenv local 3.11.3
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txtIf LightGBM fails to install:
pip install lightgbm- Logistic Regression — baseline linear model.
- LightGBM — gradient-boosted tree model optimized for speed and efficiency, designed to handle large datasets and high-dimensional features with fast training and strong predictive performance.
- XGBoost — tree-based boosting model used to benchmark performance and validate robustness across model families.