Telco Customer Churn Prediction

Project Overview

This project develops a high-sensitivity machine learning pipeline to identify customers at risk of churn for a telecommunications provider. By shifting focus from standard accuracy to high-recall modeling, the system ensures that approximately 82% of actual churners are flagged for proactive retention efforts.

Project Structure

The repository is organized into a modular architecture to ensure scalability and maintainability:

data/: Contains raw and processed datasets.
notebooks/: Jupyter notebooks for EDA, experimental modeling, and extracting actionable business insights to reduce customer attrition.
src/: Core logic including data preprocessing, visualization, and model training scripts.
requirements.txt: List of dependencies required to run the environment.

Methodology

1. Exploratory Data Analysis (EDA)

Identified significant class imbalance (approx. 26% churn rate).
Analyzed key correlations between tenure, Contract, InternetService, and churn probability.

2. Modeling & Strategy

Baseline: Logistic Regression used to establish a linear performance floor.
Candidate Models: Random Forest and XGBoost selected for their ability to handle non-linear relationships.
Optimization: Conducted exhaustive Hyperparameter Tuning via GridSearchCV.
Imbalance Handling: Utilized scale_pos_weight and class_weight to prioritize Recall over standard Accuracy.

3. Model Interpretation (SHAP)

Utilized SHapley Additive exPlanations to provide transparency into model decisions. Key churn drivers include:

Contract Type: Month-to-month contracts are the highest risk factor.
Tenure: Lower tenure strongly correlates with increased churn risk.
Service Tier: Fiber Optic users exhibit higher attrition rates.

Results

Model	Recall (Churn)	F1-Score	AUC-ROC
Optimized XGBoost	0.82	0.62	0.844
Optimized Random Forest	0.76	0.63	0.844
Logistic Regression	0.55	0.59	0.839

Installation & Usage

Clone the repository.
Install dependencies: pip install -r requirements.txt.
Run the analysis: Execute the main notebook or training scripts in src/.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
notebook		notebook
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Telco Customer Churn Prediction

Project Overview

Project Structure

Methodology

1. Exploratory Data Analysis (EDA)

2. Modeling & Strategy

3. Model Interpretation (SHAP)

Results

Installation & Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Telco Customer Churn Prediction

Project Overview

Project Structure

Methodology

1. Exploratory Data Analysis (EDA)

2. Modeling & Strategy

3. Model Interpretation (SHAP)

Results

Installation & Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages