📊Customer Churn Analysis

📈 Project Overview

The Customer Churn Analysis project aims to predict whether a customer will leave a telecommunications company (churn) based on various features such as usage patterns, demographics, and service details. By accurately predicting churn, the company can proactively address customer concerns, improve retention strategies, and enhance overall customer satisfaction.

🔍 Features

View Consolidated Dataset:
- Explore the entire dataset with easy-to-understand metrics and visualizations.
Geospatial Insights:
- See where customers are located and how churn patterns look on an interactive map.
New Customer Prediction:
- Enter details about new customers to predict if they might leave, with helpful visual feedback.
Comprehensive Data Processing:
- A strong process for loading, cleaning, and transforming data to ensure high-quality inputs for modeling.
Model Training & Evaluation:
- Use XGBoost, a powerful tool, to build an accurate model that predicts customer churn.

🛠️ Technologies Used

Programming Languages: Python
Web Framework: Streamlit
Data Processing: Pandas, Joblib
Machine Learning: Scikit-learn, XGBoost
Visualization: Kepler.gl

🚀 Installation

Clone the Repository

git clone https://github.com/yourusername/CustomerChurnModelgit
cd CustumerChurnModel

Install Dependencies

pip install -r requirements.txt

If requirements.txt is not present, install the necessary packages manually:

pip install streamlit pandas numpy scikit-learn xgboost plotly keplergl streamlit-keplergl joblib

Prepare the Data

Ensure that the raw data files are placed in the ./data/raw/ directory as follows:
- services.xlsx
- demographics.xlsx
- location.xlsx
- status.xlsx
Note: Replace the placeholder data with your actual datasets.
Process the Data

Run the data processing script to merge, clean, and save the processed data.
```
python scripts/data_processing.py
```
Train the Model

Execute the model training script to build and save the churn prediction model.
```
python scripts/model_training.py
```
Run the Streamlit Application

Launch the web application to interact with the churn prediction system.
```
streamlit run app/streamlit_app.py
```
The app will be accessible at http://localhost:8501.

💡 Usage

1. Predicting New Customer Churn

Go to the New Customer Prediction section.
Input relevant customer details such as tenure, monthly charges, services subscribed, and demographics.
Click on Predict Churn to receive a probability score and risk assessment.
Visual indicators and key risk factors will help interpret the prediction.

2. Geospatial Insights

Access the Geospatial Insights section to visualize customer locations and churn patterns on an interactive map.
Understand regional trends and identify hotspots of customer churn.

3. Viewing the Dataset

Navigate to the View Dataset section.
Explore key metrics like total customers, average tenure, and monthly charges.
Utilize the tabs to delve into churn analysis, demographic insights, or view the raw data.

🗃️ Data Description

Raw Datasets

services.xlsx
- Columns: Customer ID, Tenure in Months, Phone Service, Internet Service, Streaming, Monthly Charge, Total Charges
demographics.xlsx
- Columns: Customer ID, Age, Gender
location.xlsx
- Columns: Customer ID, City, State, Zip Code, Latitude, Longitude
status.xlsx
- Columns: Customer ID, Churn Value, Churn Category, Churn Reason

Processed Data

The raw datasets are merged on Customer ID to form a consolidated dataset.
Non-essential columns are dropped, and data types are appropriately set.
Missing values are handled, and features are scaled for modeling.
The final processed data is saved as merged.parquet in the ./data/processed/ directory.

🤖 Model Training

Algorithm

XGBoost Classifier: Used for its performance and ability to handle complex datasets.

Training Pipeline

Load Data:
- We start by loading the cleaned data from a file called merged.parquet.
Prepare Data:
- Convert categories (like gender or service type) into numbers so the model can understand them.
- Scale numerical values (like charges) to ensure they are on a similar range.
Split Data:
- Divide the data into two parts: one for training the model and one for testing how well it works.
Tune Model Settings:
- Adjust settings (how deep the model can go) to find the best version of the model that predicts churn accurately.
Evaluate Model:
- Check how well the model performs using various metrics (like accuracy) to see if it’s making good predictions.
Save Model:
- Save the best version of the model and its settings so we can use it later without retraining.

Training Script

Located at scripts/model.py
Execute using:
```
python scripts/model_training.py
```

🖥️ Application

Streamlit Web App

File: app/streamlit_app.py
Launch Command:
```
streamlit run app/streamlit_app.py
```

Features:

Dataset:
- Displays key metrics and interactive visualizations.
- Tabs for churn analysis, demographics, and raw data exploration.
Geospatial Insights:
- Interactive map showcasing customer locations and churn density.
New Customer Prediction:
- Input form for new customer details.
- Predicts churn probability with visual indicators and risk factors.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
data		data
model		model
scripts		scripts
.DS_Store		.DS_Store
README.md		README.md
photo.png		photo.png
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊Customer Churn Analysis

Table of Contents

📈 Project Overview

🔍 Features

🛠️ Technologies Used

🚀 Installation

💡 Usage

1. Predicting New Customer Churn

2. Geospatial Insights

3. Viewing the Dataset

🗃️ Data Description

Raw Datasets

Processed Data

🤖 Model Training

Algorithm

Training Pipeline

Training Script

🖥️ Application

Streamlit Web App

Features:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📊Customer Churn Analysis

Table of Contents

📈 Project Overview

🔍 Features

🛠️ Technologies Used

🚀 Installation

💡 Usage

1. Predicting New Customer Churn

2. Geospatial Insights

3. Viewing the Dataset

🗃️ Data Description

Raw Datasets

Processed Data

🤖 Model Training

Algorithm

Training Pipeline

Training Script

🖥️ Application

Streamlit Web App

Features:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages