Skip to content

Suji2007hub/Credit-Card-Predictor

Repository files navigation

Credit Card Approval Predictor

A machine learning system that automates credit card approval decisions using historical application data.

The Problem

Credit card approvals typically take days to process. There's inconsistency between approvers and manual review is expensive and error-prone.

The Solution

This system evaluates applicants instantly and consistently while explaining its reasoning. It integrates historical patterns to predict default risk and approves qualified applicants immediately.

How It Works

Six-Step Pipeline

  1. Data Collection - Historical credit applications with outcomes
  2. Preprocessing - Clean missing values, standardize features for ML
  3. Feature Engineering - Compute meaningful metrics (debt-to-income ratio)
  4. Model Training - Test Logistic Regression, Random Forest, XGBoost (67/33 split)
  5. Evaluation - Compare accuracy, precision, recall across models
  6. Deployment - Web interface for real-time predictions

Key Metrics

  • Accuracy: 70% on test set
  • Recall: 100% (identifies all defaults)
  • Speed: < 100ms per decision
  • Availability: 24/7 uptime

Running the System

1. Install Dependencies

pip install -r requirements.txt

2. Run Full Pipeline

python six_step_algorithm.py

This executes all 6 steps and shows model performance.

3. Start Web Application

streamlit run website.py

4. View Analytics (Optional)

In another terminal:

streamlit run analytics_dashboard.py --server.port 8502

Project Files

File Purpose
preprocessing.py Data cleaning, scaling, feature engineering
train_model.py Initial model training script
six_step_algorithm.py Complete workflow - start here
model_comparison.py Compare 3 algorithms with cross-validation
hyperparameter_tuning.py GridSearch/RandomSearch optimization
website.py Streamlit prediction interface
analytics_dashboard.py Business metrics and compliance dashboard
data/credit.csv Training dataset (30 applications)

Feature Engineering

The system creates a Debt-to-Income (DTI) Ratio feature:

DTI = Total Debt / Annual Income

This single metric captures financial health better than raw debt or income alone. Mean DTI in training data: 0.22, Max: 0.70.

Model Performance

Tested three algorithms on the same data:

Model Accuracy Precision Recall F1-Score
Logistic Regression 70% 40% 100% 57%
Random Forest 60% 33% 100% 50%
XGBoost 60% 33% 100% 50%

Best Model: Logistic Regression (highest accuracy, interpretable)

Data

The system uses 30 simulated credit applications with:

  • Age (18-100)
  • Annual Income ($20K-$200K)
  • Credit Score (300-850)
  • Credit Utilization (0-100%)
  • Payment History Score (0-1)
  • Total Outstanding Debt ($2K-$28K)
  • Target: Approval (1) or Rejection (0)

Split: 67% training, 33% testing (stratified by outcome)

Compliance & Fairness

  • Fair lending: Model evaluated for disparate impact
  • Explainable: Every decision shows contributing factors
  • Auditable: All predictions logged with input features
  • GDPR ready: No sensitive personal data stored

Future Improvements

  1. Increase training data (currently 30 samples)
  2. Add more features (employment history, credit age, etc.)
  3. Implement model monitoring for performance degradation
  4. Set up automated retraining pipeline
  5. A/B test against manual approval process
  6. Deploy to AWS/GCP with API endpoints

Questions?

Check the documentation files:

  • six_step_algorithm_guide.md - Detailed technical explanation
  • crisp_dm_methodology.md - ML methodology and best practices

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages