Credit-Risk-Assessment

The objective of this study is to develop a Probability of Default (PD) model for retail loan applicants based on historical credit data. The model aims to estimate the likelihood that a borrower will default on a loan, thereby supporting credit risk assessment and decision-making processes.

The analysis is based on a dataset provided by Lending Club, which contains detailed information on consumer loans issued between 2007 and 2015. The dataset comprises more than 460,000 loan observations and includes borrower characteristics, loan attributes, financial indicators, and loan performance outcomes.

Target Variable Definition

The target variable is derived from the loan status information. A binary default indicator is constructed as follows:

Bad loans (default = 0):
- Loans with status Charged Off
- Loans with status Default
- Loans with payment delays of 31–120 days
Good loans (default = 1):
- All remaining loan statuses

This formulation transforms the problem into a binary classification task, where the goal is to predict the probability that a loan will belong to the default class.

Modeling Approach

The PD model is developed using a traditional credit risk modeling framework that emphasizes interpretability and regulatory compliance. The approach includes:

Variable binning to capture non-linear relationships and reduce noise,
Transformation of predictors using Weight of Evidence (WoE),
Feature selection based on Information Value (IV) to identify variables with strong discriminatory power.

This methodology is widely used in credit scoring applications and enables transparent interpretation of model outputs.

Temporal Data Split

To reflect real-world credit risk modeling conditions, the dataset is divided into two time-based samples:

a development sample, consisting of loan applications available at the time of model construction;
a validation sample, containing applications observed after the model was developed.

This temporal split allows for the assessment of model stability over time and the evaluation of potential shifts in borrower characteristics.

Practical Implications

The resulting PD model can be applied to:

assess the creditworthiness of new loan applicants;
support loan approval and pricing decisions;
serve as an input to expected loss and broader credit risk management frameworks.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
creditRisk_PDmodel.ipynb		creditRisk_PDmodel.ipynb
creditRisk_preproc.ipynb		creditRisk_preproc.ipynb
creditRisk_selection.ipynb		creditRisk_selection.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Credit-Risk-Assessment

Target Variable Definition

Modeling Approach

Temporal Data Split

Practical Implications

About

Uh oh!

Releases

Packages

Languages

License

mmichellesvv/Credit-Risk-Assessment

Folders and files

Latest commit

History

Repository files navigation

Credit-Risk-Assessment

Target Variable Definition

Modeling Approach

Temporal Data Split

Practical Implications

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages