Predictive Credit Risk Model

This repository contains a machine learning project for credit risk prediction using the UCI Default of Credit Card Clients dataset. The model predicts whether a client will default on their credit card payment based on their demographic, payment history, and bill statement data.

Overview

Credit risk assessment is crucial for financial institutions to minimize losses. This project utilizes a Random Forest Classifier to predict the likelihood of a client defaulting, with results evaluated using metrics like Accuracy, ROC AUC Score, and Classification Report.

Dataset

The dataset used is sourced from the UCI Machine Learning Repository and contains:

30,000 samples of credit card clients.
23 features including:
- Demographic information: SEX, AGE, EDUCATION, MARRIAGE
- Payment history: PAY_0 to PAY_6
- Bill statements: BILL_AMT1 to BILL_AMT6
- Payment amounts: PAY_AMT1 to PAY_AMT6
Target variable: default (1 = Default, 0 = No Default)

Workflow

Data Preprocessing:
- Filling missing values with column means.
- Standardizing numeric features using StandardScaler.
- Encoding categorical variables using LabelEncoder.
Class Balance Check:
- The dataset has an equal distribution of Default and No Default classes (4673 samples each), ensuring no need for resampling techniques.
Model Training:
- A Random Forest Classifier is trained.
- Hyperparameter tuning performed using GridSearchCV.
Model Evaluation:
- Accuracy: 85.4%
- ROC AUC Score: 0.924
- Detailed Classification Report and Confusion Matrix are generated.
Feature Importance:
- The top predictors of credit default are identified, including LIMIT_BAL, PAY_0, and BILL_AMT features.

Results

Key Metrics:

Metric	Value
Accuracy	85.4%
ROC AUC	0.924
Precision	0.85–0.86
Recall	0.85–0.86

Confusion Matrix:

The confusion matrix highlights the prediction performance for both classes:

Actual/Predicted	No Default	Default
No Default	4024	649
Default	711	3962

Installation

To run this project locally, follow these steps:

Clone the Repository:

git clone https://github.com/<YourUsername>/<RepoName>.git
cd <RepoName>

Install Dependencies: Install the required Python libraries using pip:
```
pip install -r requirements.txt
```
Run the Jupyter Notebook: Open the Jupyter Notebook to explore the code:
```
jupyter notebook
```

Requirements

Python 3.8+
Libraries:
- pandas
- numpy
- matplotlib
- seaborn
- scikit-learn
- imbalanced-learn (if SMOTE is applied in future versions)

Visualizations

Confusion Matrix:
Feature Importance:

Next Steps

Compare performance with other models like XGBoost and LightGBM.
Deploy the model as an API for real-time predictions.
Add visualization dashboards for better insights.

Contributions

Contributions are welcome! Feel free to fork the repository, create a new branch, and submit a pull request.

License

This project is licensed under the MIT License.

Author

Jebin Larosh Jervis
Connect with me: LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
ConfusionMatrix.png		ConfusionMatrix.png
FeatureImportance.png		FeatureImportance.png
PredictiveCreditRisk.ipynb		PredictiveCreditRisk.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive Credit Risk Model

Overview

Dataset

Workflow

Results

Key Metrics:

Confusion Matrix:

Installation

Requirements

Visualizations

Next Steps

Contributions

License

Author

About

Releases

Packages

Languages

Jebin1999/CreditRiskmodel

Folders and files

Latest commit

History

Repository files navigation

Predictive Credit Risk Model

Overview

Dataset

Workflow

Results

Key Metrics:

Confusion Matrix:

Installation

Requirements

Visualizations

Next Steps

Contributions

License

Author

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages