Credit Card Approval Classification

By: Tahsin Jahin Khalid

This is a machine learning project that analyzes the Kaggle Dataset and does a classification task of determining whether a credit card is approved or not.

Technology Used:

Python (for Data Preprocessing)
Orange Data Mining

Dataset

Kaggle - Credit Card Approval
GPT4 (test data generation)

About Dataset

Commercial banks receive a lot of applications for credit cards. Many of them get rejected for many reasons, like high loan balances, low income levels, or too many inquiries on an individual's credit report, for example. Manually analyzing these applications is mundane, error-prone, and time-consuming.

About Project

Orange Data Mining Workflow

Project Summary

The preprocessed/cleaned data CSV file is loaded visa the CSV File Import widget.
A Select Columns widget is used to designate the "ApprovalStatus" variable as the target variable. The "Unnamed: 0" column is added to the ignore columns field.
We use a Distributions widget to visualise the target variable to check if there is any imbalance.
The stage is training and validating various models on the datasert for classification. For this project we have used logistic regression, decision tree, random forest, neural network and a stacked model of these for comparison.
The metrics of the model (Test and Score widget) are shown below:
- From these metrics, it can be seen that:
  - Out of the five models used, the stacked model and the neural network models have performed better than the baseline logistic regression model.
  - Comparing between the stacked model and the neural network, the neural nerwork (NN) model has performed marginally better in classification
  - The confusion Matrix of the NN model on training/validation data is shown below:
GPT4 is used to generate synthetic testing data to test the performance of the model on unseen data.
The NN model is used to make predictions on the unseen test data.
- The AUC on testing data is 0.566, and the Classification Accuracy is 52%.
- Inspecting the confusion matrix, the model's performance on test data is not optimal.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
imgs		imgs
.gitignore		.gitignore
CC_Credit_Card_Approval.ows		CC_Credit_Card_Approval.ows
LICENSE		LICENSE
README.md		README.md
data_prep_cc_approval.py		data_prep_cc_approval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Card Approval Classification

By: Tahsin Jahin Khalid

Technology Used:

Dataset

About Dataset

About Project

Orange Data Mining Workflow

Project Summary

About

Languages

License

tahsinjahinkhalid/Credit_Card_Approval_Classification

Folders and files

Latest commit

History

Repository files navigation

Credit Card Approval Classification

By: Tahsin Jahin Khalid

Technology Used:

Dataset

About Dataset

About Project

Orange Data Mining Workflow

Project Summary

About

Topics

Resources

License

Stars

Watchers

Forks

Languages