A Comparative Analysis of Deep Learning Sequential Models for Temporal Feature Extraction in Credit Card Default Prediction

A Comparative Analysis of Deep Learning Sequential Models for Temporal Feature Extraction in Credit Card Default Prediction

Developed as an MSc dissertation for the program "Business Analytics: Operational Research and Risk Analysis" at the Alliance Manchester Business School.

Abstract

Three popular Deep Learning (DL) models (RNN, LSTM, CNN) were developed using various experimental setups, consisting of different final classification layers and input features. The purpose of this study was to discover the best performing DL model that extracts the dynamic characteristics of raw transactional data and is able to predict credit card default without the need for feature engineering. The LSTM model outperformed the other two, showing remarkable ability in identifying both defaulters and non-defaulters. This breakthrough, leveraging AI, has the potential to save financial institutions significant amounts of money by improving their credit risk assessment processes.

Keywords: Machine Learning, Deep Learning, Temporal Feature Extraction, Credit Card Default

Installation and Setup

Codes and Resources Used

IDE Used: VSCode
Python Version: 3.10.9

Python Packages Used

General Purpose: copy, os, random
Data Manipulation: pandas, numpy
Data Visualization: seaborn, matplotlib
Machine Learning Models - Preprocessing: scikit-learn, xgboost
Deep Learning Models: tensorflow, keras

Code structure

The folder python code in this github repository contains 3 files, namely:

EDA.ipynb: jupyter notebook with all the exploratory data analysis (EDA) plots of the project
hyperparameter_tuning.py: python file including the whole tuning process for all 24 models developed in this study (architecture tuning + optimizer, learning rate, batch size tuning - see elbow plots - tuning below)
results_on_test_set_optimal_conf.py: python file including the optimal configurations derived after tuning, and the final results on the test set

Additional folders/files in this repository

data: folder containing the dataset
elbow plots - tuning: Excel files containing F1 scores for each architecture tested in the first tuning phase (architecture tuning). Those files are called in hyperparameter_tuning.py to plot elbow curves and select the architecture that yields the highest F1 score with the lowest possible complexity. The selected optimal architectures were then tuned further (second tuning phase) regarding different optimizer, learning rates, batch sizes.
technical_graphs: pdf file containing some graphs/visualizations regarding the modelling pipeline and the different experimental setups used in this study. Also, visualizations are given to better understand how the 3 sequential DL models work.

Data

default_of_credit_card_clients.xls: dataset obtained from the UCI Machine Learning Repository (Yeh, 2016) and contains information about customers’ credit card payment history from a Taiwanese bank. Available in: https://archive.ics.uci.edu/dataset/350/default+of+credit+card+clients

Research Design

The graphs below demonstrate the 8 stages in which the DL models were compared. The stages consist of the use of different final classification layers, namely a dense layer, logistic regression, random forest and XGBoost, and whether or not static features are included together with the temporal ones. The optimal Temporal Feature Extractor (TFE) will be the one which consistently demonstrates superior performance across the majority of stages. Each model undergone rigorous hyperparameter tuning, before the final assessment, regarding architecture, optimizer, learning rate and batch size.

Results

Hyperparameter tuning - optimal configurations

The table presented below showcases the final configurations of 24 models spanning across 8 IF + FCL stages. Accompanying this information are the mean F1 scores and standard deviations derived from the 5-fold cross-validation procedure. It is noteworthy that, irrespective of the chosen input features, LSTM models, whether integrated with a dense layer or LR as the final classifier, consistently demonstrated superior performance in terms of mean F1 scores (the appendix mentioned is not available in this repository).

Below, a summarization of the optimal configurations occured after tuning for each DL model can be seen:

RNN

LSTM

CNN

Final Classification Results for all TFEs in each IF + FCL stage

As observed below, the RNN demonstrated its superior performance in 3 out of 8 stages (specifically, stages 3, 4, and 5), while the LSTM outperformed the others in 6 out of 8 stages (namely, stages 1, 2, 4, 5, 6, and 7). The CNN exhibited its strength in 2 out of 8 stages (specifically, stages 6 and 8). It's worth noting that the highest F1 scores achieved (0.481, 0.482, 0.488) were consistently associated with LSTM as the TFE.

Additionally, it is important to highlight that models employing RF and XGB (in stages 3, 4, 7, and 8) as the final classification layer tended to exhibit signs of overfitting. However, this pattern was consistent across all three TFEs in these stages, suggesting that it did not adversely impact the comparative analysis between them.

In conclusion, irrespective of the input features or the final classification layer chosen, LSTM consistently emerged as a top-performing model in the majority of cases, demonstrating its robustness in the final comparison.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
data		data
elbow plots - tuning		elbow plots - tuning
python code		python code
LICENSE.md		LICENSE.md
README.md		README.md
technical_graphs.pdf		technical_graphs.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Comparative Analysis of Deep Learning Sequential Models for Temporal Feature Extraction in Credit Card Default Prediction

Abstract

Installation and Setup

Codes and Resources Used

Python Packages Used

Code structure

Additional folders/files in this repository

Data

Research Design

Results

Hyperparameter tuning - optimal configurations

RNN

LSTM

CNN

Final Classification Results for all TFEs in each IF + FCL stage

About

Languages

License

DSkapinakis/credit-card-default-deep-learning

Folders and files

Latest commit

History

Repository files navigation

A Comparative Analysis of Deep Learning Sequential Models for Temporal Feature Extraction in Credit Card Default Prediction

Abstract

Installation and Setup

Codes and Resources Used

Python Packages Used

Code structure

Additional folders/files in this repository

Data

Research Design

Results

Hyperparameter tuning - optimal configurations

RNN

LSTM

CNN

Final Classification Results for all TFEs in each IF + FCL stage

About

Topics

Resources

License

Stars

Watchers

Forks

Languages