In this report, we perform an initial examination of the dataset obtained from UC Irvine in 2016, focusing on credit card clients in Taiwan who have defaulted. Building a predictive model to assess the likelihood of customer defaulting requires fairness to prevent discrimination based on sensitive features. Defaulting on a credit card is defined as failing to make the minimum payment for at least 180 days. We plan to explore whether we can find a risk prediction model that is fair across different subgroups (male/female, education) that still remains accurate.
The original source for the data can be found here: https://archive.ics.uci.edu/dataset/350/default+of+credit+card+clients
Our main research question is as follows: Is it possible to develop an accurate and fair algorithmic decision-making model for predicting credit card default without relying on sensitive features? Some of our sub questions stemming from this include: Are models not trained on sensitive features accurate at predicting credit card defaulting? And, how does this play out between genders?
Go through the notebook phase5.ipynb to see how we approached this research question. This was done for the class INFO 4390: Designing Fair Algorithms at Cornell University.
Tanvi Namjoshi, Dylan Van Bramer, Madeline Demers, Ella White