This project was part of the African Credit Scoring Challenge, aiming to predict loan defaults in Africa’s dynamic financial markets. The objective was to build a robust machine learning model and a scalable credit scoring function to assist financial institutions in mitigating risk and optimising lending decisions.
- F1-score: Achieved a competitive score of 0.71 early in the competition.
- Imbalanced Data: Addressed class imbalance using SMOTEENN for hybrid resampling.
- Feature Engineering: Incorporated demographic and economic factors specific to African markets (economic-dataset.csv).
- Model Optimisation: Fine-tuned for robustness and generalisability.
- Credit Scoring: Developed a scalable function to classify probabilities into actionable risk categories.
-
Data Challenges:
- Managed significant class imbalances with advanced resampling techniques.
- Ensured generalisability across diverse customer demographics because the train dataset contain only Kenya data, but the test dataset on Zindi can contain other country data like Ghana.
-
Machine Learning Techniques:
- Used XGBoost for high-performance classification.
- Applied SMOTEENN for handling imbalanced data.
-
Credit Scoring Function:
- Designed a scalable system to categorise risk levels based on model predictions.
- Provided actionable insights for financial decision-making.
This project was both challenging and rewarding, demanding a mix of technical expertise and strategic thinking. It highlights my ability to handle real-world data challenges and propose scalable solutions.
- Languages: Python
- Libraries: Pandas, NumPy, Scikit-learn, XGBoost, Ensemble Learning, Matplotlib, Seaborn
Feel free to reach out if you'd like to discuss this project or my approach! 😊