OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Journal of Software Engineering and Applications 2025

Credit Score Classification Using Advanced Machine Learning: A Comprehensive Approach

DOI: 10.4236/jsea.2025.183007, PP. 98-112

Chaoya Yan, Xinyu Zhang, Jiaqing Shen

Keywords: Credit Scoring, Machine Learning, CatBoost, Feature Engineering, Class Imbalance, Financial Risk Assessment

Full-Text Cite this paper Add to My Lib

Abstract:

This paper presents a comprehensive machine learning approach for credit score classification, addressing key challenges in financial risk assessment. We propose an optimized CatBoost-based framework that integrates advanced feature engineering, systematic class imbalance handling, and robust evaluation metrics. Our methodology achieves strong classification performance, with AUC scores of 0.944, 0.858, and 0.928 for the Poor, Standard, and Good credit score classes, respectively. The system particularly excels in distinguishing high-risk (Poor) and low-risk (Good) credit profiles, while the Standard class remains the most challenging due to its overlapping characteristics. Through extensive experimentation and analysis, we provide valuable insights into feature importance and model behavior, offering practical implications for financial institutions and credit scoring systems.

References

[1]	Hand, D.J. and Henley, W.E. (2001) Statistical Classification Methods in Consumer Credit Scoring: A Review. Journal of the Royal Statistical Society Series A: Statistics in Society, 160, 523-541. https://doi.org/10.1111/j.1467-985x.1997.00078.x
[2]	Thomas, L.C., Crook, J. and Edelman, D. (2000) Credit Scoring and Its Applications. SIAM.
[3]	Hand, D.J. (1997) Construction and Assessment of Classification Rules. Wiley.
[4]	Lessmann, S., et al. (2015) Benchmarking State-of-the-Art Classification Algorithms for Credit Scoring. Journal of the Operational Research Society, 66, 743-755.
[5]	West, D. (2000) Neural Network Credit Scoring Models. Computers & Operations Research, 27, 1131-1152. https://doi.org/10.1016/s0305-0548(99)00149-5
[6]	Huang, C., Chen, M. and Wang, C. (2007) Credit Scoring with a Data Mining Approach Based on Support Vector Machines. Expert Systems with Applications, 33, 847-856. https://doi.org/10.1016/j.eswa.2006.07.007
[7]	Lundberg, S.M. and Lee, S.I. (2017) A Unified Approach to Interpretable Model Predictions. arXiv: 1705.07874.
[8]	Chen, T. and Guestrin, C. (2016) XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 13-17 August 2016, 785-794. https://doi.org/10.1145/2939672.2939785
[9]	Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P. (2002) SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
[10]	Douzas, G., Bacao, F. and Last, F. (2018) Improving Imbalanced Learning through a Heuristic Oversampling Method Based on K-Means and Smote. Information Sciences, 465, 1-20. https://doi.org/10.1016/j.ins.2018.06.056
[11]	Gosiewska, A., Kozak, A. and Biecek, P. (2021) Simpler Is Better: Lifting Interpretability-Performance Trade-Off via Automated Feature Engineering. Decision Support Systems, 150, Article ID: 113556. https://doi.org/10.1016/j.dss.2021.113556
[12]	Ribeiro, M.T., Singh, S. and Guestrin, C. (2016) “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 13-17 August 2016, 1135-1144. https://doi.org/10.1145/2939672.2939778
[13]	Bussmann, N., Giudici, P., Marinelli, D. and Papenbrock, J. (2020) Explainable Machine Learning in Credit Risk Management. Computational Economics, 57, 203-216. https://doi.org/10.1007/s10614-020-10042-0
[14]	Heaton, J.B., Polson, N.G. and Witte, J.H. (2016) Deep Learning for Finance: Deep Portfolios. Applied Stochastic Models in Business and Industry, 33, 3-12. https://doi.org/10.1002/asmb.2209
[15]	Yang, Q., Liu, Y., Chen, T. and Tong, Y. (2019) Federated Machine Learning: Concept and Applications. ACM Transactions on Intelligent Systems and Technology, 10, 1-19. https://doi.org/10.1145/3298981
[16]	Paris, R. (2022) Credit Score Classification Dataset. https://www.kaggle.com/datasets/parisrohan/credit-score-classification

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133