全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于模型融合的互联网信贷信用风险预测研究
Research on Internet Credit Risk Prediction Based on Model Fusion

DOI: 10.12677/SA.2019.85093, PP. 823-834

Keywords: 逻辑回归,信用风险,随机森林,XGBoost模型,LightGBM模型
Logistic Regression
, the Credit risk, Random Forests, XGBoost, LightGBM

Full-Text   Cite this paper   Add to My Lib

Abstract:

互联网信贷信用风险的预测是互联网金融可持续发展的关键因素,在放贷前准确预估借款人的信用风险,能有效较低企业可能的风险损失。随着机器学习的发展,机器学习的算法模型在互联网信贷信用风险方面的应用也越来越多。为了探究树模型和线性模型融合在互联网信贷信用风险预测的效果,本文采用Stacking模型融合方法设计了信用风险预测模型,其中第一层模型为随机森林、XGBoost、LightGBM,第二层模型为逻辑回归。并且在拍拍贷的真实数据上进行实验,对比了融合后的模型和单模型在AUC、准确率和耗时上的表现,结果表明融合后的模型虽然耗时长一些,但是在AUC和准确率方面都比单模型的效果要好,为互联网金融信贷风险预测模型的构建提供了一个新的思路。
The prediction of the credit risk of Internet credit is a key factor for the sustainable development of Internet finance. It can accurately estimate the credit risk of borrowers before lending, effectively reducing the possible risk loss of enterprises. With the development of machine learning, the algorithm model of machine learning has been applied more and more in the credit risk of Internet credit. In order to explore the effect of integrating tree model and linear model in the prediction of credit risk of Internet credit, this paper adopts Stacking model fusion method to design the credit risk prediction model, in which the first layer model is random forest, XGBoost and LightGBM and the second layer model is logistic regression, and conducts experiments on the real data of Clap to Borrow. Compared with the performance of the single model on AUC, accuracy and time consuming, the results show that the fused model, although takes longer time, but performs better in terms of AUC and accuracy, which provides a new idea for the construction of financial credit risk prediction model.

References

[1]  于晓虹, 楼文高. 基于随机森林的P2P网贷信用风险评价、预警与实证研究[J]. 金融理论与实践, 2016(2): 53-58.
[2]  Redmond, U. and Cunningham, P. (2013) A Temporal Network Analysis Reveals the Unprofitability of Arbitrage in the Prosper Marketplace. Expert Systems with Applications, 40, 3715-3721.
https://doi.org/10.1016/j.eswa.2012.12.077
[3]  Malekipirbazari, M. and Aksakalli, V. (2015) Risk Assessment in Social Lending via Random Forests. Expert Systems with Applications, 42, 4621-4631.
https://doi.org/10.1016/j.eswa.2015.02.001
[4]  李昕, 戴一成. 基于BP神经网络的P2P网贷借款人信用风险评估研究[J]. 武汉金融, 2018(2): 33-37.
[5]  Ke, G.L., Meng, Q., Finley, T., Wang, T.F., Chen, W., Ma, W.D., Ye, Q.W. and Liu, T.-Y. (2017) LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Advances in Neural Information Processing Systems, 30, 3149-3157.
[6]  李航. 统计学习方法[M]. 北京: 清华大学出版社, 2012: 78-79.
[7]  Verikas, A., Gelzinis, A. and Bacauskiene, M. (2011) Mining Data with Random Forests: A Survey and Results of New Tests. Pattern Recognition, 44, 330-349.
https://doi.org/10.1016/j.patcog.2010.08.011
[8]  Chen, T. and Guestrin, C. (2016) XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, August 13-17, 2016, 785-794.
https://doi.org/10.1145/2939672.2939785
[9]  Sun, Y., Wong, A.K.C. and Kamel, M.S. (2009) Classification of Imbalanced Data: A Review. International Journal of Pattern Recognition and Artificial Intelligence, 23, 687-719.
https://doi.org/10.1142/S0218001409007326
[10]  Chawla, N.V., Bowyer, K.W., Hall, L.O., et al. (2002) SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357.
https://doi.org/10.1613/jair.953
[11]  Ling, C.X., Huang, J. and Zhang, H. (2003) AUC: A Better Measure than Accuracy in Comparing Learning Algorithms: Advances in Artificial Intelligence. 16th Conference of the Canadian Society for Computational Studies of Intelligence, AI 2003, Halifax, 11-13 June, 2003.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133