全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
Finance  2020 

基于局部采样样本均衡的P2P借贷违约预警模型
P2P Lending Default Warning Model Based on Local Sampling Sample Equilibrium

DOI: 10.12677/FIN.2020.105027, PP. 455-464

Keywords: P2P网络借贷,违约预警,随机森林,样本均衡
P2P Lending
, Default Warning, Random Forest, Sample Equilibrium

Full-Text   Cite this paper   Add to My Lib

Abstract:

随着互联网金融的不断发展,P2P网络借贷的借贷人违约风险识别引起金融机构的重点关注,且随着互联网金融整改措施的实施,借贷违约量不断减少,因此在这P2P网络借贷历史违约数据不断减少的环境下,基于不均衡数据的违约预警分析显得尤为重要。本文在BSL不均衡样本抽样算法的基础上,通过Kmeans聚类算法降低抽样时间复杂度,并使用随机森林与其他机器学习分类算法进行对比实验,同时加入借款描述与借款标题的文本分析,最终建立了基于随机森林的P2P网络借贷违约预警模型来实现对于数据不均衡的P2P借贷违约风险识别。在满足高效率、高识别率的同时,满足了增量学习的现实需求,为P2P网络借贷平台提供一定的监管指导意见。
With the continuous development of Internet finance, the identification of borrowers’ default risk of peer-to-peer (P2P) lending has attracted the attention of financial institutions, and with the imple-mentation of Internet finance rectification measures, the amount of loan defaults has been de-creasing. Therefore, under the environment of decreasing historical default data in P2P lending, default warning analysis based on unbalanced data is particularly important. In this paper, based on BSL unbalanced sample sampling algorithm, K-means clustering algorithm is used to reduce the complexity of sampling time, and random forest is used to compare with other machine learning classification algorithms. At the same time, text analysis of loan description and loan title is added. Finally, a peer-to-peer lending default warning model based on random forest is established to identify P2P loan default risk of unbalanced data. It not only meets the needs of high efficiency and high recognition rate, but also meets the practical needs of incremental learning, and provides cer-tain supervision guidance for peer-to-peer lending platform.

References

[1]  廖理, 吉霖, 张伟强. 借贷市场能准确识别学历的价值吗?——来自P2P平台的经验证据[J]. 金融研究, 2015(3): 146-159.
[2]  阮素梅, 周泽林. 基于L1惩罚Logit模型的P2P网络借贷信用违约识别与预测[J]. 财贸研究, 2018, 29(2): 54-63.
[3]  李广明, 诸唯君, 周欢. P2P网络融资中贷款者欠款特征提取实证研究[J]. 商业时代, 2011(1): 41-42+58.
[4]  刘博楠. 我国P2P网络借贷违约风险的影响因素研究[D]: [硕士学位论文]. 长沙: 湖南大学, 2017.
[5]  Chawla, N.V., Bowyer, K.W., Hall, L.O., et al. (2002) SMOTE Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research, 16, 1321-357.
https://doi.org/10.1613/jair.953
[6]  Hu, F., Wang, L. and Zhou, Y. (2018) Unbalanced Data Oversampling Method Based on Three Decision. Electronic Journal, 461, 135-144.
https://doi.org/10.1521/pdps.2018.46.1.135
[7]  李杰, 马士豪, 靳孟宇, Chao-hsien Chu. 基于SA-SVM的众筹违约风险预警模型[J]. 统计与信息论坛, 2018, 33(11): 70-77.
[8]  Abdoh, S.F., Abo Rizka, M. and Maghraby, F.A. (2018) Cervical Cancer Diagnosis Using Random Forest Classifier with SMOTE and Feature Re-duction Techniques. IEEE Access, 6, 59475-59485.
https://doi.org/10.1109/ACCESS.2018.2874063
[9]  张忠林, 曹婷婷. 基于重采样与特征选择的不均衡数据分类算法[J]. 小型微型计算机系统, 2020, 41(6): 1327-1333.
[10]  黄承慧, 印鉴, 侯昉. 一种结合词项语义信息和TF-IDF方法的文本相似度量方法[J]. 计算机学报, 2011, 34(5): 856-864.
[11]  Mikolov, T., et al. (2013) Effi-cient Estimation of Word Representations in Vector Space.
[12]  Kusner, M., et al. (2015) From Word Embeddings to Document Distances. International Conference on Machine Learning, Vol. 37, 957-966.
[13]  刘礼丽. 历史信用记录与当前借款违约风险关系研究——基于P2P平台的实证分析[J]. 中国物价, 2020(4): 49-52.
[14]  冯素玲, 赵家玲, 赵书. 女性借款人对降低网贷市场违约风险有积极效应吗?——来自“拍拍贷”的实证研究[J]. 济南大学学报(社会科学版), 2020, 30(2): 91-101+159.
[15]  Nadi, A. and Moradi, H. (2019) Increasing the Views and Reducing the Depth in Random Forest. Expert Systems with Applications, 138, Article ID: 112801.
https://doi.org/10.1016/j.eswa.2019.07.018
[16]  张莉, 郭军. 基于边界样本的训练样本选择方法[J]. 北京邮电大学学报, 2006, 29(4): 77-80.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133