%0 Journal Article
%T 基于Stacking算法实现信贷不平衡数据分类
Classification of Credit Imbalance Data Based on Stacking Algorithm
%A 郑利沙
%A 黄浩
%J Hans Journal of Data Mining
%P 254-260
%@ 2163-1468
%D 2020
%I Hans Publishing
%R 10.12677/HJDM.2020.104027
%X 随着大数据技术在应用层面的日渐普及,机器学习、深度学习相关算法在金融风控行业的应用得到了积极的探索。本文基于开源的信用卡数据(该数据具有样本比例极度不平衡的特点),比较不同采样方法对类别不平衡数据分类结果的影响,并应用集成学习算法Stacking融合多个基分类器训练数据,得到更为稳健的分类模型,有效避免了过拟合现象的发生。
With the increasing popularity of big data technology at the application level, the application of machine learning and deep learning related algorithms in the financial risk control industry has been actively explored. Based on open source credit card data (the data has the characteristics of extremely unbalanced sample ratios), this paper compares the impact of different sampling meth-ods on the classification effect of different classification algorithms in the binary classification prob-lem of unbalanced data, and applies ensemble learning algorithm to fuse multiple base classifier training data. A more robust classification model is obtained, effectively avoiding the occurrence of overfitting.
%K 样本不平衡数据,集成学习,Stacking
Sample Unbalanced Data
%K Integration Learning
%K Stacking
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=37908