|
E-Commerce Letters 2024
基于模型融合的上市公司财务造假的预测
|
Abstract:
我国上市公司财务报告造假的问题一直伴随着市场的发展。针对此问题,构造了基于分类模型的上市公司财务造假的预测研究。通过数据的预处理和机器学习算法模型,以及统计学的一些方法,建立了一套完整的分析预测模型。首先按照行业分为大类,将年数据中缺失率达到50%以上的指标剔除,剩下缺失的数据选用0来填充。对于日数据提取每股指标,并按年进行均值化,将均值化后的日数据整合为年数据并且提取出特征因子,最终通过降维的思想筛选出对上市财务造假有较大影响的因子。通过确定的因子,将特征因子初步处理,并且进行标准化,通过使用三大类特征选择的方法,使系统的特定指标进一步优化,接着用主成分降维,正则化特征提取,最终用决策树分类模型、线性判别模型、梯度提升分类模型、支持向量机模型四种分类模型进行分类预测。
The problem of financial report fraud of listed companies in our country has been accompanied by the development of the market. In order to solve this problem, this paper constructs a forecasting research on financial fraud of listed companies based on classification model. Through the data preprocessing and machine learning algorithm model, as well as some statistical methods, we established a complete set of analysis and prediction model. First of all, according to the industry it is divided into large categories. The annual data missing rate of more than 50% of the indicators removed. The remaining missing data selected to fill 0. For the daily data, we extract the per-share index and average it every year, then integrate the average daily data into the number of years. We integrate the averaged daily data into annual data and extract the characteristic factors. Finally, through the thought of dimensionality reduction, the factors that have a greater impact on listed financial fraud are screened out. By using the method of feature selection of three categories, the specific index of the system is further optimized, and then the principal component is used to reduce the dimension. Finally, decision tree classification model, linear discriminant model, gradient promotion classification model and support vector machine model were used to predict the classification.
[1] | 高利芳, 李艺玮. 职务舞弊的内部审计困境与准则完善[J]. 财经问题研究, 2019(8): 104-112. |
[2] | 戴丹苗, 刘锡良. 中概股公司财务舞弊的文献综述[J]. 金融发展研究, 2017(1): 11-19. |
[3] | 邢小艳. 基于模式识别的“高送转”投资策略研究[D]: [硕士学位论文]. 广州: 华南理工大学. 2016: 810. |
[4] | 桂萍, 王婷. 高管变更、内部控制质量与公司财务造假[J]. 财会月刊, 2018(10): 85-87. |
[5] | 黄世忠. 上市公司财务造假的八因八策[J]. 财务与会计, 2019(16): 4-11. |
[6] | 朱卫东, 苏剑, 武子豪. 高管特征与真实盈余管理——基于随机森林的实证[J]. 会计之友, 2022(12): 100-107. |
[7] | 段亚萍. 基于XGBoost算法优化BP神经网络的新能源汽车专利价值评估[D]: [硕士学位论文]. 重庆: 重庆理工大学, 2023. https://doi.org/10.27753/d.cnki.gcqgx.2023.000216 |
[8] | 张涵夏. 适用于线性回归和逻辑回归的场景分析[J]. 自动化与仪器仪表, 2022(10): 1-4 8. https://doi.org/10.14016/j.cnki.1001-9227.2022.10.001 |
[9] | 薛慧. 基于LightGBM模型的电力上市公司财务风险预警研究[D]: [硕士学位论文]. 成都: 西南财经大学, 2021. https://doi.org/10.27412/d.cnki.gxncu.2021.002834 |
[10] | 陈业辉, 郑克立, 陈立中, 等. 口服他克莫司血药浓度-时间曲线下面积[J]. 中华泌尿外科杂志, 2004(11): 29-31. |
[11] | 李子言. 大数据背景下ROC曲线介绍与应用[J]. 科教导刊, 2021(14): 81-84. https://doi.org/10.16400/j.cnki.kjdk.2021.14.026 |
[12] | 崔智泉. 浅谈高斯分布的原理和应用[J]. 中国校外教育, 2018(16): 63-64. |
[13] | 贺怀清, 贾洁, 刘浩翰. 基于方差过滤的改进多通路Metropolis光线传输算法[J]. 计算机辅助设计与图形学学报, 2018, 30(6): 1082-1088. |
[14] | 龚晓彦. 基于互信息的医学图像配准算法研究[D]: [硕士学位论文]. 秦皇岛: 燕山大学, 2010. |