|
交叉验证法在模型比较中的应用
|
Abstract:
在模型比较中,有很多评价标准,如p-值等,都受制于数据的分布假定。而利用交叉验证法进行数据处理,然后比较归一化均方误差Normalized Mean Squared Error (NMSE)是目前最流行的模型评价的标准,不受任何数据分布的限制。本文详细介绍了交叉验证法,并给出了其具体的应用。通过对实际的问题建立了6种不同的模型,并利用10折交叉验证法对不同模型的归一化均方误差(NMSE)进行比较,选择出了最优的预测精度最高的模型。
In model comparison, there are many evaluation criteria, such as p-value, which are subject to the distribution assumptions of the data. The use of cross-validation method for data processing and then comparison of normalized mean squared error (NMSE) is currently the most popular standard for model evaluation, which is not limited by any data distribution. This article introduces the cross-validation method in detail and gives its specific application. By establishing six different models for the actual problem, and comparing the normalized mean squared error (NMSE) of dif-ferent models by using the 10-fold cross-validation method, the optimal model with the highest prediction accuracy was selected.
[1] | 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016: 26-27. |
[2] | 费宇. 多元统计分析——基于R [M]. 北京: 人民大学出版社, 2014: 16-35. |
[3] | 李红梅. 机器学习方法和统计建模方法的预测比较研究[D]: [硕士学位论文]. 昆明: 云南师范大学, 2016. |
[4] | 李正良, 彭思思, 王涛. 基于k-fold交叉验证的代理模型序列采样方法[J]. 计算力学学报, 2021, 39(2): 244-248. |
[5] | 吴喜之. 应用回归及分类——基于R [M]. 北京: 人民大学出版社, 2016: 13-47+108-148. |
[6] | 玛利亚L.里佐. 统计计算使用R [M]. 胡锐, 李义, 译. 北京: 机械工业出版社, 2019: 140-141. |