%0 Journal Article %T 基于随机森林算法的欧洲土壤重金属污染研究
Study on Heavy Metal Pollution in European Soil Based on Random Forest Algorithms %A 宋申辉 %A 谢淑云 %A 杨瑞琰 %J Statistics and Applications %P 218-226 %@ 2325-226X %D 2019 %I Hans Publishing %R 10.12677/SA.2019.82024 %X
在大数据背景下,为提高评价土壤中重金属污染的效率,引入机器学习中的随机森林算法。本文以欧洲表层土壤为例,建立Random forest模型,对As、Co、Cr、Cu、Ni、Pb、Zn 7种重金属的污染程度进行分类;然后通过加入核主成分分析对模型进行改进,建立KPCA-Random forest模型,并从分类精度和运行时间两个维度上进行对比。结果显示:改进后模型的分类精确度由93.41%提高到94.67%,运行时间从12.530601 s缩减到9.437811 s。最后本文对建立的随机森林模型的优缺点进行了评价,并提出今后的研究方向。
Under the background of large data, in order to improve the efficiency of evaluating heavy metal pollution in soil, a random forest algorithm in machine learning is introduced. In this paper, Random forest model was established to analyze the pollution degree of As, Co, Cr, Cu, Ni, Pb and Zn in top soil of Europe. Then, the KPCA-Random forest model is established by adding the kernel principal component analysis to improve the model, and the classification accuracy and running time are compared. The results show that the classification accuracy of the improved model is improved from 93.41% to 94.67%, and the running time is reduced from 12.530601 s to 9.437811 s. Finally, the advantages and disadvantages of the Random forest model are evaluated, and the future research directions are also proposed.
%K 随机森林,节点分裂算法,核主成分分析,重金属污染
Random Forest %K Node Splitting Algorithm %K Kernel Principal Component Analysis %K Heavy Metal Pollution %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=29533