|
基于心脏病数据的两种机器学习预测模型比较研究
|
Abstract:
现如今中国正面临的两大主要压力是人口老龄化进程加快和代谢危险因素流行,心血管疾病的发病率和患病率一直保持上升状态,并成为了我国居民死亡的首要原因。与此同时,医学与统计相结合,建立出具有一定预测效果的模型,可以帮助更有效地治疗和控制病情,使心脏病风险预测模型成为公共卫生安全的重要工具。本文首先对心脏病数据集进行预处理,再通过混合采样的方法获得平衡数据,依次构建随机森林和全连接神经网络模型,对它们分别进行比较研究,阐述了随机森林算法在预测心脏病患病情况时有显著优势。建立恰当的模型之后可以有效地对患者进行方便快捷的心脏病预测,有效提高临床诊断的准确率,帮助心脏病患者尽早进行医疗干预获得健康。
The two major pressures facing China today are the accelerated aging of the population and the prevalence of metabolic risk factors. The incidence and prevalence of cardiovascular diseases have been on the rise, and have become the leading cause of death in China. At the same time, the combination of medicine and statistics can establish a model with certain predictive effect, which can help to treat and control the disease more effectively, making the heart disease risk prediction model an important tool for public health security. In this paper, the heart disease data set is preprocessed first, and then balanced data is obtained by mixed sampling method. The random forest and fully connected neural network models are constructed successively, and the comparison between them is carried out, and the significant advantages of the random forest algorithm in predicting the incidence of heart disease are expounded. After establishing a proper model, patients can effectively predict heart disease conveniently and quickly, effectively improve the accuracy of clinical diagnosis, and help patients with heart disease to get healthy as soon as possible through medical intervention.
[1] | 刘彦华. 2023中国现代生命发展指数76.0国人最担忧的三大疾病: 癌症、心脏病、眼病[J]. 小康, 2023(10): 46-48. |
[2] | 陈育德, 杨辉. 贯彻“十四五”国民健康规划, 确保实现健康预期寿命目标[J]. 中国全科医学, 2023, 26(4): 391-394+408. |
[3] | Cai, Y., Cui, X., Su, B.B. and Wu, S.Y. (2022) Changes in Mortality Rates of Major Chronic Diseases among Populations Aged over 60 Years and Their Contributions to Life Expectancy Increase, China, 2005-2020. China CDC Weekly, 4, 866-870. https://doi.org/10.46234/ccdcw2022.179 |
[4] | Muthiah, V.,A G M,Varieur, J T, et al. (2022) The Global Burden of Cardiovascular Diseases and Risk: A Compass for Future Health. Journal of the American College of Cardiology, 80, 2361-2371.
https://doi.org/10.1016/j.jacc.2022.11.005 |
[5] | Subbalakshmi, G., Ramesh, K. and Chinna Rao, M. (2011) Decision Support in Heart Disease Prediction System Using Naive Bayes. Indian Journal of Computer Science and Engineering, 2, 170-176. |
[6] | Alickovic, E. and Subasi, A. (2015) Effect of Multiscale PCA De-Noising in ECG Beat Classification for Diagnosis of Cardiovascular Diseases. Circuits, Systems, and Signal Processing, 34, 513-533.
https://doi.org/10.1007/s00034-014-9864-8 |
[7] | Dimopoulos, A,C., Nikolaidou, M., Caballero, F.F., Engchuan, W., Sanchez-Niubo, A., Arndt, H., Ayuso-Mateos, J.L., Haro, J.M., Chatterji, S., Georgousopoulou, E.N., Pitsavos, C. and Panagiotakos, D.B. (2018) Machine Learning Methodologies versus Cardiovascular Risk Scores, in Predicting Disease Risk. BMC Medical Research Methodology, 18, Article Number: 179. https://doi.org/10.1186/s12874-018-0644-1 |
[8] | Gokulnath, C.B., and Shantharajah, S.P. (2019) An Optimized Feature Selection Based on Genetic Approach and Support Vector Machine for Heart Disease. Cluster Computing, 22, 14777-14787.
https://doi.org/10.1007/s10586-018-2416-4 |
[9] | Khourdifi, Y. and Bahaj, M. (2019) Heart Disease Prediction and Classification Using Machine Learning Algorithms Optimized by Particle Swarm Optimization and Ant Colony Optimization. International Journal of Intelligent Systems, 12, 242-252. https://doi.org/10.22266/ijies2019.0228.24 |
[10] | Valarmathi, R. and Sheela, T. (2021) Heart Disease Prediction Using Hyper Parameter Optimization (HPO) Tuning. Biomedical Signal Processing and Control, 70, 103033. https://doi.org/10.1016/j.bspc.2021.103033 |
[11] | 王浩淼, 曹若菲, 林金欣, 等. 基于脑出血患者院前指标的多种机器学习预测模型构建及比较研究[C]//中国统计教育学会, 教育部高等学校统计学类专业教学指导委员会, 全国应用统计专业学位研究生教育指导委员会. 2021年(第七届)全国大学生统计建模大赛获奖论文集(一). 2021: 394-439. |
[12] | 唐善成, 陈明, 王瀚博, 等. 采用变分自编码器的无监督压敏电阻表面缺陷检测[J]. 计算机集成制造系统, 2022, 28(5): 1337-1351. https://doi.org/10.13196/j.cims.2022.05.006 |