全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

应用于心脏病诊断的线性回归决策树模型
Decision Tree Model Based on Linear Regression for Heart Disease Diagnosis

DOI: 10.12677/CSA.2021.118216, PP. 2108-2116

Keywords: 变异数分析,线性回归,决策树,智慧医疗
Variance Analysis
, Linear Regression, Decision Tree, Smart Healthcare

Full-Text   Cite this paper   Add to My Lib

Abstract:

心脏病是一种十分常见的高发性疾病,已经成为导致人类死亡的主要因素之一。提高心脏病的医疗诊断的准确性,并对其实行更早的干预与治疗是需要关注的问题。在本文中,我们在数据预处理和模型建立前期阶段采用的是python代码实现,最终发现患病比例与性别和年龄也有着一定的联系。然后采用了SPSS对其进行分析,发现R值为0.719,属于0.5~1之间的大效应的情况,因此,模型拟合效果良好。此外,方差分析的显著性值为0,处于0~0.05的范围之内,可以说明各个参数建立的线性关系回归模型具有极显著的统计学意义,即线性关系显著。模型建立的后期阶段采用以决策树为代表的多种预测模型,最终预测准确率如下:基于信息熵的决策树模型为85.6%,基于基尼指数的决策树模型为84.2%,基于基尼指数的决策树(预剪枝)模型为86.6%。我们发现:模型的准确率均在85%左右,其中基于基尼指数的决策树(预剪枝)模型准确率最高。
Heart disease is a very common high-incidence disease, which has become one of the main factors leading to human death. Improving the accuracy of medical diagnosis of heart disease and implementing earlier intervention and treatment are issues that need attention. In this article, we adopted python code in the early stage of data preprocessing and model establishment, and finally found that the disease ratio is also related to gender and age. Then SPSS was used to analyze it, and it was found that the R value was 0.719, which is a large effect between 0.5~1. Therefore, the model fitting effect is good. In addition, the significance value of the analysis of variance is 0, which is within the range of 0~0.05, which can indicate that the linear regression model established by each parameter has extremely significant statistical significance, that is, the linear relationship is significant. In the later stage of model establishment, a variety of prediction models represented by decision tree are used. The final prediction accuracy is as follows: the accuracy of the decision tree model based on information entropy is 85.6%, the accuracy of the decision tree model based on the Gini index is 84.2%, and the accuracy of the decision tree (prepruning) based on the Gini index is 86.6%. We found that the accuracy of the models is around 85%, and the decision tree (prepruning) model based on the Gini index has the highest accuracy.

References

[1]  Cao, K. (2019) Artificial Intelligence on Diabetic Retinopathy Diagnosis: An Automatic Classification Method Based on Grey Level Co-Occurrence Matrix and Naive Bayesian Model. International Journal of Ophthalmology, 12, 1158-1162.
[2]  邓卓, 苏秉华, 张凯. 基于集成学习的乳腺癌分类研究[J]. 中国医疗设备, 2020, 35(12): 59-62.
[3]  Yu, Y.X. (2019) The Application of Intelligent Medicine of Perceptron Algorithm in the Diagnosis of Spi-nal Disease. China New Telecommunications, 21, 229-231.
[4]  使用SPSS进行线性回归分析[EB/OL].
https://jingyan.baidu.com/article/b2c186c8055f49c46ef6ff0b.html, 2021-01-30.
[5]  R做线性回归[EB/OL].
https://www.sohu.com/a/230584172_274950, 2021-01-30.
[6]  D-W检验[EB/OL].
https://baike.baidu.com/item/D-W%E6%A3%80%E9%AA%8C/8030379?fr=aladdin, 2021-01-30.
[7]  方差[EB/OL].
https://baike.baidu.com/item/%E6%96%B9%E5%B7%AE, 2021-01-30.
[8]  杜小芳, 陈毅红. Spark MLlib中决策树算法不同特征选择标准比较[J]. 太原师范学院学报(自然科学版), 2020, 19(4): 37-39+51.
[9]  张振, 田雪飞, 郜文辉, 何凤姣, 邓天好, 宋晓燕, 郑飘, 黄振. 基于决策树及贝叶斯网络建立原发性肝癌肝郁脾虚证诊断模型研究[J]. 中国中医药信息杂志, 2020, 27(9): 115-120.
[10]  Ren Y.X., Wang, S.Y., Luo, Y.T. and Chen, S.Y. (2020) ID3 Algorithm-Based Research on College Students’ Mobile Game Preferences and Analysis of Cir-cumvention Paths. Academic Journal of Engineering and Technology Science, 3.
[11]  Afrianto, E., Suseno, J.E. and Warsito, B. (2020) Decision Tree Method with C4.5 Algorithm for Students Classification Who Is Entitled to Receive Indonesian Smart Card (KIP). IOP Conference Series: Materials Science and Engineering, 879, Article ID: 012072.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133