%0 Journal Article %T 应用于心脏病诊断的线性回归决策树模型
Decision Tree Model Based on Linear Regression for Heart Disease Diagnosis %A 闵杰青 %A 李昕洁 %A 谭强 %A 赵娜 %A 李向娟 %A 王剑 %A 曾敬勋 %A 刘学承 %J Computer Science and Application %P 2108-2116 %@ 2161-881X %D 2021 %I Hans Publishing %R 10.12677/CSA.2021.118216 %X
心脏病是一种十分常见的高发性疾病,已经成为导致人类死亡的主要因素之一。提高心脏病的医疗诊断的准确性,并对其实行更早的干预与治疗是需要关注的问题。在本文中,我们在数据预处理和模型建立前期阶段采用的是python代码实现,最终发现患病比例与性别和年龄也有着一定的联系。然后采用了SPSS对其进行分析,发现R值为0.719,属于0.5~1之间的大效应的情况,因此,模型拟合效果良好。此外,方差分析的显著性值为0,处于0~0.05的范围之内,可以说明各个参数建立的线性关系回归模型具有极显著的统计学意义,即线性关系显著。模型建立的后期阶段采用以决策树为代表的多种预测模型,最终预测准确率如下:基于信息熵的决策树模型为85.6%,基于基尼指数的决策树模型为84.2%,基于基尼指数的决策树(预剪枝)模型为86.6%。我们发现:模型的准确率均在85%左右,其中基于基尼指数的决策树(预剪枝)模型准确率最高。
Heart disease is a very common high-incidence disease, which has become one of the main factors leading to human death. Improving the accuracy of medical diagnosis of heart disease and implementing earlier intervention and treatment are issues that need attention. In this article, we adopted python code in the early stage of data preprocessing and model establishment, and finally found that the disease ratio is also related to gender and age. Then SPSS was used to analyze it, and it was found that the R value was 0.719, which is a large effect between 0.5~1. Therefore, the model fitting effect is good. In addition, the significance value of the analysis of variance is 0, which is within the range of 0~0.05, which can indicate that the linear regression model established by each parameter has extremely significant statistical significance, that is, the linear relationship is significant. In the later stage of model establishment, a variety of prediction models represented by decision tree are used. The final prediction accuracy is as follows: the accuracy of the decision tree model based on information entropy is 85.6%, the accuracy of the decision tree model based on the Gini index is 84.2%, and the accuracy of the decision tree (prepruning) based on the Gini index is 86.6%. We found that the accuracy of the models is around 85%, and the decision tree (prepruning) model based on the Gini index has the highest accuracy.
%K 变异数分析,线性回归,决策树,智慧医疗
Variance Analysis %K Linear Regression %K Decision Tree %K Smart Healthcare %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=44659