|
- 2018
基于异构网络拓扑数据的人类必要基因预测DOI: 10.14081/j.cnki.hgdxb.2018.03.006 Keywords: 人类必要基因, 异构网络, 过抽样, 重启动随机游走, 支持向量机humanessentialgenes, heterogeneousnetworks, oversampling, randomwalkwith, restartalgorithm, supportvectormachine Abstract: 对必要基因进行研究不仅能够了解生物生存和繁殖的最低要求,且有助于寻找人类疾病基因和新的药 物靶点. 实验法鉴定人类必要基因虽有效但价格昂贵且耗时费力,开发高效算法预测必要基因是对实验法必要 而有效的补充. 提出一种基于融合多个异构网络拓扑数据预测必要基因的算法,该算法选用重启动随机游走算 法将多个异构网络整合成统一的基因网络特征,采用SMOTE过抽样算法平衡训练支持向量机过程中的正负样 本. 实验结果表明,整合异构网络拓扑数据方法比基于单一网络的模型能更有效地预测人类必要基因.Thestudiesoftheessentialgenesarehelpfulnotonlyinunderstandingtheminimumrequirementsforsurviv? alandreproduction, butalsofindingthenewhumandiseasegenesanddrugtargets. Thoughtheexperimentalmethodsto identifytheessentialgenesiseffective, thesemethodsareexpensiveandtime-consuming.Therefore, thedevelopmentof efficientpredictionalgorithmtopredicthumanessentialgenesisanecessaryandeffectivecomplementtoexperimental methods.Thispaperproposedanalgorithmbasedonthefusionofmultipleheterogeneousnetworktopologydatatopredict theessentialgenes. Inourstudy, randomwalkwithrestartalgorithmwasusedtointegrateheterogeneousnetworktopo? logicaldataintouniformednetworkfeaturesofgenes. SMOTEoversamplingalgorithmwasadoptedtobalancetheposi? tiveandnegativesamplesintrainingSVM.Theexperimentalresultsshowthatthemethodofintegratingheterogeneous networktopologydatacanpredicthumanessentialgenesmoreeffectivelythanthosebasedthesinglenetworkmodel.
|