全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

整合网络属性、序列特征和功能注释预测潜在的癌基因

, PP. 589-595

Keywords: 癌基因,Logistic回归,网络特征,序列特征,功能注释

Full-Text   Cite this paper   Add to My Lib

Abstract:

发现新的癌基因是癌症研究的主要目标之一.生物信息学方法可以帮助加快癌基因的发现,理解癌症发生机制和挖掘药物靶标.通过整合网络属性、序列特征和功能注释信息,建立了一个能用于潜在癌基因预测的分类器.通过检测发现,在癌基因与非癌基因之间有55个特征显示了显著的差异.14个癌症相关的特征被用于训练分类器.在分类器中,探索使用4种机器学习方法,即logistic回归、支持向量机、贝叶斯网络和决策树,来区分癌基因与非癌基因.通过5倍交叉验证评估不同模型的有效性,发现这4种方法对应的ROC曲线下面积分别为0.834,0.740,0.800和0.782.最后,将基于多种生物学特征的logistic回归分类器应用于Entrez数据库中的基因,发现了1976个潜在的癌基因.本研究发现,整合的预测方法优于基于单一证据的预测模型,而网络特征和功能注释信息相比序列特征具有更强的预测能力.

References

[1]  1 Vogelstein B, Kinzler K W. Cancer genes and the pathways they control. Nat Med, 2004, 10: 789-799
[2]  2 Futreal P A, Coin L, Marshall M, et al. A census of human cancer genes. Nat Rev Cancer, 2004, 4: 177-183
[3]  3 Strausberg R L, Simpson A J, Wooster R. Sequence-based cancer genomics: progress, lessons and opportunities. Nat Rev Genet, 2003, 4: 409-418
[4]  4 Altshuler D, Daly M J, Lander E S. Genetic mapping in human disease. Science, 2008, 322: 881-888
[5]  5 Aragues R, Sander C, Oliva B. Predicting cancer involvement of genes from heterogeneous data. BMC Bioinformatics, 2008, 9: 172
[6]  6 Furney S J, Higgins D G, Ouzounis C A, et al. Structural and functional properties of genes involved in human cancer. BMC Genomics, 2006, 7: 3
[7]  7 Ostlund G, Lindskog M, Sonnhammer E L. Network-based identification of novel cancer genes. Mol Cell Proteomics, 2010, 9: 648-655
[8]  8 Li L, Zhang K, Lee J, et al. Discovering cancer genes by integrating network and functional properties. BMC Med Genomics, 2009, 2: 61
[9]  16 Hamosh A, Scott A F, Amberger J S, et al. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res, 2005, 33: D514-D517
[10]  17 D''Antonio M, Pendino V, Sinha S, et al. Network of Cancer Genes (NCG 3.0): integration and analysis of genetic and network properties of cancer genes. Nucleic Acids Res, 2012, 40: D978- D983
[11]  18 Maglott D, Ostell J, Pruitt K D, et al. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res, 2007, 35: D26-D31
[12]  19 Tu Z, Wang L, Xu M, et al. Further understanding human disease genes by comparing with housekeeping genes and other genes. BMC Genomics, 2006, 7: 31
[13]  20 Frank E, Hall M, Trigg L, et al. Data mining in bioinformatics using Weka. Bioinformatics, 2004, 20: 2479-2481
[14]  21 Hanley J A, McNeil B J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 1982, 143: 29-36
[15]  22 Xu J, Li Y. Discovering disease-genes by topological features in human protein-protein interaction network. Bioinformatics, 2006, 22: 2800-2805
[16]  23 Kyte J, Doolittle R F. A simple method for displaying the hydropathic character of a protein. J Mol Biol, 1982, 157: 105-132
[17]  24 Bakheet T M, Doig A J. Properties and identification of human protein drug targets. Bioinformatics, 2009, 25: 451-457
[18]  25 Harris M A, Clark J, Ireland A, et al. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res, 2004, 32: D258-D261
[19]  26 Huang da W, Sherman B T, Lempicki R A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res, 2009, 37: 1-13
[20]  9 Wang E, Lenferink A, O''Connor-McCourt M. Cancer systems biology: exploring cancer-associated genes on cellular networks. Cell Mol Life Sci, 2007, 64: 1752-1762
[21]  10 Milenkovic T, Memisevic V, Ganesan A K, et al. Systems-level cancer gene identification from protein interaction network topology applied to melanogenesis-related functional genomics data. J R Soc, 2010, 7: 423-437
[22]  11 Brown K R, Jurisica I. Online predicted human interaction database. Bioinformatics, 2005, 21: 2076-2082
[23]  12 Alfarano C, Andrade C E, Anthony K, et al. The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res, 2005, 33: D418-D424
[24]  13 Peri S, Navarro J D, Kristiansen T Z, et al. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res, 2004, 32: D497-D501
[25]  14 Chatr-aryamontri A, Ceol A, Palazzi L M, et al. MINT: the Molecular interaction database. Nucleic Acids Res, 2007, 35: D572-D574
[26]  15 Cui Q, Ma Y, Jaramillo M, et al. A map of human cancer signaling. Mol Syst Biol, 2007, 3: 152

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133