|
- 2017
基于分子网络的疾病基因预测方法综述
|
Abstract:
疾病基因预测是揭示疾病作用机理、系统研究复杂疾病的关键环节。高通量生物实验技术的成熟,促进了基于分子网络的疾病基因预测方法的发展。基于“连接有罪”的生物学假设,疾病基因预测算法在生物网络中衡量候选基因与已知疾病基因的邻近性或相似性,以预测潜在的致病基因。该文将疾病基因预测方法归纳为3种:基于已知疾病基因信息的预测方法、融合表型相似性信息的预测方法以及融合多结果的预测方法,并对这3种方法的研究现状进行了综述,指出了现有研究成果的不足以及未来的研究方向。
[1] | ANTANAVICIUTE A, DALY C, CRINNION L A, et al. GeneTIER:Prioritization of candidate disease genes using tissue-specific gene expression profiles[J]. Bioinformatics, 2015, 31(16):2728-2735. |
[2] | CRUZ-MONTEAGUDO M, BORGES F, PAZ-Y-MI?O C, et al. Efficient and biologically relevant consensus strategy for Parkinson's disease gene prioritization[J]. BMC Medical Genomics, 2016, 9(1):12. |
[3] | RUAL J-F, VENKATESAN K, HAO T, et al. Towards a proteome-scale map of the human protein-protein interaction network[J]. Nature, 2005, 437(7062):1173-1178. |
[4] | KERRIEN S, ARANDA B, BREUZA L, et al. The IntAct molecular interaction database in 2012[J]. Nucleic Acids Research, 2011, 40(D1):D841-D846. |
[5] | AMARAL L A N. A truer measure of our ignorance[J]. Proceedings of the National Academy of Sciences, 2008, 105(19):6795-6796. |
[6] | BAUER-MEHREN A, RAUTSCHKA M, SANZ F, et al. DisGeNET:a cytoscape plugin to visualize, integrate, search and analyze gene-disease networks[J]. Bioinformatics, 2010, 26(22):2924-2926. |
[7] | VAN DRIEL M A, BRUGGEMAN J, VRIEND G, et al. A text-mining analysis of the human phenome[J]. European Journal of Human Genetics, 2006, 14(5):535-542. |
[8] | VAN DRIEL M A, BRUGGEMAN J, VRIEND G, et al. MimMiner:a online mendelian inheritance in man mining tool[DB/OL].[2006-05-08]. http://wwwcmbirunl/MimMiner/supplhtml. |
[9] | OTIV M, SNEL B, HUYNEN M A, et al. Predicting disease genes using protein-protein interactions[J]. Journal of Medical Genetics, 2006, 43(8):691-698. |
[10] | FOUSS F, PIROTTE A, RENDERS J-M, et al. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3):355-369. |
[11] | OTI M, HUYNEN M A, BRUNNER H G. Phenome connections[J]. Trends in Genetics, 2008, 24(3):103-106. |
[12] | ERTEN S, BEBEK G, KOYUTüRK M. Vavien:an algorithm for prioritizing candidate disease genes based on topological similarity of proteins in interaction networks[J]. Journal of Computational Biology, 2011, 18(11):1561-1574. |
[13] | WU X, JIANG R, ZHANG M Q, et al. Network-based global inference of human disease genes[J]. Molecular Systems Biology, 2008, 4(1):189. |
[14] | GANEGODA G U, SHENG Y, WANG J. ProSim:a method for prioritizing disease genes based on protein proximity and disease similarity[J]. BioMed Research International, 2015(5):213750. |
[15] | LI Y, PATRA J C. Genome-wide inferring genephenotype relationship by walking on the heterogeneous network[J]. Bioinformatics, 2010, 26(9):1219-1224. |
[16] | LUO J, LIANG S. Prioritization of potential candidate disease genes by topological similarity of protein-protein interaction network and phenotype data[J]. Journal of Biomedical Informatics, 2015, 53:229-236. |
[17] | BERSANELLI M, MOSCA E, REMONDINI D, et al. Methods for the integration of multi-omics data:Mathematical aspects[J]. BMC Bioinformatics, 2016, 17(2):167. |
[18] | ERONEN L, TOIVONEN H. Biomine:Predicting links between biological entities using network models of heterogeneous databases[J]. BMC Bioinformatics, 2012, 13(1):119. |
[19] | AERTS S, LAMBRECHTS D, MAITY S, et al. Gene prioritization through genomic data fusion[J]. Nature Biotechnology, 2006, 24(5):537-544. |
[20] | LI Y, PATRA J C. Integration of multiple data sources to prioritize candidate genes using discounted rating system[J]. BMC Bioinformatics, 2010, 11(1):S20. |
[21] | CHATR-ARYAMONTRI A, BREITKREUTZ B-J, OUGHTRED R, et al. The BioGRID interaction database:2015 update[J]. Nucleic Acids Research, 2015, 43(D1):D470-D478. |
[22] | BADER G D, BETEL D, HOGUE C W. BIND:the biomolecular interaction network database[J]. Nucleic Acids Research, 2003, 31(1):248-250. |
[23] | LICATA L, BRIGANTI L, PELUSO D, et al. MINT, the molecular interaction database:2012 update[J]. Nucleic Acids Research, 2012, 40(D1):D857-D861. |
[24] | PENNISI E. Europe's cancer genome anatomy project[J]. Science, 1997, 276(5315):1024. |
[25] | FUTREAL P A, COIN L, MARSHALL M, et al. A census of human cancer genes[J]. Nature Reviews Cancer, 2004, 4(3):177-183. |
[26] | FREUDENBERG J, PROPPING P. A similarity-based method for genome-wide prediction of disease-relevant human genes[J]. Bioinformatics, 2002, 18(suppl 2):S110-S115. |
[27] | LEI C, RUAN J. A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity[J]. Bioinformatics, 2013, 29(3):355-364. |
[28] | ZENG X, LIAO Y, ZOU Q. Prediction and validation of disease genes using HeteSim scores[J]. IEEE/ACM Transactions on Computational Bilolgy and Bioinformatics, 2017, 14(3):687-695. |
[29] | CHEN Y, WANG W, ZHOU Y, et al. In silico gene prioritization by integrating multiple data sources[J]. PloS One, 2011, 6(6):e21137. |
[30] | STARK C, BREITKREUTZ B-J, CHATRARYAMONTRI A, et al. The BioGRID interaction database:2011 update[J]. Nucleic Acids Research, 2011, 39(suppl 1):D698-D704. |
[31] | ZOU Q, LI J, WANG C, et al. Approaches for recognizing disease genes based on network[J]. BioMed Research International, 2014(5013):416323. |
[32] | FAWCETT T. An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27(8):861-874. |
[33] | CHEN J, XU H, ARONOW B J, et al. Improved human disease candidate gene prioritization using mouse phenotype[J]. BMC Bioinformatics, 2007, 8(1):392. |
[34] | CASCI T. Human disease:Something old, something new[J]. Nature Reviews Genetics, 2011, 12(6):382-383. |
[35] | HUANG D W, SHERMAN B T, LEMPICKI R. A systematic and integrative analysis of large gene lists using DAVID bioinformatics resources[J]. Nature Protocols, 2009, 4(1):44-57. |
[36] | WANG X, GULBAHCE N, YU H. Network-based methods for human disease gene prediction[J]. Briefings in Functional Genomics, 2011, 10(5):280-293. |
[37] | OTT J, WANG J, LEAL S M. Genetic linkage analysis in the age of whole-genome sequencing[J]. Nature Reviews Genetics, 2015, 16(5):275-284. |
[38] | GOH K-I, CUSICK M E, VALLE D, et al. The human disease network[J]. Proceedings of the National Academy of Sciences, 2007, 104(21):8685-8690. |
[39] | BRUNNER H G, VAN DRIEL M A. From syndrome families to functional genomics[J]. Nature Reviews Genetics, 2004, 5(7):545-551. |
[40] | LAGE K, KARLBERG E O, ST?RLING Z M, et al. A human phenome-interactome network of protein complexes implicated in genetic disorders[J]. Nature Biotechnology, 2007, 25(3):309-316. |
[41] | BARABáSI A-L, GULBAHCE N, LOSCALZO J. Network medicine:a network-based approach to human disease[J]. Nature Reviews Genetics, 2011, 12(1):56-68. |
[42] | TIFFIN N, ANDRADE-NAVARRO M A, PEREZIRATXETA C. Linking genes to diseases:it's all in the data[J]. Genome Medicine, 2009, 1(8):77. |
[43] | TEJERA E, BERNARDES J, REBELO I. Co-expression network analysis and genetic algorithms for gene prioritization in preeclampsia[J]. BMC Medical Genomics, 2013, 6(1):51. |
[44] | CARTER S L, BRECHBüHLER C M, GRIFFIN M, et al. Gene co-expression network topology provides a framework for molecular characterization of cellular state[J]. Bioinformatics, 2004, 20(14):2242-2250. |
[45] | NITSCH D, GON?ALVES J P, OJEDA F, et al. Candidate gene prioritization by network analysis of differential expression using machine learning approaches[J]. BMC Bioinformatics, 2010, 11(1):460. |
[46] | LI M, LI Q, GANEGODA G U, et al. Prioritization of orphan disease-causing genes using topological feature and GO similarity between proteins in interaction networks[J]. Science China Life Sciences, 2014, 57(11):1064-1071. |
[47] | SCHLICKER A, LENGAUER T, ALBRECHT M. Improving disease gene prioritization using the semantic similarity of Gene Ontology terms[J]. Bioinformatics, 2010, 26(18):i561-i567. |
[48] | OLIVER S. Proteomics:Guilt-by-association goes global[J]. Nature, 2000, 403(6770):601-603. |
[49] | OTIM, BRUNNER H G. The modular nature of genetic diseases[J]. Clinical Genetics, 2007, 71(1):1-11. |
[50] | CAGNEY G, UETZ P, FIELDS S. High-throughput screening for protein-protein interactions using two-hybrid assay[J]. Methods in Enzymology, 2000, 328:3-14. |
[51] | LINGHU B, SNITKIN E S, HU Z, et al. Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network[J]. Genome Biology, 2009, 10(9):91. |
[52] | LEE I, BLOM U M, WANG P I, et al. Prioritizing candidate disease genes by network-based boosting of genome-wide association data[J]. Genome Research, 2011, 21(7):1109-1121. |
[53] | SCHMITT T, OGRIS C, SONNHAMMER E L. FunCoup 30:Database of genome-wide functional coupling networks[J]. Nucleic Acids Research, 2014, 42(D1):D380-D388. |
[54] | BRIN S, PAGE L. Reprint of:the anatomy of a large-scale hypertextual web search engine[J]. Computer Networks, 2012, 56(18):3825-3833. |
[55] | 吕琳媛, 周涛. 链路预测[M]. 北京:高等教育出版社, 2013:69-70. Lü Lin-yuan, ZHOU Tao. Link Prediction[M]. Beijing:Higher Education Press, 2013:69-70. |
[56] | Lü L, ZHOU T. Link prediction in complex networks:a survey[J]. Physica A:Statistical Mechanics and Its Applications, 2011, 390(6):1150-1170. |
[57] | 吕琳媛. 复杂网络链路预测[J]. 电子科技大学学报, 2010, 39(5):651-661. Lü Lin-yuan. Link prediction on complex networks[J]. Journal of University of Electronic Science and Technology of China, 2010, 39(5):651-661. |
[58] | ZHAO J, YANG T H, HUANG Y, et al. Ranking candidate disease genes from gene expression and protein interaction:a Katz-centrality based approach[J]. PloS One, 2011, 6(9):e24306. |
[59] | ERTEN S, BEBEK G, EWING R M, et al. DADA:Degree-aware algorithms for network-based disease gene prioritization[J]. BioData Mining, 2011, 4(1):19. |
[60] | LAN W, WANG J, LI M, et al. Computational approaches for prioritizing candidate disease genes based on PPI networks[J]. Tsinghua Science and Technology, 2015, 20(5):500-512. |
[61] | EASTON D, BISHOP D, FORD D, et al. Genetic linkage analysis in familial breast and ovarian cancer:Results from 214 families the breast cancer linkage consortium[J]. American Journal of Human Genetics, 1993, 52(4):678. |
[62] | PRASAD T S K, GOEL R, KANDASAMY K, et al. Human protein reference database-2009 update[J]. Nucleic Acids Research, 2009, 37(suppl 1):D767-D772. |
[63] | SZKLARCZYK D, FRANCESCHINI A, WYDER S, et al. STRING v10:Protein-protein interaction networks, integrated over the tree of life[J]. Nucleic Acids Research, 2014, 43(D1):D447-D452. |
[64] | SCHAEFER M H, FONTAINE J-F, VINAYAGAM A, et al. HIPPIE:Integrating protein interaction networks with experiment based quality scores[J]. PloS One, 2012, 7(2):e31826. |
[65] | MOREAU Y, TRANCHEVENT L-C. Computational tools for prioritizing candidate genes:Boosting disease gene discovery[J]. Nature Reviews Genetics, 2012, 13(8):523-536. |
[66] | HAMOSH A, SCOTT A F, AMBERGER J S, et al. Online mendelian inheritance in man (OMIM), a knowledgebase of human genes and genetic disorders[J]. Nucleic Acids Research, 2005, 33(suppl 1):D514-D517. |
[67] | BECKER K G, BARNES K C, BRIGHT T J, et al. The genetic association database[J]. Nature Genetics, 2004, 36(5):431-432. |
[68] | KRAUTHAMMER M, KAUFMANN C A, GILLIAM T C, et al. Molecular triangulation:Bridging linkage and molecular-network information for identifying candidate genes in Alzheimer's disease[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(42):15148-15153. |
[69] | NAVLAKHA S, KINGSFORD C. The power of protein interaction networks for associating genes with diseases[J]. Bioinformatics, 2010, 26(8):1057-1063. |
[70] | K?HLER S, BAUER S, HORN D, et al. Walking the interactome for prioritization of candidate disease genes[J]. The American Journal of Human Genetics, 2008, 82(4):949-958. |
[71] | VANUNU O, MAGGER O, RUPPIN E, et al. Associating genes and protein complexes with disease via network propagation[J]. PLoS Comput Biol, 2010, 6(1):e1000641. |
[72] | KATZ L. A new status index derived from sociometric analysis[J]. Psychometrika, 1953, 18(1):39-43. |
[73] | ZHANG S, NING X M, ZHANG X S. Graph kernels, hierarchical clustering, and network community structure:experiments and comparative analysis[J]. The European Physical Journal B-Condensed Matter and Complex Systems, 2007, 57(1):67-74. |
[74] | TAUCHEN G. Finite state markov-chain approximations to univariate and vector autoregressions[J]. Economics Letters, 1986, 20(2):177-181. |
[75] | 汪小帆, 李翔, 陈关荣. 网络科学导论[M]. 北京:高等教育出版社, 2012. WANG Xiao-fan, LI Xiang, CHEN Guan-rong. Network science:an introduction[M]. Beijing:Higher Education Press, 2012. |
[76] | LIU W, Lü L. Link prediction based on local random walk[J]. EPL (Europhysics Letters), 2010, 89(5):58007. |
[77] | SINGH-BLOM U M, NATARAJAN N, TEWARI A, et al. Prediction and validation of gene-disease associations using methods inspired by social network analyses[J]. PloS One, 2013, 8(5):e58977. |
[78] | WAGNER G P, PAVLICEV M, CHEVERUD J M. The road to modularity[J]. Nature Reviews Genetics, 2007, 8(12):921-931. |
[79] | B?RNIGEN D, TRANCHEVENT L-C, BONACHELACAPDEVILA F, et al. An unbiased evaluation of gene prioritization tools[J]. Bioinformatics, 2012, 28(23):3081-3088. |
[80] | TRANCHEVENT L-C, CAPDEVILA F B, NITSCH D, et al. A guide to web tools to prioritize candidate genes[J]. Briefings in Bioinformatics, 2011, 12(1):22-32. |
[81] | ADIE E A, ADAMS R R, EVANS K L, et al. SUSPECTS:Enabling fast and effective prioritization of positional candidates[J]. Bioinformatics, 2006, 22(6):773-774. |
[82] | SEELOW D, SCHWARZ J M, SCHUELKE M. GeneDistiller-distilling candidate genes from linkage intervals[J]. PLoS One, 2008, 3(12):e3874. |