The gene regulatory network (GRN) reveals the regulatory relationships among genes and can provide a systematic understanding of molecular mechanisms underlying biological processes. The importance of computer simulations in understanding cellular processes is now widely accepted; a variety of algorithms have been developed to study these biological networks. The goal of this study is to provide a comprehensive evaluation and a practical guide to aid in choosing statistical methods for constructing large scale GRNs. Using both simulation studies and a real application in E. coli data, we compare different methods in terms of sensitivity and specificity in identifying the true connections and the hub genes, the ease of use, and computational speed. Our results show that these algorithms performed reasonably well, and each method has its own advantages: (1) GeneNet, WGCNA (Weighted Correlation Network Analysis), and ARACNE (Algorithm for the Reconstruction of Accurate Cellular Networks) performed well in constructing the global network structure; (2) GeneNet and SPACE (Sparse PArtial Correlation Estimation) performed well in identifying a few connections with high specificity.
References
[1]
Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science 303: 799–805.
[2]
Ihmels J, Friedlander G, Bergmann S, Sarig O, Ziv Y, et al. (2002) Revealing modular organization in the yeast transcriptional network. Nat Genet 31: 370–7.
[3]
Lee I, Date SV, Adai AT, Marcotte EM (2004) A probabilistic functional network of yeast genes. Science 306: 1555–8.
[4]
Sachs K, Perez O, Pe'er D, Lauffenburger DA, Nolan GP (2005) Causal protein-signaling networks derived from multiparameter single-cell data. Science 308: 523–9.
[5]
Segal E, Shapira M, Regev A, Pe'er D, Botstein D, et al. (2003) Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet 34: 166–76.
[6]
Stuart JM, Segal E, Koller D, Kim SK (2003) A gene-coexpression network for global discovery of conserved genetic modules. Science 302: 249–55.
[7]
Kendall SD, Linardic CM, Adam SJ, Counter CM (2005) A network of genetic events sufficient to convert normal human cells to a tumorigenic state. Cancer Res 65: 9824–8.
[8]
Mani KM, Lefebvre C, Wang K, Lim WK, Basso K, et al. (2008) A systems biology approach to prediction of oncogenes and molecular perturbation targets in b-cell lymphomas. Mol Syst Biol 4: 169.
[9]
Nibbe RK, Koyuturk M, Chance MR (2010) An integrative -omics approach to identify functional sub-networks in human colorectal cancer. PLoS Comput Biol 6: e1000639.
[10]
Slavov N, Dawson KA (2009) Correlation signature of the macroscopic states of the gene regulatory network in cancer. Proc Natl Acad Sci U S A 106: 4079–84.
[11]
Werhli AV, Grzegorczyk M, Husmeier D (2006) Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks. Bioinformatics 22: 2523–31.
[12]
Schafer J, Strimmer K (2005) An empirical bayes approach to inferring large-scale gene association networks. Bioinformatics 21: 754–64.
[13]
Peng J, Wang P, Zhou N, Zhu J (2009) Partial correlation estimation by joint sparse regression models. Journal of the American Statistical Association 104: 735–746.
[14]
Langfelder P, Horvath S (2008) Wgcna: an r package for weighted correlation network analysis. BMC Bioinformatics 9: 559.
[15]
Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R, et al. (2005) Reverse engineering of regulatory networks in human B cells. Nat Genet 37: 382–390.
[16]
Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, et al. (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7: Suppl 1S7.
[17]
Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4: Article17.
[18]
Ellis B, Wong WH (2008) Learning causal bayesian network structures from experimental data. Journal of the American Statistical Association 103: 778–789.
[19]
Liang F, Zhang J (2009) Learning bayesian networks for discrete data. Comput Stat Data Anal 53: 865–876.
[20]
Li H, Sun Y, Zhan M (2009) Exploring pathways from gene co-expression to network dynamics. Methods Mol Biol 541: 249–67.
[21]
Carter SL, Brechbuhler CM, Griffin M, Bond AT (2004) Gene co-expression network topology provides a framework for molecular characterization of cellular state. Bioinformatics 20: 2242–50.
[22]
Mao L, Van Hemert JL, Dash S, Dickerson JA (2009) Arabidopsis gene co-expression network and its functional modules. BMC Bioinformatics 10: 346.
[23]
Mason MJ, Fan G, Plath K, Zhou Q, Horvath S (2009) Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells. BMC Genomics 10: 327.
[24]
Ruan J, Dean AK, Zhang W (2010) A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Syst Biol 4: 8.
[25]
Hu Z, Snitkin E, DeLisi C (2008) Visant: an integrative framework for networks in systems biology. Brief Bioinform 9: 317–325.
[26]
Shannon P, Markiel A, Ozier O, Baliga N, Wang J, et al. (2003) Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Research 13: 2498–2504.
[27]
Dennis G, Sherman B, Hosack D, Yang J, Gao W, et al. (2003) David: Database for annotation, visualization, and integrated discovery. Genome Biol 4: P3.
[28]
Oldham M, Horvath S, Geschwind D (2006) Conservation and evolution of gene co-expression networks in human and chimpanzee brains. Proc Natl Acad Sci USA 103: 17973–17978.
[29]
Ghazalpour A, Doss S, Zhang B, Plaisier C, Wang S, et al. (2006) Integrating genetics and network analysis to characterize genes related to mouse weight. PloS Genetics 2: e130.
[30]
Carlson M, Zhang B, Fang Z, Horvath S, Mishel P, et al. (2006) Gene connectivity, function, and sequence conservation: Predictions from modular yeast co-expression networks. BMC Genomics 7:
[31]
Horvath S, Zhang B, Carlson M, Lu K, Zhu S, et al. (2006) Analysis of oncogenic signaling networks in glioblastoma identifies aspm as a novel molecular target. Proc Natl Acad Sci USA 103: 17402–17407.
[32]
Cox DR, Wermuth N (1996) Multivariate Dependencies: Models, Analysis and Interpretation. London: Chapman and Hall.
[33]
Penrose R (1955) A generalized inverse for matrices. Proc Cambridge Phil Soc 51: 406–413.
[34]
Cooper GF, Hersokovits E (1992) A bayesian method for the induction of probabilistic networks from data. Machine Learning 9: 309–347.
[35]
Chen X, Chen M, Ning K (2006) Bnarray: an r package for constructing gene regulatory networks from microarray data by using bayesian network. Bioinformatics 22: 2952.
[36]
Myllymaki P, Silander T, Tirri H, Uronen P (2002) B-course: A web-based tool for bayesian and causal data analysis. International Journal on Artificial Intelligence Tools 11: 369–388.
[37]
Murphy K (2001) The bayes net toolbox for matlab. Computing science and statistics 33: 1024–1034.
[38]
Peri S, Navarro JD, Kristiansen TZ, Amanchy R, Surendranath V, et al. (2004) Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res 32: D497–501.
[39]
Mishra GR, Suresh M, Kumaran K, Kannabiran N, Suresh S, et al. (2006) Human protein reference database–2006 update. Nucleic Acids Res 34: D411–414.
[40]
Pan W, Lin J, Le CT (2002) Model-based cluster analysis of microarray gene-expression data. Genome Biol 3: RESEARCH0009.
[41]
Husmeier D (2003) Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic bayesian networks. Bioinformatics 19: 2271.
[42]
Faith JJ, Driscoll ME, Fusaro VA, Cosgrove EJ, Hayete B, et al. (2008) Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata. Nucleic Acids Res 36: D866–870.
[43]
Salgado H, Gama-Castro S, Peralta-Gil M, Diaz-Peredo E, Sanchez-Solano F, et al. (2006) RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucleic Acids Res 34: D394–397.
[44]
Alvarez A, Woolf P (2011) Partially observed bipartite network analysis to identify predictive connections in transcriptional regulatory networks. BMC Systems Biology 5: 86.