Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ-clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs. 1. Introduction The development in systems biology has brought a strong interest in considering an organism as a large and complex network of interacting parts. Many subsystems of living organisms can be modeled as complex networks. One important example is a network of biochemical reactions which constitutes a complex system responsible for homeostasis in the living cell. An abstract network model of the biochemical processes within the cell can be constructed such that reactions are represented as nodes and metabolites (and enzymes) as edges. In the past, this system was studied mainly on a subsystem level through metabolic pathways. Recently, it has become important to consider the metabolic system as one complex network to understand deeper phenomena involving interactions across multiple pathways. The need to study the whole network consisting of thousands of reactions, metabolites, and enzymes requires a visualization system allowing biologists to study the overall structure of the system. Such a visualization should allow navigation and comprehension of the global system structures. In the present paper, we propose a visualization algorithm for very large networks arising in systems biology and we illustrate its usage on two complex biological networks. The first case study is a metabolic network
References
[1]
T. Kamada and S. Kawai, “An algorithm for drawing general undirected graphs,” Information Processing Letters, vol. 31, no. 1, pp. 7–15, 1989.
[2]
T. M. J. Ffuchterman and E. M. Reingold, “Graph drawing by force-directed place-ment,” Software, vol. 21, no. 11, pp. 1129–1164, 1991.
[3]
A. T. Adai, S. V. Date, S. Wieland, and E. M. Marcotte, “LGL: creating a map of protein function with an algorithm for visualizing very large biological networks,” Journal of Molecular Biology, vol. 340, no. 1, pp. 179–190, 2004.
[4]
K. Han and B.-H. Ju, “A fast layout algorithm for protein interaction networks,” Bioinformatics, vol. 19, no. 15, pp. 1882–1888, 2003.
[5]
J. Abello, M. Resende, and S. Sudarsky, “Massive quasi-clique detection,” in LATIN 2002: Theoretical Informatics, S. Rajsbaum, Ed., vol. 2286 of Lecture Notes in Computer Science, pp. 598–612, Springer, Berlin, Germany, 2002.
[6]
W. Li and H. Kurata, “A grid layout algorithm for automatic drawing of biochemical networks,” Bioinformatics, vol. 21, no. 9, pp. 2036–2042, 2005.
[7]
S. E. Schaeffer, “Graph clustering,” Computer Science Review, vol. 1, pp. 27–64, 2007.
[8]
N. Mishra, R. Schreiber, I. Stanton, and R. Tarjan, “Clustering social networks,” in Algorithms and Models For the Web-Graph, A. Bonato and F. Chung, Eds., vol. 4863 of Lecture Notes in Computer Science, pp. 56–67, Springer, Berlin, Germany, 2007.
[9]
J. Hastad, “Clique is hard to approximate within n1-ε,” Acta Mathematica, vol. 182, pp. 105–142, 1999.
[10]
A. Frick, A. Ludwig, and H. Mehldau, “A fast adaptive layout algorithm for undirected graphs (extended abstract and system demonstration),” in Graph Drawing, R. Tamassia and I. Tollis, Eds., vol. 894 of Lecture Notes in Computer Science, pp. 388–403, Springer, Berlin, Germany, 1995.
[11]
E. M. Reingold and J. S. Tilford, “Tidier drawings of trees,” IEEE Transactions on Software Engineering, vol. SE-7, no. 2, pp. 223–228, 1981.
[12]
S. Grivet, D. Auber, J. P. Domenger, and G. Melancon, “Bubble tree drawing algorithm,” in Computer Vision and Graphics, K. Wojciechowski, B. Smolka, H. Palus, R. Kozera, W. Skarbek, and L. Noakes, Eds., vol. 32 of Computational Imaging and Vision, pp. 633–641, Springer, Dordrecht, The Netherlands, 2006.
[13]
T. Hruz, O. Laule, G. Szabo et al., “Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes,” Advances in Bioinformatics, vol. 2008, Article ID 420747, 5 pages, 2008.
[14]
T. Asami, Y. K. Min, K. Sekimata et al., “Mode of action of brassinazole: a specific inhibitor of brassinosteroid biosynthesis,” ACS Symposium Series, vol. 774, pp. 269–280, 2001.
[15]
J.-X. He, J. M. Gendron, Y. Yang, J. Li, and Z.-Y. Wang, “The GSK3-like kinase BIN2 phosphorylates and destabilizes BZR1, a positive regulator of the brassinosteroid signaling pathway in Arabidopsis,” Proceedings of the National Academy of Sciences of the United States of America, vol. 99, no. 15, pp. 10185–10190, 2002.
[16]
K. J. Halliday, “Plant hormones: the interplay of brassinosteroids and auxin,” Current Biology, vol. 14, no. 23, pp. R1008–R1010, 2004.
[17]
K. Y. Yip, R. P. Alexander, K.-K. Yan, and M. Gerstein, “Improved reconstruction of in silico gene regulatory networks by integrating knockout and perturbation data,” PLoS ONE, vol. 5, no. 1, Article ID e8121, 2010.
[18]
M. Mutwil, B. Usadel, M. Schütte, A. Loraine, O. Ebenh?h, and S. Persson, “Assembly of an interactive correlation network for the Arabidopsis genome using a novel Heuristic Clustering Algorithm,” Plant Physiology, vol. 152, no. 1, pp. 29–43, 2010.
[19]
D. Marbach, R. J. Prill, T. Schaffter, C. Mattiussi, D. Floreano, and G. Stolovitzky, “Revealing strengths and weaknesses of methods for gene network inference,” Proceedings of the National Academy of Sciences of the United States of America, vol. 107, no. 14, pp. 6286–6291, 2010.
[20]
S. De Bodt, S. Proost, K. Vandepoele, P. Rouzé, and Y. Van de Peer, “Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression,” BMC genomics, vol. 10, p. 288, 2009.
[21]
D. R. Rhodes, S. A. Tomlins, S. Varambally et al., “Probabilistic model of the human protein-protein interaction network,” Nature Biotechnology, vol. 23, no. 8, pp. 951–959, 2005.
[22]
A. de la Fuente, N. Bing, I. Hoeschele, and P. Mendes, “Discovery of meaningful associations in genomic data using partial correlation coefficients,” Bioinformatics, vol. 20, no. 18, pp. 3565–3574, 2004.
[23]
J.-F. Rual, K. Venkatesan, T. Hao et al., “Towards a proteome-scale map of the human protein-protein interaction network,” Nature, vol. 437, no. 7062, pp. 1173–1178, 2005.
[24]
K. Ishii, T. Washio, T. Uechi, M. Yoshihama, N. Kenmochi, and M. Tomita, “Characteristics and clustering of human ribosomal protein genes,” BMC Genomics, vol. 7, article 37, 2006.
[25]
O. Atias, B. Chor, and D. A. Chamovitz, “Large-scale analysis of Arabidopsis transcription reveals a basal co-regulation network,” BMC Systems Biology, vol. 3, p. 86, 2009.
[26]
R. Bourqui, D. Auber, and P. Mary, “How to draw clustered weighted graphs using a multilevel force-directed graph drawing algorithm,” in Proceedings of the 11th International Conference Information Visualization (IV '07), pp. 757–764, July 2007.
[27]
B. Kaba, N. Pinet, G. Lelandais, A. Sigayret, and A. Berry, “Clustering gene expression data using graph separators,” In Silico Biology, vol. 7, no. 4-5, pp. 433–452, 2007.