This paper proposes a novel algorithm for inferring gene regulatory networks which makes use of cubature Kalman filter (CKF) and Kalman filter (KF) techniques in conjunction with compressed sensing methods. The gene network is described using a state-space model. A nonlinear model for the evolution of gene expression is considered, while the gene expression data is assumed to follow a linear Gaussian model. The hidden states are estimated using CKF. The system parameters are modeled as a Gauss-Markov process and are estimated using compressed sensing-based KF. These parameters provide insight into the regulatory relations among the genes. The Cramér-Rao lower bound of the parameter estimates is calculated for the system model and used as a benchmark to assess the estimation accuracy. The proposed algorithm is evaluated rigorously using synthetic data in different scenarios which include different number of genes and varying number of sample points. In addition, the algorithm is tested on the DREAM4 in silico data sets as well as the in vivo data sets from IRMA network. The proposed algorithm shows superior performance in terms of accuracy, robustness, and scalability. 1. Introduction Gene regulation is one of the most intriguing processes taking place in living cells. With hundreds of thousands of genes at their disposal, cells must decide which genes are to express at a particular time. As the cell development evolves, different needs and functions entail an efficient mechanism to turn the required genes on while leaving the others off. Cells can also activate new genes to respond effectively to environmental changes and perform specific roles. The knowledge of which gene triggers a particular genetic condition can help us ward off the potential harmful effects by switching that gene off. For instance, cancer can be controlled by deactivating the genes that cause it. Gene expression is the process of generating functional gene products, for example, mRNA and protein. The level of gene functionality can be measured using microarrays or gene chips to produce the gene expression data [1]. More accurate estimation of gene expression is now possible using the RNA-Seq method. Intelligent use of such data can help improve our understanding of how the genes are interacting in a living organism [2–4]. Gene regulation is known to exhibit several modes; a couple of important ones include transcription regulation and posttranscription regulation [5]. While the theoretical applications of gene regulation are extremely promising, it requires a thorough understanding
References
[1]
X. Zhou, X. Wang, and E. R. Dougherty, Genomic Networks: Statistical Inference from Microarray Data, John Wiley & Sons, New York, NY, USA, 2006.
[2]
H. Kitano, “Computational systems biology,” Nature, vol. 420, pp. 206–210, 2002.
[3]
X. Zhou and S. T. C. Wong, Computational Systems Bioinformatics, World Scientific, River Edge, NJ, USA, 2008.
[4]
X. Cai and X. Wang, “Stochastic modeling and simulation of gene networks,” IEEE Signal Processing Magazine, vol. 24, no. 1, pp. 27–36, 2007.
[5]
D. Yue, J. Meng, M. Lu, C. L. P. Chen, M. Guo, and Y. Huang, “Understanding micro-RNA regulation: a computational perspective,” IEEE Signal Processing Magazine, vol. 29, no. 1, pp. 77–88, 2012.
[6]
R. Pal, S. Bhattacharya, and M. U. Caglar, “Robust approaches for genetic regulatory network modeling and intervention: a review of recent advances,” IEEE Signal Processing Magazine, vol. 29, no. 1, pp. 66–76, 2012.
[7]
H. Hache, H. Lehrach, and R. Herwig, “Reverse engineering of gene regulatory networks: a comparative study,” Eurasip Journal on Bioinformatics and Systems Biology, vol. 2009, Article ID 617281, 2009.
[8]
T. Schlitt and A. Brazma, “Current approaches to gene regulatory network modelling,” BMC Bioinformatics, vol. 8, no. 6, p. 9, 2007.
[9]
H. D. Jong, “Modeling and simulation of genetic regulatoy systems: a literature review,” Journal of Computational Biology, vol. 9, no. 1, pp. 67–103, 2002.
[10]
I. Nachman, A. Regev, and N. Friedman, “Inferring quantitative models of regulatory networks from expression data,” Bioinformatics, vol. 20, no. 1, pp. i248–i256, 2004.
[11]
C. D. Giurcaneanu, I. Tabus, and J. Astola, “Clustering time series gene expression data based on sum-of-exponentials fitting,” EURASIP Journal on Advances in Signal Processing, vol. 2005, no. 8, Article ID 358568, pp. 1159–1173, 2005.
[12]
C. D. Giurcaneanu, I. Tabus, J. Astola, J. Ollila, and M. Vihinen, “Fast iterative gene clustering based on information theoretic criteria for selecting the cluster structure,” Journal of Computational Biology, vol. 11, no. 4, pp. 660–682, 2004.
[13]
X. Cai and G. B. Giannakis, “Identifying differentially expressed genes in microarray experiments with model-based variance estimation,” IEEE Transactions on Signal Processing, vol. 54, no. 6, pp. 2418–2426, 2006.
[14]
X. Zhou, X. Wang, and E. R. Dougherty, “Gene clustering based on cluster-wide mutual information,” Journal of Computational Biology, vol. 11, no. 1, pp. 151–165, 2004.
[15]
W. Zhao, E. Serpedin, and E. R. Dougherty, “Inferring connectivity of genetic regulatory networks using informationtheoretic criteria,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 5, no. 2, pp. 262–274, 2008.
[16]
J. Dougherty, I. Tabus, and J. Astola, “Inference of gene regulatory networks based on a universal minimum description length,” Eurasip Journal on Bioinformatics and Systems Biology, vol. 2008, Article ID 482090, 2008.
[17]
L. Qian, H. Wang, and E. R. Dougherty, “Inference of noisy nonlinear differential equation models for gene regulatory networks using genetic programming and Kalman filtering,” IEEE Transactions on Signal Processing, vol. 56, no. 7, pp. 3327–3339, 2008.
[18]
W. Zhao, E. Serpedin, and E. R. Dougherty, “Inferring gene regulatory networks from time series data using the minimum description length principle,” Bioinformatics, vol. 22, no. 17, pp. 2129–2135, 2006.
[19]
X. Zhou, X. Wang, R. Pal, I. Ivanov, M. Bittner, and E. R. Dougherty, “A Bayesian connectivity-based approach to constructing probabilistic gene regulatory networks,” Bioinformatics, vol. 20, no. 17, pp. 2918–2927, 2004.
[20]
J. Meng, M. Lu, Y. Chen, S.-J. Gao, and Y. Huang, “Robust inference of the context specific structure and temporal dynamics of gene regulatory network,” BMC Genomics, vol. 11, no. 3, p. S11, 2010.
[21]
Y. Zhang, Z. Deng, H. Jiang, and P. Jia, “Inferring gene regulatory networks from multiple data sources via a dynamic Bayesian network with structural em.,” in DILS, S. C. Boulakia and V. Tannen, Eds., vol. 4544 of Lecture Notes in Computer Science, pp. 204–214, Springer, New York, NY, USA, 2007.
[22]
K. Murphy and S. Mian, Modeling gene expression data using dynamic Bayesian networks, University of California, Berkeley, Calif, USA, 2001.
[23]
H. Liu, D. Yue, L. Zhang, Y. Chen, S. J. Gao, and Y. Huang, “A Bayesian approach for identifying miRNA targets by combining sequence prediction and gene expression profiling,” BMC Genomics, vol. 11, no. 3, p. S12, 2010.
[24]
Y. Huang, J. Wang, J. Zhang, M. Sanchez, and Y. Wang, “Bayesian inference of genetic regulatory networks from time series microarray data using dynamic Bayesian networks,” Journal of Multimedia, vol. 2, no. 3, pp. 46–56, 2007.
[25]
B.-E. Perrin, L. Ralaivola, A. Mazurie, S. Bottani, J. Mallet, and F. D'Alché-Buc, “Gene networks inference using dynamic Bayesian networks,” Bioinformatics, vol. 19, no. 2, pp. ii138–ii148, 2003.
[26]
C. Rangel, D. L. Wild, F. Falciani, Z. Ghahramani, and A. Gaiba, “A. modelling biological responses using gene expression profiling and linear dynamical systems,” Bioinformatics, pp. 349–356, 2005.
[27]
M. Quach, N. Brunel, and F. d'Alch Buc, “Estimating parameters and hidden variables in non-linear state-space models based on ODEs for biological networks inference,” Bioinformatics, vol. 23, no. 23, pp. 3209–3216, 2007.
[28]
F.-X. Wu, W.-J. Zhang, and A. J. Kusalik, “Modeling gene expression from microarray expression data with state-space equations,” in Pacific Symposium on Biocomputing, R. B. Altman, A. K. Dunker, L. Hunter, T. A. Jung, and T. E. Klein, Eds., pp. 581–592, World Scientific, River Edge, NJ, USA, 2004.
[29]
R. Yamaguchi, S. Yoshida, S. Imoto, T. Higuchi, and S. Miyano, “Finding module-based gene networks with state-space modelsmining high-dimensional and short time-course gene expression data,” IEEE Signal Processing Magazine, vol. 24, no. 1, pp. 37–46, 2007.
[30]
O. Hirose, R. Yoshida, S. Imoto et al., “Statistical inference of transcriptional module-based gene networks from time course gene expression profiles by using state space models,” Bioinformatics, vol. 24, no. 7, pp. 932–942, 2008.
[31]
J. Angus, M. Beal, J. Li, C. Rangel, and D. Wild, “Inferring transcriptional networks using prior biological knowledge and constrained state-space models,” in Learning and Inference in Computational Systems Biology, N. Lawrence, M. Girolami, M. Rattray, and G. Sanguinetti, Eds., pp. 117–152, MIT Press, Cambridge, UK, 2010.
[32]
C. Rangel, J. Angus, Z. Ghahramani et al., “Modeling T-cell activation using gene expression profiling and state-space models,” Bioinformatics, vol. 20, no. 9, pp. 1361–1372, 2004.
[33]
A. Noor, E. Serpedin, M. N. Nounou, and H. N. Nounou, “Inferring gene regulatory networks via nonlinear state-space models and exploiting sparsity,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1203–1211, 2012.
[34]
Z. Wang, X. Liu, Y. Liu, J. Liang, and V. Vinciotti, “An extended kalman filtering approach to modeling nonlinear dynamic gene regulatory networks via short gene expression time series,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 6, no. 3, pp. 410–419, 2009.
[35]
A. Noor, E. Serpedin, M. Nounou, H. Nounou, N. Mohamed, and L. Chouchane, “An overview of the statistical methods used for inferring gene regulatory networks and proteinprotein interaction networks,” Advances in Bioinformatics, vol. 2013, Article ID 953814, 12 pages, 2013.
[36]
I. Arasaratnam and S. Haykin, “Cubature kalman filters,” IEEE Transactions on Automatic Control, vol. 54, no. 6, pp. 1254–1269, 2009.
[37]
A. Noor, E. Serpedin, M. N. Nounou, and H. N. Nounou, “A cubature Kalman filter approach for inferring gene regulatory networks using time series data,” in Proceedings of the IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS '11), pp. 25–28, 2011.
[38]
A. Carmi, P. Gurfil, and D. Kanevsky, “Methods for sparse signal recovery using kalman filtering with embedded rseudo-measurement norms and quasi-norms,” IEEE Transactions on Signal Processing, vol. 58, no. 4, pp. 2405–2409, 2010.
[39]
C. A. Penfold and D. L. Wild, “How to infer gene networks from expression profiles, revisited,” Interface Focus, pp. 857–870, 2011.
[40]
I. Cantone, L. Marucci, F. Iorio et al., “A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches,” Cell, vol. 137, no. 1, pp. 172–181, 2009.
[41]
Y. Huang, I. M. Tienda-Luna, and Y. Wang, “Reverse engineering gene regulatory networks: a survey of statistical models,” IEEE Signal Processing Magazine, vol. 26, no. 1, pp. 76–97, 2009.
[42]
Z. Wang, F. Yang, D. W. C. Ho, S. Swift, A. Tucker, and X. Liu, “Stochastic dynamic modeling of short gene expression time-series data,” IEEE Transactions on Nanobioscience, vol. 7, no. 1, pp. 44–55, 2008.
[43]
H. Xiong and Y. Choe, “Structural systems identification of genetic regulatory networks,” Bioinformatics, vol. 24, no. 4, pp. 553–560, 2008.
[44]
R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society B, vol. 58, no. 1, pp. 267–288, 1996.
[45]
E. J. Cands and T. Tao, “Decoding by linear programming,” IEEE Transactions on Information Theory, vol. 51, no. 12, pp. 4203–4215, 2005.
[46]
J. D. Geeter, H. V. Brussel, and J. D. Schutter, “A smoothly constrained Kalman filter,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 10, pp. 1171–1177, 1997.
[47]
S. M. Kay, Fundamentals of Statistical Signal Processing. Estimation Theory, Prentice-Hall, New York, NY, USA, 1993.