全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Identification of Robust Pathway Markers for Cancer through Rank-Based Pathway Activity Inference

DOI: 10.1155/2013/618461

Full-Text   Cite this paper   Add to My Lib

Abstract:

One important problem in translational genomics is the identification of reliable and reproducible markers that can be used to discriminate between different classes of a complex disease, such as cancer. The typical small sample setting makes the prediction of such markers very challenging, and various approaches have been proposed to address this problem. For example, it has been shown that pathway markers, which aggregate the gene activities in the same pathway, tend to be more robust than gene markers. Furthermore, the use of gene expression ranking has been demonstrated to be robust to batch effects and that it can lead to more interpretable results. In this paper, we propose an enhanced pathway activity inference method that uses gene ranking to predict the pathway activity in a probabilistic manner. The main focus of this work is on identifying robust pathway markers that can ultimately lead to robust classifiers with reproducible performance across datasets. Simulation results based on multiple breast cancer datasets show that the proposed inference method identifies better pathway markers that can predict breast cancer metastasis with higher accuracy. Moreover, the identified pathway markers can lead to better classifiers with more consistent classification performance across independent datasets. 1. Introduction Advances in microarray and sequencing technologies have enabled the measurement of genome-wide expression profiles, which have spawned a large number of studies aiming to make accurate diagnosis and prognosis based on gene expression profiles [1–4]. For example, there has been significant amount of work on identifying markers and building classifiers that can be used to predict breast cancer metastasis [2, 4]. Many existing methods have directly employed gene expression data without any knowledge of the interrelations between genes. As a result, the predicted gene markers often lack interpretability and many of them are not reproducible in other independent datasets. To overcome this problem, several different approaches have been proposed so far. For example, a recent work by Geman et al. [3] proposed an approach that utilizes the relative expression between genes, rather than their absolute expression values. It was shown that the resulting markers are easier to interpret, robust to chip-to-chip variations, and more reproducible across datasets. Another possible way to address the aforementioned problem is to interpret the gene expression data at a “modular” level through data integration [5–11]. These methods utilize additional data

References

[1]  M. West, C. Blanchette, H. Dressman et al., “Predicting the clinical status of human breast cancer by using gene expression profiles,” Proceedings of the National Academy of Sciences of the United States of America, vol. 98, no. 20, pp. 11462–11467, 2001.
[2]  L. J. Van't Veer, H. Dai, M. J. Van de Vijver et al., “Gene expression profiling predicts clinical outcome of breast cancer,” Nature, vol. 415, no. 6871, pp. 530–536, 2002.
[3]  D. Geman, C. D'Avignon, D. Q. Naiman, and R. L. Winslow, “Classifying gene expression profiles from pairwise mRNA comparisons,” Statistical Applications in Genetics and Molecular Biology, vol. 3, no. 1, article 19, 2004.
[4]  Y. Wang, J. G. M. Klijn, Y. Zhang et al., “Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer,” The Lancet, vol. 365, no. 9460, pp. 671–679, 2005.
[5]  L. Tian, S. A. Greenberg, S. W. Kong, J. Altschuler, I. S. Kohane, and P. J. Park, “Discovering statistically significant pathways in expression profiling studies,” Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 38, pp. 13544–13549, 2005.
[6]  Z. Guo, T. Zhang, X. Li et al., “Towards precise classification of cancers based on robust gene functional expression profiles,” BMC Bioinformatics, vol. 6, article 58, 2005.
[7]  C. Auffray, “Protein subnetwork markers improve prediction of cancer outcome,” Molecular Systems Biology, vol. 3, article 141, 2007.
[8]  H. Y. Chuang, E. Lee, Y. T. Liu, D. Lee, and T. Ideker, “Network-based classification of breast cancer metastasis,” Molecular Systems Biology, vol. 3, article 140, 2007.
[9]  E. Lee, H. Y. Chuang, J. W. Kim, T. Ideker, and D. Lee, “Inferring pathway activity toward precise disease classification,” PLoS Computational Biology, vol. 4, no. 11, Article ID e1000217, 2008.
[10]  J. Su, B. J. Yoon, and E. R. Dougherty, “Accurate and reliable cancer classification based on probabilistic inference of pathway activity,” PloS ONE, vol. 4, no. 12, Article ID e8161, 2009.
[11]  J. Su, B. J. Yoon, and E. R. Dougherty, “Identification of diagnostic subnetwork markers for cancer in human protein-protein interaction network,” BMC Bioinformatics, vol. 11, no. 6, article 8, 2010.
[12]  J. A. Eddy, L. Hood, N. D. Price, and D. Geman, “Identifying tightly regulated and variably expressed networks by Differential Rank Conservation (DIRAC),” PLoS Computational Biology, vol. 6, no. 5, Article ID e1000792, 2010.
[13]  N. Khunlertgit and B. J. Yoon, “Finding robust pathway markers for cancer classification,” in Proceedings of the IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS '12), 2012.
[14]  M. J. Van De Vijver, Y. D. He, L. J. Van 'T Veer et al., “A gene-expression signature as a predictor of survival in breast cancer,” New England Journal of Medicine, vol. 347, no. 25, pp. 1999–2009, 2002.
[15]  C. Desmedt, F. Piette, S. Loi et al., “Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series,” Clinical Cancer Research, vol. 13, no. 11, pp. 3207–3214, 2007.
[16]  Y. Pawitan, J. Bjohle, L. Amler, and A. L. Borg, “Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts,” Breast Cancer Research, vol. 7, pp. R953–R964, 2005.
[17]  H. Y. Chang, D. S. A. Nuyten, J. B. Sneddon et al., “Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival,” Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 10, pp. 3738–3743, 2005.
[18]  R. Edgar, M. Domrachev, and A. E. Lash, “Gene Expression Omnibus: NCBI gene expression and hybridization array data repository,” Nucleic Acids Research, vol. 30, no. 1, pp. 207–210, 2002.
[19]  R. C. Gentleman, V. J. Carey, D. M. Bates et al., “Bioconductor: open software development for computational biology and bioinformatics,” Genome Biology, vol. 5, no. 10, p. R80, 2004.
[20]  A. Liberzon, A. Subramanian, R. Pinchback, H. Thorvaldsdóttir, P. Tamayo, and J. P. Mesirov, “Molecular signatures database (MSigDB) 3.0,” Bioinformatics, vol. 27, no. 12, pp. 1739–1740, 2011.
[21]  T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley Interscience, New York, NY, USA, 2006.
[22]  T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861–874, 2006.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133