全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A Robust Topology-Based Algorithm for Gene Expression Profiling

DOI: 10.5402/2012/381023

Full-Text   Cite this paper   Add to My Lib

Abstract:

Early and accurate diagnoses of cancer can significantly improve the design of personalized therapy and enhance the success of therapeutic interventions. Histopathological approaches, which rely on microscopic examinations of malignant tissue, are not conducive to timely diagnoses. High throughput genomics offers a possible new classification of cancer subtypes. Unfortunately, most clustering algorithms have not been proven sufficiently robust. We propose a novel approach that relies on the use of statistical invariants and persistent homology, one of the most exciting recent developments in topology. It identifies a sufficient but compact set of genes for the analysis as well as a core group of tightly correlated patient samples for each subtype. Partitioning occurs hierarchically and allows for the identification of genetically similar subtypes. We analyzed the gene expression profiles of 202 tumors of the brain cancer glioblastoma multiforme (GBM) given at the Cancer Genome Atlas (TCGA) site. We identify core patient groups associated with the classical, mesenchymal, and proneural subtypes of GBM. In our analysis, the neural subtype consists of several small groups rather than a single component. A subtype prediction model is introduced which partitions tumors in a manner consistent with clustering algorithms but requires the genetic signature of only 59 genes. 1. Introduction Cancers in many tissues are heterogeneous, and the efficacy of therapeutic interventions depends on the specific subtype of the malignancy. Hence, early and accurate identification of the cancer subtype is critical in designing an effective personalized therapy. Current methods for assessment rely on microscopic examinations of the malignant tissue for previously established histopathological abnormalities. Unfortunately, such features may not be apparent during early stages of the disease and moreover, differentiating between abnormalities in distinct cancer subtypes can be challenging. Recent advances in high-throughput genomics offer an exciting new alternative for early and reliable cancer prognosis. Mutations that underlie a malignancy modify the levels of many genes within a cell; the goal of gene expression profiling is to define a signature for each cancer subtype through statistically significant up-/downregulation of a panel of genes. The National Institutes for Health, through the Cancer Genome Atlas (TCGA) [1, 2], will aid this effort by establishing large sets of genomic data on human cancers in at least 20 tissues [3–8]. The premise behind TCGA is that

References

[1]  R. G. W. Verhaak, K. A. Hoadley, E. Purdom et al., “Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1,” Cancer Cell, vol. 17, no. 1, pp. 98–110, 2010.
[2]  L. Chin, M. Meyerson, R. McLendon et al., “Comprehensive genomic characterization defines human glioblastoma genes and core pathways,” Nature, vol. 455, no. 7216, pp. 1061–1068, 2008.
[3]  Y. Liang, M. Diehn, N. Watson et al., “Gene expression profiling reveals molecularly and clinically distinct subtypes of glioblastoma multiforme,” Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 16, pp. 5814–5819, 2005.
[4]  E. A. Maher, C. Brennan, P. Y. Wen et al., “Marked genomic differences characterize primary and secondary glioblastoma subtypes and identify two distinct molecular and clinical secondary glioblastoma entities,” Cancer Research, vol. 66, no. 23, pp. 11502–11513, 2006.
[5]  P. S. Mischel and T. F. Cloughesy, “Targeted molecular therapy of GBM,” Brain Pathology, vol. 13, no. 1, pp. 52–61, 2003.
[6]  P. S. Mischel, S. F. Nelson, and T. F. Cloughesy, “Molecular analysis of glioblastoma: Pathway profiling and its implications for patient therapy,” Cancer Biology and Therapy, vol. 2, no. 3, pp. 242–247, 2003.
[7]  A. von Deimling, D. N. Louis, and O. D. Wiestler, “Molecular pathways in the formation of gliomas,” Glia, vol. 15, no. 3, pp. 328–338, 1995.
[8]  W. A. Freije, F. E. Castro-Vargas, Z. Fang et al., “Gene expression profiling of gliomas strongly predicts survival,” Cancer Research, vol. 64, no. 18, pp. 6503–6510, 2004.
[9]  E. C. Hayden, “Genomics boosts brain-cancer work,” Nature, vol. 463, no. 7279, p. 278, 2010.
[10]  B. M. Kuehn, “Genomics illuminates a deadly brain cancer,” The Journal of the American Medical Association, vol. 303, no. 10, pp. 925–927, 2010.
[11]  S. Monti, P. Tamayo, J. Mesirov, and T. Golub, “Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data,” Machine Learning, vol. 52, no. 1-2, pp. 91–118, 2003.
[12]  Y. Liu, D. N. Hayes, A. Nobel, and J. S. Marron, “Statistical significance of clustering for high-dimension, low-sample size data,” Journal of the American Statistical Association, vol. 103, no. 483, pp. 1281–1293, 2008.
[13]  G. Carlsson, “Topology and data,” Bulletin of the American Mathematical Society, vol. 46, no. 2, pp. 255–308, 2009.
[14]  G. Carlsson and A. Zomorodian, “The theory of multidimensional persistence,” Discrete and Computational Geometry, vol. 42, no. 1, pp. 71–93, 2009.
[15]  D. Horak, S. Maleti?, and M. Rajkovi?, “Persistent homology of complex networks,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2009, no. 3, Article ID P03034, 2009.
[16]  M. J. L. de Hoon, S. Imoto, J. Nolan, and S. Miyano, “Open source clustering software,” Bioinformatics, vol. 20, no. 9, pp. 1453–1454, 2004.
[17]  M. Fiedler, “Algebraic connectivity of graphs,” Czechoslovak Mathematical Journal, vol. 23, no. 2, pp. 298–305, 1973.
[18]  M. Fiedler, “Additive compound matrices and an inequality for eigenvalues of symmetric stochastic matrices,” Czechoslovak Mathematical Journal, vol. 24, no. 12, pp. 392–402, 1974.
[19]  M. Newman, “Fast algorithm for detecting community structure in networks,” Physical Review E, vol. 69, no. 6, Article ID 066133, 2004.
[20]  U. von Luxburg, “A tutorial on spectral clustering,” Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007.
[21]  D. Mishra, R. Dash, A. K. Rath, and M. Acharya, “Feature selection in gene expression data using principal component analysis and rough set theory,” Advances in Experimental Medicine and Biology, vol. 696, pp. 91–100, 2011.
[22]  E. Cerami, E. Demir, N. Schultz, B. S. Taylor, and C. Sander, “Automated network analysis identifies core pathways in glioblastoma,” PLoS ONE, vol. 5, no. 2, Article ID e8918, 2010.
[23]  H. S. Phillips, S. Kharbanda, R. Chen et al., “Molecular subclasses of high-grade glioma predict prognosis, delineate a pattern of disease progression, and resemble stages in neurogenesis,” Cancer Cell, vol. 9, no. 3, pp. 157–173, 2006.
[24]  R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annals of Eugenics, vol. 7, no. 2, pp. 179–188, 1936.
[25]  J. Friedman, “Regularized discriminant-analysis,” Journal of the American Statistical Association, vol. 84, no. 405, pp. 165–175, 1989.
[26]  D. N. Joanes and C. A. Gill, “Comparing measures of sample skewness and kurtosis,” Journal of the Royal Statistical Society Series D, vol. 47, no. 1, pp. 183–189, 1998.
[27]  W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing, Cambridge University Press, New York, NY, USA, 3rd edition, 2007.
[28]  M. Kendall and A. Stuart, The Advanced Theory of Statistics, vol. 3, Griffin, High Wycombe, UK, 1983.
[29]  J. Ward, “Hierarchical grouping to optimize an objective function,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 236–244, 1963.
[30]  M. L. Brewer, “Development of a spectral clustering method for the analysis of molecular data sets,” Journal of Chemical Information and Modeling, vol. 47, no. 5, pp. 1727–1733, 2007.
[31]  D. DeWoskin, J. Climent, I. Cruz-White, M. Vazquez, C. Park, and J. Arsuaga, “Applications of computational homology to the analysis of treatment response in breast cancer patients,” Topology and its Applications, vol. 157, no. 1, pp. 157–164, 2010.
[32]  D. DeWoskin, R. Scharein, J. Arsuaga, and C. Park, “A computational homology analysis of CGH data finds recurrent genomic instability in older breast cancer patients,” International Journal of Radiation Oncology Biology Physics, vol. 75, no. 3, supplement, p. S135, 2009.
[33]  P. H. A. Sneath, “The application of computers to taxonomy,” Journal of general microbiology, vol. 17, no. 1, pp. 201–226, 1957.
[34]  M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein, “Cluster analysis and display of genome-wide expression patterns,” Proceedings of the National Academy of Sciences of the United States of America, vol. 95, no. 25, pp. 14863–14868, 1998.
[35]  P. T. Spellman, G. Sherlock, M. Q. Zhang et al., “Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization,” Molecular Biology of the Cell, vol. 9, no. 12, pp. 3273–3297, 1998.
[36]  I. Ostrovnaya, G. Nanjangud, and A. B. Olshen, “A classification model for distinguishing copy number variants from cancer-related alterations,” BMC Bioinformatics, vol. 11, article 297, 2010.
[37]  G. Gundem, C. Perez-Llamas, A. Jene-Sanz et al., “IntOGen: integration and data mining of multidimensional oncogenomic data,” Nature Methods, vol. 7, no. 2, pp. 92–93, 2010.
[38]  C. L. Nutt, D. R. Mani, R. A. Betensky et al., “Gene expression-based classification of malignant gliomas correlates better with survival than histological classification,” Cancer Research, vol. 63, no. 7, pp. 1602–1607, 2003.
[39]  D. W. Parsons, S. Jones, X. Zhang et al., “An integrated genomic analysis of human glioblastoma multiforme,” Science, vol. 321, no. 5897, pp. 1807–1812, 2008.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133