To understand how integration of multiple data types can help decipher cellular responses at the systems level, we analyzed the mitogenic response of human mammary epithelial cells to epidermal growth factor (EGF) using whole genome microarrays, mass spectrometry-based proteomics and large-scale western blots with over 1000 antibodies. A time course analysis revealed significant differences in the expression of 3172 genes and 596 proteins, including protein phosphorylation changes measured by western blot. Integration of these disparate data types showed that each contributed qualitatively different components to the observed cell response to EGF and that varying degrees of concordance in gene expression and protein abundance measurements could be linked to specific biological processes. Networks inferred from individual data types were relatively limited, whereas networks derived from the integrated data recapitulated the known major cellular responses to EGF and exhibited more highly connected signaling nodes than networks derived from any individual dataset. While cell cycle regulatory pathways were altered as anticipated, we found the most robust response to mitogenic concentrations of EGF was induction of matrix metalloprotease cascades, highlighting the importance of the EGFR system as a regulator of the extracellular environment. These results demonstrate the value of integrating multiple levels of biological information to more accurately reconstruct networks of cellular response.
References
[1]
Aderem A (2005) Systems biology: its practice and challenges. Cell 121: 511–513.
[2]
Wiley HS, Shvartsman SY, Lauffenburger DA (2003) Computational modeling of the EGF-receptor system: a paradigm for systems biology. Trends Cell Biol 13: 43–50.
[3]
Nicholson JK, Holmes E, Lindon JC, Wilson ID (2004) The challenges of modeling mammalian biocomplexity. Nat Biotechnol 22: 1268–1274.
[4]
Schena M, Shalon D, Davis RW, Brown PO (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270: 467–470.
Perou CM, Jeffrey SS, van de Rijn M, Rees CA, Eisen MB, et al. (1999) Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci U S A 96: 9212–9217.
[7]
Lettieri T (2006) Recent applications of DNA microarray technology to toxicology and ecotoxicology. Environ Health Perspect 114: 4–9.
[8]
Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, et al. (1999) Quantitative analysis of complex protein mixtures using isotop-coded affinity tags. Nat Biotechnol 17: 994–999.
[9]
Liu T, Qian WJ, Strittmatter EF, Camp DG 2nd, Anderson GA, et al. (2004) High-throughput comparative proteome analysis using a quantitative cysteinyl-peptide enrichment technology. Anal Chem 76: 5345–5353.
[10]
Zhang Y, Wolf-Yadlin A, Ross PL, Pappin DJ, Rush J, et al. (2005) Time-resolved mass spectrometry of tyrosine phosphorylation sites in the epidermal growth factor receptor signaling network reveals dynamic modules. Mol Cell Proteomics 4: 1240–1250.
[11]
Zangar RC, Daly DS, White AM (2006) ELISA microarray technology as a high-throughput system for cancer biomarker validation. Expert Rev Proteomics 3: 37–44.
[12]
Irish JM, Kotecha N, Nolan GP (2006) Mapping normal and cancer cell signalling networks: towards single-cell proteomics. Nat Rev Cancer 6: 146–155.
[13]
Jones RB, Gordus A, Krall JA, MacBeath G (2006) A quantitative protein interaction network for the ErbB receptors using protein microarrays. Nature 439: 168–174.
[14]
Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, et al. (2005) A human protein-protein interaction network: a resource for annotating the proteome. Cell 122: 957–968.
[15]
Chen G, Gharib TG, Huang CC, Taylor JM, Misek DE, et al. (2002) Discordant protein and mRNA expression in lung adenocarcinomas. Mol Cell Proteomics 1: 304–313.
[16]
Gygi SP, Rochon Y, Franza BR, Aebersold R (1999) Correlation between protein and mRNA abundance in yeast. Mol Cell Biol 19: 1720–1730.
[17]
Hatzimanikatis V, Lee KH (1999) Dynamical analysis of gene networks requires both mRNA and protein expression information. Metab Eng 1: 275–281.
[18]
Kitano H (2002) Looking beyond the details: a rise in system-oriented approaches in genetics and molecular biology. Curr Genet 41: 1–10.
[19]
Bonneau R, Facciotti MT, Reiss DJ, Schmid AK, Pan M, et al. (2007) A predictive model for transcriptional control of physiology in a free living cell. Cell 131: 1354–1365.
[20]
Chen W-NU, Woodbury RL, Kathmann LE, Opresko LK, Zangar RC, et al. (2004) Induced autocrine signaling through the epidermal growth factor receptor contributes to the response of mammary epithelial cells to tumor necrosis factor α. J Biol Chem 279: 18488.
Stampfer MR, Pan CH, Hosoda J, Bartholomew J, Mendelsohn J, et al. (1993) Blockage of EGF receptor signal transduction causes reversible arrest of normal and immortal human mammary epithelial cells with synchronous reentry into the cell cycle. Exp Cell Res 208: 175–188.
[23]
Smith RD, Anderson GA, Lipton MS, Pasa-Tolic L, Shen Y, et al. (2002) An accurate mass tag strategy for quantitative and high-throughput proteome measurements. Proteomics 2: 513–523.
[24]
Chen W-NU, Yu L-R, Strittmater EF, Thrall BD, Camp DGI, et al. (2003) Detection of in situ labeled cell surface proteins by mass spectrometry: application to the membrane subproteome of human mammary epithelial cells. Proteomics 3: 1647–1651.
[25]
Jacobs JM, Mottaz HM, Yu LR, Anderson DJ, Moore RJ, et al. (2004) Multidimensional proteome analysis of human mammary epithelial cells. J Proteome Res 3: 68–75.
[26]
Liu T, Qian W-J, Chen W-NU, Jacobs JM, Moore RJ, et al. (2005) Improved proteome coverage using high-efficiency cysteinyl peptide enrichment: The mammary epithelial cell proteome. Proteomics 5: 1263–1273.
[27]
Griffin TJ, Gygi SP, Ideker T, Rist B, Eng J, et al. (2002) Complementary profiling of gene expression at the transcriptome and proteome levels in Saccharomyces cerevisiae. Mol Cell Proteomics 1: 323–333.
[28]
Waters KM, Pounds JG, Thrall BD (2006) Data merging for integrated microarray and proteomic analysis. Briefings in Functional Genomics Proteomics 5: 261–272.
[29]
Yamanishi Y, Vert JP, Nakaya A, Kanehisa M (2003) Extraction of correlated gene clusters from multiple genomic data by generalized kernel canonical correlation analysis. Bioinformatics 19: Suppl 1i323–330.
[30]
Ideker T, Thorsson V, Ranish JA, Christmas R, Buhler J, et al. (2001) Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292: 929–934.
[31]
Tian Q, Stepaniants SB, Mao M, Weng L, Feetham MC, et al. (2004) Integrated genomic and proteomic analyses of gene expression in Mammalian cells. Mol Cell Proteomics 3: 960–969.
[32]
Verhoeckx KC, Bijlsma S, de Groene EM, Witkamp RF, van der Greef J, et al. (2004) A combination of proteomics, principal component analysis and transcriptomics is a powerful tool for the identification of biomarkers for macrophage maturation in the U937 cell line. Proteomics 4: 1014–1028.
[33]
Cox B, Kislinger T, Emili A (2005) Integrating gene and protein expression data: pattern analysis and profile mining. Methods 35: 303–314.
[34]
Nishizuka S, Charboneau L, Young L, Major S, Reinhold WC, et al. (2003) Proteomic profiling of the NCI-60 cancer cell lines using new high-density reverse-phase lysate microarrays. Proc Natl Acad Sci U S A 100: 14229–14234.
[35]
Fischer OM, Hart S, Gschwind A, Ullrich A (2003) EGFR signal transactivation in cancer cells. Biochem Soc Trans 31: 1203–1208.
[36]
Pratt JM, Petty J, Riba-Garcia I, Robertson DH, Gaskell SJ, et al. (2002) Dynamics of protein turnover, a missing dimension in proteomics. Mol Cell Proteomics 1: 579–591.
Pines A, Kelstrup CD, Vrouwe MG, Puigvert JC, Typas D, et al. (2011) Global phosphoproteome profiling reveals unanticipated networks responsive to cisplatin treatment of embryonic stem cells. Mol Cell Biol 31: 4964–4977.
[39]
Oyama M, Nagashima T, Suzuki T, Kozuka-Hata H, Yumoto N, et al. (2011) Integrated quantitative analysis of the phosphoproteome and transcriptome in tamoxifen-resistant breast cancer. J Biol Chem 286: 818–829.
[40]
Moro L, Dolce L, Cabodi S, Bergatto E, Erba EB, et al. (2002) Integrin-induced epidermal growth factor (EGF) receptor activation requires c-Src and p130Cas and leads to phosphorylation of specific EGF receptor tyrosines. J Biol Chem 277: 9405–9414.
[41]
Monaghan-Benson E, McKeown-Longo PJ (2006) Urokinase-type plasminogen activator receptor regulates a novel pathway of fibronectin matrix assembly requiring Src-dependent transactivation of epidermal growth factor receptor. J Biol Chem 281: 9450–9459.
[42]
Amit I, Citri A, Shay T, Lu Y, Katz M, et al. (2007) A module of negative feedback regulators defines growth factor signaling. Nature Genetics 39: 503–512.
[43]
Anastassiou D (2007) Computational analysis of the synergy among multiple interacting genes. Mol Syst Biol 3: 83.
[44]
McDermott JE, Costa M, Janszen D, Singhal M, Tilton SC (2010) Separating the drivers from the driven: Integrative network and pathway approaches aid identification of disease biomarkers from high-throughput data. Dis Markers 28: 253–266.
[45]
Ideker T, Dutkowski J, Hood L (2011) Boosting signal-to-noise in complex biology: prior knowledge is power. Cell 144: 860–863.
[46]
Stampfer MR, Yaswen P (1993) Culture systems for study of human mammary epithelial cell proliferation, differentiation and transformation. Cancer Surv 18: 7–34.
[47]
Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19: 185–193.
[48]
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate- a practical and powerful approach to multiple testing. J Roy Stat Soc B Met 57: 289–300.
[49]
Petritis K, Kangas LJ, Ferguson PL, Anderson GA, Pasa-Tolic L, et al. (2003) Use of artificial neural networks for the accurate prediction of peptide liquid chromatography elution times in proteome analyses. Anal Chem 75: 1039–1048.
[50]
Qian WJ, Monroe ME, Liu T, Jacobs JM, Anderson GA, et al. (2005) Quantitative proteome analysis of human plasma following in vivo lipopolysaccharide administration using 16O/18O labeling and the accurate mass and time tag approach. Mol Cell Proteomics 4: 700–709.
[51]
Nesvizhskii AI, Keller A, Kolker E, Aebersold R (2003) A statistical model for identifying proteins by tandem mass spectrometry. Establishment, characterization, and long-term maintenance of cultures of human fetal hepatocytes. Anal Chem 75: 4646–4658.
[52]
Shah AR, Singhal M, Klicker KR, Stephan EG, Steven Wiley H, et al. (2007) Enabling high-throughput data management for systems biology: The Bioinformatics Resource Manager. Bioinformatics 23: 906–909.
[53]
Saeed AI, Sharov V, White J, Li J, Liang W, et al. (2003) TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34: 374–378.
[54]
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, et al. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13: 2498–2504.