All Title Author
Keywords Abstract

ADEMA: An Algorithm to Determine Expected Metabolite Level Alterations Using Mutual Information

DOI: 10.1371/journal.pcbi.1002859

Full-Text   Cite this paper   Add to My Lib


Metabolomics is a relatively new “omics” platform, which analyzes a discrete set of metabolites detected in bio-fluids or tissue samples of organisms. It has been used in a diverse array of studies to detect biomarkers and to determine activity rates for pathways based on changes due to disease or drugs. Recent improvements in analytical methodology and large sample throughput allow for creation of large datasets of metabolites that reflect changes in metabolic dynamics due to disease or a perturbation in the metabolic network. However, current methods of comprehensive analyses of large metabolic datasets (metabolomics) are limited, unlike other “omics” approaches where complex techniques for analyzing coexpression/coregulation of multiple variables are applied. This paper discusses the shortcomings of current metabolomics data analysis techniques, and proposes a new multivariate technique (ADEMA) based on mutual information to identify expected metabolite level changes with respect to a specific condition. We show that ADEMA better predicts De Novo Lipogenesis pathway metabolite level changes in samples with Cystic Fibrosis (CF) than prediction based on the significance of individual metabolite level changes. We also applied ADEMA's classification scheme on three different cohorts of CF and wildtype mice. ADEMA was able to predict whether an unknown mouse has a CF or a wildtype genotype with 1.0, 0.84, and 0.9 accuracy for each respective dataset. ADEMA results had up to 31% higher accuracy as compared to other classification algorithms. In conclusion, ADEMA advances the state-of-the-art in metabolomics analysis, by providing accurate and interpretable classification results.


[1]  Giske?deg?rd GF, Grinde MT, Sitter B, Axelson DE, Lundgren S, et al. (2010) Multivariate modeling and prediction of breast cancer prognostic factors using MR metabolomics. J Proteome Res 9(2): 972–979. doi: 10.1021/pr9008783
[2]  Chen JL, Tang HQ, Hu JD, Fan J, Hong J, et al. (2010) Metabolomics of gastric cancer metastasis detected by gas chromatography and mass spectrometry. World J Gastroenterol 16: 5874–80. doi: 10.3748/wjg.v16.i46.5874
[3]  Griffin JL, Shockcor JP (2004) Metabolic profiles of cancer cells. Nat Rev Cancer 4: 551–561. doi: 10.1038/nrc1390
[4]  Wu H, Xue R, Tang Z, Deng C, Liu T, et al. (2010) Metabolomic investigation of gastric cancer tissue using gas chromatography/mass spectrometry. Anal Bioanal Chem 396: 1385–1395. doi: 10.1007/s00216-009-3317-4
[5]  Yi L, He J, Lianga Y, Yuan D, Chau F (2006) Plasma fatty acid metabolic profiling and biomarkers of type 2 diabetes mellitus based on GC/MS and PLS-LDA. FEBS Lett 580: 6837–6845. doi: 10.1016/j.febslet.2006.11.043
[6]  Wetmore DR, Joseloff E, Pilewski J, Lee DP, Lawton KA, et al. (2011) Metabolomic profiling beveals biochemical pathways and biomarkers associated with pathogenesis in cystic fibrosis cells. J Biol Chem 285: 30516–30522. doi: 10.1074/jbc.m110.140806
[7]  Grasemann H, Gaston B, Fang K, Paul K, Ratjen F (1999) Decreased levels of nitrosothiols in the lower airways of the patients with cystic fibrosis and normal pulmanory function. J Pediatr 135: 770–772. doi: 10.1016/s0022-3476(99)70101-0
[8]  van Ravenzwaay B, Cunha GC, Leibold E, Looser R, Mellert W, et al. (2007) The use of metabolomics for the discovery of new biomarkers of effect. Toxicol Lett 172: 21–28. doi: 10.1016/j.toxlet.2007.05.021
[9]  Boudonck KJ, Mitchell MW, Német L, Keresztes L, Nyska A, et al. (2009) Discovery of metabolomics biomarkers for early detection of nephrotoxicity. Toxicol Pathol 37: 280–292. doi: 10.1177/0192623309332992
[10]  Soga T, Baran R, Suematsu M, Ueno Y, Ikeda S, et al. (2006) Differential metabolomics reveals ophthalmic acid as an oxidative stress biomarker indicating hepatic glutathione consumption. J Biol Chem 281: 16768–16776. doi: 10.1074/jbc.m601876200
[11]  Guo Q, Sidhu JK, Ebbels TMD, Rana F, Spurgeon DJ, et al. (2009) Validation of metabolomics for toxic mechanism of action screening with the earthworm Lumbricus rubellus. Metabolomics 5: 72–83. doi: 10.1007/s11306-008-0153-z
[12]  Roessner U, Luedemann A, Brust D, Fiehn O, Linke T, et al. (2001) Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified plant systems. Plant Cell 13: 11–29. doi: 10.2307/3871150
[13]  Bowne JB, Erwin TA, Juttner J, Schnurbusch T, Langridge P, et al. (2011) Drought responses of leaf tissues from wheat cultivars of differing drought tolerance at the metabolite level. Mol Plant 5: 418–429. doi: 10.1093/mp/ssr114
[14]  Pino Del Carpio D, Basnet RK, De Vos RC, Maliepaard C, Paulo MJ, et al. (2011) Comparative methods for association studies: a case study on metabolite variation in a brassica rapa core collection. PLoS One doi:10.1371/journal.pone.0019624.
[15]  Weckwerth W, Loureiro ME, Wenzel K, Fiehn O (2004) Differential metabolic networks unravel the effects of silent plant phenotypes. Proc Natl Acad Sci U S A 101: 7809–7814. doi: 10.1073/pnas.0303415101
[16]  Kose F, Weckwerth W, Linke T, Fiehn O (2001) Visualizing plant metabolomic correlation networks using clique-metabolite matrices. Bioinformatics 17: 1198–1208. doi: 10.1093/bioinformatics/17.12.1198
[17]  Arkin A (1997) A test case of correlation metric construction of a reaction pathway from measurements. Science 277: 1275–1279. doi: 10.1126/science.277.5330.1275
[18]  Steuer R (2006) On the analysis and interpretation of correlations in metabolomic data. Brief Bioinform 7: 151–158. doi: 10.1093/bib/bbl009
[19]  Camacho D, Fuente A, Mendes P (2005) The origin of correlations in metabolomics data. Metabolomics 1: 53–63. doi: 10.1007/s11306-005-1107-3
[20]  Ward JL, Harris C, Lewis J, Beale MH (2003) Assessment of 1H NMR spectroscopy and multivariate analysis as a technique for metabolite fingerprinting of Arabidopsis thaliana. Phytochemistry 62: 949–957. doi: 10.1016/s0031-9422(02)00705-7
[21]  Hines A, Staff FJ, Widdows J, Compton RM, Falciani F, et al. (2010) Discovery of metabolic signatures for predicting whole organism toxicology. Toxicol Sci 115: 369–378. doi: 10.1093/toxsci/kfq004
[22]  Scholz M, Gatzek S, Sterling A, Fiehn O, Selbig J (2004) Metabolite fingerprinting: detecting biological features by independent component analysis. Bioinformatics 20: 2447–2454. doi: 10.1093/bioinformatics/bth270
[23]  Steinfath M, Groth D, Lisec J, Selbig J (2008) Metabolite profile analysis: from raw data to regression and classification. Physiol Plant 132: 150–161. doi: 10.1111/j.1399-3054.2007.01006.x
[24]  Kramer MA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37: 233–243. doi: 10.1002/aic.690370209
[25]  Scholz M, Kaplan F, Guy CL, Kopka J, Selbig J (2005) Non-linear PCA: a missing data approach. Bioinformatics 21: 3887–3895. doi: 10.1093/bioinformatics/bti634
[26]  Radeke HH, Christians U, Bleck JS, Sewing KF, Resch K (1991) Additive and synergistic effects of cyclosporine metabolites on glomerular mesangial cells. Kidney Int 39: 1255–1266. doi: 10.1038/ki.1991.159
[27]  Aldámiz-Echevarría L, Prieto JA, Andrade F, Elorz J, Sojo A, et al. (2009) Persistence of essential fatty acid deficiency in cystic fibrosis despite nutritional therapy. Pediatr Res 66: 585–589. doi: 10.1203/pdr.0b013e3181b4e8d3
[28]  Batal I, Ericsoussi MB, Cluette-Brown JE, O'Sullivan BP, Freedman SD, et al. (2006) Potential utility of plasma fatty acid analysis in the diagnosis of cystic fibrosis. Clin Chem 53: 78–84. doi: 10.1373/clinchem.2006.077008
[29]  Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102: 15545–15550. doi: 10.1073/pnas.0506580102
[30]  Xia J, Wishart DS (2011) Web-based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst. Nat Protoc 6: 743–760. doi: 10.1038/nprot.2011.319
[31]  Draghici S, Khatri P, Tarca AL, Amin K, Done A, et al. (2007) A systems biology approach for pathway level analysis. Genome Res 17: 1537–1545. doi: 10.1101/gr.6202607
[32]  Chowdhury SA, Nibbe RK, Chance MR, Koyutürk M (2010) Subnetwork State Functions Define Dysregulated Subnetworks in Cancer. J Comput Biol 18: 263–281. doi: 10.1089/cmb.2010.0269
[33]  Zhang H (2009) MIClique: An Algorithm to Identify Differentially Coexpressed Disease Gene Subset from Microarray Data. J Biomed Biotechnol doi:10.1155/2009/642524.
[34]  Gupta N, Aggarwal S (2010) MIB: Using mutual information for biclustering gene expression data. Pattern Recognit 43: 2692–2697. doi: 10.1016/j.patcog.2010.03.002
[35]  Steuer R, Kurths J, Daub CO, Weise J, Selbig J (2002) The mutual information: Detecting and evaluating dependencies between variables. Bioinformatics 18: S231–S240. doi: 10.1093/bioinformatics/18.suppl_2.s231
[36]  Butte AJ, Kohane IS (2000) Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurements. Pac Symp Biocomput 5: 415–426. doi: 10.1142/9789814447331_0040
[37]  Cak?r T, Hendriks MM, Westerhuis JA, Smilde AK (2009) Metabolic network discovery through reverse engineering of metabolome data. Metabolomics 5: 318–329. doi: 10.1007/s11306-009-0156-4
[38]  Numata J, Ebenh?h O, Knapp EW (2008) Measuring correlations in the metabolic network with mutual information. Genome Inform 20: 112–22. doi: 10.1142/9781848163003_0010
[39]  Moon YI, Rajagopalan B, Lall U (1995) Estimation of mutual information using kernel density estimators. Phys Rev E 52: 2318–2321. doi: 10.1103/physreve.52.2318
[40]  Silwerman BW (1986) Density estimation for statistics and data analysis. London: Chapman and Hall.
[41]  Cakmak A, Qi X, Cicek AE, Bederman I, Henderson L, et al. (2012) A New Metabolomics Analysis Technique: Steady State Metabolic Network Dynamics Analysis. J Bioinform Comput Biol doi:10.1142/S0219720012400033.
[42]  Cicek AE, Ozsoyoglu G (2012) Observation Conflict Resolution in Steady State Metabolic Network Dynamics Analysis. J Bioinform Comput Biol doi:10.1142/S0219720012400045.
[43]  DeBoor C (1978) A practical guide to splines. New York: Springer.
[44]  Daub CO, Steuer R, Selbig J, Kloska S (2004) Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data. BMC Bioinformatics doi:10.1186/1471-2105-5-118.
[45]  Venelli A (2010) Efficient entropy estimation for mutual information analysis using B-splines. Lect Notes Comput Sc 6033: 17–30. doi: 10.1007/978-3-642-12368-9_2
[46]  Schuster S, Hilgetag C (1994) On elementary flux modes in biochemical reaction systems at steady state. J Biol Syst 2: 165–182. doi: 10.1142/s0218339094000131
[47]  Bederman I, Perez A, Henderson L, Freedman JA, Poleman J, et al. (2012) Altered de novo lipogenesis contributes to low adipose stores in cystic fibrosis mice. Am J Physiol Gastrointest Liver Physiol doi:10.1152/ajpgi.00451.
[48]  Schwarz R, Musch P, von Kamp A, Engels B, Schirmer H, et al. (2005) YANA – a software tool for analyzing flux modes, gene-expression and enzyme activities. BMC Bioinformatics doi:10.1186/1471-2105-6-135.
[49]  Selway JG (2004) Metabolism at A Glance. Wiley-Blackwell.
[50]  Rommens JM, Iannuzzi MC, Kerem B, Drumm ML, Melmer G, et al. (1989) Identification of cystic fibrosis gene: Chromosome walking and jumping. Science 245: 1059–1065. doi: 10.1126/science.2772657
[51]  Snouwaert JN, Brigman KK, Latour AM, Malouf NN, Boucher RC, et al. (1992) An animal model for cystic fibrosis made by gene targeting. Science 257: 1083–1088. doi: 10.1126/science.257.5073.1083
[52]  Guyton A, Hall J (1991) Medical Physiology. Philadelphia: Elsevier Saunders. pp. 771–774.
[53]  Shlomi T, Cabili MN, Herrgard MJ, Palsson BO, Ruppin E (2008) Network-based prediction of human tissue-specific metabolism. Nat Biotechnol 26: 1003–1010. doi: 10.1038/nbt.1487
[54]  Dubitzky W, Granzow M, Berrar DP (2007) Fundamentals of data mining in genomics and proteomics. New York: Springer.
[55]  Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, et al. (2009) The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter 11: 10–18. doi: 10.1145/1656274.1656278
[56]  Brodsky L, Moussaieff A, Shahaf N, Aharoni A, Rogachev I (2010) Evaluation of peak picking in LC-MS metabolomics data. Anal Chem 82: 9177–9187. doi: 10.1021/ac101216e
[57]  Wolf J, Passarge J, Somsen OJ, Snoep JL, Heinrich R, et al. (2000) Transduction of intracellular and intercellular dynamics in yeast glycolytic oscillations. Biophys J 78: 1145–1153. doi: 10.1016/s0006-3495(00)76672-0
[58]  Li C, Donizelli M, Rodriguez N, Dharuri H, Endler L, et al. (2010) BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models. BMC Syst Biol doi:10.1186/1752-0509-4-92.
[59]  Cakmak A, Qi X, Coskun SA, Das M, Cheng E, et al. (2011) PathCase-SB architecture and database design. BMC Syst Biol doi:10.1186/1752-0509-5-188.
[60]  Coskun SA, Qi X, Cakmak A, Cheng E, Cicek AE, et al. (2012) PathCase-SB: integrating data sources and providing tools for systems biology research. BMC Systems Biology doi:10.1186/1752-0509-6-67.
[61]  Bungay SD, Gentry PA, Gentry RD (2003) A mathematical model of lipid-mediated thrombin generation. Math Med Biol 20: 105–29. doi: 10.1093/imammb/20.1.105
[62]  Ung CY, Li H, Ma XH, Jia J, Li BW, et al. (2008) Simulation of the regulation of EGFR endocytosis and EGFR-ERK signaling by endophilin-mediated RhoA-EGFR crosstalk. FEBS Lett 582: 2283–90. doi: 10.1016/j.febslet.2008.05.026
[63]  Lagerstedt SA, Hinrichs DR, Batt SM, Magera MJ, Rinaldo P, et al. (2001) Quantitative determination of plasma c8–c26 total fatty acids for the biochemical diagnosis of nutritional and metabolic disorders. Mol Genet Metab 73: 38–45. doi: 10.1006/mgme.2001.3170
[64]  Ramsey BW, Davies J, McElvaney G, Tullis E, Bell SC, et al. (2011) A CFTR potentiator in patients with cystic fibrosis and the G551D mutation. N Engl J Med 365: 1663–1672. doi: 10.1056/nejmoa1105185
[65]  Coste TC, Armand M, Lebacq J, Lebecque P, Wallemacq P, et al. (2007) An overview of monitoring and supplementation of omega 3 fatty acids in cystic fibrosis. Clin Biochem 40: 511–20. doi: 10.1016/j.clinbiochem.2007.01.002


comments powered by Disqus