全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
PLOS Genetics  2015 

Discovering Genetic Interactions in Large-Scale Association Studies by Stage-wise Likelihood Ratio Tests

DOI: 10.1371/journal.pgen.1005502

Full-Text   Cite this paper   Add to My Lib

Abstract:

Despite the success of genome-wide association studies in medical genetics, the underlying genetics of many complex diseases remains enigmatic. One plausible reason for this could be the failure to account for the presence of genetic interactions in current analyses. Exhaustive investigations of interactions are typically infeasible because the vast number of possible interactions impose hard statistical and computational challenges. There is, therefore, a need for computationally efficient methods that build on models appropriately capturing interaction. We introduce a new methodology where we augment the interaction hypothesis with a set of simpler hypotheses that are tested, in order of their complexity, against a saturated alternative hypothesis representing interaction. This sequential testing provides an efficient way to reduce the number of non-interacting variant pairs before the final interaction test. We devise two different methods, one that relies on a priori estimated numbers of marginally associated variants to correct for multiple tests, and a second that does this adaptively. We show that our methodology in general has an improved statistical power in comparison to seven other methods, and, using the idea of closed testing, that it controls the family-wise error rate. We apply our methodology to genetic data from the PROCARDIS coronary artery disease case/control cohort and discover three distinct interactions. While analyses on simulated data suggest that the statistical power may suffice for an exhaustive search of all variant pairs in ideal cases, we explore strategies for a priori selecting subsets of variant pairs to test. Our new methodology facilitates identification of new disease-relevant interactions from existing and future genome-wide association data, which may involve genes with previously unknown association to the disease. Moreover, it enables construction of interaction networks that provide a systems biology view of complex diseases, serving as a basis for more comprehensive understanding of disease pathophysiology and its clinical consequences.

References

[1]  GBD 2013 Mortality and Causes of Death Collaborators. Global, regional, and national age-sex specific all-cause and cause-specific mortality for 240 causes of death, 1990–2013: a systematic analysis for the Global Burden of Disease Study 2013. The Lancet. 2015 January;385(9963):117–171. doi: 10.1016/s0140-6736(14)61682-2
[2]  CARDIoGRAMplusC4D Consortium, Deloukas P, Kanoni S, Willenborg C, Farrall M, Assimes TL, et al. Large-scale association analysis identifies new risk loci for coronary artery disease. Nat Genet. 2013 January;45(1):25–33. doi: 10.1038/ng.2480. pmid:23202125
[3]  Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nature. 2009 October;461(7265):747–53. doi: 10.1038/nature08494. pmid:19812666
[4]  Lehner B. Genotype to phenotype: lessons from model organisms for human genetics. Nat Rev Genet. 2013 January;14(3):168–178. doi: 10.1038/nrg3404. pmid:23358379
[5]  Bateson W, Saunders E, Punnett R. Experimental studies in the physiology of heredity. Roy Soc Evolution Com Rpts. 1905;2:1–55.
[6]  Kelley R, Ideker T. Systematic interpretation of genetic interactions using protein networks. Nat Biotechnol. 2005 May;23(5):561–6. doi: 10.1038/nbt1096. pmid:15877074
[7]  McLellan J, O’Neil N, Tarailo S, Stoepel J, Bryan J, Rose A, et al. Synthetic Lethal Genetic Interactions That Decrease Somatic Cell Proliferation in Caenorhabditis elegans Identify the Alternative RFCCTF18 as a Candidate Cancer Drug Target. Mol Biol Cell. 2009 October;20:5305–5313. doi: 10.1091/mbc.E09-08-0699.
[8]  Szappanos B, Kovács K, Szamecz B, Honti F, Costanzo M, Baryshnikova A, et al. An integrated approach to characterize genetic interaction networks in yeast metabolism. Nat Genet. 2011 May;43:656–662. doi: 10.1038/ng.846. pmid:21623372
[9]  Leamy LJ, Pomp D, Lightfoot JT. An Epistatic Genetic Basis for Physical Activity Traits in Mice. J Hered. 2008 May;99(6):639–646. doi: 10.1093/jhered/esn045. pmid:18534999
[10]  Gaertner BE, Parmenter MD, Rockman MV, Kruglyak L, Phillips PC. More than the sum of its parts: a complex epistatic network underlies natural variation in thermal preference behavior in Caenorhabditis elegans. Genetics. 2012 December;192(4):1533–1542. doi: 10.1534/genetics.112.142877. pmid:23086219
[11]  Huang W, Richards S, Carbone MA, Zhu D, Anholt RRH, Ayroles JF, et al. Epistasis dominates the genetic architecture of Drosophila quantitative traits. Proc Natl Acad Sci U S A. 2012 September;109(39):15553–15559. doi: 10.1073/pnas.1213423109. pmid:22949659
[12]  Gibson G. Epistasis and pleiotropy as natural properties of transcriptional regulation. Theor Popul Biol. 1996 January;49:58–89. doi: 10.1006/tpbi.1996.0003. pmid:8813014
[13]  Cordell HJ. Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet. 2009 June;10(6):392–404. doi: 10.1038/nrg2579. pmid:19434077
[14]  Steen KV. Travelling the world of gene-gene interactions. Brief Bioinform. 2012 January;13(1):1–19. doi: 10.1093/bib/bbr012. pmid:21441561
[15]  Sham PC, Purcell SM. Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet. 2014 April;15(5):335–346. doi: 10.1038/nrg3706. pmid:24739678
[16]  Cordell HJ. Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet. 2002 October;11(20):2463–8. doi: 10.1093/hmg/11.20.2463. pmid:12351582
[17]  Clayton D. Link functions in multi-locus genetic models: implications for testing, prediction, and interpretation. Genet Epidemiol. 2012 May;36(4):409–18. doi: 10.1002/gepi.21635. pmid:22508388
[18]  Good IJ. Maximum Entropy for Hypothesis Formulation, Especially for Multidimensional Contingency Tables. Ann Math Stat. 1963 March;34:911–934. doi: 10.1214/aoms/1177704014.
[19]  Risch N. Linkage Strategies for Genetically Complex Traits. 1. Multilocus Models. Am J Hum Genet. 1990 February;46(2):222–228. pmid:2301392
[20]  Loftus GR. On interpretation of interactions. Mem Cognit. 1978 February;6:312–319. doi: 10.3758/BF03197461.
[21]  Knol MJ, VanderWeele TJ. Recommendations for presenting analyses of effect modification and interaction. Intl J Epidemiol. 2012 April;41(2):514–20. doi: 10.1093/ije/dyr218.
[22]  Nielsen DM, Ehm MG, Zaykin DV, Weir BS. Effect of two- and three-locus linkage disequilibrium on the power to detect marker/phenotype associations. Genetics. 2004 October;168(2):1029–1040. doi: 10.1534/genetics.103.022335. pmid:15514073
[23]  Prabhu S, Pe’er I. Ultrafast genome-wide scan for SNP-SNP interactions in common complex disease. Genome Res. 2012 November;22(11):2230–2240. doi: 10.1101/gr.137885.112. pmid:22767386
[24]  Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, et al. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet. 2001;69(1):138–147. doi: 10.1086/321276. pmid:11404819
[25]  Chung Y, Lee SY, Elston RC, Park T. Odds ratio based multifactor-dimensionality reduction method for detecting gene-gene interactions. Bioinformatics. 2007 January;23(1):71–76. doi: 10.1093/bioinformatics/btl557. pmid:17092990
[26]  Calle ML, Urrea V, Vellalta G, Malats N, Van Steen K. Model-Based Multifactor Dimensionality Reduction for detecting interactions in high-dimensional genomic data. Technical Reports, Department of Systems Biology, Universitat de Vic. 2008 January;p. 1–14. Available from: .
[27]  Wan X, Yang C, Yang Q, Xue H, Tang NLS, Yu W. Detecting two-locus associations allowing for interactions in genome-wide association studies. Bioinformatics. 2010 October;26(20):2517–25. doi: 10.1093/bioinformatics/btq486. pmid:20736343
[28]  Ueki M, Cordell HJ. Improved statistics for genome-wide interaction analysis. PLoS Genet. 2012 January;8(4):e1002625. doi: 10.1371/journal.pgen.1002625. pmid:22496670
[29]  Marchini J, Donnelly P, Cardon LR. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet. 2005 April;37(4):413–7. doi: 10.1038/ng1537. pmid:15793588
[30]  Millstein J, Conti DV, Gilliland FD, Gauderman WJ. A testing framework for identifying susceptibility genes in the presence of epistasis. Am J Hum Genet. 2006 Jan;78(1):15–27. doi: 10.1086/498850. pmid:16385446
[31]  Lewinger JP, Morrison JL, Thomas DC, Murcray CE, Conti DV, Li D, et al. Efficient Two-Step Testing of Gene-Gene Interactions in Genome-Wide Association Studies. Genet Epidemiol. 2013 April;37(5):440–451. doi: 10.1002/gepi.21720. pmid:23633124
[32]  Marcus R, Eric P, Gabriel KR. On Closed Testing Procedures with Special Reference to Ordered Analysis of Variance. Biometrika. 1976 December;63:655–660. doi: 10.1093/biomet/63.3.655.
[33]  Wright SPW. Adjusted P-Values for Simultaneous Inference. Biometrics. 1992 December;48:1005–1013. doi: 10.2307/2532694.
[34]  Berger R. Likelihood Ratio Tests and Intersection-Union Tests. In: Panchapakesan S, Balakrishnan N, editors. Advances in Statistical Decision Theory and Applications. Statistics for Industry and Technology. Birkh?user Boston; 1997. p. 225–237. Available from: .
[35]  Li W, Reich J. A complete enumeration and classification of two-locus disease models. Hum Hered. 2000 December;50(6):334–49. doi: 10.1159/000022939. pmid:10899752
[36]  Lee I, Blom UM, Wang PI, Shim JE, Marcotte EM. Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res. 2011 July;21:1109–1121. doi: 10.1101/gr.118992.110. pmid:21536720
[37]  Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009 June;106(23):9362–9367. doi: 10.1073/pnas.0903103106. pmid:19474294
[38]  Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B Stat Methodol. 1995 March;57(1):289–300.
[39]  Hamsten A, Eriksson P. Identifying the susceptibility genes for coronary artery disease: from hyperbole through doubt to cautious optimism. J Intern Med. 2008 May;263:538–552. doi: 10.1111/j.1365-2796.2008.01958.x. pmid:18410597
[40]  Arvind P, Nair J, Jambunathan S, Kakkar VV, Shanker J. CELSR2-PSRC1-SORT1 gene expression and association with coronary artery disease and plasma lipid levels in an Asian Indian cohort. J Cardiol. 2014 November;64(5):339–346. doi: 10.1016/j.jjcc.2014.02.012. pmid:24674750
[41]  Libby P. Inflammation in atherosclerosis. Arterioscler Thromb Vasc Biol. 2012 September;32(9):2045–51. doi: 10.1161/ATVBAHA.108.179705. pmid:22895665
[42]  IBC 50K CAD Consortium, Butterworth AS, Braund PS, Farrall M, Hardwick RJ, Saleheen D, et al. Large-scale gene-centric analysis identifies novel variants for coronary artery disease. PLoS Genet. 2011 September;7(9):e1002260. doi: 10.1371/journal.pgen.1002260.
[43]  Keating BJ, Tischfield S, Murray SS, Bhangale T, Price TS, Glessner JT, et al. Concept, design and implementation of a cardiovascular gene-centric 50 k SNP array for large-scale genomic association studies. PloS One. 2008 October;3:e3583. doi: 10.1371/journal.pone.0003583. pmid:18974833
[44]  Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010 September;38(16):e164. doi: 10.1093/nar/gkq603. pmid:20601685
[45]  Kinsella RJ, Kahari A, Haider S, Zamora J, Proctor G, Spudich G, et al. Ensembl BioMarts: a hub for data retrieval across taxonomic space. Database (Oxford). 2011 January;2011(0):bar030–bar030. doi: 10.1093/database/bar030.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133