全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
PLOS ONE  2008 

Population Substructure and Control Selection in Genome-Wide Association Studies

DOI: 10.1371/journal.pone.0002551

Full-Text   Cite this paper   Add to My Lib

Abstract:

Determination of the relevance of both demanding classical epidemiologic criteria for control selection and robust handling of population stratification (PS) represents a major challenge in the design and analysis of genome-wide association studies (GWAS). Empirical data from two GWAS in European Americans of the Cancer Genetic Markers of Susceptibility (CGEMS) project were used to evaluate the impact of PS in studies with different control selection strategies. In each of the two original case-control studies nested in corresponding prospective cohorts, a minor confounding effect due to PS (inflation factor λ of 1.025 and 1.005) was observed. In contrast, when the control groups were exchanged to mimic a cost-effective but theoretically less desirable control selection strategy, the confounding effects were larger (λ of 1.090 and 1.062). A panel of 12,898 autosomal SNPs common to both the Illumina and Affymetrix commercial platforms and with low local background linkage disequilibrium (pair-wise r2<0.004) was selected to infer population substructure with principal component analysis. A novel permutation procedure was developed for the correction of PS that identified a smaller set of principal components and achieved a better control of type I error (to λ of 1.032 and 1.006, respectively) than currently used methods. The overlap between sets of SNPs in the bottom 5% of p-values based on the new test and the test without PS correction was about 80%, with the majority of discordant SNPs having both ranks close to the threshold. Thus, for the CGEMS GWAS of prostate and breast cancer conducted in European Americans, PS does not appear to be a major problem in well-designed studies. A study using suboptimal controls can have acceptable type I error when an effective strategy for the correction of PS is employed.

References

[1]  Yeager M,Orr N,Hayes RB,Jacobs KB,Kraft P,et al. (2007) Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet 39: 645–649.
[2]  Hunter DJ,Thomas G,Hoover RN,Chanock SJ (2007) Scanning the horizon: what is the future of genome-wide association studies in accelerating discoveries in cancer etiology and prevention? Cancer Causes Control 18: 479–484.
[3]  Hunter DJ,Kraft P,Jacobs KB,Cox DG,Yeager M,et al. (2007) A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 39: 870–874.
[4]  Carlson CS,Eberle MA,Rieder MJ,Smith JD,Kruglyak L,et al. (2003) Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat Genet 33: 518–521.
[5]  Wacholder S,Rothman N,Caporaso N (2000) Population stratification in epidemiologic studies of common genetic variants and cancer: quantification of bias. J Natl Cancer Inst 92: 1151–1158.
[6]  Thomas DC,Witte JS (2002) Point: population stratification: a problem for case-control studies of candidate-gene associations? Cancer Epidemiol Biomarkers Prev 11: 505–512.
[7]  Wacholder S,Rothman N,Caporaso N (2002) Counterpoint: bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer. Cancer Epidemiol Biomarkers Prev 11: 513–520.
[8]  Cardon LR,Palmer LJ (2003) Population stratification and spurious allelic association. Lancet 361: 598–604.
[9]  Freedman ML,Reich D,Penney KL,McDonald GJ,Mignault AA,et al. (2004) Assessing the impact of population stratification on genetic association studies. Nat Genet 36: 388–393.
[10]  Marchini J,Cardon LR,Phillips MS,Donnelly P (2004) The effects of human population structure on large genetic association studies. Nat Genet 36: 512–517.
[11]  Campbell CD,Ogburn EL,Lunetta KL,Lyon HN,Freedman ML,et al. (2005) Demonstrating stratification in a European American population. Nat Genet 37: 868–872.
[12]  Hutchison KE,Stallings M,McGeary J,Bryan A (2004) Population stratification in the candidate gene study: fatal threat or red herring? Psychol Bull 130: 66–79.
[13]  Wacholder S,McLaughlin JK,Silverman DT,Mandel JS (1992) Selection of controls in case-control studies. I. Principles. Am J Epidemiol 135: 1019–1028.
[14]  Li Q,Yu K (2008) Improved Correction for Population Stratification in Genome-wide Association Studies by Identifying Hidden Population Structures. Genet Epidemiol 32: 215–226.
[15]  Price AL,Patterson NJ,Plenge RM,Weinblatt ME,Shadick NA,et al. (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38: 904–909.
[16]  Patterson N,Price AL,Reich D (2006) Population structure and eigenanalysis. 2(12): e190. doi:10.1371/journal.pgen.0020190.
[17]  Zhu X,Zhang S,Zhao H,Cooper RS (2002) Association mapping, using a mixture model for complex traits. Genet Epidemiol 23: 181–196.
[18]  Satten GA,Flanders WD,Yang Q (2001) Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. Am J Hum Genet 68: 466–477.
[19]  Pritchard JK,Stephens M,Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155: 945–959.
[20]  Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661–678.
[21]  Thomas G,Jacobs KB,Yeager M,Kraft P,Wacholder S,et al. (2008) Multiple novel loci identified in a genome-wide association study of prostate cancer. Nat Genet 40: 310–315.
[22]  McPeek MS,Sun L (2000) Statistical tests for detection of misspecified relationships by use of genome-screen data. Am J Hum Genet 66: 1076–1094.
[23]  Devlin B,Risch N (1995) A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29: 311–322.
[24]  Carlson CS,Eberle MA,Rieder MJ,Yi Q,Kruglyak L,et al. (2004) Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet 74: 106–120.
[25]  Pfaff CL,Barnholtz-Sloan J,Wagner JK,Long JC (2004) Information on ancestry from genetic markers. Genet Epidemiol 26: 305–315.
[26]  Epstein MP,Allen AS,Satten GA (2007) A simple and improved correction for population stratification in case-control studies. Am J Hum Genet 80: 921–930.
[27]  Devlin B,Roeder K (1999) Genomic control for association studies. Biometrics 55: 997–1004.
[28]  Reich DE,Goldstein DB (2001) Detecting association in a case-control study while correcting for population stratification. Genet Epidemiol 20: 4–16.
[29]  Kruskal W,Wallis A (1952) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47: 583–621.
[30]  Skol AD,Scott LJ,Abecasis GR,Boehnke M (2006) Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet 38: 209–213.
[31]  Cox DR,McCullagh P (1982) Some aspects of analysis of covariance. Biometrics 38: 541–561.
[32]  Price AL,Butler J,Patterson N,Capelli C,Pascali VL,et al. (2008) Discerning the ancestry of European Americans in genetic association studies. PLoS Genet 4(1): e236. doi:10.1371/journal.pgen.0030236.
[33]  Tian C,Plenge RM,Ransom M,Lee A,Villoslada P,et al. (2008) Analysis and application of European genetic substructure using 300 K SNP information. PLoS Genet 4(1): e4. doi:10.1371/journal.pgen.0040004.
[34]  Wang Y,Localio R,Rebbeck TR (2006) Evaluating bias due to population stratification in epidemiologic studies of gene-gene or gene-environment interactions. Cancer Epidemiol Biomarkers Prev 15: 124–132.
[35]  Wacholder S,Chatterjee N,Hartge P (2002) Joint effect of genes and environment distorted by selection biases: implications for hospital-based case-control studies. Cancer Epidemiol Biomarkers Prev 9: 885–889.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133