Population stratification is always a concern in association
analysis. There is a debate on the extent of the problem in less extreme
situations (Thomas and Witte[1], Wacholder et al.[2]). Wacholder et al.[3] and Ardlie et al.[4] showed that hidden population structure is not a serious threat to case-control designs. We propose a method of assessing the seriousness of the
population stratification before designing association studies. If population stratification
is not a serious problem, one may consider using case-control study instead of
family-based design to get more power. In a case-control design, we compare
chi-square statistics from a structured population (a union of two
subpopulations) and a homogeneous population with the same prevalence and
allele frequencies. We provide an explicit formula to calculate the chi-square
statistics from 17 parameters, such as proportions of subpopulation, allele
frequencies in subpopulations, etc. We choose these factors because they have
potential to cause false associations. Each parameter takes a random value in a
chosen range. We then calculate the likelihood of getting opposite conclusions
in the structured and the homogeneous populations. This is the likelihood of
having false positives caused by population stratification. The advantage of
this method is to provide a cost effective way to choose between using
case-control data and using family data before
actually collecting those data. We conclude
that sample sizes have a significant effect on the likelihood of false positive
caused by population stratification. The larger the sample size is, the more
likely to have false positive if the population structure is ignored. If the
sample size will be smaller than 200 by budget constraints, then case-control
study may be a better choice because of its power.
References
[1]
Thomas, D.C. and Witte, J.S. (2002) Point: Population Stratification: A Problem for Case-Control Studies of Candidate-Gene Associations? Cancer Epedemiology, Biomarkers & Prevention, 11, 505-512.
[2]
Wacholder, S., Rothman, N. and Caporaso, N. (2002) Counterpoint: Bias from Population Stratification Is Not a Major Threat to the Validity of Conclusions from Epidemiological Studies of Common Polymorphisms and Cancer. Cancer Epidemiology, Biomarkers & Prevention 11, 513-520.
[3]
Wacholder, S., Rothman, N. and Caporaso, N. (2000) Population Stratification in Epidemiologic Studies of Common Genetic Variants and Cancer: Quantification of Bias. Journal of the National Cancer Institute, 92, 1151-1158. https://doi.org/10.1093/jnci/92.14.1151
[4]
Ardlie, K.G., Lunetta, K.L. and Seielstad, M. (2002) Testing for Population Subdivision and Association in Four Case-Control Studies. American Journal of Human Genetics, 71, 304-311. https://doi.org/10.1086/341719
[5]
Visscher, P.M., et al. (2017) 10 Years of GWAS Discovery: Biology, Function, and Translation. American Journal of Human Genetics, 101, 5-22. https://doi.org/10.1016/j.ajhg.2017.06.005
[6]
Oetjens, M.T. (2016) Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data. Frontiers in Genetics, 7, 76. https://doi.org/10.3389/fgene.2016.00076
[7]
Cardon, L.R. and Palmer, L.J. (2003) Population Stratification and Spurious Allelic Association. The Lancet, 361, 598-604. https://doi.org/10.1016/S0140-6736(03)12520-2
[8]
Hellwege, J., et al. (2018) Population Stratification in Genetic Association Studies. Current Protocols in Human Genetics, 95, 1.22.1-1.22.23.
[9]
Ali-Khan, S.E., et al. (2011) The Use of Race, Ethnicity and Ancestry in Human Genetic Research. The HUGO Journal, 5, 47-63. https://doi.org/10.1007/s11568-011-9154-5
[10]
Spielman, R.S., McGinnis, R.E. and Ewens, W.J. (1993) The Transmission Test for Linkage Disequilibrium: The Insulin Gene and Insulin-Dependent Diabetes Mellitus (IDDM). American Journal of Human Genetics, 52, 506-516.
[11]
Devlin, B. and Roeder, K. (1999) Genomic Control for Association Studies. Biometrics, 55, 997-1004. https://doi.org/10.1111/j.0006-341X.1999.00997.x
[12]
Pritchard, J.K. and Rosenberg, N.A. (1999) Use of Unlinked Genetic Markers to Detect Population Stratification in Association Studies. American Journal of Human Genetics, 65, 220-228. https://doi.org/10.1086/302449
[13]
Shin, J. and Lee, C. (2015) A Mixed Model Reduces Spurious Genetic Associations Produced by Population Stratification in Genome-Wide Association Studies. Genomics, 105, 191-196. https://doi.org/10.1016/j.ygeno.2015.01.006
[14]
Price, A.L., Zaitlen, N.A., Reich, D. and Patterson, N. (2010) New Approaches to Population Stratification in Genome-Wide Association Studies. Nature Reviews Genetics, 11, 459-463. https://doi.org/10.1038/nrg2813
[15]
Alexander, D.H., Novembre, J. and Lange, K. (2009) Fast Model-Based Estimation of Ancestry in Unrelated Individuals. Genome Research, 19, 1655-1664. https://doi.org/10.1101/gr.094052.109
[16]
Barfield, R.T., et al. (2014) Accounting for Population Stratification in DNA Methylation Studies. Genetic Epidemiology, 38, 231-241. https://doi.org/10.1002/gepi.21789
[17]
Hill, W.D., et al. (2016) Molecular Genetic Contributions to Social Deprivation and Household Income in UK Biobank. Current Biology, 26, 3083-3089. https://doi.org/10.1016/j.cub.2016.09.035
[18]
Pankow, J.S., Province, M.A., Hunt, S.C. and Arnett, D.K. (2002) Regarding “Testing for population Subdivision and Association in Four Case-Control Studies”. American Journal of Human Genetics, 71, 1478-1480. https://doi.org/10.1086/344582
[19]
Morton, N.E. and Collins, A. (1998) Tests and Estimates of Allelic Association in Complex Inheritance. Proceedings of the National Academy of Sciences of the United States of America, 95, 11389-11393. https://doi.org/10.1073/pnas.95.19.11389
[20]
Bacanu, S.A., Devlin, B. and Roeder, K. (2000) The Power of Genomic Control. American Journal of Human Genetics, 66, 1933-1944. https://doi.org/10.1086/302929
[21]
Spence, M.A., Greenberg, D.A., Hodge, S.E. and Vieland, V.J. (2003) The Emperor’s New Methods. American Journal of Human Genetics, 72, 1084-1087. https://doi.org/10.1086/374826
[22]
Hartl, D.L. (1999) A Primer of Population Genetics. 3rd Edition, Sinauer, Sunderland, MA.