|
BMC Research Notes 2009
Markers typed in genome-wide analysis identify regions showing deviation from Hardy-Weinberg equilibriumAbstract: We investigated genotypes from 463842 autosomal markers from 1504 British subjects. We identified regions in which several neighbouring markers exhibited deviation from HWE in the same direction by considering "heterozygosity scores" in windows of 10 markers. The heterozygosity score for each marker was defined as -log(p) or log(p) according to whether the marker demonstrated increased heterozygosity or homozygosity. In each window the marker with the highest absolute score was ignored and the positive and negative scores were summed for the other nine markers. Windows were selected on the basis of this sum exceeding a given threshold, for which we used values of 50 or 15.For the threshold of 50, we identified 7 regions with increased heterozygosity and for the threshold of 15 we identified 22 regions with increased heterozygosity, 23 with increased homozygosity and 2 containing both kinds of window. The most impressive of these results came from a group of 6 markers at 17q21, each of which showed increased heterozygosity significant at p < 10-190.The human genome contains regions which deviate markedly from HWE and these might harbour genes influencing embryonic survival.When marker allele frequencies in controls deviate markedly from Hardy-Weinberg equilibrium (HWE) this is commonly taken as an indicator that the genotyping is unreliable or that there is marked population stratification and the marker is discarded [1]. However if common polymorphisms influence embryonic survival then it is expected that these may also lead to such deviations. The existence of such loci is supported by a genome-wide tendency for siblings to share alleles more than would be expected by chance [2].As previously suggested, we reasoned that if groups of nearby markers all showed deviation from HWE then this could not result purely from genotyping errors since there would be no reason for the same kind of error to be replicated in each marker [3]. Hence we used the control data from the
|