|
BMC Genetics 2011
Empirical evaluations of analytical issues arising from predicting HLA alleles using multiple SNPsAbstract: Applying this new methodology to three large independent study cohorts, we have evaluated the performance of the predictive models in ethnically diverse populations. Specifically, we have found that utilizing imputed in addition to genotyped SNPs generally yields comparable if not better performance in prediction accuracies. Our evaluation also supports the idea that predictive models trained on one population are transferable to other populations of the same ethnicity. Further, when the training set includes multi-ethnic populations, the resulting models are reliable and perform well for the same subpopulations across all HLA genes. In contrast, the predictive models built from single ethnic populations have superior performance within the same ethnic population, but are not likely to perform well in other ethnic populations.The empirical explorations reported here provide further evidence in support of the application of this approach for predicting HLA alleles with GWAS-derived SNP data. Utilizing all available samples, we have built "state of the art" predictive models for HLA-A, -B, -C, -DRB1, -DQB1 and -DPB1. The HLA allele predictive models, along with the program used to carry out the prediction, are available on our website.The genes encoding the human leukocyte antigens, HLA-A, -B, -C, -DRB1, -DQB1 and -DPB1, located within the major histocompatibility complex (MHC), play critical roles in immunity and host defense, risk of autoimmune disease and cancer [1-4]. HLA also plays an important role in organ and cellular transplantation where mismatched HLA can lead to graft rejection and graft-versus-host disease [5,6]. HLA genes are among the most polymorphic systems in the human genome.Due to the complex and redundant nature of the sequence variations that distinguish class I (HLA-A, -B, -C) and class II (HLA-DRB1, -DQB1 and -DPB1) alleles, which are located within 2 to 3 hypervariable regions in the class I (exon 2 and 3) and class II (exon 2) genes, and the
|