oalib
Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
An evaluation of the performance of HapMap SNP data in a Shanghai Chinese population: Analyses of allele frequency, linkage disequilibrium pattern and tagging SNPs transferability on chromosome 1q21-q25
Cheng Hu, Weiping Jia, Weihua Zhang, Congrong Wang, Rong Zhang, Jie Wang, Xiaojing Ma, The International Type 2 Diabetes 1q Consortium, Kunsan Xiang
BMC Genetics , 2008, DOI: 10.1186/1471-2156-9-19
Abstract: Three thousand and forty-two SNPs were analyzed after removal of SNPs that failed quality control and those not in the HapMap panel. We compared the allele frequency distributions, linkage disequilibrium patterns, haplotype frequency distributions and tagging SNP sets transferability between the HapMap population and Shanghai Chinese population. Among the four HapMap populations, Beijing Chinese showed the best correlation with Shanghai population on allele frequencies, linkage disequilibrium and haplotype frequencies. Tagging SNP sets selected from four HapMap populations at different thresholds were evaluated in the Shanghai sample. Under the threshold of r2 equal to 0.8 or 0.5, both HapMap Chinese and Japanese data showed better coverage and tagging efficiency than Caucasian and African data.Our study supported the applicability of HapMap Beijing Chinese SNP data to the study of complex diseases among southern Chinese population.The International HapMap Project aimed at determining the common patterns of DNA sequence variants, their frequencies, and correlations between them, through genotyping samples from four large populations, Centre d'Etude du Polymorphisme Humain reference individuals from Utah, USA (CEU), Han Chinese in Beijing, China (CHB), Japanese in Tokyo, Japan (JPT), and Yoruba in Ibadan, Nigeria (YRI), at a density of 1 SNP every 5 kb. The populations genotyped in the HapMap can serve as reference populations for the selection of tagging SNPs (tSNPs) that capture most of the variations in the genome. It provides an important shortcut to carry out candidate-gene and genome-wide association studies in a certain population by minimizing the numbers of SNPs need to be genotyped [1-3].As stated by the International HapMap Consortium, the general applicability of the HapMap data should be confirmed in other populations [1]. Several studies previously performed showed high concordance with HapMap data in allele frequencies and haplotype distributions, and
An Evaluation of the Performance of Tag SNPs Derived from HapMap in a Caucasian Population  [PDF]
Alexandre Montpetit equal contributor,Mari Nelis equal contributor,Philippe Laflamme,Reedik Magi,Xiayi Ke,Maido Remm,Lon Cardon,Thomas J Hudson,Andres Metspalu
PLOS Genetics , 2006, DOI: 10.1371/journal.pgen.0020027
Abstract: The Haplotype Map (HapMap) project recently generated genotype data for more than 1 million single-nucleotide polymorphisms (SNPs) in four population samples. The main application of the data is in the selection of tag single-nucleotide polymorphisms (tSNPs) to use in association studies. The usefulness of this selection process needs to be verified in populations outside those used for the HapMap project. In addition, it is not known how well the data represent the general population, as only 90–120 chromosomes were used for each population and since the genotyped SNPs were selected so as to have high frequencies. In this study, we analyzed more than 1,000 individuals from Estonia. The population of this northern European country has been influenced by many different waves of migrations from Europe and Russia. We genotyped 1,536 randomly selected SNPs from two 500-kbp ENCODE regions on Chromosome 2. We observed that the tSNPs selected from the CEPH (Centre d'Etude du Polymorphisme Humain) from Utah (CEU) HapMap samples (derived from US residents with northern and western European ancestry) captured most of the variation in the Estonia sample. (Between 90% and 95% of the SNPs with a minor allele frequency of more than 5% have an r2 of at least 0.8 with one of the CEU tSNPs.) Using the reverse approach, tags selected from the Estonia sample could almost equally well describe the CEU sample. Finally, we observed that the sample size, the allelic frequency, and the SNP density in the dataset used to select the tags each have important effects on the tagging performance. Overall, our study supports the use of HapMap data in other Caucasian populations, but the SNP density and the bias towards high-frequency SNPs have to be taken into account when designing association studies.
An evaluation of the performance of tag SNPs derived from HapMap in a Caucasian population.  [cached]
Montpetit Alexandre,Nelis Mari,Laflamme Philippe,Magi Reedik
PLOS Genetics , 2006,
Abstract: The Haplotype Map (HapMap) project recently generated genotype data for more than 1 million single-nucleotide polymorphisms (SNPs) in four population samples. The main application of the data is in the selection of tag single-nucleotide polymorphisms (tSNPs) to use in association studies. The usefulness of this selection process needs to be verified in populations outside those used for the HapMap project. In addition, it is not known how well the data represent the general population, as only 90-120 chromosomes were used for each population and since the genotyped SNPs were selected so as to have high frequencies. In this study, we analyzed more than 1,000 individuals from Estonia. The population of this northern European country has been influenced by many different waves of migrations from Europe and Russia. We genotyped 1,536 randomly selected SNPs from two 500-kbp ENCODE regions on Chromosome 2. We observed that the tSNPs selected from the CEPH (Centre d'Etude du Polymorphisme Humain) from Utah (CEU) HapMap samples (derived from US residents with northern and western European ancestry) captured most of the variation in the Estonia sample. (Between 90% and 95% of the SNPs with a minor allele frequency of more than 5% have an r2 of at least 0.8 with one of the CEU tSNPs.) Using the reverse approach, tags selected from the Estonia sample could almost equally well describe the CEU sample. Finally, we observed that the sample size, the allelic frequency, and the SNP density in the dataset used to select the tags each have important effects on the tagging performance. Overall, our study supports the use of HapMap data in other Caucasian populations, but the SNP density and the bias towards high-frequency SNPs have to be taken into account when designing association studies.
How well do HapMap SNPs capture the untyped SNPs?
Erwin Tantoso, Yuchen Yang, Kuo-Bin Li
BMC Genomics , 2006, DOI: 10.1186/1471-2164-7-238
Abstract: Our analysis shows that HapMap data are not robust enough to capture the untyped variants for most of the human genes. The performance of SNPs for European and Asian samples are marginal in capturing the untyped variants, i.e. approximately 55%. Expectedly, the SNPs from HapMap YRI panel can only capture approximately 30% of the variants. Although the overall performance is low, however, the SNPs for some genes perform very well and are able to capture most of the variants along the gene. This is observed in the European and Asian panel, but not in African panel. Through observation, we concluded that in order to have a well covered SNPs reference panel, the SNPs density and the association among reference SNPs are important to estimate the robustness of the chosen SNPs.We have analyzed the coverage of HapMap SNPs using NIEHS EGP data. The results show that HapMap SNPs are transferable to the NIEHS SNPs. However, HapMap SNPs cannot capture some of the untyped SNPs and therefore resequencing may be needed to uncover more SNPs in the missing region.The abundance of single nucleotide polymorphism (SNP) in the human genome sequence offers a way for genetic association studies. Association studies usually involve comparing the allele frequency of a particular SNP in unrelated controls and cases (patients)[1]. A SNP that is observed at a higher incidence in cases compared to controls can be shown to be significantly associated with the phenotype, which is dependant on the panel sizes and the measure of the difference in observed allele frequencies between the panels. However, a statistically significant association of a SNP with a phenotype does not necessarily indict the SNP as a causal variant, rather it could be that the observed SNP is in linkage disequilibrium (LD) with the causal variant [1-4]. Therefore, involving the causal SNP or the marker SNP that is in LD with the causal variant will be important to detect disease association. Ideally, we can include all the S
FstSNP-HapMap3: a database of SNPs with high population differentiation for HapMap3  [cached]
Shiwei Duan,Wei Zhang,Nancy Jean Cox,Mary Eileen Dolan
Bioinformation , 2008,
Abstract: The International HapMap Project has recently made available genotypes and frequency data for phase 3 (NCBI build 36, dbSNP b129) of the HapMap providing an enriched genotype dataset for approximately 1.6 million single nucleotide polymorphisms (SNPs) from 1,115 individuals with ancestry from parts of Africa, Asia, Europe, North America and Mexico. In the present study, we aim to facilitate pharmacogenetics studies by providing a database of SNPs with high population differentiation through a genomewide test on allele frequency variation among 11 HapMap3 samples. Common SNPs with minor allele frequency greater than 5% from each of 11 HapMap3 samples were included in the present analysis. The population differentiation is measured in terms of fixation index (Fst), and the SNPs with Fst values over 0.5 were defined as highly differentiated SNPs. Our tests were carried out between all pairs of the 11 HapMap3 samples or among subgroups with the same continental ancestries. Altogether we carried out 64 genomewide Fst tests and identified 28,215 highly differentiated SNPs for 49 different combinations of HapMap3 samples in the current database.
Evaluating the transferability of Hapmap SNPs to a Singapore Chinese population
Anand Andiappan, Ramani Anantharaman, Pallavi Nilkanth, De Yun Wang, Fook Chew
BMC Genetics , 2010, DOI: 10.1186/1471-2156-11-36
Abstract: A total of 237 SNPs were identified through resequencing of which only 95 SNPs (40%) were in Hapmap; however an additional 56 SNPs (24%) were not genotyped directly but had a proxy SNP in the Hapmap. At the genome-wide level, Singapore Chinese were highly correlated with Hapmap Han Chinese with correlation of 0.954 and 0.947 for the Illumina and Affymetrix platforms respectively with deviant SNPs randomly distributed within and across all chromosomes.The high correlation between our population and Hapmap Han Chinese reaffirms the applicability of Hapmap based genome-wide chips for GWA studies. There is a clear population signature for the Singapore Chinese samples and they predominantly resemble the southern Han Chinese population; however when new migrants particularly those with northern Han Chinese background were included, population stratification issues may arise. Future studies needs to address population stratification within the sample collection while designing and interpreting GWAS in the Chinese population.The International Hapmap Project is a multi-centre effort aimed at identifying genetic variations across the human genome among different individuals to aid biomedical researchers in identifying genetic links to various diseases and variable drug response [1-3]. The Hapmap Consortium developed a human haplotype map by genotyping 270 samples from four populations with diverse geographic ancestry. These samples included 30 trios (mother, father, and adult child) from the Yoruba in Ibadan, Nigeria (YRI); 30 trios from the Centre d'Etude du Polymorphisme Humain (CEPH) collection of Utah residents of Northern and Western European ancestry; 45 unrelated Han Chinese in Beijing (CHB); and 45 unrelated Japanese in Tokyo (JPT) [4]. While the latest published update to the Hapmap project indicates the availability of data for more than 3.1 million single nucleotide polymorphisms (SNPs) in the four populations [3] this number has grown to more than 26 million SNPs
Effective selection of informative SNPs and classification on the HapMap genotype data
Nina Zhou, Lipo Wang
BMC Bioinformatics , 2007, DOI: 10.1186/1471-2105-8-484
Abstract: In this paper, we propose to first rank each feature (SNP) using a ranking measure, i.e., a modified t-test or F-statistics. Then from the ranking list, we form different feature subsets by sequentially choosing different numbers of features (e.g., 1, 2, 3, ..., 100.) with top ranking values, train and test them by a classifier, e.g., the support vector machine (SVM), thereby finding one subset which has the highest classification accuracy. Compared to the classification method of Park et al., we obtain a better result, i.e., good classification of the 3 populations using on average 64 SNPs.Experimental results show that the both of the modified t-test and F-statistics method are very effective in ranking SNPs about their classification capabilities. Combined with the SVM classifier, a desirable feature subset (with the minimum size and most informativeness) can be quickly found in the greedy manner after ranking all SNPs. Our method is able to identify a very small number of important SNPs that can determine the populations of individuals.When any one single nucleotide of A, T, C and G in the genome sequence is replace by one of any other 3 nucleotide, e.g., from AAATCCGG to AAATTCGG, we call this single base variation (C ? T) as a single nucleotide polymorphism (SNP). It has the following three characteristics [1]: 1) very common in the human genome (a SNP occurs every 100 to 300 bases along the 3-billion-base human genome); 2)among the SNPs, two of every three SNPs are the variations from cytosine (C) to thymine (T); 3) very stable from generation to generation. Due to these characteristics, much research on SNPs has been developed, such as using SNPs to study the association of sequence variation [2-5] and to do population classification [6,7].In association studies [2-5], informative SNPs were usually selected based on certain correlation measures and therefore could represent other SNPs in the close proximity. For example, Bafna et al. [2] and Halldrsson et al
GLIDERS - A web-based search engine for genome-wide linkage disequilibrium between HapMap SNPs
Robert Lawrence, Aaron G Day-Williams, Richard Mott, John Broxholme, Lon R Cardon, Eleftheria Zeggini
BMC Bioinformatics , 2009, DOI: 10.1186/1471-2105-10-367
Abstract: GLIDERS is an easy to use web tool that only requires the user to enter rs numbers of SNPs they want to retrieve genome-wide LD for (both nearby and long-range). The intuitive web interface handles both manual entry of SNP IDs as well as allowing users to upload files of SNP IDs. The user can limit the resulting inter SNP associations with easy to use menu options. These include MAF limit (5-45%), distance limits between SNPs (minimum and maximum), r2 (0.3 to 1), HapMap population sample (CEU, YRI and JPT+CHB combined) and HapMap build/release. All resulting genome-wide inter-SNP associations are displayed on a single output page, which has a link to a downloadable tab delimited text file.GLIDERS is a quick and easy way to retrieve genome-wide inter-SNP associations and to explore LD patterns for any number of SNPs of interest. GLIDERS can be useful in identifying SNPs with long-range LD. This can highlight mis-mapping or other potential association signal localisation problems.The discovery of the block structure of haplotypes has led to much research into patterns of local linkage disequilibrium (LD) in the genome [1-3]. The International HapMap Project was motivated by these discoveries to create a fine-scale catalogue of common single nucleotide polymorphisms (SNPs) in different populations to allow further investigations into LD [4,5]. The HapMap has allowed researchers to better understand patterns of LD and utilize the information in the design of genome-wide association studies (GWAS). Several tools have been developed allowing researchers to utilize the HapMap data to investigate LD, including Haploview and SNAP [6,7]. Analysis of HapMap has revealed wide-spread, and often complex, patterns of LD which has made the localization of causal variants difficult in regions identified to be associated with disease through GWAS. Research thus far has been focused on regional patterns of LD and has revealed that LD in most genomic regions decays substantially over s
Tagging Single Nucleotide Polymorphisms in the BRIP1 Gene and Susceptibility to Breast and Ovarian Cancer  [PDF]
Honglin Song, Susan J. Ramus, Susanne Krüger Kjaer, Estrid Hogdall, Richard A. DiCioccio, Alice S. Whittemore, Valerie McGuire, Claus Hogdall, Ian J. Jacobs, Douglas F. Easton, Bruce A.J. Ponder, Alison M. Dunning, Simon A. Gayther, Paul D.P. Pharoah
PLOS ONE , 2007, DOI: 10.1371/journal.pone.0000268
Abstract: Background BRIP1 interacts with BRCA1 and functions in regulating DNA double strand break repair pathways. Germline BRIP1 mutations are associated with breast cancer and Fanconi anemia. Thus, common variants in the BRIP1 are candidates for breast and ovarian cancer susceptibility. Methods We used a SNP tagging approach to evaluate the association between common variants (minor allele frequency≥0.05) in BRIP1 and the risks of breast cancer and invasive ovarian cancer. 12 tagging SNPs (tSNPs) in the gene were identified and genotyped in up to 2,270 breast cancer cases and 2,280 controls from the UK and up to 1,513 invasive ovarian cancer cases and 2,515 controls from the UK, Denmark and USA. Genotype frequencies in cases and controls were compared using logistic regression. Results Two tSNPs showed a marginal significant association with ovarian cancer: Carriers of the minor allele of rs2191249 were at reduced risk compared with the common homozygotes (Odds Ratio (OR) = 0.90 (95% CI, 0.82–1.0), P-trend = 0.045) and the minor allele of rs4988344 was associated with increased risk (OR = 1.15 (95%CI, 1.02–1.30), P-trend = 0.02). When the analyses were restricted to serous ovarian cancers, these effects became slightly stronger. These results were not significant at the 5% level after adjusting for multiple testing. None of the tSNPs was associated with breast cancer. Conclusions It is unlikely that common variants in BRIP1 contribute significantly to breast cancer susceptibility. The possible association of rs2191249 and rs4988344 with ovarian cancer risks warrant confirmation in independent case-control studies.
Effects of cutoff thresholds for minor allele frequencies on HapMap resolution: A real dataset-based evaluation of the Chinese Han and Tibetan populations
ShiYi Xiong,YuanTao Hao,ShaoQi Rao,WeiJun Huang,Bin Hu,Labu,Pubuzhuoma,Gesangzhuogab,YiMing Wang
Chinese Science Bulletin , 2009, DOI: 10.1007/s11434-009-0302-4
Abstract: Genomic variation is the genetic basis of phenotypic diversity among individuals, including variation in disease susceptibility and drug response. The greatest promise of the International HapMap is to provide roadmaps for identifying genetic variants predisposing to complex diseases. Single nucleotide polymorphism (SNP) is the fundamental element of the HapMap. Allele frequency of SNPs is one of the major factors affecting the resulting HapMap, being the factor upon which linkage disequilibrium (LD) is calculated, haplotypes are constructed, and tagging SNPs (tagSNPs) are selected. The cutoff thresholds for the frequency of minor alleles used in the making of the map therefore have profound effects on the resolution of that map. To date most researchers have adopted their own cutoff thresholds, and there has been little real dataset-based evaluation of the effects of different cutoff thresholds on HapMap resolution. In an attempt to assess the implications of different cutoff values, we analyzed our own data for the centromeric genes on Chromosome 15 in Chinese Han and Tibetan populations, with respect to minor allele frequency cutoff values of ≥0.01 (0.01 group), ≥0.05 (0.05 group), and ≥0.10 (0.10 group), and constructed HapMaps from each of the datasets. The resolution, study power and cost-effectiveness for each of the maps were compared. Our results show that the 0.01 threshold provides the greatest power (P = 0.019 in Han and P = 0.029 in Tibetan for 0.01 vs. 0.05 threshold) and detects most population-specific haploypes (P = 0.012 for 0.01 vs. 0.05 threshold). However, in the regions studied, the 0.05 cutoff threshold did not significantly increase power above the 0.10 threshold (P = 0.191 in Han; 1.000 in Tibetans), and did not improve resolution over the 0.10 value for populationspecific haplotypes (P = 0.592) neither. Furthermore the 0.05 and 0.10 values produced the same figures for tagging efficiency, LD block number, LD length, study power and cost-savings in the Tibetan population. These results suggest that a lower cutoff value is more appropriate for studies in which population-specific haplotypes are crucial, and that the most appropriate cutoff value may differ between populations. Due to the limited genes studied in this project more studies should be conducted to further address this important issue.
Page 1 /100
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.