Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Estimating haplotype frequencies in pooled DNA samples when there is genotyping error
Shannon RE Quade, Robert C Elston, Katrina AB Goddard
BMC Genetics , 2005, DOI: 10.1186/1471-2156-6-25
Abstract: Pool sizes of 2, 5, and 10 individuals achieved comparable levels of accuracy in the estimation procedure. Common marker allele frequencies and no inter-marker LD result in less accurate estimates. This pattern is observed regardless of the amount of genotyping error simulated.Genotyping error slightly decreases the accuracy of haplotype frequency estimates. However, the EM algorithm performs well even in the presence of genotyping error. Overall, pools of 2, 5, and 10 individuals yield similar accuracy of the haplotype frequency estimates, while reducing costs due to genotyping.Association studies offer several advantages to linkage analysis for mapping susceptibility loci in complex diseases. They may be more powerful than linkage analysis for loci with a small effect, since the excess sharing across families is expected to be greater than the excess sharing within a family (identity-by-descent (IBD)) [1]. In addition, association studies are expected to provide greater precision in pinpointing the location of susceptibility loci. Finally, association studies do not require the collection of groups of relatives or extended pedigrees, which can be challenging – particularly for late onset diseases.However, even for association studies, the large sample sizes necessary to study the genetics of complex disease appear unavoidable, so recent interest has focused on methods to reduce the cost. One approach is to use diallelic nucleotide bases, or single nucleotide polymorphisms (SNPs), to help identify susceptibility genes [2]. SNPs are abundantly available in the human genome (approximately 1 per kb of DNA) [3], providing a plentiful source in the genome from which to choose. Additionally, SNP genotyping can be completely automated, and recent technologies have decreased the time necessary to perform the genotyping (as reviewed by Syvanen 2001) [4]. As a result, SNPs are relatively easy, fast, and inexpensive to genotype compared to other existing technologies, such as
Maximum Likelihood Estimation of Frequencies of Known Haplotypes from Pooled Sequence Data  [PDF]
Darren Kessner,Tom Turner,John Novembre
Quantitative Biology , 2012, DOI: 10.1093/molbev/mst016
Abstract: DNA samples are often pooled, either by experimental design, or because the sample itself is a mixture. For example, when population allele frequencies are of primary interest, individual samples may be pooled together to lower the cost of sequencing. Alternatively, the sample itself may be a mixture of multiple species or strains (e.g. bacterial species comprising a microbiome, or pathogen strains in a blood sample). We present an expectation-maximization (EM) algorithm for estimating haplotype frequencies in a pooled sample directly from mapped sequence reads, in the case where the possible haplotypes are known. This method is relevant to the analysis of pooled sequencing data from selection experiments, as well as the calculation of proportions of different strains within a metagenomics sample. Our method outperforms existing methods based on single- site allele frequencies, as well as simple approaches using sequence read data. We have implemented the method in a freely available open-source software tool.
Fast and accurate haplotype frequency estimation for large haplotype vectors from pooled DNA data  [cached]
Iliadis Alexandros,Anastassiou Dimitris,Wang Xiaodong
BMC Genetics , 2012, DOI: 10.1186/1471-2156-13-94
Abstract: Background Typically, the first phase of a genome wide association study (GWAS) includes genotyping across hundreds of individuals and validation of the most significant SNPs. Allelotyping of pooled genomic DNA is a common approach to reduce the overall cost of the study. Knowledge of haplotype structure can provide additional information to single locus analyses. Several methods have been proposed for estimating haplotype frequencies in a population from pooled DNA data. Results We introduce a technique for haplotype frequency estimation in a population from pooled DNA samples focusing on datasets containing a small number of individuals per pool (2 or 3 individuals) and a large number of markers. We compare our method with the publicly available state-of-the-art algorithms HIPPO and HAPLOPOOL on datasets of varying number of pools and marker sizes. We demonstrate that our algorithm provides improvements in terms of accuracy and computational time over competing methods for large number of markers while demonstrating comparable performance for smaller marker sizes. Our method is implemented in the "Tree-Based Deterministic Sampling Pool" (TDSPool) package which is available for download at http://www.ee.columbia.edu/~anastas/tdspool. Conclusions Using a tree-based determinstic sampling technique we present an algorithm for haplotype frequency estimation from pooled data. Our method demonstrates superior performance in datasets with large number of markers and could be the method of choice for haplotype frequency estimation in such datasets.
HLA Class II Allele and Haplotype Frequencies in Iranian Patients with Leukemia
Farideh Khosravi,Aliakbar Amirzargar,Abdolfatah Sarafnejad,Mohammad Hossein Nicknam
Iranian Journal Of Allergy, Asthma and Immunology , 2007,
Abstract: Previous studies demonstrated significant differences in a number of HLA allele frequencies in leukemia patients and normal subjects. In this study, we have analyzed HLA class II alleles and haplotypes in 110 leukemia patients (60 acute myelogenous leukemia "AML", 50 chronic myelogenous leukemia"CML") and 180 unrelated normal subjects. Blood samples were collected from all of the patients and control subjects. DNA was extracted by salting out method and HLA typing was performed using PCR-SSP method. Significant positive association with AML was obtained for HLA-DRB1*11allele (35% vs. 24.7%, P=0.033). Two alleles including HLA-DRB4 and -DQB1*0303 were significantly less frequent in AML patients than in controls. HLA-DQB1*0303 allele was never observed in CML patients compared with allele frequency in controls (4.2%). According to haplotype analysis, HLA-DRB1*0101/DQA1*0104/-DQB1*0501 frequencies were significantly higher and -DRB1*16/-DQA1*01021/-DQB1*0501 frequencies were significantly lower in CML patients than in controls .In conclusion it is suggested that HLA-DRB1*16 allele and HLA-DRB1*15/-DQA1*0103/-DQB1*06011 and -DRB1*16/-DQA1*01021/-DQB1*0501 haplotypes predispose individuals to AML and HLA-DRB4 allele predispose to CML. Future studies are needed to confirm these results and establish the role of these associations in AML and CML.
Haplotype frequencies at the DRD2 locus in populations of the East European Plain
Olga V Flegontova, Andrey V Khrunin, Olga I Lylova, Larisa A Tarskaia, Victor A Spitsyn, Alexey I Mikulich, Svetlana A Limborska
BMC Genetics , 2009, DOI: 10.1186/1471-2156-10-62
Abstract: We investigated TaqI B, BclI, MboI, TaqI D, and TaqI A RFLPs in 17 contemporary populations of the East European Plain and Siberia. Most of these populations belong to the Indo-European or Uralic language families. We identified three common haplotypes, which occurred in more than 90% of chromosomes investigated. The frequencies of the haplotypes differed according to linguistic and geographical affiliation.Populations in the northwestern (Byelorussians from Mjadel'), northern (Russians from Mezen' and Oshevensk), and eastern (Russians from Puchezh) parts of the East European Plain had relatively high frequencies of haplotype B2-D2-A2, which may reflect admixture with Uralic-speaking populations that inhabited all of these regions in the Early Middle Ages.The DRD2 gene is located on chromosome 11 and encodes the neuronal dopamine receptor D2, which plays a role in movement, emotional memory, and appetitive behavior [1]. The DRD2 locus was an object of numerous genetic association studies [2-5], and the most extensively studied polymorphism is a TaqI A RFLP (rs1800497; in the vicinity of the DRD2 gene), which has been associated with the pathology of psychoses (schizophrenia and manic-depressive disorder), Parkinson's disease, and various substance abuse syndromes. It has been proposed that TaqI A might be in linkage disequilibrium with some unidentified polymorphisms within the exons or regulatory regions of the DRD2 gene, but recently it has been mapped to the last exon of the ANKK1 (ankyrin repeat and kinase domain containing 1) gene, and it results in a Glu-Lys substitution [6]. Other frequently studied RFLPs, for example, TaqI B and D (rs1079597 and rs1800498, respectively) are located in the introns of the DRD2 gene and, most probably, have no functional significance.TaqI B, TaqI D, and TaqI A polymorphisms have also been studied on a worldwide scale [[7-11]; the ALFRED database, http://alfred.med.yale.edu/alfred/index.asp webcite], and centers of dispersal, wh
Validation of SNP Allele Frequencies Determined by Pooled Next-Generation Sequencing in Natural Populations of a Non-Model Plant Species  [PDF]
Christian Rellstab, Stefan Zoller, Andrew Tedder, Felix Gugerli, Martin C. Fischer
PLOS ONE , 2013, DOI: 10.1371/journal.pone.0080422
Abstract: Sequencing of pooled samples (Pool-Seq) using next-generation sequencing technologies has become increasingly popular, because it represents a rapid and cost-effective method to determine allele frequencies for single nucleotide polymorphisms (SNPs) in population pools. Validation of allele frequencies determined by Pool-Seq has been attempted using an individual genotyping approach, but these studies tend to use samples from existing model organism databases or DNA stores, and do not validate a realistic setup for sampling natural populations. Here we used pyrosequencing to validate allele frequencies determined by Pool-Seq in three natural populations of Arabidopsis halleri (Brassicaceae). The allele frequency estimates of the pooled population samples (consisting of 20 individual plant DNA samples) were determined after mapping Illumina reads to (i) the publicly available, high-quality reference genome of a closely related species (Arabidopsis thaliana) and (ii) our own de novo draft genome assembly of A. halleri. We then pyrosequenced nine selected SNPs using the same individuals from each population, resulting in a total of 540 samples. Our results show a highly significant and accurate relationship between pooled and individually determined allele frequencies, irrespective of the reference genome used. Allele frequencies differed on average by less than 4%. There was no tendency that either the Pool-Seq or the individual-based approach resulted in higher or lower estimates of allele frequencies. Moreover, the rather high coverage in the mapping to the two reference genomes, ranging from 55 to 284x, had no significant effect on the accuracy of the Pool-Seq. A resampling analysis showed that only very low coverage values (below 10-20x) would substantially reduce the precision of the method. We therefore conclude that a pooled re-sequencing approach is well suited for analyses of genetic variation in natural populations.
Novel Quantitative Real-Time LCR for the Sensitive Detection of SNP Frequencies in Pooled DNA: Method Development, Evaluation and Application  [PDF]
Androniki Psifidi,Chrysostomos Dovas,Georgios Banos
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0014560
Abstract: Single nucleotide polymorphisms (SNP) have proven to be powerful genetic markers for genetic applications in medicine, life science and agriculture. A variety of methods exist for SNP detection but few can quantify SNP frequencies when the mutated DNA molecules correspond to a small fraction of the wild-type DNA. Furthermore, there is no generally accepted gold standard for SNP quantification, and, in general, currently applied methods give inconsistent results in selected cohorts. In the present study we sought to develop a novel method for accurate detection and quantification of SNP in DNA pooled samples.
Allele-Level Haplotype Frequencies and Pairwise Linkage Disequilibrium for 14 KIR Loci in 506 European-American Individuals  [PDF]
Cynthia Vierra-Green, David Roe, Lihua Hou, Carolyn Katovich Hurley, Raja Rajalingam, Elaine Reed, Tatiana Lebedeva, Neng Yu, Mary Stewart, Harriet Noreen, Jill A. Hollenbach, Lisbeth A. Guethlein, Tao Wang, Stephen Spellman, Martin Maiers
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0047491
Abstract: The immune responses of natural killer cells are regulated, in part, by killer cell immunoglobulin-like receptors (KIR). The 16 closely-related genes in the KIR gene system have been diversified by gene duplication and unequal crossing over, thereby generating haplotypes with variation in gene copy number. Allelic variation also contributes to diversity within the complex. In this study, we estimated allele-level haplotype frequencies and pairwise linkage disequilibrium statistics for 14 KIR loci. The typing utilized multiple methodologies by four laboratories to provide at least 2x coverage for each allele. The computational methods generated maximum-likelihood estimates of allele-level haplotypes. Our results indicate the most extensive allele diversity was observed for the KIR framework genes and for the genes localized to the telomeric region of the KIR A haplotype. Particular alleles of the stimulatory loci appear to be nearly fixed on specific, common haplotypes while many of the less frequent alleles of the inhibitory loci appeared on multiple haplotypes, some with common haplotype structures. Haplotype structures cA01 and/or tA01 predominate in this cohort, as has been observed in most populations worldwide. Linkage disequilibrium is high within the centromeric and telomeric haplotype regions but not between them and is particularly strong between centromeric gene pairs KIR2DL5~KIR2DS3S5 and KIR2DS3S5~KIR2DL1, and telomeric KIR3DL1~KIR2DS4. Although 93% of the individuals have unique pairs of full-length allelic haplotypes, large genomic blocks sharing specific sets of alleles are seen in the most frequent haplotypes. These high-resolution, high-quality haplotypes extend our basic knowledge of the KIR gene system and may be used to support clinical studies beyond single gene analysis.
NullHap – a versatile application to estimate haplotype frequencies from unphased genotypes in the presence of null alleles
Robert M Nowak, Rafa? P?oski
BMC Bioinformatics , 2008, DOI: 10.1186/1471-2105-9-330
Abstract: Here we present a description of a modified Expectation – Maximization algorithm as well as its implementation (NullHap) which allow to effectively overcome these limitations. As an example of application we used Nullhap to reanalyze published data on distribution of KIR genotypes in Polish psoriasis patients and controls showing that the KIR2DS4/1D locus may be a marker of KIR2DS1 haplotypes with different effects on disease susceptibility.The developed application can estimate haplotype frequencies for every type of polymorphism and can effectively be used in genetic research as illustrated by a novel finding regarding the genetic susceptibility to psoriasis.Laboratory techniques used to determine haplotypes [1] are often too expensive for large-scale studies. The lack of phase information provided by the popular typing methods could be overcome using likelihood-based calculations [2], which estimate haplotype frequencies in a population, and reconstruct the haplotype pair in each individual. This approach is more cost-effective and powerful than linkage analysis [3], and gives more information than single marker-based methods [4].Haplotype estimation procedures typically use maximum likelihood approach. The most popular algorithm implemented for example in Arlequin [5] is The Expectation – Maximization algorithm (EM) [6] but other methods were also proposed: Bayesian method using a pseudo-Gibbs sampler [7], partition-ligation [8], Monte Carlo [9] and Hidden Markov Model [10].A frequent shortage of available software packages [5,7] is the lack of possibility to analyze loci where null variants occur with an appreciable frequency. In a diploid organism, a null allele is a variant which is not detected in genotyping, because of a deletion of an entire locus or because of a mutation interfering with analysis. This makes it impossible to distinguish between some heterozygous and homozygous genotypes [11]. For example, if there is only one alternative allele A1 besides
Genetic relationships among native americans based on beta-globin gene cluster haplotype frequencies
Mousinho-Ribeiro, Rita de Cassia;Pante-de-Sousa, Gabriella;Santos, Eduardo José Melo dos;Guerreiro, Jo?o Farias;
Genetics and Molecular Biology , 2003, DOI: 10.1590/S1415-47572003000300002
Abstract: the distribution of b-globin gene haplotypes was studied in 209 amerindians from eight tribes of the brazilian amazon: asurini from xingú, awá-guajá, parakan?, urubú-kaapór, zoé, kayapó (xikrin from the bacajá village), katuena, and tiriyó. nine different haplotypes were found, two of which (n. 11 and 13) had not been previously identified in brazilian indigenous populations. haplotype 2 (+ - - - -) was the most common in all groups studied, with frequencies varying from 70% to 100%, followed by haplotype 6 (- + + - +), with frequencies between 7% and 18%. the frequency distribution of the b-globin gene haplotypes in the eighteen brazilian amerindian populations studied to date is characterized by a reduced number of haplotypes (average of 3.5) and low levels of heterozygosity and intrapopulational differentiation, with a single clearly predominant haplotype in most tribes (haplotype 2). the parakan?, urubú-kaapór, tiriyó and xavante tribes constitute exceptions, presenting at least four haplotypes with relatively high frequencies. the closest genetic relationships were observed between the brazilian and the colombian amerindians (wayuu, kamsa and inga), and, to a lesser extent, with the huichol of mexico. north-american amerindians are more differentiated and clearly separated from all other tribes, except the xavante, from brazil, and the mapuche, from argentina. a restricted pool of ancestral haplotypes may explain the low diversity observed among most present-day brazilian and colombian amerindian groups, while interethnic admixture could be the most important factor to explain the high number of haplotypes and high levels of diversity observed in some south-american and most north-american tribes.
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.