Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
GWAMA: software for genome-wide association meta-analysis
Reedik M?gi, Andrew P Morris
BMC Bioinformatics , 2010, DOI: 10.1186/1471-2105-11-288
Abstract: We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results.The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA webcite.Genome-wide association (GWA) studies of hundreds of thousands of single nucleotide polymorphisms (SNPs), genotyped in samples of thousands of individuals, such as those undertaken by the Wellcome Trust Case Control Consortium [1], have proved successful in identifying novel common variants contributing moderate effects to a wide range of complex human traits (odds ratios greater than 1.2 for dichotomous traits or heritability of at least 1% for quantitative phenotypes). However, much of the genetic variation underpinning variation in these traits remains, as yet, unexplained. One natural way to increase power to detect rarer variants of more modest effect is to increase sample size. This can most readily be achieved through meta-analysis of multiple studies from the same or closely related populations, increasing the sample size to the order of tens of thousands. Such analyses have led to the identification of multiple, now established associations that would not otherwise have been identified in any individual study [2-4].Meta-analysis of GWA studies has been greatly assisted by the development of imputation techniques [5,6] which predict genotypes not directly typed on available GWA genotypin
Local Exome Sequences Facilitate Imputation of Less Common Variants and Increase Power of Genome Wide Association Studies  [PDF]
Peter K. Joshi, James Prendergast, Ross M. Fraser, Jennifer E. Huffman, Veronique Vitart, Caroline Hayward, Ruth McQuillan, Dominik Glodzik, Ozren Pola?ek, Nicholas D. Hastie, Igor Rudan, Harry Campbell, Alan F. Wright, Chris S. Haley, James F. Wilson, Pau Navarro
PLOS ONE , 2013, DOI: 10.1371/journal.pone.0068604
Abstract: The analysis of less common variants in genome-wide association studies promises to elucidate complex trait genetics but is hampered by low power to reliably detect association. We show that addition of population-specific exome sequence data to global reference data allows more accurate imputation, particularly of less common SNPs (minor allele frequency 1–10%) in two very different European populations. The imputation improvement corresponds to an increase in effective sample size of 28–38%, for SNPs with a minor allele frequency in the range 1–3%.
Genomic Risk Profiling of Ischemic Stroke: Results of an International Genome-Wide Association Meta-Analysis  [PDF]
James F. Meschia, Andrew Singleton, Michael A. Nalls, Stephen S. Rich, Pankaj Sharma, Luigi Ferrucci, Mar Matarin, Dena G. Hernandez, Kerra Pearce, Thomas G. Brott, Robert D. Brown, John Hardy, Bradford B. Worrall
PLOS ONE , 2011, DOI: 10.1371/journal.pone.0023161
Abstract: Introduction Familial aggregation of ischemic stroke derives from shared genetic and environmental factors. We present a meta-analysis of genome-wide association scans (GWAS) from 3 cohorts to identify the contribution of common variants to ischemic stroke risk. Methods This study involved 1464 ischemic stroke cases and 1932 controls. Cases were genotyped using the Illumina 610 or 660 genotyping arrays; controls, with Illumina HumanHap 550Kv1 or 550Kv3 genotyping arrays. Imputation was performed with the 1000 Genomes European ancestry haplotypes (August 2010 release) as a reference. A total of 5,156,597 single-nucleotide polymorphisms (SNPs) were incorporated into the fixed effects meta-analysis. All SNPs associated with ischemic stroke (P<1×10?5) were incorporated into a multivariate risk profile model. Results No SNP reached genome-wide significance for ischemic stroke (P<5×10?8). Secondary analysis identified a significant cumulative effect for age at onset of stroke (first versus fifth quintile of cumulative profiles based on SNPs associated with late onset, ? = 14.77 [10.85,18.68], P = 5.5×10?12), as well as a strong effect showing increased risk across samples with a high propensity for stroke among samples with enriched counts of suggestive risk alleles (P<5×10?6). Risk profile scores based only on genomic information offered little incremental prediction. Discussion There is little evidence of a common genetic variant contributing to moderate risk of ischemic stroke. Quintiles based on genetic loading of alleles associated with a younger age at onset of ischemic stroke revealed a significant difference in age at onset between those in the upper and lower quintiles. Using common variants from GWAS and imputation, genomic profiling remains inferior to family history of stroke for defining risk. Inclusion of genomic (rare variant) information may be required to improve clinical risk profiling.
Using Family-Based Imputation in Genome-Wide Association Studies with Large Complex Pedigrees: The Framingham Heart Study  [PDF]
Ming-Huei Chen, Jie Huang, Wei-Min Chen, Martin G. Larson, Caroline S. Fox, Ramachandran S. Vasan, Sudha Seshadri, Christopher J. O’Donnell, Qiong Yang
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0051589
Abstract: Imputation has been widely used in genome-wide association studies (GWAS) to infer genotypes of un-genotyped variants based on the linkage disequilibrium in external reference panels such as the HapMap and 1000 Genomes. However, imputation has only rarely been performed based on family relationships to infer genotypes of un-genotyped individuals. Using 8998 Framingham Heart Study (FHS) participants genotyped with Affymetrix 550K SNPs, we imputed genotypes of same set of SNPs for additional 3121 participants, most of whom were never genotyped due to lack of DNA sample. Prior to imputation, 122 pedigrees were too large to be handled by the imputation software Merlin. Therefore, we developed a novel pedigree splitting algorithm that can maximize the number of genotyped relatives for imputing each un-genotyped individual, while keeping new sub-pedigrees under a pre-specified size. In GWAS of four phenotypes available in FHS (Alzheimer disease, circulating levels of fibrinogen, high-density lipoprotein cholesterol, and uric acid), we compared results using genotyped individuals only with results using both genotyped and imputed individuals. We studied the impact of applying different imputation quality filtering thresholds on the association results and did not found a universal threshold that always resulted in a more significant p-value for previously identified loci. However most of these loci had a lower p-value when we only included imputed genotypes with with ≥60% SNP- and ≥50% person-specific imputation certainty. In summary, we developed a novel algorithm for splitting large pedigrees for imputation and found a plausible imputation quality filtering threshold based on FHS. Further examination may be required to generalize this threshold to other studies.
A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies  [PDF]
Bryan N. Howie,Peter Donnelly,Jonathan Marchini
PLOS Genetics , 2009, DOI: 10.1371/journal.pgen.1000529
Abstract: Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000 Genomes Project) will soon allow a broader range of SNPs to be imputed with higher accuracy, thereby increasing power. We describe a genotype imputation method (IMPUTE version 2) that is designed to address the challenges presented by these new datasets. The main innovation of our approach is a flexible modelling framework that increases accuracy and combines information across multiple reference panels while remaining computationally feasible. We find that IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made. We also find that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%–20% lower than those of the closest competing method. One particularly challenging aspect of next-generation association studies is to integrate information across multiple reference panels genotyped on different sets of SNPs; we show that our approach to this problem has practical advantages over other suggested solutions.
Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip  [PDF]
Chris C. A. Spencer equal contributor,Zhan Su equal contributor,Peter Donnelly ?,Jonathan Marchini ?
PLOS Genetics , 2009, DOI: 10.1371/journal.pgen.1000477
Abstract: Genome-wide association studies are revolutionizing the search for the genes underlying human complex diseases. The main decisions to be made at the design stage of these studies are the choice of the commercial genotyping chip to be used and the numbers of case and control samples to be genotyped. The most common method of comparing different chips is using a measure of coverage, but this fails to properly account for the effects of sample size, the genetic model of the disease, and linkage disequilibrium between SNPs. In this paper, we argue that the statistical power to detect a causative variant should be the major criterion in study design. Because of the complicated pattern of linkage disequilibrium (LD) in the human genome, power cannot be calculated analytically and must instead be assessed by simulation. We describe in detail a method of simulating case-control samples at a set of linked SNPs that replicates the patterns of LD in human populations, and we used it to assess power for a comprehensive set of available genotyping chips. Our results allow us to compare the performance of the chips to detect variants with different effect sizes and allele frequencies, look at how power changes with sample size in different populations or when using multi-marker tags and genotype imputation approaches, and how performance compares to a hypothetical chip that contains every SNP in HapMap. A main conclusion of this study is that marked differences in genome coverage may not translate into appreciable differences in power and that, when taking budgetary considerations into account, the most powerful design may not always correspond to the chip with the highest coverage. We also show that genotype imputation can be used to boost the power of many chips up to the level obtained from a hypothetical “complete” chip containing all the SNPs in HapMap. Our results have been encapsulated into an R software package that allows users to design future association studies and our methods provide a framework with which new chip sets can be evaluated.
Interpreting Meta-Analyses of Genome-Wide Association Studies  [PDF]
Buhm Han,Eleazar Eskin
PLOS Genetics , 2012, DOI: 10.1371/journal.pgen.1002555
Abstract: Meta-analysis is an increasingly popular tool for combining multiple genome-wide association studies in a single analysis to identify associations with small effect sizes. The effect sizes between studies in a meta-analysis may differ and these differences, or heterogeneity, can be caused by many factors. If heterogeneity is observed in the results of a meta-analysis, interpreting the cause of heterogeneity is important because the correct interpretation can lead to a better understanding of the disease and a more effective design of a replication study. However, interpreting heterogeneous results is difficult. The standard approach of examining the association p-values of the studies does not effectively predict if the effect exists in each study. In this paper, we propose a framework facilitating the interpretation of the results of a meta-analysis. Our framework is based on a new statistic representing the posterior probability that the effect exists in each study, which is estimated utilizing cross-study information. Simulations and application to the real data show that our framework can effectively segregate the studies predicted to have an effect, the studies predicted to not have an effect, and the ambiguous studies that are underpowered. In addition to helping interpretation, the new framework also allows us to develop a new association testing procedure taking into account the existence of effect.
Heterogeneity in Meta-Analyses of Genome-Wide Association Investigations  [PDF]
John P.A. Ioannidis, Nikolaos A. Patsopoulos, Evangelos Evangelou
PLOS ONE , 2007, DOI: 10.1371/journal.pone.0000841
Abstract: Background Meta-analysis is the systematic and quantitative synthesis of effect sizes and the exploration of their diversity across different studies. Meta-analyses are increasingly applied to synthesize data from genome-wide association (GWA) studies and from other teams that try to replicate the genetic variants that emerge from such investigations. Between-study heterogeneity is important to document and may point to interesting leads. Methodology/Principal Findings To exemplify these issues, we used data from three GWA studies on type 2 diabetes and their replication efforts where meta-analyses of all data using fixed effects methods (not incorporating between-study heterogeneity) have already been published. We considered 11 polymorphisms that at least one of the three teams has suggested as susceptibility loci for type 2 diabetes. The I2 inconsistency metric (measuring the amount of heterogeneity not due to chance) was different from 0 (no detectable heterogeneity) for 6 of the 11 genetic variants; inconsistency was moderate to very large (I2 = 32–77%) for 5 of them. For these 5 polymorphisms, random effects calculations incorporating between-study heterogeneity revealed more conservative p-values for the summary effects compared with the fixed effects calculations. These 5 associations were perused in detail to highlight potential explanations for between-study heterogeneity. These include identification of a marker for a correlated phenotype (e.g. FTO rs8050136 being associated with type 2 diabetes through its effect on obesity); differential linkage disequilibrium across studies of the identified genetic markers with the respective culprit polymorphisms (e.g., possibly the case for CDKAL1 polymorphisms or for rs9300039 and markers in linkage disequilibrium, as shown by additional studies); and potential bias. Results were largely similar, when we treated the discovery and replication data from each GWA investigation as separate studies. Significance Between-study heterogeneity is useful to document in the synthesis of data from GWA investigations and can offer valuable insights for further clarification of gene-disease associations.
An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations
Marcio AA Almeida, Paulo SL Oliveira, Tiago V Pereira, José E Krieger, Alexandre C Pereira
BMC Genetics , 2011, DOI: 10.1186/1471-2156-12-10
Abstract: In this report we identified a total of 73 imputed markers that yielded a nominally statistically significant association at P < 10 -5 for type 2 Diabetes Mellitus and compared them with results obtained based on empirical allelic frequencies. Interestingly, despite their overall high correlation, association statistics based on imputed frequencies were discordant in 35 of the 73 (47%) associated markers, considerably inflating the type I error rate of imputed markers. We comprehensively tested several quality thresholds, the haplotypic structure underlying imputed markers and the use of flanking markers as predictors of inaccurate association statistics derived from imputed markers.Our results suggest that association statistics from imputed markers showing specific MAF (Minor Allele Frequencies) range, located in weak linkage disequilibrium blocks or strongly deviating from local patterns of association are prone to have inflated false positive association signals. The present study highlights the potential of imputation procedures and proposes simple procedures for selecting the best imputed markers for follow-up genotyping studies.Genome-wide association studies (GWAS) are a promising tool for the identification of genetic markers underlying phenotypes of interest and recently allowed the identification of markers associated with several human complex phenotypes[1]. These studies have accomplished their goals in improving our knowledge of genetic patterns underlying diseases such as diabetes mellitus type I [1] and II [2] and Cronh's disease [3]. Although methodologically appealing, these high-throughput experiments are not free from biases and limitations. Indeed, it is highly acknowledged that GWAS are not only prone to major drawbacks such as genotyping errors and sample failures, but also to varying levels of genome coverage across samples [4]. In practice, a further complication arises from the barrier imposed by the comparison of results among different GW
Meta-Analysis in Genome-Wide Association Datasets: Strategies and Application in Parkinson Disease  [PDF]
Evangelos Evangelou, Demetrius M. Maraganore, John P.A. Ioannidis
PLOS ONE , 2007, DOI: 10.1371/journal.pone.0000196
Abstract: Background Genome-wide association studies hold substantial promise for identifying common genetic variants that regulate susceptibility to complex diseases. However, for the detection of small genetic effects, single studies may be underpowered. Power may be improved by combining genome-wide datasets with meta-analytic techniques. Methodology/Principal Findings Both single and two-stage genome-wide data may be combined and there are several possible strategies. In the two-stage framework, we considered the options of (1) enhancement of replication data and (2) enhancement of first-stage data, and then, we also considered (3) joint meta-analyses including all first-stage and second-stage data. These strategies were examined empirically using data from two genome-wide association studies (three datasets) on Parkinson disease. In the three strategies, we derived 12, 5, and 49 single nucleotide polymorphisms that show significant associations at conventional levels of statistical significance. None of these remained significant after conservative adjustment for the number of performed analyses in each strategy. However, some may warrant further consideration: 6 SNPs were identified with at least 2 of the 3 strategies and 3 SNPs [rs1000291 on chromosome 3, rs2241743 on chromosome 4 and rs3018626 on chromosome 11] were identified with all 3 strategies and had no or minimal between-dataset heterogeneity (I2 = 0, 0 and 15%, respectively). Analyses were primarily limited by the suboptimal overlap of tested polymorphisms across different datasets (e.g., only 31,192 shared polymorphisms between the two tier 1 datasets). Conclusions/Significance Meta-analysis may be used to improve the power and examine the between-dataset heterogeneity of genome-wide association studies. Prospective designs may be most efficient, if they try to maximize the overlap of genotyping platforms and anticipate the combination of data across many genome-wide association studies.
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.