oalib

Publish in OALib Journal

ISSN: 2333-9721

APC: Only $99

Submit

Any time

2019 ( 599 )

2018 ( 827 )

2017 ( 761 )

2016 ( 1113 )

Custom range...

Search Results: 1 - 10 of 464764 matches for " Fred A. Wright "
All listed articles are free for downloading (OA Articles)
Page 1 /464764
Display every page Item
A geometric interpretation of the permutation $p$-value and its application in eQTL studies
Wei Sun,Fred A. Wright
Statistics , 2010, DOI: 10.1214/09-AOAS298
Abstract: Permutation $p$-values have been widely used to assess the significance of linkage or association in genetic studies. However, the application in large-scale studies is hindered by a heavy computational burden. We propose a geometric interpretation of permutation $p$-values, and based on this geometric interpretation, we develop an efficient permutation $p$-value estimation method in the context of regression with binary predictors. An application to a study of gene expression quantitative trait loci (eQTL) shows that our method provides reliable estimates of permutation $p$-values while requiring less than 5% of the computational time compared with direct permutations. In fact, our method takes a constant time to estimate permutation $p$-values, no matter how small the $p$-value. Our method enables a study of the relationship between nominal $p$-values and permutation $p$-values in a wide range, and provides a geometric perspective on the effective number of independent tests.
A procedure to detect general association based on concentration of ranks
Pratyaydipta Rudra,Fred A. Wright
Statistics , 2014,
Abstract: In modern high-throughput applications, it is important to identify pairwise associations between variables, and desirable to use methods that are powerful and sensitive to a variety of association relationships. We describe RankCover, a new non-parametric association test for association between two variables that measures the concentration of paired ranked points. Here `concentration' is quantified using a disk-covering statistic that is similar to those employed in spatial data analysis. Analysis of simulated datasets demonstrates that the method is robust and often powerful in comparison to competing general association tests. We illustrate RankCover in the analysis of several real datasets.
Convergence and prediction of principal component scores in high-dimensional settings
Seunggeun Lee,Fei Zou,Fred A. Wright
Statistics , 2012, DOI: 10.1214/10-AOS821
Abstract: A number of settings arise in which it is of interest to predict Principal Component (PC) scores for new observations using data from an initial sample. In this paper, we demonstrate that naive approaches to PC score prediction can be substantially biased toward 0 in the analysis of large matrices. This phenomenon is largely related to known inconsistency results for sample eigenvalues and eigenvectors as both dimensions of the matrix increase. For the spiked eigenvalue model for random matrices, we expand the generality of these results, and propose bias-adjusted PC score prediction. In addition, we compute the asymptotic correlation coefficient between PC scores from sample and population eigenvectors. Simulation and real data examples from the genetics literature show the improved bias and numerical properties of our estimators.
Consistent Testing for Recurrent Genomic Aberrations
Vonn Walter,Fred A. Wright,Andrew B. Nobel
Quantitative Biology , 2014,
Abstract: Genomic aberrations, such as somatic copy number alterations, are frequently observed in tumor tissue. Recurrent aberrations, occurring in the same region across multiple subjects, are of interest because they may highlight genes associated with tumor development or progression. A number of tools have been proposed to assess the statistical significance of recurrent DNA copy number aberrations, but their statistical properties have not been carefully studied. Cyclic shift testing, a permutation procedure using independent random shifts of genomic marker observations on the genome, has been proposed to identify recurrent aberrations, and is potentially useful for a wider variety of purposes, including identifying regions with methylation aberrations or overrepresented in disease association studies. For data following a countable-state Markov model, we prove the asymptotic validity of cyclic shift $p$-values under a fixed sample size regime as the number of observed markers tends to infinity. We illustrate cyclic shift testing for a variety of data types, producing biologically relevant findings for three publicly available datasets.
A statistical framework for testing functional categories in microarray data
William T. Barry,Andrew B. Nobel,Fred A. Wright
Statistics , 2008, DOI: 10.1214/07-AOAS146
Abstract: Ready access to emerging databases of gene annotation and functional pathways has shifted assessments of differential expression in DNA microarray studies from single genes to groups of genes with shared biological function. This paper takes a critical look at existing methods for assessing the differential expression of a group of genes (functional category), and provides some suggestions for improved performance. We begin by presenting a general framework, in which the set of genes in a functional category is compared to the complementary set of genes on the array. The framework includes tests for overrepresentation of a category within a list of significant genes, and methods that consider continuous measures of differential expression. Existing tests are divided into two classes. Class 1 tests assume gene-specific measures of differential expression are independent, despite overwhelming evidence of positive correlation. Analytic and simulated results are presented that demonstrate Class 1 tests are strongly anti-conservative in practice. Class 2 tests account for gene correlation, typically through array permutation that by construction has proper Type I error control for the induced null. However, both Class 1 and Class 2 tests use a null hypothesis that all genes have the same degree of differential expression. We introduce a more sensible and general (Class 3) null under which the profile of differential expression is the same within the category and complement. Under this broader null, Class 2 tests are shown to be conservative. We propose standard bootstrap methods for testing against the Class 3 null and demonstrate they provide valid Type I error control and more power than array permutation in simulated datasets and real microarray experiments.
Gene Expression in Peripheral Blood Leukocytes in Monozygotic Twins Discordant for Chronic Fatigue: No Evidence of a Biomarker
Andrea Byrnes,Andreas Jacks,Karin Dahlman-Wright,Birgitta Evengard,Fred A. Wright,Nancy L. Pedersen,Patrick F. Sullivan
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0005805
Abstract: Chronic fatiguing illness remains a poorly understood syndrome of unknown pathogenesis. We attempted to identify biomarkers for chronic fatiguing illness using microarrays to query the transcriptome in peripheral blood leukocytes.
An Empirical Bayes Approach for Multiple Tissue eQTL Analysis
Gen Li,Andrey A. Shabalin,Ivan Rusyn,Fred A. Wright,Andrew B. Nobel
Statistics , 2013,
Abstract: Expression quantitative trait loci (eQTL) analyses, which identify genetic markers associated with the expression of a gene, are an important tool in the understanding of diseases in human and other populations. While most eQTL studies to date consider the connection between genetic variation and expression in a single tissue, complex, multi-tissue data sets are now being generated by the GTEx initiative. These data sets have the potential to improve the findings of single tissue analyses by borrowing strength across tissues, and the potential to elucidate the genotypic basis of differences between tissues. In this paper we introduce and study a multivariate hierarchical Bayesian model (MT-eQTL) for multi-tissue eQTL analysis. MT-eQTL directly models the vector of correlations between expression and genotype across tissues. It explicitly captures patterns of variation in the presence or absence of eQTLs, as well as the heterogeneity of effect sizes across tissues. Moreover, the model is applicable to complex designs in which the set of donors can (i) vary from tissue to tissue, and (ii) exhibit incomplete overlap between tissues. The MT-eQTL model is marginally consistent, in the sense that the model for a subset of tissues can be obtained from the full model via marginalization. Fitting of the MT-eQTL model is carried out via empirical Bayes, using an approximate EM algorithm. Inferences concerning eQTL detection and the configuration of eQTLs across tissues are derived from adaptive thresholding of local false discovery rates, and maximum a-posteriori estimation, respectively. We investigate the MT-eQTL model through a simulation study, and rigorously establish the FDR control of the local FDR testing procedure under mild assumptions appropriate for dependent data.
Discovering collectively informative descriptors from high-throughput experiments
Clark D Jeffries, William O Ward, Diana O Perkins, Fred A Wright
BMC Bioinformatics , 2009, DOI: 10.1186/1471-2105-10-431
Abstract: This paper describes a novel algorithm called BLANKET for symmetric analysis of two experiments that assess informativeness of descriptors. The experiments are required to be related only in that their descriptor sets intersect substantially and their definitions of case and control are consistent. From resulting lists of n descriptors ranked by informativeness, BLANKET determines shortlists of descriptors from each experiment, generally of different lengths p and q. For any pair of shortlists, four numbers are evident: the number of descriptors appearing in both shortlists, in exactly one shortlist, or in neither shortlist. From the associated contingency table, BLANKET computes Right Fisher Exact Test (RFET) values used as scores over a plane of possible pairs of shortlist lengths [1,2]. BLANKET then chooses a pair or pairs with RFET score less than a threshold; the threshold depends upon n and shortlist length limits and represents a quality of intersection achieved by less than 5% of random lists.Researchers seek within a universe of descriptors some minimal subset that collectively and efficiently predicts experimental outcomes. Ideally, any smaller subset should be insufficient for reliable prediction and any larger subset should have little additional accuracy. As a method, BLANKET is easy to conceptualize and presents only moderate computational complexity. Many existing databases could be mined using BLANKET to suggest optimal sets of predictive descriptors.In contemporary high-throughput experiments, very many descriptor values can be measured, leading to the issue of correction for multiple testing to minimize false positives at the cost of a high number of false negatives. Reconciliation entails compromises that are to some extent arbitrary. A deterministic method is needed for selecting a minimal, distinguished set of descriptors that collectively provide effective, efficient prediction. Researchers can subsequently investigate members of such a subset
Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
Daniel M Gatti, William T Barry, Andrew B Nobel, Ivan Rusyn, Fred A Wright
BMC Genomics , 2010, DOI: 10.1186/1471-2164-11-574
Abstract: We conducted a meta-analysis of over 200 datasets from the Gene Expression Omnibus in order to demonstrate the practical impact of strong gene correlation patterns that are highly consistent across experiments. We show that a common independence assumption-based gene set testing procedure produces very high false positive rates when applied to data sets for which treatment groups have been randomized, and that gene sets with high internal correlation are more likely to be declared significant. A reanalysis of the same datasets using an array resampling approach properly controls false positive rates, leading to more parsimonious and high-confidence gene set findings, which should facilitate pathway-based interpretation of the microarray data.These findings call into question many of the gene set testing results in the literature and argue strongly for the adoption of resampling based gene set testing criteria in the peer reviewed biomedical literature.Methods for statistical analysis of gene expression microarrays are maturing rapidly, and there are a variety of approaches to normalization, detection of differential expression, clustering, and class prediction [1]. In many experiments, a statistical test is performed to identify genes significantly associated with experimental condition, clinical response, or other sample attributes. The resulting list of significant genes may be so large that it defies easy interpretation, and it is natural to seek a concise, biological summary of results. One such approach is gene set testing (sometimes called "pathway analysis"), which detects over-representation of gene sets among the list of significant genes. Gene sets may be curated [2], or derived from databases such as Gene Ontology (GO) [3] or Kyoto Encyclopedia of Gene and Genomes (KEGG) [4].The simplest approach to gene set testing relies on 2 × 2 tables of gene set membership (in gene set or not) vs. significance (significant or not). Gene set testing is often performed
Microarray analysis of peripheral blood lymphocytes from ALS patients and the SAFE detection of the KEGG ALS pathway
Jean-Luc C Mougeot, Zhen Li, Andrea E Price, Fred A Wright, Benjamin R Brooks
BMC Medical Genomics , 2011, DOI: 10.1186/1755-8794-4-74
Abstract: Differentially expressed genes were determined by LIMMA (Linear Models for MicroArray) and SAM (Significance Analysis of Microarrays) analyses. The SAFE (Significance Analysis of Function and Expression) procedure was used to identify molecular pathway perturbations. Proteasome inhibition assays were conducted on cultured peripheral blood mononuclear cells (PBMCs) from ALS patients to confirm alteration of the Ubiquitin/Proteasome System (UPS).For the first time, using SAFE in a global gene ontology analysis (gene set size 5-100), we show significant perturbation of the KEGG (Kyoto Encyclopedia of Genes and Genomes) ALS pathway of motor neuron degeneration in PBLs from ALS patients. This was the only KEGG disease pathway significantly upregulated among 25, and contributing genes, including SOD1, represented 54% of the encoded proteins or protein complexes of the KEGG ALS pathway. Further SAFE analysis, including gene set sizes >100, showed that only neurodegenerative diseases (4 out of 34 disease pathways) including ALS were significantly upregulated. Changes in UBR2 expression correlated inversely with time since onset of disease and directly with ALSFRS-R, implying that UBR2 was increased early in the course of ALS. Cultured PBMCs from ALS patients accumulated more ubiquitinated proteins than PBMCs from healthy controls in a serum-dependent manner confirming changes in this pathway.Our study indicates that PBLs from sALS patients are strong responders to systemic signals or local signals acquired by cell trafficking, representing changes in gene expression similar to those present in brain and spinal cord of sALS patients. PBLs may provide a useful means to study ALS pathogenesis.Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease causing muscle weakness and wasting resulting from the loss of motor neurons in brain and spinal cord characterized by ubiquitinated inclusions in brain and spinal cord of post mortem ALS patients [1]. Several
Page 1 /464764
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.