Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Determinantal Point Process Priors for Bayesian Variable Selection in Linear Regression  [PDF]
Mutsuki Kojima,Fumiyasu Komaki
Statistics , 2014,
Abstract: We propose discrete determinantal point processes (DPPs) for priors on the model parameter in Bayesian variable selection. By our variable selection method, collinear predictors are less likely to be selected simultaneously because of the repulsion property of discrete DPPs. Three types of DPP priors are proposed. We show the efficiency of the proposed priors through numerical experiments and applications to collinear datasets.
Bayesian variable selection with shrinking and diffusing priors  [PDF]
Naveen Naidu Narisetty,Xuming He
Statistics , 2014, DOI: 10.1214/14-AOS1207
Abstract: We consider a Bayesian approach to variable selection in the presence of high dimensional covariates based on a hierarchical model that places prior distributions on the regression coefficients as well as on the model space. We adopt the well-known spike and slab Gaussian priors with a distinct feature, that is, the prior variances depend on the sample size through which appropriate shrinkage can be achieved. We show the strong selection consistency of the proposed method in the sense that the posterior probability of the true model converges to one even when the number of covariates grows nearly exponentially with the sample size. This is arguably the strongest selection consistency result that has been available in the Bayesian variable selection literature; yet the proposed method can be carried out through posterior sampling with a simple Gibbs sampler. Furthermore, we argue that the proposed method is asymptotically similar to model selection with the $L_0$ penalty. We also demonstrate through empirical work the fine performance of the proposed approach relative to some state of the art alternatives.
Bayesian variable selection with spherically symmetric priors  [PDF]
M. B. De Kock,H. C. Eggers
Statistics , 2014,
Abstract: We propose that Bayesian variable selection for linear parametrisations with Gaussian iid likelihoods be based on the spherical symmetry of the diagonalised parameter space. Our r-prior results in closed forms for the evidence for four examples, including the hyper-g prior and the Zellner-Siow prior, which are shown to be special cases. Scenarios of a single variable dispersion parameter and of fixed dispersion are studied, and asymptotic forms comparable to the traditional information criteria are derived. A simulation exercise shows that model comparison based on our r-prior gives good results comparable to or better than current model comparison schemes.
Variable Selection in Bayesian Semiparametric Regression Models  [PDF]
Ofir Harari,David M. Steinberg
Statistics , 2014,
Abstract: In this paper we extend existing Bayesian methods for variable selection in Gaussian process regression, to select both the regression terms and the active covariates in the spatial correlation structure. We then use the estimated posterior probabilities to choose between relatively few modes through cross-validation, and consequently improve prediction.
Robust Bayesian variable selection with sub-harmonic priors  [PDF]
Yuzo Maruyama,William E. Strawderman
Statistics , 2010,
Abstract: This paper studies Bayesian variable selection in linear models with general spherically symmetric error distributions. We propose sub-harmonic priors which arise as a class of mixtures of Zellner's g-priors for which the Bayes factors are independent of the underlying error distribution, as long as it is in the spherically symmetric class. Because of this invariance to spherically symmetric error distribution, we refer to our method as a robust Bayesian variable selection method. We demonstrate that our Bayes factors have model selection consistency and are coherent. We also develop Laplace approximations to Bayes factors for a number of recently studied mixtures of g-priors that have recently appeared in the literature (including our own) for Gaussian errors. These approximations, in each case, are given by the Gaussian Bayes factor based on BIC times a simple rational function of the prior's hyper-parameters and the R^2's for the respective models. We also extend model selection consistency for several g-prior based Bayes factor methods for Gaussian errors to the entire class of spherically symmetric error distributions. Additionally we demonstrate that our class of sub-harmonic priors are the only ones within a large class of mixtures of g-priors studied in the literature which are robust in our sense. A simulation study and an analysis of two real data sets indicates good performance of our robust Bayes factors relative to BIC and to other mixture of g-prior based methods.
An Integrative Framework for Bayesian Variable Selection with Informative Priors for Identifying Genes and Pathways  [PDF]
Bin Peng, Dianwen Zhu, Bradley P. Ander, Xiaoshuai Zhang, Fuzhong Xue, Frank R. Sharp, Xiaowei Yang
PLOS ONE , 2013, DOI: 10.1371/journal.pone.0067672
Abstract: The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with ‘large p, small n’ problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed.
Bayesian Variable Selection and Estimation for Group Lasso  [PDF]
Xiaofan Xu,Malay Ghosh
Statistics , 2015, DOI: 10.1214/14-BA929
Abstract: The paper revisits the Bayesian group lasso and uses spike and slab priors for group variable selection. In the process, the connection of our model with penalized regression is demonstrated, and the role of posterior median for thresholding is pointed out. We show that the posterior median estimator has the oracle property for group variable selection and estimation under orthogonal designs, while the group lasso has suboptimal asymptotic estimation rate when variable selection consistency is achieved. Next we consider bi-level selection problem and propose the Bayesian sparse group selection again with spike and slab priors to select variables both at the group level and also within a group. We demonstrate via simulation that the posterior median estimator of our spike and slab models has excellent performance for both variable selection and estimation.
MCMC algorithms for Bayesian variable selection in the logistic regression model for large-scale genomic applications  [PDF]
Manuela Zucknick,Sylvia Richardson
Statistics , 2014,
Abstract: In large-scale genomic applications vast numbers of molecular features are scanned in order to find a small number of candidates which are linked to a particular disease or phenotype. This is a variable selection problem in the "large p, small n" paradigm where many more variables than samples are available. Additionally, a complex dependence structure is often observed among the markers/genes due to their joint involvement in biological processes and pathways. Bayesian variable selection methods that introduce sparseness through additional priors on the model size are well suited to the problem. However, the model space is very large and standard Markov chain Monte Carlo (MCMC) algorithms such as a Gibbs sampler sweeping over all p variables in each iteration are often computationally infeasible. We propose to employ the dependence structure in the data to decide which variables should always be updated together and which are nearly conditionally independent and hence do not need to be considered together. Here, we focus on binary classification applications. We follow the implementation of the Bayesian probit regression model by Albert and Chib (1993) and the Bayesian logistic regression model by Holmes and Held (2006) which both lead to marginal Gaussian distributions. We in- vestigate several MCMC samplers using the dependence structure in different ways. The mixing and convergence performances of the resulting Markov chains are evaluated and compared to standard samplers in two simulation studies and in an application to a real gene expression data set.
Bayesian Variable Selection with Related Predictors  [PDF]
Hugh Chipman
Physics , 1995,
Abstract: In data sets with many predictors, algorithms for identifying a good subset of predictors are often used. Most such algorithms do not account for any relationships between predictors. For example, stepwise regression might select a model containing an interaction AB but neither main effect A or B. This paper develops mathematical representations of this and other relations between predictors, which may then be incorporated in a model selection procedure. A Bayesian approach that goes beyond the standard independence prior for variable selection is adopted, and preference for certain models is interpreted as prior information. Priors relevant to arbitrary interactions and polynomials, dummy variables for categorical factors, competing predictors, and restrictions on the size of the models are developed. Since the relations developed are for priors, they may be incorporated in any Bayesian variable selection algorithm for any type of linear model. The application of the methods here is illustrated via the Stochastic Search Variable Selection algorithm of George and McCulloch (1993), which is modified to utilize the new priors. The performance of the approach is illustrated with two constructed examples and a computer performance dataset. Keywords: Model Selection, Prior Distributions, Interaction, Dummy Variable
Bayesian variable selection regression for genome-wide association studies and other large-scale problems  [PDF]
Yongtao Guan,Matthew Stephens
Statistics , 2011, DOI: 10.1214/11-AOAS455
Abstract: We consider applying Bayesian Variable Selection Regression, or BVSR, to genome-wide association studies and similar large-scale regression problems. Currently, typical genome-wide association studies measure hundreds of thousands, or millions, of genetic variants (SNPs), in thousands or tens of thousands of individuals, and attempt to identify regions harboring SNPs that affect some phenotype or outcome of interest. This goal can naturally be cast as a variable selection regression problem, with the SNPs as the covariates in the regression. Characteristic features of genome-wide association studies include the following: (i) a focus primarily on identifying relevant variables, rather than on prediction; and (ii) many relevant covariates may have tiny effects, making it effectively impossible to confidently identify the complete "correct" subset of variables. Taken together, these factors put a premium on having interpretable measures of confidence for individual covariates being included in the model, which we argue is a strength of BVSR compared with alternatives such as penalized regression methods. Here we focus primarily on analysis of quantitative phenotypes, and on appropriate prior specification for BVSR in this setting, emphasizing the idea of considering what the priors imply about the total proportion of variance in outcome explained by relevant covariates. We also emphasize the potential for BVSR to estimate this proportion of variance explained, and hence shed light on the issue of "missing heritability" in genome-wide association studies.
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.