Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
A Bayesian alternative to mutual information for the hierarchical clustering of dependent random variables  [PDF]
Guillaume Marrelec,Arnaud Messé,Pierre Bellec
Computer Science , 2015, DOI: 10.1371/journal.pone.0137278
Abstract: The use of mutual information as a similarity measure in agglomerative hierarchical clustering (AHC) raises an important issue: some correction needs to be applied for the dimensionality of variables. In this work, we formulate the decision of merging dependent multivariate normal variables in an AHC procedure as a Bayesian model comparison. We found that the Bayesian formulation naturally shrinks the empirical covariance matrix towards a matrix set a priori (e.g., the identity), provides an automated stopping rule, and corrects for dimensionality using a term that scales up the measure as a function of the dimensionality of the variables. Also, the resulting log Bayes factor is asymptotically proportional to the plug-in estimate of mutual information, with an additive correction for dimensionality in agreement with the Bayesian information criterion. We investigated the behavior of these Bayesian alternatives (in exact and asymptotic forms) to mutual information on simulated and real data. An encouraging result was first derived on simulations: the hierarchical clustering based on the log Bayes factor outperformed off-the-shelf clustering techniques as well as raw and normalized mutual information in terms of classification accuracy. On a toy example, we found that the Bayesian approaches led to results that were similar to those of mutual information clustering techniques, with the advantage of an automated thresholding. On real functional magnetic resonance imaging (fMRI) datasets measuring brain activity, it identified clusters consistent with the established outcome of standard procedures. On this application, normalized mutual information had a highly atypical behavior, in the sense that it systematically favored very large clusters. These initial experiments suggest that the proposed Bayesian alternatives to mutual information are a useful new tool for hierarchical clustering.
Linking Bovine Tuberculosis on Cattle Farms to White-Tailed Deer and Environmental Variables Using Bayesian Hierarchical Analysis  [PDF]
W. David Walter, Rick Smith, Mike Vanderklok, Kurt C. VerCauteren
PLOS ONE , 2014, DOI: 10.1371/journal.pone.0090925
Abstract: Bovine tuberculosis is a bacterial disease caused by Mycobacterium bovis in livestock and wildlife with hosts that include Eurasian badgers (Meles meles), brushtail possum (Trichosurus vulpecula), and white-tailed deer (Odocoileus virginianus). Risk-assessment efforts in Michigan have been initiated on farms to minimize interactions of cattle with wildlife hosts but research on M. bovis on cattle farms has not investigated the spatial context of disease epidemiology. To incorporate spatially explicit data, initial likelihood of infection probabilities for cattle farms tested for M. bovis, prevalence of M. bovis in white-tailed deer, deer density, and environmental variables for each farm were modeled in a Bayesian hierarchical framework. We used geo-referenced locations of 762 cattle farms that have been tested for M. bovis, white-tailed deer prevalence, and several environmental variables that may lead to long-term survival and viability of M. bovis on farms and surrounding habitats (i.e., soil type, habitat type). Bayesian hierarchical analyses identified deer prevalence and proportion of sandy soil within our sampling grid as the most supported model. Analysis of cattle farms tested for M. bovis identified that for every 1% increase in sandy soil resulted in an increase in odds of infection by 4%. Our analysis revealed that the influence of prevalence of M. bovis in white-tailed deer was still a concern even after considerable efforts to prevent cattle interactions with white-tailed deer through on-farm mitigation and reduction in the deer population. Cattle farms test positive for M. bovis annually in our study area suggesting that the potential for an environmental source either on farms or in the surrounding landscape may contributing to new or re-infections with M. bovis. Our research provides an initial assessment of potential environmental factors that could be incorporated into additional modeling efforts as more knowledge of deer herd factors and cattle farm prevalence is documented.
Bayesian Hierarchical Mixtures of Experts  [PDF]
Christopher M. Bishop,Markus Svensen
Computer Science , 2012,
Abstract: The Hierarchical Mixture of Experts (HME) is a well-known tree-based model for regression and classification, based on soft probabilistic splits. In its original formulation it was trained by maximum likelihood, and is therefore prone to over-fitting. Furthermore the maximum likelihood framework offers no natural metric for optimizing the complexity and structure of the tree. Previous attempts to provide a Bayesian treatment of the HME model have relied either on ad-hoc local Gaussian approximations or have dealt with related models representing the joint distribution of both input and output variables. In this paper we describe a fully Bayesian treatment of the HME model based on variational inference. By combining local and global variational methods we obtain a rigourous lower bound on the marginal probability of the data under the model. This bound is optimized during the training phase, and its resulting value can be used for model order selection. We present results using this approach for a data set describing robot arm kinematics.
Fast Out-of-Sample Predictions for Bayesian Hierarchical Models of Latent Health States  [PDF]
Aaron J Fisher,R Yates Coley,Scott L Zeger
Statistics , 2015,
Abstract: Hierarchical Bayesian models can be especially useful in precision medicine settings, where clinicians are interested in estimating the patient-level latent variables associated with an individual's current health state and its trajectory. Such models are often fit using batch Markov Chain Monte Carlo (MCMC). However, the slow speed of batch MCMC computation makes it difficult to implement in clinical settings, where immediate latent variable estimates are often desired in response to new patient data. In this report, we discuss how importance sampling (IS) can instead be used to obtain fast, in-clinic estimates of patient-level latent variables. We apply IS to the hierarchical model proposed in Coley et al (2015) for predicting an individual's underlying prostate cancer state. We find that latent variable estimates via IS can typically be obtained in 1-10 seconds per person and have high agreement with estimates coming from longer-running batch MCMC methods. Alternative options for out-of-sample fitting and online updating are also discussed.
Rejoinder: Bayesian Checking of the Second Levels of Hierarchical Models  [PDF]
M. J. Bayarri,M. E. Castellanos
Statistics , 2008, DOI: 10.1214/07-STS235REJ
Abstract: Rejoinder: Bayesian Checking of the Second Levels of Hierarchical Models [arXiv:0802.0743]
Comment: Bayesian Checking of the Second Levels of Hierarchical Models  [PDF]
Valen E. Johnson
Statistics , 2008, DOI: 10.1214/07-STS235D
Abstract: Comment: Bayesian Checking of the Second Levels of Hierarchical Models [arXiv:0802.0743]
Comment: Bayesian Checking of the Second Levels of Hierarchical Models  [PDF]
Andrew Gelman
Statistics , 2008, DOI: 10.1214/07-STS235A
Abstract: Comment: Bayesian Checking of the Second Levels of Hierarchical Models [arXiv:0802.0743]
Bayesian Analysis for Semiparametric Reproductive Dispersion Models with Nonignorably Missing Data

CHEN Xuedong,

系统科学与数学 , 2010,
Abstract: Semiparametric reproductive dispersion model (SRDNM) is an extension of reproductive dispersion models and semiparametric regression models, and includes generalized partial linear model and semiparametric generalized linear model as its special cases. A method is proposed to obtain Bayesian estimationand to select appropriate model based on Bayes factor for such modelwith missing data both in covariate and response. Firstly, nonparametric components are fitted by penalized-splines and a Bayesian hierarchical model is set to model smooth parameters, then latent variables are introduced and the collapsed Gibbs sampler is implemented in order to improve the mixing androbustness of MCMC. Finally, simulation and real datasets are presented to illustrate the proposed methods.
Risk and Regret of Hierarchical Bayesian Learners  [PDF]
Jonathan H. Huggins,Joshua B. Tenenbaum
Computer Science , 2015,
Abstract: Common statistical practice has shown that the full power of Bayesian methods is not realized until hierarchical priors are used, as these allow for greater "robustness" and the ability to "share statistical strength." Yet it is an ongoing challenge to provide a learning-theoretically sound formalism of such notions that: offers practical guidance concerning when and how best to utilize hierarchical models; provides insights into what makes for a good hierarchical prior; and, when the form of the prior has been chosen, can guide the choice of hyperparameter settings. We present a set of analytical tools for understanding hierarchical priors in both the online and batch learning settings. We provide regret bounds under log-loss, which show how certain hierarchical models compare, in retrospect, to the best single model in the model class. We also show how to convert a Bayesian log-loss regret bound into a Bayesian risk bound for any bounded loss, a result which may be of independent interest. Risk and regret bounds for Student's $t$ and hierarchical Gaussian priors allow us to formalize the concepts of "robustness" and "sharing statistical strength." Priors for feature selection are investigated as well. Our results suggest that the learning-theoretic benefits of using hierarchical priors can often come at little cost on practical problems.
A Bayesian Hierarchical Model for Relating Multiple SNPs within Multiple Genes to Disease Risk  [PDF]
Lewei Duan,Duncan C. Thomas
International Journal of Genomics , 2013, DOI: 10.1155/2013/406217
Abstract: A variety of methods have been proposed for studying the association of multiple genes thought to be involved in a common pathway for a particular disease. Here, we present an extension of a Bayesian hierarchical modeling strategy that allows for multiple SNPs within each gene, with external prior information at either the SNP or gene level. The model involves variable selection at the SNP level through latent indicator variables and Bayesian shrinkage at the gene level towards a prior mean vector and covariance matrix that depend on external information. The entire model is fitted using Markov chain Monte Carlo methods. Simulation studies show that the approach is capable of recovering many of the truly causal SNPs and genes, depending upon their frequency and size of their effects. The method is applied to data on 504?SNPs in 38 candidate genes involved in DNA damage response in the WECARE study of second breast cancers in relation to radiotherapy exposure. 1. Introduction The Women’s Environment, Cancer And Radiation Epidemiology (WECARE) study [1] is aimed at a comprehensive examination of genes involved in particular functional pathways. The study is a population-based nested case-control study of 708 contralateral breast cancers (CBC) within a notional cohort of about 65,000 survivors of a first breast cancer, 1401 of whom serve as controls, and the primary exposure of interest is ionizing radiation dose to the contralateral breast from radiotherapy for treatment of the first cancer. Ionizing radiation is known to cause double strand breaks (DSBs) in DNA, which can invoke any of several DNA damage response mechanisms, primarily DSB repair via either homologous recombination or nonhomologous end joining, cell cycle checkpoint regulation, or apoptosis. The original study focused on mutations in the ATM gene, which plays a central role in the recognition of DSBs. The study was then extended to include BRCA1, BRCA2, and CHEK2, which are all involved in homologous recombination repair (HRR), and later still to include a broader set of 38 candidate genes involved in this and other pathways for DSB damage response. We have previously reported on the main effects of ionizing radiation [2, 3], ATM [4–6], BRCA1/2 [7–12], CHEK2 [13], and the interactions of radiation with ATM [14] and BRCA1/2 [15] as well as with other treatments and reproductive factors [16, 17], amongst other risk factors. The aim of this paper is to provide a comprehensive modeling strategy for examining the effects of all genes in a pathway and to apply the approach to candidate genes
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.