Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Bayesian Additive Regression Trees With Parametric Models of Heteroskedasticity  [PDF]
Justin Bleich,Adam Kapelner
Statistics , 2014,
Abstract: We incorporate heteroskedasticity into Bayesian Additive Regression Trees (BART) by modeling the log of the error variance parameter as a linear function of prespecified covariates. Under this scheme, the Gibbs sampling procedure for the original sum-of- trees model is easily modified, and the parameters for the variance model are updated via a Metropolis-Hastings step. We demonstrate the promise of our approach by providing more appropriate posterior predictive intervals than homoskedastic BART in heteroskedastic settings and demonstrating the model's resistance to overfitting. Our implementation will be offered in an upcoming release of the R package bartMachine.
Parallel Bayesian Additive Regression Trees  [PDF]
Matthew T. Pratola,Hugh A. Chipman,James R. Gattiker,David M. Higdon,Robert McCulloch,William N. Rust
Statistics , 2013,
Abstract: Bayesian Additive Regression Trees (BART) is a Bayesian approach to flexible non-linear regression which has been shown to be competitive with the best modern predictive methods such as those based on bagging and boosting. BART offers some advantages. For example, the stochastic search Markov Chain Monte Carlo (MCMC) algorithm can provide a more complete search of the model space and variation across MCMC draws can capture the level of uncertainty in the usual Bayesian way. The BART prior is robust in that reasonable results are typically obtained with a default prior specification. However, the publicly available implementation of the BART algorithm in the R package BayesTree is not fast enough to be considered interactive with over a thousand observations, and is unlikely to even run with 50,000 to 100,000 observations. In this paper we show how the BART algorithm may be modified and then computed using single program, multiple data (SPMD) parallel computation implemented using the Message Passing Interface (MPI) library. The approach scales nearly linearly in the number of processor cores, enabling the practitioner to perform statistical inference on massive datasets. Our approach can also handle datasets too massive to fit on any single data repository.
Bayesian Additive Regression Trees using Bayesian Model Averaging  [PDF]
Belinda Hernández,Adrian E. Raftery,Stephen R. Pennington,Andrew C. Parnell
Statistics , 2015,
Abstract: Bayesian Additive Regression Trees (BART) is a statistical sum of trees model. It can be considered a Bayesian version of machine learning tree ensemble methods where the individual trees are the base learners. However for data sets where the number of variables $p$ is large (e.g. $p>5,000$) the algorithm can become prohibitively expensive, computationally. Another method which is popular for high dimensional data is random forests, a machine learning algorithm which grows trees using a greedy search for the best split points. However, as it is not a statistical model, it cannot produce probabilistic estimates or predictions. We propose an alternative algorithm for BART called BART-BMA, which uses Bayesian Model Averaging and a greedy search algorithm to produce a model which is much more efficient than BART for datasets with large $p$. BART-BMA incorporates elements of both BART and random forests to offer a model-based algorithm which can deal with high-dimensional data. We have found that BART-BMA can be run in a reasonable time on a standard laptop for the "small $n$ large $p$" scenario which is common in many areas of bioinformatics. We showcase this method using simulated data and data from two real proteomic experiments; one to distinguish between patients with cardiovascular disease and controls and another to classify agressive from non-agressive prostate cancer. We compare our results to their main competitors. Open source code written in R and Rcpp to run BART-BMA can be found at: https://github.com/BelindaHernandez/BART-BMA.git
Particle Gibbs for Bayesian Additive Regression Trees  [PDF]
Balaji Lakshminarayanan,Daniel M. Roy,Yee Whye Teh
Computer Science , 2015,
Abstract: Additive regression trees are flexible non-parametric models and popular off-the-shelf tools for real-world non-linear regression. In application domains, such as bioinformatics, where there is also demand for probabilistic predictions with measures of uncertainty, the Bayesian additive regression trees (BART) model, introduced by Chipman et al. (2010), is increasingly popular. As data sets have grown in size, however, the standard Metropolis-Hastings algorithms used to perform inference in BART are proving inadequate. In particular, these Markov chains make local changes to the trees and suffer from slow mixing when the data are high-dimensional or the best fitting trees are more than a few layers deep. We present a novel sampler for BART based on the Particle Gibbs (PG) algorithm (Andrieu et al., 2010) and a top-down particle filtering algorithm for Bayesian decision trees (Lakshminarayanan et al., 2013). Rather than making local changes to individual trees, the PG sampler proposes a complete tree to fit the residual. Experiments show that the PG sampler outperforms existing samplers in many settings.
BART: Bayesian additive regression trees  [PDF]
Hugh A. Chipman,Edward I. George,Robert E. McCulloch
Statistics , 2008, DOI: 10.1214/09-AOAS285
Abstract: We develop a Bayesian "sum-of-trees" model where each tree is constrained by a regularization prior to be a weak learner, and fitting and inference are accomplished via an iterative Bayesian backfitting MCMC algorithm that generates samples from a posterior. Effectively, BART is a nonparametric Bayesian regression approach which uses dimensionally adaptive random basis elements. Motivated by ensemble methods in general, and boosting algorithms in particular, BART is defined by a statistical model: a prior and a likelihood. This approach enables full posterior inference including point and interval estimates of the unknown regression function as well as the marginal effects of potential predictors. By keeping track of predictor inclusion frequencies, BART can also be used for model-free variable selection. BART's many features are illustrated with a bake-off against competing methods on 42 different data sets, with a simulation experiment and on a drug discovery classification problem.
DART: Dropouts meet Multiple Additive Regression Trees  [PDF]
K. V. Rashmi,Ran Gilad-Bachrach
Computer Science , 2015,
Abstract: Multiple Additive Regression Trees (MART), an ensemble model of boosted regression trees, is known to deliver high prediction accuracy for diverse tasks, and it is widely used in practice. However, it suffers an issue which we call over-specialization, wherein trees added at later iterations tend to impact the prediction of only a few instances, and make negligible contribution towards the remaining instances. This negatively affects the performance of the model on unseen data, and also makes the model over-sensitive to the contributions of the few, initially added tress. We show that the commonly used tool to address this issue, that of shrinkage, alleviates the problem only to a certain extent and the fundamental issue of over-specialization still remains. In this work, we explore a different approach to address the problem that of employing dropouts, a tool that has been recently proposed in the context of learning deep neural networks. We propose a novel way of employing dropouts in MART, resulting in the DART algorithm. We evaluate DART on ranking, regression and classification tasks, using large scale, publicly available datasets, and show that DART outperforms MART in each of the tasks, with a significant margin. We also show that DART overcomes the issue of over-specialization to a considerable extent.
Prediction with Missing Data via Bayesian Additive Regression Trees  [PDF]
Adam Kapelner,Justin Bleich
Computer Science , 2013,
Abstract: We present a method for incorporating missing data in non-parametric statistical learning without the need for imputation. We focus on a tree-based method, Bayesian Additive Regression Trees (BART), enhanced with "Missingness Incorporated in Attributes," an approach recently proposed incorporating missingness into decision trees (Twala, 2008). This procedure takes advantage of the partitioning mechanisms found in tree-based models. Simulations on generated models and real data indicate that our proposed method can forecast well on complicated missing-at-random and not-missing-at-random models as well as models where missingness itself influences the response. Our procedure has higher predictive performance and is more stable than competitors in many cases. We also illustrate BART's abilities to incorporate missingness into uncertainty intervals and to detect the influence of missingness on the model fit.
MBACT - Multiclass Bayesian Additive Classification Trees  [PDF]
Bereket P. Kindo,Hao Wang,Edsel A. Pe?a
Statistics , 2013,
Abstract: In this article, we propose Multiclass Bayesian Additive Classification Trees (MBACT) as a nonparametric procedure to deal with multiclass classification problems. MBACT is a multiclass extension of BART: Bayesian Additive Regression Trees \citep{CGM10}. In a range of data generating schemes and real data applications, MBACT is shown to have good predictive performance, competitive to existing procedures, and in particular it outperforms most procedures when the relationship between the response and predictors is nonlinear.
Additive Gaussian Process Regression  [PDF]
Shaan Qamar,Surya T. Tokdar
Statistics , 2014,
Abstract: Additive-interactive regression has recently been shown to offer attractive minimax error rates over traditional nonparametric multivariate regression in a wide variety of settings, including cases where the predictor count is much larger than the sample size and many of the predictors have important effects on the response, potentially through complex interactions. We present a Bayesian implementation of additive-interactive regression using an additive Gaussian process (AGP) prior and develop an efficient Markov chain sampler that extends stochastic search variable selection in this setting. Careful prior and hyper-parameter specification are developed in light of performance and computational considerations, and key innovations address difficulties in exploring a joint posterior distribution over multiple subsets of high dimensional predictor inclusion vectors. The method offers state-of-the-art support and interaction recovery while improving dramatically over competitors in terms of prediction accuracy on a diverse set of simulated and real data. Results from real data studies provide strong evidence that the additive-interactive framework is an attractive modeling platform for high-dimensional nonparametric regression.
Additive isotone regression  [PDF]
Enno Mammen,Kyusang Yu
Mathematics , 2007, DOI: 10.1214/074921707000000355
Abstract: This paper is about optimal estimation of the additive components of a nonparametric, additive isotone regression model. It is shown that asymptotically up to first order, each additive component can be estimated as well as it could be by a least squares estimator if the other components were known. The algorithm for the calculation of the estimator uses backfitting. Convergence of the algorithm is shown. Finite sample properties are also compared through simulation experiments.
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.