Abstract:
The regression discontinuity (RD) design is a popular approach to causal inference in non-randomized studies. This is because it can be used to identify and estimate causal effects under mild conditions. Specifically, for each subject, the RD design assigns a treatment or non-treatment, depending on whether or not an observed value of an assignment variable exceeds a fixed and known cutoff value. In this paper, we propose a Bayesian nonparametric regression modeling approach to RD designs, which exploits a local randomization feature. In this approach, the assignment variable is treated as a covariate, and a scalar-valued confounding variable is treated as a dependent variable (which may be a multivariate confounder score). Then, over the model's posterior distribution of locally-randomized subjects that cluster around the cutoff of the assignment variable, inference for causal effects are made within this random cluster, via two-group statistical comparisons of treatment outcomes and non-treatment outcomes. We illustrate the Bayesian nonparametric approach through the analysis of a real educational data set, to investigate the causal link between basic skills and teaching ability.

Abstract:
We introduce a random partition model for Bayesian nonparametric regression. The model is based on infinitely-many disjoint regions of the range of a latent covariate-dependent Gaussian process. Given a realization of the process, the cluster of dependent variable responses that share a common region are assumed to arise from the same distribution. Also, the latent Gaussian process prior allows for the random partitions (i.e., clusters of the observations) to exhibit dependencies among one another. The model is illustrated through the analysis of a real data set arising from education, and through the analysis of simulated data that were generated from complex data-generating models.

Abstract:
This paper is a note on the use of Bayesian nonparametric mixture models for continuous time series. We identify a key requirement for such models, and then establish that there is a single type of model which meets this requirement. As it turns out, the model is well known in multiple change-point problems.

Abstract:
For the important classical problem of inference on a sparse high-dimensional normal mean vector, we propose a novel empirical Bayes model that admits a posterior distribution with desirable properties under mild conditions. In particular, our empirical Bayes posterior distribution concentrates on balls, centered at the true mean vector, with squared radius proportional to the minimax rate, and its posterior mean is an asymptotically minimax estimator. We also show that, asymptotically, the support of our empirical Bayes posterior has roughly the same effective dimension as the true sparse mean vector. Simulation from our empirical Bayes posterior is straightforward, and our numerical results demonstrate the quality of our method compared to others having similar large-sample properties.

Abstract:
For non-randomized studies, the regression discontinuity design (RDD) can be used to identify and estimate causal effects from a "locally-randomized" subgroup of subjects, under relatively mild conditions. However, current models focus causal inferences on the impact of the treatment (versus non-treatment) variable on the mean of the dependent variable, via linear regression. For RDDs, we propose a flexible Bayesian nonparametric regression model that can provide accurate estimates of causal effects, in terms of the predictive mean, variance, quantile, probability density, distribution function, or any other chosen function of the outcome variable. We illustrate the model through the analysis of two real educational data sets, involving (resp.) a sharp RDD and a fuzzy RDD.

Abstract:
Motivated by the fundamental problem of measuring species diversity, this paper introduces the concept of a cluster structure to define an exchangeable cluster probability function that governs the joint distribution of a random count and its exchangeable random partitions. A cluster structure, naturally arising from a completely random measure mixed Poisson process, allows the probability distribution of the random partitions of a subset of a sample to be dependent on the sample size, a distinct and motivated feature that differs it from a partition structure. A generalized negative binomial process model is proposed to generate a cluster structure, where in the prior the number of clusters is finite and Poisson distributed, and the cluster sizes follow a truncated negative binomial distribution. We construct a nonparametric Bayesian estimator of Simpson's index of diversity under the generalized negative binomial process. We illustrate our results through the analysis of two real sequencing count datasets.

Abstract:
P\'{o}lya trees fix partitions and use random probabilities in order to construct random probability measures. With quantile pyramids we instead fix probabilities and use random partitions. For nonparametric Bayesian inference we use a prior which supports piecewise linear quantile functions, based on the need to work with a finite set of partitions, yet we show that the limiting version of the prior exists. We also discuss and investigate an alternative model based on the so-called substitute likelihood. Both approaches factorize in a convenient way leading to relatively straightforward analysis via MCMC, since analytic summaries of posterior distributions are too complicated. We give conditions securing the existence of an absolute continuous quantile process, and discuss consistency and approximate normality for the sequence of posterior distributions. Illustrations are included.

Abstract:
We provide a reason for Bayesian updating, in the Bernoulli case, even when it is assumed that observations are independent and identically distributed with a fixed but unknown parameter $\theta_0$. The motivation relies on the use of loss functions and asymptotics. Such a justification is important due to the recent interest and focus on Bayesian consistency which indeed assumes that the observations are independent and identically distributed rather than being conditionally independent with joint distribution depending on the choice of prior.

Abstract:
The current definition of a conditional probability distribution enables one to update probabilities only on the basis of stochastic information. This paper provides a definition for conditional probability distributions with non-stochastic information. The definition is derived as a solution of a decision theoretic problem, where the information is connected to the outcome of interest via a loss function. We shall show that the Kullback-Leibler divergence plays a central role. Some illustrations are presented.