Abstract:
Nonparametric and nonlinear measures of statistical dependence between pairs of random variables have proved themselves important tools in modern data analysis, where the emergence of large data sets can support the relaxation of linearity assumptions implicit in traditional association scores such as correlation. Recent proposals based around estimating information theoretic measures such as Mutual Information (MI) have been particularly popular. Here we describe a Bayesian nonparametric procedure that leads to a tractable, explicit and analytic quantification of the probability of dependence, using Polya tree priors on the space of probability measures. Our procedure can accommodate known uncertainty in the form of the underlying sampling distribution and provides an explicit posterior probability measure of both dependence and independence. Well known advantages of having an explicit probability measure include the easy comparison of evidence across different studies, the inclusion of prior information, and the integration of results within decision analysis.

Abstract:
In recent years, Bayesian nonparametric statistics has gathered extraordinary attention. Nonetheless, a relatively little amount of work has been expended on Bayesian nonparametric hypothesis testing. In this paper, a novel Bayesian nonparametric approach to the two-sample problem is established. Precisely, given two samples $\mathbf{X}=X_1,\ldots,X_{m_1}$ $\overset {i.i.d.} \sim F$ and $\mathbf{Y}=Y_1,\ldots,Y_{m_2} \overset {i.i.d.} \sim G$, with $F$ and $G$ being unknown continuous cumulative distribution functions, we wish to test the null hypothesis $\mathcal{H}_0:~F=G$. The method is based on the Kolmogorov distance and approximate samples from the Dirichlet process centered at the standard normal distribution and a concentration parameter 1. It is demonstrated that the proposed test is robust with respect to any prior specification of the Dirichlet process. A power comparison with several well-known tests is incorporated. In particular, the proposed test dominates the standard Kolmogorov-Smirnov test in all the cases examined in the paper.

Abstract:
This paper proposes a low-computational Bayesian algorithm for noisy sparse recovery (NSR), called BHT-BP. In this framework, we consider an LDPC-like measurement matrices which has a tree-structured property, and additive white Gaussian noise. BHT-BP has a joint detection-and-estimation structure consisting of a sparse support detector and a nonzero estimator. The support detector is designed under the criterion of the minimum detection error probability using a nonparametric belief propagation (nBP) and composite binary hypothesis tests. The nonzeros are estimated in the sense of linear MMSE, where the support detection result is utilized. BHT-BP has its strength in noise robust support detection, effectively removing quantization errors caused by the uniform sampling-based nBP. Therefore, in the NSR problems, BHT-BP has advantages over CS-BP which is an existing nBP algorithm, being comparable to other recent CS solvers, in several aspects. In addition, we examine impact of the minimum nonzero value of sparse signals via BHT-BP, on the basis of the results of the recent literature. Our empirical result shows that variation of x_min is reflected to recovery performance in the form of SNR shift.

Abstract:
We develop a set of scalable Bayesian inference procedures for a general class of nonparametric regression models based on embarrassingly parallel MCMC. Specifically, we first perform independent nonparametric Bayesian inference on each subset split from a massive dataset, and then aggregate those results into global counterparts. By partitioning the dataset carefully, we show that our aggregated inference results obtain the oracle rule in the sense that they are equivalent to those obtained directly from the massive data (which are computationally prohibitive in practice, though). For example, the aggregated credible sets achieve desirable credibility level and frequentist coverage possessed by the oracle counterparts (with similar radius). The oracle matching phenomenon occurs due to the nice geometric structures of the infinite-dimensional parameter space. A technical by-product is a new version of uniformly consistent test that applies to a general regression model under Sobolev norm.

Abstract:
This paper studies the problem of testing whether a function is monotone from a nonparametric Bayesian perspective. Two new families of tests are constructed. The first uses constrained smoothing splines, together with a hierarchical stochastic-process prior that explicitly controls the prior probability of monotonicity. The second uses regression splines, together with two proposals for the prior over the regression coefficients. The finite-sample performance of the tests is shown via simulation to improve upon existing frequentist and Bayesian methods. The asymptotic properties of the Bayes factor for comparing monotone versus non-monotone regression functions in a Gaussian model are also studied. Our results significantly extend those currently available, which chiefly focus on determining the dimension of a parametric linear model.

Abstract:
There is an increasing interest to understand the dependence structure of a random vector not only in the center of its distribution but also in the tails. Extreme-value theory tackles the problem of modelling the joint tail of a multivariate distribution by modelling the marginal distributions and the dependence structure separately. For estimating dependence at high levels, the stable tail dependence function and the spectral measure are particularly convenient. These objects also lie at the basis of nonparametric techniques for modelling the dependence among extremes in the max-domain of attraction setting. In case of asymptotic independence, this setting is inadequate, and more refined tail dependence coefficients exist, serving, among others, to discriminate between asymptotic dependence and independence. Throughout, the methods are illustrated on financial data.

Abstract:
We develop a Bayesian nonparametric model for reconstructing magnetic resonance images (MRI) from highly undersampled k-space data. We perform dictionary learning as part of the image reconstruction process. To this end, we use the beta process as a nonparametric dictionary learning prior for representing an image patch as a sparse combination of dictionary elements. The size of the dictionary and the patch-specific sparsity pattern are inferred from the data, in addition to other dictionary learning variables. Dictionary learning is performed directly on the compressed image, and so is tailored to the MRI being considered. In addition, we investigate a total variation penalty term in combination with the dictionary learning model, and show how the denoising property of dictionary learning removes dependence on regularization parameters in the noisy setting. We derive a stochastic optimization algorithm based on Markov Chain Monte Carlo (MCMC) for the Bayesian model, and use the alternating direction method of multipliers (ADMM) for efficiently performing total variation minimization. We present empirical results on several MRI, which show that the proposed regularization framework can improve reconstruction accuracy over other methods.

Abstract:
A key problem in statistical modeling is model selection, how to choose a model at an appropriate level of complexity. This problem appears in many settings, most prominently in choosing the number ofclusters in mixture models or the number of factors in factor analysis. In this tutorial we describe Bayesian nonparametric methods, a class of methods that side-steps this issue by allowing the data to determine the complexity of the model. This tutorial is a high-level introduction to Bayesian nonparametric methods and contains several examples of their application.

Abstract:
We consider the problem of flexible modeling of higher order Markov chains when an upper bound on the order of the chain is known but the true order and nature of the serial dependence are unknown. We propose Bayesian nonparametric methodology based on conditional tensor factorizations, which can characterize any transition probability with a specified maximal order. The methodology selects the important lags and captures higher order interactions among the lags, while also facilitating calculation of Bayes factors for a variety of hypotheses of interest. We design efficient Markov chain Monte Carlo algorithms for posterior computation, allowing for uncertainty in the set of important lags to be included and in the nature and order of the serial dependence. The methods are illustrated using simulation experiments and real world applications.

Abstract:
A Bayesian approach to the classification problem is proposed in which random partitions play a central role. It is argued that the partitioning approach has the capacity to take advantage of a variety of large-scale spatial structures, if they are present in the unknown regression function $f_0$. An idealized one-dimensional problem is considered in detail. The proposed nonparametric prior uses random split points to partition the unit interval into a random number of pieces. This prior is found to provide a consistent estimate of the regression function in the $\L^p$ topology, for any $1 \leq p < \infty$, and for arbitrary measurable $f_0:[0,1] \to [0,1]$. A Markov chain Monte Carlo (MCMC) implementation is outlined and analyzed. Simulation experiments are conducted to show that the proposed estimate compares favorably with a variety of conventional estimators. A striking resemblance between the posterior mean estimate and the bagged CART estimate is noted and discussed. For higher dimensions, a generalized prior is introduced which employs a random Voronoi partition of the covariate-space. The resulting estimate displays promise on a two-dimensional problem, and extends with a minimum of additional computational effort to arbitrary metric spaces.