Abstract:
The regression discontinuity (RD) design is a popular approach to causal inference in non-randomized studies. This is because it can be used to identify and estimate causal effects under mild conditions. Specifically, for each subject, the RD design assigns a treatment or non-treatment, depending on whether or not an observed value of an assignment variable exceeds a fixed and known cutoff value. In this paper, we propose a Bayesian nonparametric regression modeling approach to RD designs, which exploits a local randomization feature. In this approach, the assignment variable is treated as a covariate, and a scalar-valued confounding variable is treated as a dependent variable (which may be a multivariate confounder score). Then, over the model's posterior distribution of locally-randomized subjects that cluster around the cutoff of the assignment variable, inference for causal effects are made within this random cluster, via two-group statistical comparisons of treatment outcomes and non-treatment outcomes. We illustrate the Bayesian nonparametric approach through the analysis of a real educational data set, to investigate the causal link between basic skills and teaching ability.

Abstract:
In this paper, we propose an explicit closed-form Bayes factor for the problem of two-sample hypothesis testing. The proposed approach can be regarded as a Bayesian version of the pooled-variance t-statistic and has various appealing properties in practical applications. It relies on data only through the t-statistic and can thus be calculated by using an Excel spreadsheet or a pocket calculator. It avoids several undesirable paradoxes, which may be encountered by the previous Bayesian approach of Gonen et al. (2005). Specifically, the proposed approach can be easily taught in an introductory statistics course with an emphasis on Bayesian thinking. Simulated and real data examples are provided for illustrative purposes.

Abstract:
In recent years, Bayesian nonparametric statistics has gathered extraordinary attention. Nonetheless, a relatively little amount of work has been expended on Bayesian nonparametric hypothesis testing. In this paper, a novel Bayesian nonparametric approach to the two-sample problem is established. Precisely, given two samples $\mathbf{X}=X_1,\ldots,X_{m_1}$ $\overset {i.i.d.} \sim F$ and $\mathbf{Y}=Y_1,\ldots,Y_{m_2} \overset {i.i.d.} \sim G$, with $F$ and $G$ being unknown continuous cumulative distribution functions, we wish to test the null hypothesis $\mathcal{H}_0:~F=G$. The method is based on the Kolmogorov distance and approximate samples from the Dirichlet process centered at the standard normal distribution and a concentration parameter 1. It is demonstrated that the proposed test is robust with respect to any prior specification of the Dirichlet process. A power comparison with several well-known tests is incorporated. In particular, the proposed test dominates the standard Kolmogorov-Smirnov test in all the cases examined in the paper.

Abstract:
We consider the problem of testing whether two finite-dimensional random dot product graphs have generating latent positions that are independently drawn from the same distribution, or distributions that are related via scaling or projection. We propose a test statistic that is a kernel-based function of the adjacency spectral embedding for each graph. We obtain a limiting distribution for our test statistic under the null and we show that our test procedure is consistent across a broad range of alternatives.

Abstract:
This paper studies the problem of testing whether a function is monotone from a nonparametric Bayesian perspective. Two new families of tests are constructed. The first uses constrained smoothing splines, together with a hierarchical stochastic-process prior that explicitly controls the prior probability of monotonicity. The second uses regression splines, together with two proposals for the prior over the regression coefficients. The finite-sample performance of the tests is shown via simulation to improve upon existing frequentist and Bayesian methods. The asymptotic properties of the Bayes factor for comparing monotone versus non-monotone regression functions in a Gaussian model are also studied. Our results significantly extend those currently available, which chiefly focus on determining the dimension of a parametric linear model.

Abstract:
This paper describes a framework for flexible multiple hypothesis testing of autoregressive time series. The modeling approach is Bayesian, though a blend of frequentist and Bayesian reasoning is used to evaluate procedures. Nonparametric characterizations of both the null and alternative hypotheses will be shown to be the key robustification step necessary to ensure reasonable Type-I error performance. The methodology is applied to part of a large database containing up to 50 years of corporate performance statistics on 24,157 publicly traded American companies, where the primary goal of the analysis is to flag companies whose historical performance is significantly different from that expected due to chance.

Abstract:
In this paper, we propose a Bayesian Hypothesis Testing Algorithm (BHTA) for sparse representation. It uses the Bayesian framework to determine active atoms in sparse representation of a signal. The Bayesian hypothesis testing based on three assumptions, determines the active atoms from the correlations and leads to the activity measure as proposed in Iterative Detection Estimation (IDE) algorithm. In fact, IDE uses an arbitrary decreasing sequence of thresholds while the proposed algorithm is based on a sequence which derived from hypothesis testing. So, Bayesian hypothesis testing framework leads to an improved version of the IDE algorithm. The simulations show that Hard-version of our suggested algorithm achieves one of the best results in terms of estimation accuracy among the algorithms which have been implemented in our simulations, while it has the greatest complexity in terms of simulation time.

Abstract:
This letter presents a novel Block Bayesian Hypothesis Testing Algorithm (Block-BHTA) for reconstructing block sparse signals with unknown block structures. The Block-BHTA comprises the detection and recovery of the supports, and the estimation of the amplitudes of the block sparse signal. The support detection and recovery is performed using a Bayesian hypothesis testing. Then, based on the detected and reconstructed supports, the nonzero amplitudes are estimated by linear MMSE. The effectiveness of Block-BHTA is demonstrated by numerical experiments.

Abstract:
This paper deals with the problem of nonparametric independence testing, a fundamental decision-theoretic problem that asks if two arbitrary (possibly multivariate) random variables $X,Y$ are independent or not, a question that comes up in many fields like causality and neuroscience. While quantities like correlation of $X,Y$ only test for (univariate) linear independence, natural alternatives like mutual information of $X,Y$ are hard to estimate due to a serious curse of dimensionality. A recent approach, avoiding both issues, estimates norms of an \textit{operator} in Reproducing Kernel Hilbert Spaces (RKHSs). Our main contribution is strong empirical evidence that by employing \textit{shrunk} operators when the sample size is small, one can attain an improvement in power at low false positive rates. We analyze the effects of Stein shrinkage on a popular test statistic called HSIC (Hilbert-Schmidt Independence Criterion). Our observations provide insights into two recently proposed shrinkage estimators, SCOSE and FCOSE - we prove that SCOSE is (essentially) the optimal linear shrinkage method for \textit{estimating} the true operator; however, the non-linearly shrunk FCOSE usually achieves greater improvements in \textit{test power}. This work is important for more powerful nonparametric detection of subtle nonlinear dependencies for small samples.

Abstract:
In this article, we propose a new method for the fundamental task of testing for dependence between two groups of variables. The response densities under the null hypothesis of independence and the alternative hypothesis of dependence are specified by nonparametric Bayesian models. Under the null hypothesis, the joint distribution is modeled by the product of two independent Dirichlet Process Mixture (DPM) priors; under the alternative, the full joint density is modeled by a multivariate DPM prior. The test is then based on the posterior probability of favoring the alternative hypothesis. The proposed test not only has good performance for testing linear dependence among other popular nonparametric tests, but is also preferred to other methods in testing many of the nonlinear dependencies we explored. In the analysis of gene expression data, we compare different methods for testing pairwise dependence between genes. The results show that the proposed test identifies some dependence structures that are not detected by other tests.