Abstract:
This paper describes a framework for flexible multiple hypothesis testing of autoregressive time series. The modeling approach is Bayesian, though a blend of frequentist and Bayesian reasoning is used to evaluate procedures. Nonparametric characterizations of both the null and alternative hypotheses will be shown to be the key robustification step necessary to ensure reasonable Type-I error performance. The methodology is applied to part of a large database containing up to 50 years of corporate performance statistics on 24,157 publicly traded American companies, where the primary goal of the analysis is to flag companies whose historical performance is significantly different from that expected due to chance.

Abstract:
In this article we describe Bayesian nonparametric procedures for two-sample hypothesis testing. Namely, given two sets of samples $\mathbf{y}^{\scriptscriptstyle(1)}\;$\stackrel{\scriptscriptstyle{iid}}{\s im}$\;F^{\scriptscriptstyle(1)}$ and $\mathbf{y}^{\scriptscriptstyle(2 )}\;$\stackrel{\scriptscriptstyle{iid}}{\sim}$\;F^{\scriptscriptstyle( 2)}$, with $F^{\scriptscriptstyle(1)},F^{\scriptscriptstyle(2)}$ unknown, we wish to evaluate the evidence for the null hypothesis $H_0:F^{\scriptscriptstyle(1)}\equiv F^{\scriptscriptstyle(2)}$ versus the alternative $H_1:F^{\scriptscriptstyle(1)}\neq F^{\scriptscriptstyle(2)}$. Our method is based upon a nonparametric P\'{o}lya tree prior centered either subjectively or using an empirical procedure. We show that the P\'{o}lya tree prior leads to an analytic expression for the marginal likelihood under the two hypotheses and hence an explicit measure of the probability of the null $\mathrm{Pr}(H_0|\{\mathbf {y}^{\scriptscriptstyle(1)},\mathbf{y}^{\scriptscriptstyle(2)}\}\mathbf{)}$.

Abstract:
In this paper we study Bayesian answers to testing problems when the hypotheses are not well separated and propose a general approach with a special focus on shape constrains testing. We then apply our method to several testing problems including testing for positivity and monotonicity in a nonparametric regression setting. For each of this problems, we show that our approach leads to the optimal separation rate of testing, which indicates that our tests have the best power. To our knowledge, separation rates have not been studied in the Bayesian literature so far.

Abstract:
In this paper, we establish a uniform error rate of a Bahadur representation for local polynomial estimators of quantile regression functions. The error rate is uniform over a range of quantiles, a range of evaluation points in the regressors, and over a wide class of probabilities for observed random variables. Most of the existing results on Bahadur representations for local polynomial quantile regression estimators apply to the fixed data generating process. In the context of testing monotonicity where the null hypothesis is of a complex composite hypothesis, it is particularly relevant to establish Bahadur expansions that hold uniformly over a large class of data generating processes. In addition, we establish the same error rate for bootstrap local polynomial estimators which can be useful for various bootstrap inference. As an illustration, we apply to testing monotonicity of quantile regression and present Monte Carlo experiments based on this example.

Abstract:
The regression discontinuity (RD) design is a popular approach to causal inference in non-randomized studies. This is because it can be used to identify and estimate causal effects under mild conditions. Specifically, for each subject, the RD design assigns a treatment or non-treatment, depending on whether or not an observed value of an assignment variable exceeds a fixed and known cutoff value. In this paper, we propose a Bayesian nonparametric regression modeling approach to RD designs, which exploits a local randomization feature. In this approach, the assignment variable is treated as a covariate, and a scalar-valued confounding variable is treated as a dependent variable (which may be a multivariate confounder score). Then, over the model's posterior distribution of locally-randomized subjects that cluster around the cutoff of the assignment variable, inference for causal effects are made within this random cluster, via two-group statistical comparisons of treatment outcomes and non-treatment outcomes. We illustrate the Bayesian nonparametric approach through the analysis of a real educational data set, to investigate the causal link between basic skills and teaching ability.

Abstract:
Top monotonicity is a relaxation of various well-known domain restrictions such as single-peaked and single-crossing for which negative impossibility results are circumvented and for which the median-voter theorem still holds. We examine the problem of testing top monotonicity and present a characterization of top monotonicity with respect to non-betweenness constraints. We then extend the definition of top monotonicity to partial orders and show that testing top monotonicity of partial orders is NP-complete.

Abstract:
In this article, we propose a new method for the fundamental task of testing for dependence between two groups of variables. The response densities under the null hypothesis of independence and the alternative hypothesis of dependence are specified by nonparametric Bayesian models. Under the null hypothesis, the joint distribution is modeled by the product of two independent Dirichlet Process Mixture (DPM) priors; under the alternative, the full joint density is modeled by a multivariate DPM prior. The test is then based on the posterior probability of favoring the alternative hypothesis. The proposed test not only has good performance for testing linear dependence among other popular nonparametric tests, but is also preferred to other methods in testing many of the nonlinear dependencies we explored. In the analysis of gene expression data, we compare different methods for testing pairwise dependence between genes. The results show that the proposed test identifies some dependence structures that are not detected by other tests.

Abstract:
For many real-life Bayesian networks, common knowledge dictates that the output established for the main variable of interest increases with higher values for the observable variables. We define two concepts of monotonicity to capture this type of knowledge. We say that a network is isotone in distribution if the probability distribution computed for the output variable given specific observations is stochastically dominated by any such distribution given higher-ordered observations; a network is isotone in mode if a probability distribution given higher observations has a higher mode. We show that establishing whether a network exhibits any of these properties of monotonicity is coNPPP-complete in general, and remains coNP-complete for polytrees. We present an approximate algorithm for deciding whether a network is monotone in distribution and illustrate its application to a real-life network in oncology.

Abstract:
Monotonicity is a key qualitative prediction of a wide array of economic models derived via robust comparative statics. It is therefore important to design effective and practical econometric methods for testing this prediction in empirical analysis. This paper develops a general nonparametric framework for testing monotonicity of a regression function. Using this framework, a broad class of new tests is introduced, which gives an empirical researcher a lot of flexibility to incorporate ex ante information she might have. The paper also develops new methods for simulating critical values, which are based on the combination of a bootstrap procedure and new selection algorithms. These methods yield tests that have correct asymptotic size and are asymptotically nonconservative. It is also shown how to obtain an adaptive rate optimal test that has the best attainable rate of uniform consistency against models whose regression function has Lipschitz-continuous first-order derivatives and that automatically adapts to the unknown smoothness of the regression function. Simulations show that the power of the new tests in many cases significantly exceeds that of some prior tests, e.g. that of Ghosal, Sen, and Van der Vaart (2000). An application of the developed procedures to the dataset of Ellison and Ellison (2011) shows that there is some evidence of strategic entry deterrence in pharmaceutical industry where incumbents may use strategic investment to prevent generic entries when their patents expire.

Abstract:
A key problem in statistical modeling is model selection, how to choose a model at an appropriate level of complexity. This problem appears in many settings, most prominently in choosing the number ofclusters in mixture models or the number of factors in factor analysis. In this tutorial we describe Bayesian nonparametric methods, a class of methods that side-steps this issue by allowing the data to determine the complexity of the model. This tutorial is a high-level introduction to Bayesian nonparametric methods and contains several examples of their application.