Abstract:
This paper discusses asymptotically distribution free tests for the classical goodness-of-fit hypothesis of an error distribution in nonparametric regression models. These tests are based on the same martingale transform of the residual empirical process as used in the one sample location model. This transformation eliminates extra randomization due to covariates but not due the errors, which is intrinsically present in the estimators of the regression function. Thus, tests based on the transformed process have, generally, better power. The results of this paper are applicable as soon as asymptotic uniform linearity of nonparametric residual empirical process is available. In particular they are applicable under the conditions stipulated in recent papers of Akritas and Van Keilegom and M\"uller, Schick and Wefelmeyer.

Abstract:
This paper applies the recently axiomatized Optimum Information Principle (minimize the Kullback-Leibler information subject to all relevant information) to nonparametric density estimation, which provides a theoretical foundation as well as a computational algorithm for maximum entropy density estimation. The estimator, called optimum information estimator, approximates the true density arbitrarily well. As a by-product I obtain a measure of goodness of fit of parametric models (both conditional and unconditional) and an absolute criterion for model selection, as opposed to other conventional methods such as AIC and BIC which are relative measures.

Abstract:
This article describes an extension of classical \chi^2 goodness-of-fit tests to Bayesian model assessment. The extension, which essentially involves evaluating Pearson's goodness-of-fit statistic at a parameter value drawn from its posterior distribution, has the important property that it is asymptotically distributed as a \chi^2 random variable on K-1 degrees of freedom, independently of the dimension of the underlying parameter vector. By examining the posterior distribution of this statistic, global goodness-of-fit diagnostics are obtained. Advantages of these diagnostics include ease of interpretation, computational convenience and favorable power properties. The proposed diagnostics can be used to assess the adequacy of a broad class of Bayesian models, essentially requiring only a finite-dimensional parameter vector and conditionally independent observations.

Abstract:
The problem of curve registration appears in many different areas of applications ranging from neuroscience to road traffic modeling. In the present work, we propose a nonparametric testing framework in which we develop a generalized likelihood ratio test to perform curve registration. We first prove that, under the null hypothesis, the resulting test statistic is asymptotically distributed as a chi-squared random variable. This result, often referred to as Wilks' phenomenon, provides a natural threshold for the test of a prescribed asymptotic significance level and a natural measure of lack-of-fit in terms of the $p$-value of the $\chi^2$-test. We also prove that the proposed test is consistent, \textit{i.e.}, its power is asymptotically equal to $1$. Finite sample properties of the proposed methodology are demonstrated by numerical simulations. As an application, a new local descriptor for digital images is introduced and an experimental evaluation of its discriminative power is conducted.

Abstract:
In the framework of quantum optics, we study the problem of goodness-of-fit testing in a severely ill-posed inverse problem. A novel testing procedure is introduced and its rates of convergence are investigated under various smoothness assumptions. The procedure is derived from a projection-type estimator, where the projection is done in $\mathbb{L}_2$ distance on some suitably chosen pattern functions. The proposed methodology is illustrated with simulated data sets.

Abstract:
We consider an unknown response function $f$ defined on $\Delta=[0,1]^d$, $1\le d\le\infty$, taken at $n$ random uniform design points and observed with Gaussian noise of known variance. Given a positive sequence $r_n\to 0$ as $n\to\infty$ and a known function $f_0 \in L_2(\Delta)$, we propose, under general conditions, a unified framework for the goodness-of-fit testing problem for testing the null hypothesis $H_0: f=f_0$ against the alternative $H_1: f\in\CF, \|f-f_0\|\ge r_n$, where $\CF$ is an ellipsoid in the Hilbert space $ L_2(\Delta)$ with respect to the tensor product Fourier basis and $\|\cdot\|$ is the norm in $ L_2(\Delta)$. We obtain both rate and sharp asymptotics for the error probabilities in the minimax setup. The derived tests are inherently non-adaptive. Several illustrative examples are presented. In particular, we consider functions belonging to ellipsoids arising from the well-known multidimensional Sobolev and tensor product Sobolev norms as well as from the less-known Sloan-Wo$\rm\acute{z}$niakowski norm and a norm constructed from multivariable analytic functions on the complex strip. Some extensions of the suggested minimax goodness-of-fit testing methodology, covering the cases of general design schemes with a known product probability density function, unknown variance, other basis functions and adaptivity of the suggested tests, are also briefly discussed.

Abstract:
In this article we describe Bayesian nonparametric procedures for two-sample hypothesis testing. Namely, given two sets of samples $\mathbf{y}^{\scriptscriptstyle(1)}\;$\stackrel{\scriptscriptstyle{iid}}{\s im}$\;F^{\scriptscriptstyle(1)}$ and $\mathbf{y}^{\scriptscriptstyle(2 )}\;$\stackrel{\scriptscriptstyle{iid}}{\sim}$\;F^{\scriptscriptstyle( 2)}$, with $F^{\scriptscriptstyle(1)},F^{\scriptscriptstyle(2)}$ unknown, we wish to evaluate the evidence for the null hypothesis $H_0:F^{\scriptscriptstyle(1)}\equiv F^{\scriptscriptstyle(2)}$ versus the alternative $H_1:F^{\scriptscriptstyle(1)}\neq F^{\scriptscriptstyle(2)}$. Our method is based upon a nonparametric P\'{o}lya tree prior centered either subjectively or using an empirical procedure. We show that the P\'{o}lya tree prior leads to an analytic expression for the marginal likelihood under the two hypotheses and hence an explicit measure of the probability of the null $\mathrm{Pr}(H_0|\{\mathbf {y}^{\scriptscriptstyle(1)},\mathbf{y}^{\scriptscriptstyle(2)}\}\mathbf{)}$.

Abstract:
Given a sample of size $n$ from a population of individual belonging to different species with unknown proportions, a popular problem of practical interest consists in making inference on the probability $D_{n}(l)$ that the $(n+1)$-th draw coincides with a species with frequency $l$ in the sample, for any $l=0,1,\ldots,n$. This paper contributes to the methodology of Bayesian nonparametric inference for $D_{n}(l)$. Specifically, under the general framework of Gibbs-type priors we show how to derive credible intervals for the Bayesian nonparametric estimator of $D_{n}(l)$, and we investigate the large $n$ asymptotic behaviour of such an estimator. Of particular interest are special cases of our results obtained under the assumption of the two parameter Poisson-Dirichlet prior and the normalized generalized Gamma prior, which are two of the most commonly used Gibbs-type priors. With respect to these two prior assumptions, the proposed results are illustrated through a simulation study and a benchmark Expressed Sequence Tags dataset. To the best our knowledge, this illustration provides the first comparative study between the two parameter Poisson-Dirichlet prior and the normalized generalized Gamma prior in the context of Bayesian nonparemetric inference for $D_{n}(l)$

Abstract:
The process comparing the empirical cumulative distribution function of the sample with a parametric estimate of the cumulative distribution function is known as the empirical process with estimated parameters and has been extensively employed in the literature for goodness-of-fit testing. The simplest way to carry out such goodness-of-fit tests, especially in a multivariate setting, is to use a parametric bootstrap. Although very easy to implement, the parametric bootstrap can become very computationally expensive as the sample size, the number of parameters, or the dimension of the data increase. An alternative resampling technique based on a fast weighted bootstrap is proposed in this paper, and is studied both theoretically and empirically. The outcome of this work is a generic and computationally efficient multiplier goodness-of-fit procedure that can be used as a large-sample alternative to the parametric bootstrap. In order to approximately determine how large the sample size needs to be for the parametric and weighted bootstraps to have roughly equivalent powers, extensive Monte Carlo experiments are carried out in dimension one, two and three, and for models containing up to nine parameters. The computational gains resulting from the use of the proposed multiplier goodness-of-fit procedure are illustrated on trivariate financial data. A by-product of this work is a fast large-sample goodness-of-fit procedure for the bivariate and trivariate t distribution whose degrees of freedom are fixed.

Abstract:
This paper studies the problem of nonparametric testing for the effect of a random functional covariate on a real-valued error term. The covariate takes values in $L^2[0,1]$, the Hilbert space of the square-integrable real-valued functions on the unit interval. The error term could be directly observed as a response or \emph{estimated} from a functional parametric model, like for instance the functional linear regression. Our test is based on the remark that checking the no-effect of the functional covariate is equivalent to checking the nullity of the conditional expectation of the error term given a sufficiently rich set of projections of the covariate. Such projections could be on elements of norm 1 from finite-dimension subspaces of $L^2[0,1]$. Next, the idea is to search a finite-dimension element of norm 1 that is, in some sense, the least favorable for the null hypothesis. Finally, it remains to perform a nonparametric check of the nullity of the conditional expectation of the error term given the scalar product between the covariate and the selected least favorable direction. For such finite-dimension search and nonparametric check we use a kernel-based approach. As a result, our test statistic is a quadratic form based on univariate kernel smoothing and the asymptotic critical values are given by the standard normal law. The test is able to detect nonparametric alternatives, including the polynomial ones. The error term could present heteroscedasticity of unknown form. We do no require the law of the covariate $X$ to be known. The test could be implemented quite easily and performs well in simulations and real data applications. We illustrate the performance of our test for checking the functional linear regression model.