Abstract:
We propose a new method to test conditional independence of two real random variables $Y$ and $Z$ conditionally on an arbitrary third random variable $X$. %with $F_{.|.}$ representing conditional distribution functions, The partial copula is introduced, defined as the joint distribution of $U=F_{Y|X}(Y|X)$ and $V=F_{Z|X}(Z|X)$. We call this transformation of $(Y,Z)$ into $(U,V)$ the partial copula transform. It is easy to show that if $Y$ and $Z$ are continuous for any given value of $X$, then $Y\ind Z|X$ implies $U\ind V$. Conditional independence can then be tested by (i) applying the partial copula transform to the data points and (ii) applying a test of ordinary independence to the transformed data. In practice, $F_{Y|X}$ and $F_{Z|X}$ will need to be estimated, which can be done by, e.g., standard kernel methods. We show that under easily satisfied conditions, and for a very large class of test statistics for independence which includes the covariance, Kendall's tau, and Hoeffding's test statistic, the effect of this estimation vanishes asymptotically. Thus, for large samples, the estimation can be ignored and we have a simple method which can be used to apply a wide range of tests of independence, including ones with consistency for arbitrary alternatives, to test for conditional independence. A simulation study indicates good small sample performance. Advantages of the partial copula approach compared to competitors seem to be simplicity and generality.

Abstract:
The goal of this paper is to integrate the notions of stochastic conditional independence and variation conditional independence under a more general notion of extended conditional independence. We show that under appropriate assumptions the calculus that applies for the two cases separately (axioms of a separoid) still applies for the extended case. These results provide a rigorous basis for a wide range of statistical concepts, including ancillarity and sufficiency, and, in particular, the Decision Theoretic framework for statistical causality, which uses the language and calculus of conditional independence in order to express causal properties and make causal inferences.

Abstract:
We consider inference procedures, conditional on an observed ancillary statistic, for regression coefficients under a linear regression setup where the unknown error distribution is specified nonparametrically. We establish conditional asymptotic normality of the regression coefficient estimators under regularity conditions, and formally justify the approach of plugging in kernel-type density estimators in conditional inference procedures. Simulation results show that the approach yields accurate conditional coverage probabilities when used for constructing confidence intervals. The plug-in approach can be applied in conjunction with configural polysampling to derive robust conditional estimators adaptive to a confrontation of contrasting scenarios. We demonstrate this by investigating the conditional mean squared error of location estimators under various confrontations in a simulation study, which successfully extends configural polysampling to a nonparametric context.

Abstract:
Logical inference algorithms for conditional independence (CI) statements have important applications from testing consistency during knowledge elicitation to constraintbased structure learning of graphical models. We prove that the implication problem for CI statements is decidable, given that the size of the domains of the random variables is known and fixed. We will present an approximate logical inference algorithm which combines a falsification and a novel validation algorithm. The validation algorithm represents each set of CI statements as a sparse 0-1 matrix A and validates instances of the implication problem by solving specific linear programs with constraint matrix A. We will show experimentally that the algorithm is both effective and efficient in validating and falsifying instances of the probabilistic CI implication problem.

Abstract:
The graphoid axioms for conditional independence, originally described by Dawid [1979], are fundamental to probabilistic reasoning [Pearl, 19881. Such axioms provide a mechanism for manipulating conditional independence assertions without resorting to their numerical definition. This paper explores a representation for independence statements using multiple undirected graphs and some simple graphical transformations. The independence statements derivable in this system are equivalent to those obtainable by the graphoid axioms. Therefore, this is a purely graphical proof technique for conditional independence.

Abstract:
Bayesian properties of the signed root likelihood ratio statistic are analysed. Conditions for first-order probability matching are derived by the examination of the Bayesian posterior and frequentist means of this statistic. Second-order matching conditions are shown to arise from matching of the Bayesian posterior and frequentist variances of a mean-adjusted version of the signed root statistic. Conditions for conditional probability matching in ancillary statistic models are derived and discussed.

Abstract:
Prior specification for nonparametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. Realistically, a statistician is unlikely to have informed opinions about all aspects of such a parameter, but may have real information about functionals of the parameter, such the population mean or variance. This article proposes a new framework for nonparametric Bayes inference in which the prior distribution for a possibly infinite-dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a nonparametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard nonparametric prior distributions in common use, and inherit the large support of the standard priors upon which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard nonparametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modeling of high-dimensional sparse contingency tables.

Abstract:
Conditional independence in a multivariate normal (or Gaussian) distribution is characterized by the vanishing of subdeterminants of the distribution's covariance matrix. Gaussian conditional independence models thus correspond to algebraic subsets of the cone of positive definite matrices. For statistical inference in such models it is important to know whether or not the model contains singularities. We study this issue in models involving up to four random variables. In particular, we give examples of conditional independence relations which, despite being probabilistically representable, yield models that non-trivially decompose into a finite union of several smooth submodels.

Abstract:
One of the benefits of belief networks and influence diagrams is that so much knowledge is captured in the graphical structure. In particular, statements of conditional irrelevance (or independence) can be verified in time linear in the size of the graph. To resolve a particular inference query or decision problem, only some of the possible states and probability distributions must be specified, the "requisite information." This paper presents a new, simple, and efficient "Bayes-ball" algorithm which is well-suited to both new students of belief networks and state of the art implementations. The Bayes-ball algorithm determines irrelevant sets and requisite information more efficiently than existing methods, and is linear in the size of the graph for belief networks and influence diagrams.

Abstract:
This paper presents an approximate method for performing Bayesian inference in models with conditional independence over a decentralized network of learning agents. The method first employs variational inference on each individual learning agent to generate a local approximate posterior, the agents transmit their local posteriors to other agents in the network, and finally each agent combines its set of received local posteriors. The key insight in this work is that, for many Bayesian models, approximate inference schemes destroy symmetry and dependencies in the model that are crucial to the correct application of Bayes' rule when combining the local posteriors. The proposed method addresses this issue by including an additional optimization step in the combination procedure that accounts for these broken dependencies. Experiments on synthetic and real data demonstrate that the decentralized method provides advantages in computational performance and predictive test likelihood over previous batch and distributed methods.