Abstract:
Matrix Schubert varieties are certain varieties in the affine space of square matrices which are determined by specifying rank conditions on submatrices. We study these varieties for generic matrices, symmetric matrices, and upper triangular matrices in view of two applications to algebraic statistics: we observe that special conditional independence models for Gaussian random variables are intersections of matrix Schubert varieties in the symmetric case. Consequently, we obtain a combinatorial primary decomposition algorithm for some conditional independence ideals. We also characterize the vanishing ideals of Gaussian graphical models for generalized Markov chains. In the course of this investigation, we are led to consider three related stratifications, which come from the Schubert stratification of a flag variety. We provide some combinatorial results, including describing the stratifications using the language of rank arrays and enumerating the strata in each case.

Abstract:
Ancestral graph models, introduced by Richardson and Spirtes (2002), generalize both Markov random fields and Bayesian networks to a class of graphs with a global Markov property that is closed under conditioning and marginalization. By design, ancestral graphs encode precisely the conditional independence structures that can arise from Bayesian networks with selection and unobserved (hidden/latent) variables. Thus, ancestral graph models provide a potentially very useful framework for exploratory model selection when unobserved variables might be involved in the data-generating process but no particular hidden structure can be specified. In this paper, we present the Iterative Conditional Fitting (ICF) algorithm for maximum likelihood estimation in Gaussian ancestral graph models. The name reflects that in each step of the procedure a conditional distribution is estimated, subject to constraints, while a marginal distribution is held fixed. This approach is in duality to the well-known Iterative Proportional Fitting algorithm, in which marginal distributions are fitted while conditional distributions are held fixed.

Abstract:
We show that there can be no finite list of conditional independence relations which can be used to deduce all conditional independence implications among Gaussian random variables. To do this, we construct, for each $n> 3$ a family of $n$ conditional independence statements on $n$ random variables which together imply that $X_1 \ind X_2$, and such that no subset have this same implication. The proof relies on binomial primary decomposition.

Abstract:
Many relations of scientific interest are nonlinear, and even in linear systems distributions are often non-Gaussian, for example in fMRI BOLD data. A class of search procedures for causal relations in high dimensional data relies on sample derived conditional independence decisions. The most common applications rely on Gaussian tests that can be systematically erroneous in nonlinear non-Gaussian cases. Recent work (Gretton et al. (2009), Tillman et al. (2009), Zhang et al. (2011)) has proposed conditional independence tests using Reproducing Kernel Hilbert Spaces (RKHS). Among these, perhaps the most efficient has been KCI (Kernel Conditional Independence, Zhang et al. (2011)), with computational requirements that grow effectively at least as O(N3), placing it out of range of large sample size analysis, and restricting its applicability to high dimensional data sets. We propose a class of O(N2) tests using conditional correlation independence (CCI) that require a few seconds on a standard workstation for tests that require tens of minutes to hours for the KCI method, depending on degree of parallelization, with similar accuracy. For accuracy on difficult nonlinear, non-Gaussian data sets, we also compare a recent test due to Harris & Drton (2012), applicable to nonlinear, non-Gaussian distributions in the Gaussian copula, as well as to partial correlation, a linear Gaussian test.

Abstract:
We propose a new class of models for random permutations, which we call log-linear models, by the analogy with log-linear models used in the analysis of contingency tables. As a special case, we study the family of all Luce-decomposable distributions, and the family of those random permutations, for which the distribution of both the permutation and its inverse is Luce-decomposable. We show that these latter models can be described by conditional independence relations. We calculate the number of free parameters in these models, and describe an iterative algorithm for maximum likelihood estimation, which enables us to test if a set of data satisfies the conditional independence relations or not.

Abstract:
Spatio-temporal models are widely used for inference in statistics and many applied areas. In such contexts interests are often in the fractal nature of the sample surfaces and in the rate of change of the spatial surface at a given location in a given direction. In this paper we apply the theory of Yaglom (1957) to construct a large class of space-time Gaussian models with stationary increments, establish bounds on the prediction errors and determine the smoothness properties and fractal properties of this class of Gaussian models. Our results can be applied directly to analyze the stationary space-time models introduced by Cressie and Huang (1999), Gneiting (2002) and Stein (2005), respectively.

Abstract:
The semigraphoid closure of every couple of CI-statements (GI=conditional independence) is a stochastic CI-model. As a consequence of this result it is shown that every probabilistically sound inference rule for CI-model, having at most two antecedents, is derivable from the semigraphoid inference rules. This justifies the use of semigraphoids as approximations of stochastic CI-models in probabilistic reasoning. The list of all 19 potential dominant elements of the mentioned semigraphoid closure is given as a byproduct.

Abstract:
For the nonparametric estimation of multivariate finite mixture models with the conditional independence assumption, we propose a new formulation of the objective function in terms of penalized smoothed Kullback-Leibler distance. The nonlinearly smoothed majorization-minimization (NSMM) algorithm is derived from this perspective. An elegant representation of the NSMM algorithm is obtained using a novel projection-multiplication operator, a more precise monotonicity property of the algorithm is discovered, and the existence of a solution to the main optimization problem is proved for the first time.

Abstract:
Conditional Gaussian graphical models (cGGM) are a recent reparametrization of the multivariate linear regression model which explicitly exhibits $i)$ the partial covariances between the predictors and the responses, and $ii)$ the partial covariances between the responses themselves. Such models are particularly suitable for interpretability since partial covariances describe strong relationships between variables. In this framework, we propose a regularization scheme to enhance the learning strategy of the model by driving the selection of the relevant input features by prior structural information. It comes with an efficient alternating optimization procedure which is guaranteed to converge to the global minimum. On top of showing competitive performance on artificial and real datasets, our method demonstrates capabilities for fine interpretation of its parameters, as illustrated on three high-dimensional datasets from spectroscopy, genetics, and genomics.

Abstract:
Log-linear models are a classical tool for the analysis of contingency tables. In particular, the subclass of graphical log-linear models provides a general framework for modelling conditional independences. However, with the exception of special structures, marginal independence hypotheses cannot be accommodated by these traditional models. Focusing on binary variables, we present a model class that provides a framework for modelling marginal independences in contingency tables. The approach taken is graphical and draws on analogies to multivariate Gaussian models for marginal independence. For the graphical model representation we use bi-directed graphs, which are in the tradition of path diagrams. We show how the models can be parameterized in a simple fashion, and how maximum likelihood estimation can be performed using a version of the Iterated Conditional Fitting algorithm. Finally we consider combining these models with symmetry restrictions.