Abstract:
We propose an adaptive diffusion mechanism to optimize a global cost function in a distributed manner over a network of nodes. The cost function is assumed to consist of a collection of individual components. Diffusion adaptation allows the nodes to cooperate and diffuse information in real-time; it also helps alleviate the effects of stochastic gradient noise and measurement noise through a continuous learning process. We analyze the mean-square-error performance of the algorithm in some detail, including its transient and steady-state behavior. We also apply the diffusion algorithm to two problems: distributed estimation with sparse parameters and distributed localization. Compared to well-studied incremental methods, diffusion methods do not require the use of a cyclic path over the nodes and are robust to node and link failure. Diffusion methods also endow networks with adaptation abilities that enable the individual nodes to continue learning even when the cost function changes with time. Examples involving such dynamic cost functions with moving targets are common in the context of biological networks.

Abstract:
We deal with the estimation of the regime number in a linear Gaussian autoregressive process with a Markov regime (AR-MR). The problem of estimating the number of regimes in this type of series is that of determining the number of states in the hidden Markov chain controlling the process. We propose a method based on penalized maximum likelihood estimation and establish its strong consistency (almost sure) without assuming previous bounds on the number of states.

Abstract:
Adaptive networks consist of a collection of nodes with adaptation and learning abilities. The nodes interact with each other on a local level and diffuse information across the network to solve estimation and inference tasks in a distributed manner. In this work, we compare the mean-square performance of two main strategies for distributed estimation over networks: consensus strategies and diffusion strategies. The analysis in the paper confirms that under constant step-sizes, diffusion strategies allow information to diffuse more thoroughly through the network and this property has a favorable effect on the evolution of the network: diffusion networks are shown to converge faster and reach lower mean-square deviation than consensus networks, and their mean-square stability is insensitive to the choice of the combination weights. In contrast, and surprisingly, it is shown that consensus networks can become unstable even if all the individual nodes are stable and able to solve the estimation task on their own. When this occurs, cooperation over the network leads to a catastrophic failure of the estimation task. This phenomenon does not occur for diffusion networks: we show that stability of the individual nodes always ensures stability of the diffusion network irrespective of the combination topology. Simulation results support the theoretical findings.

Abstract:
The solution to a multi-dimensional linear Stochastic Differential Equation (SDE) with constant initial state is well known to be a Gaussian Markov process, but its covariance kernel involves the solution to an integral equation in the general case. We show that the covariance kernel has a simpler semi-parametric form for families of such solutions representing increments of a common process. We then show that a covariance kernel of a particular parametric form is necessary and sufficient for a solution to have stationary increments and, in considerable generality, for a mean-square continuous Gaussian process to have stationary increments and the Markov property. Applying a Gaussian process with such a parametric kernel to the problem of predicting multi-dimensional time series, we derive closed-form expressions for the posterior moments and for maximum likelihood estimators of the parameters that are unbiased, jointly sufficient, and easily computed for any dimension.

Abstract:
We find the class, ${\cal{C}}_k, k \ge 0$, of all zero mean stationary Gaussian processes, $Y(t), ~t \in \reals$ with $k$ derivatives, for which \begin{equation} Z(t) \equiv (Y^{(0)}(t), Y^{(1)}(t), \ldots, Y^{(k)}(t) ), ~ t \ge 0 \end{equation} \noindent is a $(k+1)$-vector Markov process. (here, $Y^{(0)}(t) = Y(t)$).

Abstract:
Given a Gaussian Markov random field, we consider the problem of selecting a subset of variables to observe which minimizes the total expected squared prediction error of the unobserved variables. We first show that finding an exact solution is NP-hard even for a restricted class of Gaussian Markov random fields, called Gaussian free fields, which arise in semi-supervised learning and computer vision. We then give a simple greedy approximation algorithm for Gaussian free fields on arbitrary graphs. Finally, we give a message passing algorithm for general Gaussian Markov random fields on bounded tree-width graphs.

Abstract:
This paper investigates the Gaussian quasi-likelihood estimation of an exponentially ergodic multidimensional Markov process, which is expressed as a solution to a L\'{e}vy driven stochastic differential equation whose coefficients are known except for the finite-dimensional parameters to be estimated, where the diffusion coefficient may be degenerate or even null. We suppose that the process is discretely observed under the rapidly increasing experimental design with step size $h_n$. By means of the polynomial-type large deviation inequality, convergence of the corresponding statistical random fields is derived in a mighty mode, which especially leads to the asymptotic normality at rate $\sqrt{nh_n}$ for all the target parameters, and also to the convergence of their moments. As our Gaussian quasi-likelihood solely looks at the local-mean and local-covariance structures, efficiency loss would be large in some instances. Nevertheless, it has the practically important advantages: first, the computation of estimates does not require any fine tuning, and hence it is straightforward; second, the estimation procedure can be adopted without full specification of the L\'{e}vy measure.

Abstract:
We prove a version of the multidimensional Fourth Moment Theorem for chaotic random vectors, in the general context of diffusion Markov generators. In addition to the usual componentwise convergence and unlike the infinite-dimensional Ornstein-Uhlenbeck generator case, another moment-type condition is required to imply joint convergence of of a given sequence of vectors.

Abstract:
Assuming that the loss function is convex in the prediction, we construct a prediction strategy universal for the class of Markov prediction strategies, not necessarily continuous. Allowing randomization, we remove the requirement of convexity.