Abstract:
Recently there has been an increasing interest in methods that deal with multiple outputs. This has been motivated partly by frameworks like multitask learning, multisensor networks or structured output data. From a Gaussian processes perspective, the problem reduces to specifying an appropriate covariance function that, whilst being positive semi-definite, captures the dependencies between all the data points and across all the outputs. One approach to account for non-trivial correlations between outputs employs convolution processes. Under a latent function interpretation of the convolution transform we establish dependencies between output variables. The main drawbacks of this approach are the associated computational and storage demands. In this paper we address these issues. We present different sparse approximations for dependent output Gaussian processes constructed through the convolution formalism. We exploit the conditional independencies present naturally in the model. This leads to a form of the covariance similar in spirit to the so called PITC and FITC approximations for a single output. We show experimental results with synthetic and real data, in particular, we show results in pollution prediction, school exams score prediction and gene expression data.

Abstract:
Many signal processing and machine learning methods share essentially the same linear-in-the-parameter model, with as many parameters as available samples as in kernel-based machines. Sparse approximation is essential in many disciplines, with new challenges emerging in online learning with kernels. To this end, several sparsity measures have been proposed in the literature to quantify sparse dictionaries and constructing relevant ones, the most prolific ones being the distance, the approximation, the coherence and the Babel measures. In this paper, we analyze sparse dictionaries based on these measures. By conducting an eigenvalue analysis, we show that these sparsity measures share many properties, including the linear independence condition and inducing a well-posed optimization problem. Furthermore, we prove that there exists a quasi-isometry between the parameter (i.e., dual) space and the dictionary's induced feature space.

Abstract:
A series of stationary principles are developed for dynamical systems by formulating the concept of mixed convolved action, which is written in terms of mixed variables, using temporal convolutions and fractional derivatives. Dynamical systems with discrete and continuous spatial representations are considered as initial applications. In each case, a single scalar functional provides the governing differential equations, along with all the pertinent initial and boundary conditions, as the Euler-Lagrange equations emanating from the stationarity of this mixed convolved action. Both conservative and non-conservative processes can be considered within a common framework, thus resolving a long-standing limitation of variational approaches for dynamical systems. Several results in fractional calculus also are developed.

Abstract:
Multi-output Gaussian processes have received increasing attention during the last few years as a natural mechanism to extend the powerful flexibility of Gaussian processes to the setup of multiple output variables. The key point here is the ability to design kernel functions that allow exploiting the correlations between the outputs while fulfilling the positive definiteness requisite for the covariance function. Alternatives to construct these covariance functions are the linear model of coregionalization and process convolutions. Each of these methods demand the specification of the number of latent Gaussian process used to build the covariance function for the outputs. We propose in this paper, the use of an Indian Buffet process as a way to perform model selection over the number of latent Gaussian processes. This type of model is particularly important in the context of latent force models, where the latent forces are associated to physical quantities like protein profiles or latent forces in mechanical systems. We use variational inference to estimate posterior distributions over the variables involved, and show examples of the model performance over artificial data, a motion capture dataset, and a gene expression dataset.

Abstract:
In sparse coding it is common to tile an image into nonoverlapping patches, and then use a dictionary to create a sparse representation of each tile independently. In this situation, the overcompleteness of the dictionary is the number of dictionary elements divided by the patch size. In deconvolutional neural networks (DCNs), dictionaries learned on nonoverlapping tiles are replaced by a family of convolution kernels. Hence adjacent points in the feature maps (V1 layers) have receptive fields in the image that are translations of each other. The translational distance is determined by the dimensions of V1 in comparison to the dimensions of the image space. We refer to this translational distance as the stride. We implement a type of DCN using a modified Locally Competitive Algorithm (LCA) to investigate the relationship between the number of kernels, the stride, the receptive field size, and the quality of reconstruction. We find, for example, that for 16x16-pixel receptive fields, using eight kernels and a stride of 2 leads to sparse reconstructions of comparable quality as using 512 kernels and a stride of 16 (the nonoverlapping case). We also find that for a given stride and number of kernels, the patch size does not significantly affect reconstruction quality. Instead, the learned convolution kernels have a natural support radius independent of the patch size.

Abstract:
In the $k$-Leaf Out-Branching and $k$-Internal Out-Branching problems we are given a directed graph $D$ with a designated root $r$ and a nonnegative integer $k$. The question is to determine the existence of an outbranching rooted at $r$ that has at least $k$ leaves, or at least $k$ internal vertices, respectively. Both these problems were intensively studied from the points of view of parameterized complexity and kernelization, and in particular for both of them kernels with $O(k^2)$ vertices are known on general graphs. In this work we show that $k$-Leaf Out-Branching admits a kernel with $O(k)$ vertices on $\mathcal{H}$-minor-free graphs, for any fixed family of graphs $\mathcal{H}$, whereas $k$-Internal Out-Branching admits a kernel with $O(k)$ vertices on any graph class of bounded expansion.

Abstract:
A fast multilevel algorithm based on directionally scaled tensor-product Gaussian kernels on structured sparse grids is proposed for interpolation of high-dimensional functions and for the numerical integration of high-dimensional integrals. The algorithm is based on the recent Multilevel Sparse Kernel-based Interpolation (MLSKI) method (Georgoulis, Levesley \& Subhan, \emph{SIAM J. Sci. Comput.}, 35(2), pp.~A815--A831, 2013), with particular focus on the fast implementation of Gaussian-based MLSKI for interpolation and integration problems of high-dimen-sional functions $f:[0,1]^d\to\mathbb{R}$, with $5\le d\le 10$. The MLSKI interpolation procedure is shown to be interpolatory and a fast implementation is proposed. More specifically, exploiting the tensor-product nature of anisotropic Gaussian kernels, one-dimensional cardinal basis functions on a sequence of hierarchical equidistant nodes are precomputed to machine precision, rendering the interpolation problem into a fully parallelisable ensemble of linear combinations of function evaluations. A numerical integration algorithm is also proposed, based on interpolating the (high-dimensional) integrand. A series of numerical experiments highlights the applicability of the proposed algorithm for interpolation and integration for up to 10-dimensional problems.

Abstract:
Integer linear programs (ILPs) are a widely applied framework for dealing with combinatorial problems that arise in practice. It is known, e.g., by the success of CPLEX, that preprocessing and simplification can greatly speed up the process of optimizing an ILP. The present work seeks to further the theoretical understanding of preprocessing for ILPs by initiating a rigorous study within the framework of parameterized complexity and kernelization. A famous result of Lenstra (Mathematics of Operations Research, 1983) shows that feasibility of any ILP with n variables and m constraints can be decided in time O(c^{n^3} m^c'). Thus, by a folklore argument, any such ILP admits a kernelization to an equivalent instance of size O(c^{n^3}). It is known, that unless NP \subseteq coNP/poly and the polynomial hierarchy collapses, no kernelization with size bound polynomial in n is possible. However, this lower bound only applies for the case when constraints may include an arbitrary number of variables since it follows from lower bounds for Satisfiability and Hitting Set, whose bounded arity variants admit polynomial kernelizations. We consider the feasibility problem for ILPs Ax<= b where A is an r-row-sparse matrix parameterized by the number of variables. We show that the kernelizability of this problem depends strongly on the range of the variables. If the range is unbounded then this problem does not admit a polynomial kernelization unless NP \subseteq coNP/poly. If, on the other hand, the range of each variable is polynomially bounded in n then we do get a polynomial kernelization. Additionally, this holds also for the more general case when the maximum range d is an additional parameter, i.e., the size obtained is polynomial in n+d.

Abstract:
We define the convolved h(x)-Fibonacci polynomials as an extension of the classical convolved Fibonacci numbers. Then we give some combinatorial formulas involving the h(x)-Fibonacci and h(x)-Lucas polynomials. Moreover we obtain the convolved h(x)-Fibonacci polynomials form a family of Hessenberg matrices.

Abstract:
The convolved Fibonacci numbers F_j^(r) are defined by (1-z-z^2)^{-r}=\sum_{j>=0}F_{j+1}^(r)z^j. In this note some related numbers that can be expressed in terms of convolved Fibonacci numbers are considered. These numbers appear in the numerical evaluation of a certain number theoretical constant. This note is a case study of the transform {1/n}\sum_{d|n}mu(d)f(z^d)^{n/d}, with f any formal series and mu the Moebius function), which is studied in a companion paper entitled `The formal series Witt transform'.