Abstract:
The seemingly stochastic transient dynamics of neocortical circuits observed in vivo have been hypothesized to represent a signature of ongoing stochastic inference. In vitro neurons, on the other hand, exhibit a highly deterministic response to various types of stimulation. We show that an ensemble of deterministic leaky integrate-and-fire neurons embedded in a spiking noisy environment can attain the correct firing statistics in order to sample from a well-defined target distribution. We provide an analytical derivation of the activation function on the single cell level; for recurrent networks, we examine convergence towards stationarity in computer simulations and demonstrate sample-based Bayesian inference in a mixed graphical model. This establishes a rigorous link between deterministic neuron models and functional stochastic dynamics on the network level.

Abstract:
Research on performance, robustness, and evolution of the global Internet is fundamentally handicapped without accurate and thorough knowledge of the nature and structure of the contractual relationships between Autonomous Systems (ASs). In this work we introduce novel heuristics for inferring AS relationships. Our heuristics improve upon previous works in several technical aspects, which we outline in detail and demonstrate with several examples. Seeking to increase the value and reliability of our inference results, we then focus on validation of inferred AS relationships. We perform a survey with ASs' network administrators to collect information on the actual connectivity and policies of the surveyed ASs. Based on the survey results, we find that our new AS relationship inference techniques achieve high levels of accuracy: we correctly infer 96.5% customer to provider (c2p), 82.8% peer to peer (p2p), and 90.3% sibling to sibling (s2s) relationships. We then cross-compare the reported AS connectivity with the AS connectivity data contained in BGP tables. We find that BGP tables miss up to 86.2% of the true adjacencies of the surveyed ASs. The majority of the missing links are of the p2p type, which highlights the limitations of present measuring techniques to capture links of this type. Finally, to make our results easily accessible and practically useful for the community, we open an AS relationship repository where we archive, on a weekly basis, and make publicly available the complete Internet AS-level topology annotated with AS relationship information for every pair of AS neighbors.

Abstract:
We present a deterministic near-linear time algorithm that computes the edge-connectivity and finds a minimum cut for a simple undirected unweighted graph G with n vertices and m edges. This is the first o(mn) time deterministic algorithm for the problem. In near-linear time we can also construct the classic cactus representation of all minimum cuts. The previous fastest deterministic algorithm by Gabow from STOC'91 took ~O(m+k^2 n), where k is the edge connectivity, but k could be Omega(n). At STOC'96 Karger presented a randomized near linear time Monte Carlo algorithm for the minimum cut problem. As he points out, there is no better way of certifying the minimality of the returned cut than to use Gabow's slower deterministic algorithm and compare sizes. Our main technical contribution is a near-linear time algorithm that contract vertex sets of a simple input graph G with minimum degree d, producing a multigraph with ~O(m/d) edges which preserves all minimum cuts of G with at least 2 vertices on each side. In our deterministic near-linear time algorithm, we will decompose the problem via low-conductance cuts found using PageRank a la Brin and Page (1998), as analyzed by Andersson, Chung, and Lang at FOCS'06. Normally such algorithms for low-conductance cuts are randomized Monte Carlo algorithms, because they rely on guessing a good start vertex. However, in our case, we have so much structure that no guessing is needed.

Abstract:
We treat the causal inference as a 'chain' of mathematical conditions that must be satisfied to conclude that the potential mediator is causal for the trait, where the inference is only as good as the weakest link in the chain. P-values are computed for the component conditions, which include tests of linkage and conditional independence. The Intersection-Union Test, in which a series of statistical tests are combined to form an omnibus test, is then employed to generate the overall test result. Using computer simulated mouse crosses, we show that type I error is low under a variety of conditions that include hidden variables and reactive pathways. We show that power under a simple causal model is comparable to other model selection techniques as well as Bayesian network reconstruction methods. We further show empirically that this method compares favorably to Bayesian network reconstruction methods for reconstructing transcriptional regulatory networks in yeast, recovering 7 out of 8 experimentally validated regulators.Here we propose a novel statistical framework in which existing notions of causal mediation are formalized into a hypothesis test, thus providing a standard quantitative measure of uncertainty in the form of a p-value. The method is theoretically and computationally accessible and with the provided software may prove a useful tool in disentangling molecular relationships.It has become increasingly appreciated in recent years that empirical evidence of causal links between genotype and multiple quantitative traits such as transcript abundances and clinical phenotypes can provide information on causal relationships between those quantitative traits [1-10]. The conceptual foundation of the inferred causal relationships is built on the idea that random segregation of chromosomes during gametogenesis insulates against confounding in a manner analogous to treatment randomization in a clinical trial [11,12]. Markov properties and conditional correlation hav

Abstract:
Estimation of epidemiological and population parameters from molecular sequence data has become central to the understanding of infectious disease dynamics. Various models have been proposed to infer details of the dynamics that describe epidemic progression. These include inference approaches derived from Kingman's coalescent theory. Here, we use recently described coalescent theory for epidemic dynamics to develop stochastic and deterministic coalescent SIR tree priors. We implement these in a Bayesian phylogenetic inference framework to permit joint estimation of SIR epidemic parameters and the sample genealogy. We assess the performance of the two coalescent models and also juxtapose results obtained with BDSIR, a recently published birth-death-sampling model for epidemic inference. Comparisons are made by analyzing sets of genealogies simulated under precisely known epidemiological parameters. Additionally, we analyze influenza A (H1N1) sequence data sampled in the Canterbury region of New Zealand and HIV-1 sequence data obtained from known UK infection clusters. We show that both coalescent SIR models are effective at estimating epidemiological parameters from data with large fundamental reproductive number $R_0$ and large population size $S_0$. Furthermore, we find that the stochastic variant generally outperforms its deterministic counterpart in terms of error, bias, and highest posterior density coverage, particularly for smaller $R_0$ and $S_0$. However, each of these inference models are shown to have undesirable properties in certain circumstances, especially for epidemic outbreaks with $R_0$ close to one or with small effective susceptible populations.

Abstract:
Systems are studied in which transport is possible due to large extension with open boundaries in certain directions but the particles responsible for transport can disappear from it by leaving it in other directions, by chemical reaction or by adsorption. The connection of the total escape rate, the rate of the disappearance and the diffusion constant is investigated. It leads to the observation that the diffusion coefficient defined by $< x^2>$ is in general different from the one present in the effective Fokker-Planck equation. The result makes it possible to generalize the Gaspard-Nicolis formula for deterministic systems to this transient case.

Abstract:
We show how to construct a near deterministic CNOT using several single photons sources, linear optics, photon number resolving quantum non-demolition detectors and feed-forward. This gate does not require the use of massively entangled states common to other implementations and is very efficient on resources with only one ancilla photon required. The key element of this gate are non-demolition detectors that use a weak cross-Kerr nonlinearity effect to conditionally generate a phase shift on a coherent probe, if a photon is present in the signal mode. These potential phase shifts can then be measured using highly efficient homodyne detection.

Abstract:
Distinguishing mutations that determine an organism's phenotype from (near-) neutral ‘hitchhikers’ is a fundamental challenge in genome research, and is relevant for numerous medical and biotechnological applications. For human influenza viruses, recognizing changes in the antigenic phenotype and a strains' capability to evade pre-existing host immunity is important for the production of efficient vaccines. We have developed a method for inferring ‘antigenic trees’ for the major viral surface protein hemagglutinin. In the antigenic tree, antigenic weights are assigned to all tree branches, which allows us to resolve the antigenic impact of the associated amino acid changes. Our technique predicted antigenic distances with comparable accuracy to antigenic cartography. Additionally, it identified both known and novel sites, and amino acid changes with antigenic impact in the evolution of influenza A (H3N2) viruses from 1968 to 2003. The technique can also be applied for inference of ‘phenotype trees’ and genotype–phenotype relationships from other types of pairwise phenotype distances.

Abstract:
Inferring an appropriate DTD or XML Schema Definition (XSD) for a given collection of XML documents essentially reduces to learning deterministic regular expressions from sets of positive example words. Unfortunately, there is no algorithm capable of learning the complete class of deterministic regular expressions from positive examples only, as we will show. The regular expressions occurring in practical DTDs and XSDs, however, are such that every alphabet symbol occurs only a small number of times. As such, in practice it suffices to learn the subclass of deterministic regular expressions in which each alphabet symbol occurs at most k times, for some small k. We refer to such expressions as k-occurrence regular expressions (k-OREs for short). Motivated by this observation, we provide a probabilistic algorithm that learns k-OREs for increasing values of k, and selects the deterministic one that best describes the sample based on a Minimum Description Length argument. The effectiveness of the method is empirically validated both on real world and synthetic data. Furthermore, the method is shown to be conservative over the simpler classes of expressions considered in previous work.

Abstract:
Reinforcement learning has solid foundations, but becomes inefficient in partially observed (non-Markovian) environments. Thus, a learning agent -born with a representation and a policy- might wish to investigate to what extent the Markov property holds. We propose a learning architecture that utilizes combinatorial policy optimization to overcome non-Markovity and to develop efficient behaviors, which are easy to inherit, tests the Markov property of the behavioral states, and corrects against non-Markovity by running a deterministic factored Finite State Model, which can be learned. We illustrate the properties of architecture in the near deterministic Ms. Pac-Man game. We analyze the architecture from the point of view of evolutionary, individual, and social learning.