Abstract:
We explore the possibility of applying the framework of frequent pattern mining to a class of continuous objects appearing in nature, namely knots. We introduce the frequent knot mining problem and present a solution. The key observation is that a database consisting of knots can be transformed into a transactional database. This observation is based on the Prime Decomposition Theorem of knots.

Abstract:
We show that there is a query expressible in first-order logic over the reals that returns, on any given semi-algebraic set A, for every point a radius around which A is conical. We obtain this result by combining famous results from calculus and real algebraic geometry, notably Sard's theorem and Thom's first isotopy lemma, with recent algorithmic results by Rannou.

Abstract:
We define three novel statistical models and give an efficient algorithm for haplotype reconstruction, jointly called HaploRec. HaploRec is based on exploiting local regularities conserved in haplotypes: it reconstructs haplotypes so that they have maximal local coherence. This approach – not assuming statistical dependence for remotely located markers – has two useful properties: it is well-suited for sparse marker maps, such as those used in gene mapping, and it can actually take advantage of long maps.Our experimental results with simulated and real data show that HaploRec is a powerful method for the large scale haplotyping needed in association studies. With sample sizes large enough for gene mapping it appeared to be the best compared to all other tested methods (Phase, fastPhase, PL-EM, Snphap, Gerbil; simulated data), with small samples it was competitive with the best available methods (real data). HaploRec is several orders of magnitude faster than Phase and comparable to the other methods; the running times are roughly linear in the number of subjects and the number of markers. HaploRec is publicly available at http://www.cs.helsinki.fi/group/genetics/haplotyping.html webcite.The problem we consider is haplotype reconstruction: given the genotypes of a sample of individuals, the task is to predict the most likely haplotype pair for each individual. Computational haplotype reconstruction methods are based on statistical dependency between closely located markers, known as linkage disequilibrium. Many computational methods have been developed for the reconstruction of haplotypes. Some of these methods do not rely on the statistical modeling of the haplotypes [1-3], but most of them, like our proposed algorithm HaploRec, do [4-10]. For a review of these and other haplotyping methods we refer to [11-13]. Laboratory techniques are being developed for direct molecular haplotyping (see, e.g., [14,15]), but these techniques are not mature yet, and are currently t

Abstract:
We address a fundamental question concerning spatio-temporal database systems: ``What are exactly spatio-temporal queries?'' We define spatio-temporal queries to be computable mappings that are also generic, meaning that the result of a query may only depend to a limited extent on the actual internal representation of the spatio-temporal data. Genericity is defined as invariance under groups of geometric transformations that preserve certain characteristics of spatio-temporal data (e.g., collinearity, distance, velocity, acceleration, ...). These groups depend on the notions that are relevant in particular spatio-temporal database applications. These transformations also have the distinctive property that they respect the monotone and unidirectional nature of time. We investigate different genericity classes with respect to the constraint database model for spatio-temporal databases and we identify sound and complete languages for the first-order and the computable queries in these genericity classes. We distinguish between genericity determined by time-invariant transformations, genericity notions concerning physical quantities and genericity determined by time-dependent transformations.

Abstract:
We describe two efficient on-line algorithms to simplify weighted graphs by eliminating degree-two vertices. Our algorithms are on-line in that they react to updates on the data, keeping the simplification up-to-date. The supported updates are insertions of vertices and edges; hence, our algorithms are partially dynamic. We provide both analytical and empirical evaluations of the efficiency of our approaches. Specifically, we prove an O(log n) upper bound on the amortized time complexity of our maintenance algorithms, with n the number of insertions.

Abstract:
In the context of mining for frequent patterns using the standard levelwise algorithm, the following question arises: given the current level and the current set of frequent patterns, what is the maximal number of candidate patterns that can be generated on the next level? We answer this question by providing a tight upper bound, derived from a combinatorial result from the sixties by Kruskal and Katona. Our result is useful to reduce the number of database scans.

Abstract:
This paper addresses the question whether one can determine the connectivity of a semi-algebraic set in three dimensions by testing the connectivity of a finite number of two-dimensional ``samples'' of the set, where these samples are defined by first-order queries. The question is answered negatively for two classes of first-order queries: cartesian-product-free, and positive one-pass.

The solution of an n-dimensional
stochastic differential equation driven by Gaussian white noises is a Markov vector.
In this way, the transition joint probability density function (JPDF) of this
vector is given by a deterministic parabolic partial differential
equation, the so-called Fokker-Planck-Kolmogorov (FPK) equation. There exist few
exact solutions of this equation so that the analyst must resort to approximate
or numerical procedures. The finite element method (FE) is among the latter,
and is reviewed in this paper. Suitable computer codes are written for the two
fundamental versions of the FE method, the Bubnov-Galerkin and the
Petrov-Galerkin method. In order to reduce the computational effort, which is
to reduce the number of nodal points, the following refinements to the method
are proposed: 1) exponential (Gaussian) weighting functions different from the
shape functions are tested; 2) quadratic and cubic splines are used to
interpolate the nodal values that are known in a limited number of points. In
the applications, the transient state is studied for first order systems only,
while for second order systems, the steady-state JPDF is determined, and it is
compared with exact solutions or with simulative solutions: a very good
agreement is found.

Abstract:
Palmar intertriradial ridge counts (a-b, b-c, c-d, a-d) in 260 males and 260 females of Sardinian origin were considered. Bilateral and sex differences, correlation coefficients, skewness and kurtosis of the four distributions and an index of asymmetry were calculated. There are no sex differences, ridge counts show positive and almost always significant correlations. The a-d ridge count shows a normal distribution in both sexes.

Abstract:
A second order oscillator with
nonlinear restoring force and nonlinear damping is considered: it is subject to
both external and internal (parametric) excitations of Gaussian white noise
type. The nonlinearities are chosen in such a way that the associated Fokker-Planck-Kolmogorov
equation is solvable in the steady state. Different choices of some system
parameters give rise to different and interesting shapes of the joint
probability density function of the response, which in some cases appears to be
multimodal. The problem of the determination of the power spectral density of
the response is also addressed by using the true statistical linearization
method.