All Title Author
Keywords Abstract

Evolutionary models for insertions and deletions in a probabilistic modeling framework

DOI: 10.1186/1471-2105-6-63

Full-Text   Cite this paper   Add to My Lib


Probabilistic models of substitution events are well established, but there has not been a completely satisfactory theoretical framework for modeling insertion and deletion events.I have developed a method for extending standard Markov substitution models to include gap characters, and another method for the evolution of state transition probabilities in a probabilistic model. These methods use instantaneous rate matrices in a way that is more general than those used for substitution processes, and are sufficient to provide time-dependent models for standard linear and affine gap penalties, respectively.Given a probabilistic model, we can make all of its emission probabilities (including gap characters) and all its transition probabilities conditional on a chosen divergence time. To do this, we only need to know the parameters of the model at one particular divergence time instance, as well as the parameters of the model at the two extremes of zero and infinite divergence.I have implemented these methods in a new generation of the RNA genefinder QRNA (eQRNA).These methods can be applied to incorporate evolutionary models of insertions and deletions into any hidden Markov model or stochastic context-free grammar, in a pair or profile form, for sequence modeling.Probabilistic models are widely used for sequence analysis [1]. Hidden Markov models (HMMs) are a very large class of probabilistic models used for many problems in biological sequence analysis such as sequence homology searches [2-4], sequence alignment [5], or protein genefinding [6-8]. Stochastic context-free grammars (SCFGs) are another class of probabilistic models used for structural RNAs for problems such as RNA homology searches [9-13], RNA structure prediction [14,15], and RNA genefinding [16].Sequence similarity methods based on HMMs or SCFGs can take the form of profile or pair models and are very important for comparative genomics. These probabilistic methods for sequence comparison assume a certai


comments powered by Disqus

Contact Us


微信:OALib Journal