oalib
Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
A method for the prediction of GPCRs coupling specificity to G-proteins using refined profile Hidden Markov Models
Nikolaos G Sgourakis, Pantelis G Bagos, Panagiotis K Papasaikas, Stavros J Hamodrakas
BMC Bioinformatics , 2005, DOI: 10.1186/1471-2105-6-104
Abstract: Using a dataset of 282 GPCR sequences of known coupling preference to G-proteins and adopting a five-fold cross-validation procedure, the method yielded an 89.7% correct classification rate. In a validation set comprised of all receptor sequences that are species homologues to GPCRs with known coupling preferences, excluding the sequences used to train the models, our method yields a correct classification rate of 91.0%. Furthermore, promiscuous coupling properties were correctly predicted for 6 of the 24 GPCRs that are known to interact with more than one subfamily of G-proteins.Our method demonstrates high correct classification rate. Unlike previously published methods performing the same task, it does not require any transmembrane topology prediction in a preceding step. A web-server for the prediction of GPCRs coupling specificity to G-proteins available for non-commercial users is located at http://bioinformatics.biol.uoa.gr/PRED-COUPLE webcite.G-protein coupled receptors are important receivers of information input to eukaryotic cells. They share a common fold of seven transmembrane helices arranged as a seven α-helix bundle, as confirmed by analysis of the crystal structure of Rhodopsin [1] that has been extensively used as template for homology-based modeling of GPCRs [2-4]. A collection of messages of extreme diversity including photons and native agonists, such as ions, odorants and pheromones, amino acids, nucleotides, peptides, biogenic amines, prostaglandines and glycoprotein hormones [5] interact with different extracellular and/or transmembrane domains of GPCRs, in order to convey their messages to the interior of the cell [2,6]. Based primarily on shared sequence motifs, six distinct families of GPCRs are traditionally defined: A, B, C, D, E and the frizzled/smoothened family, as summarized in the GPCRDB classification scheme [7]. Various methods have been deployed for higher-level classification of GPCRs including profile Hidden Markov Models [8,9]
Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins
Pantelis G Bagos, Theodore D Liakopoulos, Stavros J Hamodrakas
BMC Bioinformatics , 2006, DOI: 10.1186/1471-2105-7-189
Abstract: We present here, a simple method that allows incorporation of prior topological information concerning the sequences at hand, while at the same time the HMMs retain their full probabilistic interpretation in terms of conditional probabilities. We present modifications to the standard Forward and Backward algorithms of HMMs and we also show explicitly, how reliable predictions may arise by these modifications, using all the algorithms currently available for decoding HMMs. A similar procedure may be used in the training procedure, aiming at optimizing the labels of the HMM's classes, especially in cases such as transmembrane proteins where the labels of the membrane-spanning segments are inherently misplaced. We present an application of this approach developing a method to predict the transmembrane regions of alpha-helical membrane proteins, trained on crystallographically solved data. We show that this method compares well against already established algorithms presented in the literature, and it is extremely useful in practical applications.The algorithms presented here, are easily implemented in any kind of a Hidden Markov Model, whereas the prediction method (HMM-TM) is freely available for academic users at http://bioinformatics.biol.uoa.gr/HMM-TM webcite, offering the most advanced decoding options currently available.Hidden Markov Models (HMMs) are probabilistic models [1], commonly used during the last years for applications in bioinformatics [2]. These tasks include gene finding [3], multiple alignments [4] and database searches [5], prediction of signal peptides [6,7], prediction of protein secondary structure [8], prediction of transmembrane protein topology [9,10], as well as joint prediction of transmembrane helices and signal peptides [11]. Especially in the case of transmembrane proteins, HMMs have been found to perform significantly better compared to other sophisticated Machine-Learning techniques such as Neural Networks (NNs) or Support Vector Mach
IgTM: An algorithm to predict transmembrane domains and topology in proteins
Piedachu Peris, Damián López, Marcelino Campos
BMC Bioinformatics , 2008, DOI: 10.1186/1471-2105-9-367
Abstract: We obtained values close to 80% in both specificity and sensitivity. Six datasets have been used for the experiments, considering different encodings for the input sequences. An encoding that includes the topology changes in the sequence (from inside and outside the membrane to it and vice versa) allowed us to obtain the best results. This software is publicly available at: http://www.dsic.upv.es/users/tlcc/bio/bio.html webciteWe compared our results with other well-known methods, that obtain a slightly better precision. However, this work shows that it is possible to apply Grammatical Inference techniques in an effective way to bioinformatics problems.Membrane proteins are involved in a variety of important biological functions [1,2] where they play the role of receptors or transporters. The number of transmembrane segments of a protein and some characteristics such as loop lengths can identify features of the proteins, as well as their role [3]. Therefore, it is very important to predict the location of transmembrane domains along the sequence, since these are the basic structural building blocks defining the protein topology. Several works have dealt with this prediction task from different approaches, mainly using Hidden Markov Models (HMM) [4-6], neural networks [7,8] or statistical analysis [9]. A rich literature is available on proteins prediction. For reviews on different methods for predicting transmembrane domains in proteins, we refer the reader to [10-12].This work addresses the problem of protein transmembrane domains prediction by making use of a Grammatical Inference (GI) based approach. GI is a particular case of Inductive Inference, an iterative process that takes into account a set of facts and tries to obtain a model consistent with the available data. In GI the model resulting from the induction process is a formal grammar (that generates a formal language) inferred from a set of sample strings, composed by a set M+ of strings belonging to a targ
Identification of Plasmodium vivax Proteins with Potential Role in Invasion Using Sequence Redundancy Reduction and Profile Hidden Markov Models  [PDF]
Daniel Restrepo-Montoya,David Becerra,Juan G. Carvajal-Pati?o,Alvaro Mongui,Luis F. Ni?o,Manuel E. Patarroyo,Manuel A. Patarroyo
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0025189
Abstract: This study describes a bioinformatics approach designed to identify Plasmodium vivax proteins potentially involved in reticulocyte invasion. Specifically, different protein training sets were built and tuned based on different biological parameters, such as experimental evidence of secretion and/or involvement in invasion-related processes. A profile-based sequence method supported by hidden Markov models (HMMs) was then used to build classifiers to search for biologically-related proteins. The transcriptional profile of the P. vivax intra-erythrocyte developmental cycle was then screened using these classifiers.
RSpred, a set of Hidden Markov Models to detect and classify the RIFIN and STEVOR proteins of Plasmodium falciparum
Nicolas Joannin, Yvonne Kallberg, Mats Wahlgren, Bengt Persson
BMC Genomics , 2011, DOI: 10.1186/1471-2164-12-119
Abstract: We have manually curated the rif and stevor gene repertoires of two Plasmodium falciparum genomes, isolates DD2 and HB3. We have identified 25% of mis-annotated and ~30 missing rif and stevor genes. Using these data sets, as well as sequences from the well curated reference genome (isolate 3D7) and field isolate data from Uniprot, we have developed a tool named RSpred. The tool, based on a set of hidden Markov models and an evaluation program, automatically identifies STEVOR and RIFIN sequences as well as the sub-groups: A-RIFIN, B-RIFIN, B1-RIFIN and B2-RIFIN. In addition to these groups, we distinguish a small subset of STEVOR proteins that we named STEVOR-like, as they either differ remarkably from typical STEVOR proteins or are too fragmented to reach a high enough score. When compared to Pfam and TIGRFAMs, RSpred proves to be a more robust and more sensitive method. We have applied RSpred to the proteomes of several P. falciparum strains, P. reichenowi, P. vivax, P. knowlesi and the rodent malaria species. All groups were found in the P. falciparum strains, and also in the P. reichenowi parasite, whereas none were predicted in the other species.We have generated a tool for the sorting of RIFIN and STEVOR proteins, large antigenic variant protein groups, into homogeneous sub-families. Assigning functions to such protein families requires their subdivision into meaningful groups such as we have shown for the RIFIN protein family. RSpred removes the need for complicated and time consuming phylogenetic analysis methods. It will benefit both research groups sequencing whole genomes as well as others working with field isolates. RSpred is freely accessible via http://www.ifm.liu.se/bioinfo/ webcite.Many pathogens have evolved strategies to survive within the hosts they infect. One strategy consists of varying the antigens the pathogen exposes to its host immune system, usually resulting in the proliferation of multicopy protein families, commonly named Variable Surfa
Efficient algorithms for training the parameters of hidden Markov models using stochastic expectation maximization (EM) training and Viterbi training
Tin Y Lam, Irmtraud M Meyer
Algorithms for Molecular Biology , 2010, DOI: 10.1186/1748-7188-5-38
Abstract: We introduce two computationally efficient training algorithms, one for Viterbi training and one for stochastic expectation maximization (EM) training, which render the memory requirements independent of the sequence length. Unlike the existing algorithms for Viterbi and stochastic EM training which require a two-step procedure, our two new algorithms require only one step and scan the input sequence in only one direction. We also implement these two new algorithms and the already published linear-memory algorithm for EM training into the hidden Markov model compiler HMM-CONVERTER and examine their respective practical merits for three small example models.Bioinformatics applications employing hidden Markov models can use the two algorithms in order to make Viterbi training and stochastic EM training more computationally efficient. Using these algorithms, parameter training can thus be attempted for more complex models and longer training sequences. The two new algorithms have the added advantage of being easier to implement than the corresponding default algorithms for Viterbi training and stochastic EM training.Hidden Markov models (HMMs) and their variants are widely used for analyzing biological sequence data. Bioinformatics applications range from methods for comparative gene prediction (e.g. [1,2]) to methods for modeling promoter grammars (e.g. [3]), identifying protein domains (e.g. [4]), predicting protein interfaces (e.g. [5]), the topology of transmembrane proteins (e.g. [6]) and residue-residue contacts in protein structures (e.g. [7]), querying pathways in protein interaction networks (e.g. [8]), predicting the occupancy of transcription factors (e.g. [9]) as well as inference models for genome-wide association studies (e.g. [10]) and disease association tests for inferring ancestral haplotypes (e.g. [11]).Most of these bioinformatics applications have been set up for a specific type of analysis and a specific biological data set, at least initially. Th
Hidden Semi Markov Models for Multiple Observation Sequences: The mhsmm Package for R  [PDF]
Jared O'Connell,S?ren H?jsgaard
Journal of Statistical Software , 2011,
Abstract: This paper describes the R package mhsmm which implements estimation and prediction methods for hidden Markov and semi-Markov models for multiple observation sequences. Such techniques are of interest when observed data is thought to be dependent on some unobserved (or hidden) state. Hidden Markov models only allow a geometrically distributed sojourn time in a given state, while hidden semi-Markov models extend this by allowing an arbitrary sojourn distribution. We demonstrate the software with simulation examples and an application involving the modelling of the ovarian cycle of dairy cows.
Transmembrane Topology and Signal Peptide Prediction Using Dynamic Bayesian Networks  [PDF]
Sheila M. Reynolds,Lukas K?ll,Michael E. Riffle,Jeff A. Bilmes,William Stafford Noble
PLOS Computational Biology , 2008, DOI: 10.1371/journal.pcbi.1000213
Abstract: Hidden Markov models (HMMs) have been successfully applied to the tasks of transmembrane protein topology prediction and signal peptide prediction. In this paper we expand upon this work by making use of the more powerful class of dynamic Bayesian networks (DBNs). Our model, Philius, is inspired by a previously published HMM, Phobius, and combines a signal peptide submodel with a transmembrane submodel. We introduce a two-stage DBN decoder that combines the power of posterior decoding with the grammar constraints of Viterbi-style decoding. Philius also provides protein type, segment, and topology confidence metrics to aid in the interpretation of the predictions. We report a relative improvement of 13% over Phobius in full-topology prediction accuracy on transmembrane proteins, and a sensitivity and specificity of 0.96 in detecting signal peptides. We also show that our confidence metrics correlate well with the observed precision. In addition, we have made predictions on all 6.3 million proteins in the Yeast Resource Center (YRC) database. This large-scale study provides an overall picture of the relative numbers of proteins that include a signal-peptide and/or one or more transmembrane segments as well as a valuable resource for the scientific community. All DBNs are implemented using the Graphical Models Toolkit. Source code for the models described here is available at http://noble.gs.washington.edu/proj/phil?ius. A Philius Web server is available at http://www.yeastrc.org/philius, and the predictions on the YRC database are available at http://www.yeastrc.org/pdr.
Logical Hidden Markov Models  [PDF]
L. De Raedt,K. Kersting,T. Raiko
Computer Science , 2011, DOI: 10.1613/jair.1675
Abstract: Logical hidden Markov models (LOHMMs) upgrade traditional hidden Markov models to deal with sequences of structured symbols in the form of logical atoms, rather than flat characters. This note formally introduces LOHMMs and presents solutions to the three central inference problems for LOHMMs: evaluation, most likely hidden state sequence and parameter estimation. The resulting representation and algorithms are experimentally evaluated on problems from the domain of bioinformatics.
Prediction of State of Wireless Network Using Markov and Hidden Markov Model  [cached]
MD. Osman Gani,Hasan Sarwar,Chowdhury Mofizur Rahman
Journal of Networks , 2009, DOI: 10.4304/jnw.4.10.976-984
Abstract: Optimal resource allocation and higher quality of service is a much needed requirement in case of wireless networks. In order to improve the above factors, intelligent prediction of network behavior plays a very important role. Markov Model (MM) and Hidden Markov Model (HMM) are proven prediction techniques used in many fields. In this paper, we have used Markov and Hidden Markov prediction tools to predict the number of wireless devices that are connected to a specific Access Point (AP) at a specific instant of time. Prediction has been performed in two stages. In the first stage, we have found state sequence of wireless access points (AP) in a wireless network by observing the traffic load sequence in time. It is found that a particular choice of data may lead to 91% accuracy in predicting the real scenario. In the second stage, we have used Markov Model to find out the future state sequence of the previously found sequence from first stage. The prediction of next state of an AP performed by Markov Tool shows 88.71% accuracy. It is found that Markov Model can predict with an accuracy of 95.55% if initial transition matrix is calculated directly. We have also shown that O(1) Markov Model gives slightly better accuracy in prediction compared to O(2) MM for predicting far future.
Page 1 /100
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.