All Title Author
Keywords Abstract

Analysis of an optimal hidden Markov model for secondary structure prediction

DOI: 10.1186/1472-6807-6-25

Full-Text   Cite this paper   Add to My Lib


Our HMM is designed without prior knowledge. It is chosen within a collection of models of increasing size, using statistical and accuracy criteria. The resulting model has 36 hidden states: 15 that model α-helices, 12 that model coil and 9 that model β-strands. Connections between hidden states and state emission probabilities reflect the organization of protein structures into secondary structure segments. We start by analyzing the model features and see how it offers a new vision of local structures. We then use it for secondary structure prediction. Our model appears to be very efficient on single sequences, with a Q3 score of 68.8%, more than one point above PSIPRED prediction on single sequences. A straightforward extension of the method allows the use of multiple sequence alignments, rising the Q3 score to 75.5%.The hidden Markov model presented here achieves valuable prediction results using only a limited number of parameters. It provides an interpretable framework for protein secondary structure architecture. Furthermore, it can be used as a tool for generating protein sequences with a given secondary structure content.Predicting the secondary structure of a protein is often a first step toward 3D structure prediction of a particular protein. In comparative modeling, secondary structure prediction is used to refine sequence alignments, or to improve the detection of distant homologs [1]. Moreover, it is of prime importance when prediction is made without a template [2]. For all these reasons protein secondary structure prediction has remained an active field for years. Virtually all statistical and learning methods have been applied to this task. Nowadays, the best methods achieve prediction rate of about 80% using homologous sequence information. A survey of the Eva on-line evaluation [3] shows that the top performing methods include several approaches based on neural networks, e.g. PSIPRED by Jones et al [4], PROFsec and PHDpsi by Rost et al [5]. Recentl


comments powered by Disqus