All Title Author
Keywords Abstract

PLOS ONE  2013 

Protein Structural Model Selection by Combining Consensus and Single Scoring Methods

DOI: 10.1371/journal.pone.0074006

Full-Text   Cite this paper   Add to My Lib


Quality assessment (QA) for predicted protein structural models is an important and challenging research problem in protein structure prediction. Consensus Global Distance Test (CGDT) methods assess each decoy (predicted structural model) based on its structural similarity to all others in a decoy set and has been proved to work well when good decoys are in a majority cluster. Scoring functions evaluate each single decoy based on its structural properties. Both methods have their merits and limitations. In this paper, we present a novel method called PWCom, which consists of two neural networks sequentially to combine CGDT and single model scoring methods such as RW, DDFire and OPUS-Ca. Specifically, for every pair of decoys, the difference of the corresponding feature vectors is input to the first neural network which enables one to predict whether the decoy-pair are significantly different in terms of their GDT scores to the native. If yes, the second neural network is used to decide which one of the two is closer to the native structure. The quality score for each decoy in the pool is based on the number of winning times during the pairwise comparisons. Test results on three benchmark datasets from different model generation methods showed that PWCom significantly improves over consensus GDT and single scoring methods. The QA server (MUFOLD-Server) applying this method in CASP 10 QA category was ranked the second place in terms of Pearson and Spearman correlation performance.


[1]  Domingues FS, Koppensteiner WA, Sippl MJ (2000) The role of protein structure in genomics. FEBS Lett 476: 98–102.
[2]  Baker D, Sali A (2001) Protein structure prediction and structural genomics. Science 294: 93–96.
[3]  Cozzetto D, Tramontano A (2008) Advances and pitfalls in protein structure prediction. Curr Protein Pept Sci 9: 567–577.
[4]  Petrey D, Honig B (2005) Protein structure prediction: inroads to biology. Mol Cell 20: 811–819.
[5]  Simons KT, Bonneau R, Ruczinski I, Baker D (1999) Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins Suppl 3: 171–176.
[6]  Roy A, Kucukural A, Zhang Y (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5: 725–738.
[7]  Wu S, Skolnick J, Zhang Y (2007) Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol 5: 17.
[8]  Zhang Y, Arakaki AK, Skolnick J (2005) TASSER: an automated method for the prediction of protein tertiary structures in CASP6. Proteins 61 Suppl 791–98.
[9]  Zhang J, Wang Q, Barz B, He Z, Kosztin I, et al. (2010) MUFOLD: A new solution for protein 3D structure prediction. Proteins 78: 1137–1152.
[10]  Lazaridis T, Karplus M (1999) Discrimination of the native from misfolded protein models with an energy function including implicit solvation. J Mol Biol 288: 477–487.
[11]  Petrey D, Honig B (2000) Free energy determinants of tertiary structure and the evaluation of protein models. Protein Sci 9: 2181–2191.
[12]  Wu Y, Lu M, Chen M, Li J, Ma J (2007) OPUS-Ca: a knowledge-based potential function requiring only Calpha positions. Protein Sci 16: 1449–1463.
[13]  Yang Y, Zhou Y (2008) Specific interactions for ab initio folding of protein terminal regions with secondary structures. Proteins 72: 793–803.
[14]  Lassmann T, Sonnhammer EL (2005) Automatic assessment of alignment quality. Nucleic Acids Res 33: 7120–7128.
[15]  Wallner B, Elofsson A (2003) Can correct protein models be identified? Protein Sci 12: 1073–1086.
[16]  Gao X, Bu D, Li SC, Xu J, Li M (2007) FragQA: predicting local fragment quality of a sequence-structure alignment. Genome Inform 19: 27–39.
[17]  Ray A, Lindahl E, Wallner B (2012) Improved model quality assessment using ProQ2. BMC Bioinformatics 13: 224.
[18]  Benkert P, Tosatto SC, Schwede T (2009) Global and local model quality estimation at CASP8 using the scoring functions QMEAN and QMEANclust. Proteins 77 Suppl 9173–180.
[19]  Zemla A (2003) LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res 31: 3370–3374.
[20]  Moult J, Pedersen JT, Judson R, Fidelis K (1995) A large-scale experiment to assess protein structure prediction methods. Proteins 23: ii-v.
[21]  Wang Q, Vantasin K, Xu D, Shang Y (2011) MUFOLD-WQA: A new selective consensus method for quality assessment in protein structure prediction. Proteins 79 Suppl 10185–195.
[22]  Cheng J, Wang Z, Tegge AN, Eickholt J (2009) Prediction of global and local quality of CASP8 models by MULTICOM series. Proteins 77 Suppl 9181–184.
[23]  Wallner B, Elofsson A (2007) Prediction of global and local model quality in CASP7 using Pcons and ProQ. Proteins 69 Suppl 8184–193.
[24]  Shi X, Zhang J, He Z, Shang Y, Xu D (2011) A sampling-based method for ranking protein structural models by integrating multiple scores and features. Curr Protein Pept Sci 12: 540–548.
[25]  Qiu J, Sheffler W, Baker D, Noble WS (2008) Ranking predicted protein structures with support vector regression. Proteins 71: 1175–1182.
[26]  He Z, Zhang J, Xu Y, Shang Y, Xu D (2011) Protein structural model selection based on protein-dependent scoring function. Statistics and Its Interface Volume 0.
[27]  Zhang J, Zhang Y (2010) A novel side-chain orientation dependent potential derived from random-walk reference state for protein fold selection and structure prediction. PLoS One 5: e15386.
[28]  Zheng W, Liu X (2005) A protein structural alphabet and its substitution matrix CLESUM. Transactions on Computational Systems Biology II Volume 3680: 59–67.
[29]  McGuffin LJ, Bryson K, Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16: 404–405.
[30]  Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22: 2577–2637.
[31]  Cheng J, Randall AZ, Sweredoski MJ, Baldi P (2005) SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res 33: W72–76.
[32]  Faraggi E, Xue B, Zhou Y (2009) Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by guided-learning through a two-layer neural network. Proteins 74: 847–856.
[33]  Gao X, Bu D, Xu J, Li M (2009) Improving consensus contact prediction via server correlation reduction. BMC Struct Biol 9: 28.


comments powered by Disqus

Contact Us


微信:OALib Journal