All Title Author
Keywords Abstract

Consensus Decision for Protein Structure Classification

DOI: 10.4236/jilsa.2012.43022, PP. 216-222

Keywords: Bioinformatics Databases, Classification, Mining Methods and Algorithms, Similarity Measures

Full-Text   Cite this paper   Add to My Lib


The fundamental aim of protein classification is to recognize the family of a given protein and determine its biological function. In the literature, the most common approaches are based on sequence or structure similarity comparisons. Other methods use evolutionary distances between proteins. In order to increase classification performance, this work proposes a novel method, namely Consensus, which combines the decisions of several sequence and structure comparison tools to classify a given structure. Additionally, Consensus uses the evolutionary information of the compared structures. Our method is tested on three databases and evaluated based on different criteria. Performance evaluation of our method shows that it outperforms the different classifiers used separately and gives higher classification perfor-mance than a free-alignment method, namely ProtClass.


[1]  R. Busa-Fekete, A. Kocsor and S. Pongor, “Tree-Based Algorithms for Protein Classification,” Studies in Computational Intelligence, Vol. 94, 2008, pp. 165-182. doi:10.1007/978-3-540-76803-6_6
[2]  J. A. Eisen, “Phylogenomics: Improving Functional Prediction for Uncharacterized Genes by Evolutionary Analysis,” Genome Research, Vol. 8, No. 3, 1998, pp. 163-167.
[3]  P. Jain and J. D. Hirst, “Automatic Structure Classification of Small Proteins Using Random Forest,” BMC Bioinformatics, Vol. 11, No. 364, 2010.
[4]  R. Parasuram, J. S. Lee, P. Yin, S. Somarowthu and M. J. Ondrechen, “Functional Classification of Protein 3D Structures from Predicted Local Interaction Sites,” Journal of Bioinformatics and Computational Biology, Vol. 8, No. 1, 2010, pp. 1-15.
[5]  Y. Y. Tseng and W. H. Li, “Classification of Protein Functional Surfaces Using Structural Characteristics,” Proceedings of the National Academy of Science, Vol. 109, No. 4, 2012, pp. 1170-1175. doi:10.1073/pnas.1119684109
[6]  J. Lundstrom, L. Rychlewski, J. Bujnicki and A. Elofsson, “Pcons: A Neural Network-Based Consensus Predictor That Improves Fold Recognition,” Protein Science, Vol. 10, No. 11, 2001, pp. 2354-2362. doi:10.1110/ps.08501
[7]  S. Cheek, Y. Qi, S. S. Krishna, L. N. Kinch and N. V. Grishin, “SCOPmap: Automated Assignment of Protein Structures to Evolutionary Superfamilies,” BMC Bioinformatics, Vol. 5, No. 197, 2004.
[8]  A. G. Murzin, S. E. Brenner, T. Hubbard and C. Chothia, “SCOP: A Structural Classification of Proteins Database for the Investigation of Sequences and Structures,” Journal of Molecular Biology, Vol. 247, No. 4, 1995, pp. 536540. doi:10.1016/S0022-2836(05)80134-2
[9]  O. ?amo?lu, T. Can, A. K. Singh and Y. F. Wang, “Decision Tree Based Information Integration for Automated Protein Classification,” Journal of Bioinformatics and Computational Biology, Vol. 3, No. 3, 2005, pp. 717-742. doi:10.1142/S0219720005001259
[10]  I. Melvin, E. Ie, R. Kuang, J. Weston, W. N. Stafford and C. Leslie, “SVMfold: A Tool for Discriminative MultiClass Protein Fold and Superfamily Recognition,” BMC Bioinformatics, Vol. 8, No. 4, 2007, p. S2. doi:10.1186/1471-2105-8-S4-S2
[11]  C. A. Orengo, A. D. Michie, S. Jones, D. T. Jones, M. B. Swindells and J. M. Thornton, “CATH—A Hierarchic Classification of Protein Domain Structures,” Structure, Vol. 5, No. 8, 1997, pp. 1093-1108. doi:10.1016/S0969-2126(97)00260-8
[12]  K. Boujenfa, N. Essoussi and M. Limam, “Tree-kNN: A Tree-Based Algorithm for Protein Sequence Classification,” International Journal on Computer Science and Engineering, Vol. 3, No. 2, 2011, pp. 961-968.
[13]  Z. Aung and K. L. Tan, “Automatic 3D Protein Structure Classification without Structural Alignment,” Jounal of Computational Biology, Vol. 12, No. 9, 2005, pp. 12211241. doi:10.1089/cmb.2005.12.1221
[14]  A. Biegert and J. S?ding, “Sequence Context-Specific Profiles for Homology Searching,” PNAS, Vol. 106, No. 10, 2009, pp. 3770-3775. doi:10.1073/pnas.0810767106
[15]  S. F. Altschul, T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller and D. J. Lipman, “Gapped Blast and Psi-Blast: A New Generation of Protein Database Search Programs,” Nucleic Acids Research, Vol. 25, No. 17, 1997, pp. 3389-3402. doi:10.1093/nar/25.17.3389
[16]  J. Zhu and Z. Weng, “Fast: A Novel Protein Structure Alignment Algorithm,” PROTEINS: Structure, Function, and Bioinformatics, Vol. 58, No. 3, 2005, pp. 618-627. doi:10.1002/prot.20331
[17]  A. Guerler and E. W. Knapp, “Novel Protein Folds and Their Nonsequential Structural Analogs,” Protein Science, Vol. 17, No. 8, 2008, pp. 1374-1382. doi:10.1110/ps.035469.108
[18]  N. Morikawa, “Discrete Differential Geometry of Tetrahedrons and Encoding of Local Protein Structure,” arXiv: 0710.4596v1.
[19]  S. F. Altschul, W. Gish, W. Miller, E. W. Myers and D. J. Lipman, “Basic Local Alignment Search Tool,” Journal of Molecular Biology Vol. 215, No. 3, 1990, pp. 403-410.
[20]  P. Sonego, M. Pacurar, S. Dhir, A. Kertesz-Farkas, A. Kocsor, Z. Gaspari, J. A. M. Leunissen and S. Pongor, “A Protein Classification Benchmark Collection for Machine Learning,” Nucleic Acids Research, Vol. 35, No. 1, 2007, pp. D232-D236. doi:10.1093/nar/gkl812
[21]  M. Widmann, P. B. Juhl and J. Pleiss, “Structural Classification by the Lipase Engineering Database: A Case Study of Candida Antarctica Lipase A,” BMC Genomics, Vol. 11, No. 123, 2010.
[22]  J. Pleiss, M. Fischer, M. Peiker, C. Thiele and R. D. Schmid, “Lipase Engineering Database—Understanding and Exploiting Sequence-Structure-Function Relationships,” Journal of Molecular Catalysis B-Enzymatic, Vol. 10, No. 5, 2000, pp. 491-508. doi:10.1016/S1381-1177(00)00092-8


comments powered by Disqus

Contact Us


微信:OALib Journal