全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Improved Protein Phosphorylation Site Prediction by a New Combination of Feature Set and Feature Selection

DOI: 10.4236/jbise.2018.116013, PP. 144-157

Keywords: Protein Phosphorylation, Phosphorylation Site Prediction, Sequence Feature, Feature Selection with Grid Search

Full-Text   Cite this paper   Add to My Lib

Abstract:

Phosphorylation of protein is an important post-translational modification that enables activation of various enzymes and receptors included in signaling pathways. To reduce the cost of identifying phosphorylation site by laborious experiments, computational prediction of it has been actively studied. In this study, by adopting a new set of features and applying feature selection by Random Forest with grid search before training by Support Vector Machine, our method achieved better or comparable performance of phosphorylation site prediction for two different data sets.

References

[1]  Hunter, T. (2000) Signaling—2000 and Beyond. Cell, 100, 113-127.
https://doi.org/10.1016/S0092-8674(00)81688-8
[2]  Khoury, G.A., Baliban, R.C. and Floudasa, C.A. (2011) Proteome-Wide Post-Translational Modification Statistics: Frequency Analysis and Curation of the Swiss-Prot Database. Scientific Report, 1.
https://doi.org/10.1038/srep00090
[3]  Pinna, L.A. and Ruzzene, M. (1996) How Do Protein Kinases Recognize Their Substrates? Biochimica et Biophysica Acta (BBA)—Molecular Cell Research, 1314, 191-225.
[4]  Newman, R.H., Zhang, J. and Zhu, H. (2014) Toward a Systems-Level View of Dynamic Phosphorylation Networks. Frontier in Genetics.
https://doi.org/10.3389/fgene.2014.00263
[5]  Trost, B. and Kusalik, A. (2011) Computational Prediction of Eukaryotic Phosphorylation Sites. Bioinformatics, 27.
https://doi.org/10.1093/bioinformatics/btr525
[6]  Xue, Y., Gao, X., Cao, J., Liu, Z., Jin, C., Wen, L., Yao, X. and Ren, J. (2010) A Summary of Computational Resources for Protein Phosphorylation. Current Protein and Peptide Science, 11, 485-496.
https://doi.org/10.2174/138920310791824138
[7]  Newman, R.H., Hu, J., Rho, H.-S., Xie, Z., Woodard, C., Neiswinger, J., Ni, Q., et al. (2013) Construction of Human Activity—Based Phosphorylation Networks. Molecular Systems Biology, 9.
[8]  Biswas, A.K., Noman, N. and Sikder, A.R. (2010) Machine Learning Approach to Predict Protein Phosphorylation Sites by Incorporating Evolutionary Information. BMC Bioinformatcis, 11.
https://doi.org/10.1186/1471-2105-11-273
[9]  Blom, N., Gammeltoft, S. and Brunak, S. (1999) Sequence and Structure-Based Prediction of Eukaryotic Protein Phosphorylation Sites. Journal of Molecular Biology, 294, 1351-1362.
https://doi.org/10.1006/jmbi.1999.3310
[10]  Kim, J.H., Lee, J., Oh, B., Kimm, K. and Koh, I. (2004) Prediction of Phosphorylation Sites Using SVMs. Bioinformatics, 3179-3184.
https://doi.org/10.1093/bioinformatics/bth382
[11]  Ismail, H.D., Jones, A., Kim, J.H., Newman, J.H. and KC, D.B. (2016) RF-Phos: A Novel General Phosphorylation Site Prediction Tool Based on Random Forest. BioMed Research International, 12.
https://doi.org/10.1155/2016/3281590
[12]  Dou, Y., Yao, Y. and Zhang, Y. (2014) PhosphoSVM: Prediction of Phosphorylation Sites by Integrating Various Protein Sequence Attributes with a Support Vector Machine. Amino Acids, 46, 1459-1469.
https://doi.org/10.1007/s00726-014-1711-5
[13]  Dinkel, H., Chica, C., Via, C., Gould, C.M., Jensen, L.J., Gibson, T.J. and Diella, F. (2011) Phospho.ELM: A Database of Phosphorylation Sites—Update 2011. Nucleic Acids Research, 39, D261-D267.
https://doi.org/10.1093/nar/gkq1104
[14]  Sikic, K. and Carugo, O. (2010) Protein Sequence Redundancy Re-duction: Comparison of Various Methods. Bioinformation, 5, 234-239.
https://doi.org/10.6026/97320630005234
[15]  Heazlewood, J.L., Durek, P., Hummel, J., Selbig, J., Weckwerth, W., Walther, D. and Schulze, W.X. (2007) PhosPhAt: A Database of Phosphorylation Sites in Arabidopsis thaliana and a Plant-Specific Phosphorylation Site Predictor. Nucleic Acids Research, 36, D1015-D1021.
https://doi.org/10.1093/nar/gkm812
[16]  Rao, H.B., Zhu, F., Yang, G.B., Li, R. and Chen, Z. (2011) Update of PROFEAT: A Web Server for Computing Structural and Physicochemical Features of Proteins and Peptides from Amino Acid Sequence. Nucleic Acids Research, 39, W385-W390.
https://doi.org/10.1093/nar/gkr284
[17]  Bergman, N.H. (2007) Comparative Genomics: Volumes 1 and 2. Humana Press, Totowa.
https://doi.org/10.1007/978-1-59745-515-2
[18]  Xiao, N., Cao, D.-S., Zhu, M.-F. and Xu, Q.-S. (2015) Protr/ProtrWeb: R Package and Web Server for Generating Various Numerical Representation Schemes of Protein Sequences. Bioinformatics, 31, 1857-1859.
https://doi.org/10.1093/bioinformatics/btv042
[19]  Bhasin, M. and Raghava, G.P. (2004) Classification of Nuclear Receptors Based on Amino Acid Composition and Dipeptide Composition. Journal of Biological Chemistry, 279, 23262-23266.
https://doi.org/10.1074/jbc.M401932200
[20]  Li, Z.R., Lin, H.H., Han, L.Y., Jiang, L., Chen, X. and Chen, Y.Z. (2006) PROFEAT: A Web Server for Computing Structural and Physicochemical Features of Proteins and Peptides from Amino Acid Sequence. Nucleic Acids Research, 34, W32-W37.
https://doi.org/10.1093/nar/gkl305
[21]  Dubchak, I., Muchnik, I., Mayor, C., Dralyuk, I. and Kim, S.-H. (1999) Recognition of a Protein Fold in the Context of the SCOP Classification. Proteins: Structure, Function, and Bioinformatics, 35, 401-407.
https://doi.org/10.1002/(SICI)1097-0134(19990601)35:4<401::AID-PROT3>3.0.CO;2-K
[22]  Shen, J., Zhang, J., Luo, X., Zhu, W., Yu, K., Chen, K., Li, Y. and Jiang, H. (2007) Predicting Protein-Protein Interactions Based Only on Sequences Information. Proceedings of the National Academy of Sciences, 104, 4337-4341.
https://doi.org/10.1073/pnas.0607879104
[23]  Breiman, L. (2001) Random Forests. Machine Learning, 45, 5-32.
https://doi.org/10.1023/A:1010933404324
[24]  Vapnik, V.N. (1998) Statistical Learning Theory. Wiley, New York.
[25]  Blom, N., Sicheritz-Pontén, T., Gupta, R., Gammeltoft, S. and Brunak, S. (2004) Prediction of Post-Translational Glycosylation and Phosphorylation of Proteins from the Amino Acid Sequence. Proteomics, 4, 1633-1649.
https://doi.org/10.1002/pmic.200300771
[26]  Xue, Y., Liu, Z., Cao, J., Ma, Q., Gao, X., Wang, Q., Ren, J., et al. (2011) GPS 2.1: Enhanced Prediction of Kinase-Specific Phosphorylation Sites with an Algorithm of Motif Length Selection. Protein Engineering Design & Selection, 24, 255-260.
https://doi.org/10.1093/protein/gzq094
[27]  Gao, J., Thelen, J.J., Dunker, A.K. and Xu, D. (2010) Musite, a Tool for Global Prediction of General and Kinase-Specific Phosphorylation Sites. Molecular & Cellular Proteomics, 9, 2586-2600.
https://doi.org/10.1074/mcp.M110.001388

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133