全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于长短时记忆循环网络和基团特征的蛋白质二级结构预测
Protein Secondary Structure Prediction Based on Long-Short-Term Memory Recurrent Network and Radical Group Features

DOI: 10.12677/HJCB.2020.104007, PP. 57-68

Keywords: 蛋白质,蛋白质二级结构预测,循环网络,基团,结构预测
Protein
, Protein Secondary Structure Prediction, Recurrent Network, Radical Group, Structure Pre-dict

Full-Text   Cite this paper   Add to My Lib

Abstract:

蛋白质二级结构预测是蛋白质结构研究领域的重要课题,随着机器学习和深度学习的发展,多种多样的预测模型被提出,实验采用双向长短时记忆循环网络模型,取消滑动窗口限制,充分考虑氨基酸长距离相互作用和氨基酸序列前后文之间的相互影响。重新设计了网络的输入特征,在PSSM基础上增加了42基团特征,使用大数据集进行训练,在公共测试集CASP9,CASP10,CASP11和CASP12上Q3准确率分别达到了85.74%,86.83%,84.73%和83.79%。实验结果表明,蛋白质二级结构预测可在新的特征设计,考虑氨基酸长距离相互作用和大数据的使用方向上进一步的研究。
Protein secondary structure prediction is an important topic in the field of protein structure re-search. With the development of machine learning and deep learning, a variety of prediction mod-els have been proposed. The experiment used a bidirectional long-short-term memory recurrent network model, removed the sliding window, and fully considered the long-distance amino acid in-teraction and the interaction between the context of the amino acid sequence. Redesigned the input features of the network, added 42 radical group features on the basis of PSSM, used large data sets for training, and the accuracy of Q3 on the public test sets CASP9, CASP10, CASP11 and CASP12 reached 85.74%, 86.83%, 84.73% and 83.79% respectively. The experimental results show that protein secondary structure prediction can be further studied in the design of new features, con-sidering the long-range interaction of amino acids and the use of big data.

References

[1]  Jiang, Q., Jin, X., Lee, S.J., et al. (2017) Protein Secondary Structure Prediction: A Survey of the State of the Art. Jour-nal of Molecular Graphics & Modelling, 76, 379-402.
https://doi.org/10.1016/j.jmgm.2017.07.015
[2]  Yang, Y., Gao, J., Wang, J., et al. (2018) Sixty-Five Years of the Long March in Protein Secondary Structure Prediction: The Final Stretch. Briefings in Bioinformatics, 19, 482-494.
[3]  Ma, Y., Liu, Y. and Cheng, J. (2018) Protein Secondary Struc-ture Prediction Based on Data Partition and Semi-Random Subspace Method. Scientific Reports, 8, Article No. 9856.
https://doi.org/10.1038/s41598-018-28084-8
[4]  刘斌, 温雪岩. 优化多核SVM的蛋白质二级结构预测[J]. 现代电子技术, 2020, 43(8): 139-142.
[5]  Lasfar, M. and Bouden, H. (2018) A Method of Data Mining Using Hidden Markov Models (HMMs) for Protein Secondary Structure Prediction. Procedia Computer Science, 127, 42-51.
https://doi.org/10.1016/j.procs.2018.01.096
[6]  Drozdetskiy, A., Cole, C., Procter, J., et al. (2015) JPred4: A Protein Secondary Structure Prediction Server. Nucleic Acids Research, 43, 389-394.
https://doi.org/10.1093/nar/gkv332
[7]  Jones, D. (1999) Protein Secondary Structure Prediction Based on Posi-tion-Specific Scoring Matrices. Journal of Molecular Biology, 292, 195-202.
https://doi.org/10.1006/jmbi.1999.3091
[8]  郭延哺, 李维华, 王兵益, 等. 基于卷积长短时记忆神经网络的蛋白质二级结构预测[J]. 模式识别与人工智能, 2018, 31(6): 562-568.
[9]  Fang, C., Shang, Y. and Xu, D. (2018) MUFOLD-SS: New Deep Inception-inside-Inception Networks for Protein Secondary Structure Prediction. Proteins: Structure, Function and Bioinformatics, 86, 592-598.
https://doi.org/10.1002/prot.25487
[10]  Wang, S., Peng, J., Ma, J., et al. (2016) Protein Secondary Structure Pre-diction Using Deep Convolutional Neural Fields. Scientific Reports, 6, Article No. 18962.
https://doi.org/10.1038/srep18962
[11]  Heffernan, R., Yang, Y., Kuldip, P., et al. (2017) Capturing Non-Local In-teractions by Long Short-Term Memory Bidirectional Recurrent Neural Networks for Improving Prediction of Protein Secondary Structure, Backbone Angles, Contact Numbers, and Solvent Accessibility. Bioinformatics, 33, 3842-3849.
https://doi.org/10.1093/bioinformatics/btx218
[12]  Hanson, J., Paliwal, K., Litfin, T., et al. (2018) Improving Pre-diction of Protein Secondary Structure, Backbone Angles, Solvent Accessibility and Contact Numbers by Using Predict-ed Contact Maps and an Ensemble of Recurrent and Residual Convolutional Neural Networks. Bioinformatics, 35, 2403-2410.
https://doi.org/10.1093/bioinformatics/bty1006
[13]  Hochreiter, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780.
https://doi.org/10.1162/neco.1997.9.8.1735
[14]  Gers, F.A., Schmidhuber, J. and Cummins, F. (2000) Learning to Forget: Continual Prediction with LSTM. Neural Computation, 12, 2451-2471.
https://doi.org/10.1162/089976600300015015
[15]  Wang, G. and Dunbrack, R. (2005) PISCES: Recent Improve-ments to a PDB Sequence Culling Server. Nucleic Acids Research, 33, W94-W98.
https://doi.org/10.1093/nar/gki402
[16]  泽瓦勒贝M, 等. 理解生物信息学[M]. 李亦学, 郝沛, 译. 北京: 科学出版社, 2012.
[17]  沈世镒. 蛋白质分析与数学: 生物、医学与医药卫生中的定量化研究.上册[M]. 北京: 科学出版社, 2014.
[18]  张帅燕, 刘毅慧. 基于一种新的基团编码的蛋白质二级结构预测[J]. 智能计算机与应用, 2017, 7(3): 13-16.
[19]  Kingma, D. and Ba, J. (2014) Adam: A Method for Stochastic Optimization. Computer Sci-ence.
[20]  Kabsch, W. and Sander, C. (1983) Dictionary of Protein Secondary Structure: Pattern Recognition of Hydro-gen-Bonded and Geometrical Features. Biopolymers, 22, 2577-2637.
https://doi.org/10.1002/bip.360221211
[21]  Heffernan, R., Paliwal, K., Lyons, J., et al. (2018) Sin-gle-Sequence-Based Prediction of Secondary Structures and Solvent Accessibility by Deep Whole-Sequence Learning. Computers & Chemistry, 39, 2210-2216.
https://doi.org/10.1002/jcc.25534
[22]  Cheng, J., Liu, Y. and Ma, Y. (2020) Protein Secondary Structure Prediction Based on Integration of CNN and LSTM Model. Journal of Visual Communication and Image Representation, 71, Arti-cle ID: 102844.
https://doi.org/10.1016/j.jvcir.2020.102844
[23]  Moul, T.J., Fidelis, K., Kryshtafovych, A., et al. (2011) Critical Assessment of Methods of Protein Structure Prediction (CASP)—Round IX. Proteins: Structure, Function, and Bioin-formatics, 79, 1-5.
https://doi.org/10.1002/prot.23200
[24]  Moul, T.J., Fidelis, K., Kryshtafovych, A., et al. (2014) Critical Assessment of Methods of Protein Structure Prediction (CASP)—Round X. Proteins: Structure, Function, and Bioinformatics, 82, 1-6.
https://doi.org/10.1002/prot.24452
[25]  Faraggi, E., Zhang, T., Yang, Y., et al. (2012) SPINE X: Improving Protein Secondary Structure Prediction by Multistep Learning Coupled with Prediction of Solvent Accessible Surface Area and Backbone Torsion Angles. Journal of Computational Chemistry, 33, 259-267.
https://doi.org/10.1002/jcc.21968
[26]  Magnan, C.N. and Pierre, B. (2014) SSpro/ACCpro 5: Almost Perfect Pre-diction of Protein Secondary Structure and Relative Solvent Accessibility Using Profiles, Machine Learning and Struc-tural Similarity. Bioinformatics, 30, 2592- 2597.
https://doi.org/10.1093/bioinformatics/btu352

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133