全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于双向长短期记忆网络和卷积神经网络的DNA 6mA甲基化位点预测
Prediction of DNA 6mA Methylation Sites Based on Bidirectional Long Short-Term Memory Network and Convolutional Neural Network

DOI: 10.12677/hjcb.2024.143003, PP. 29-38

Keywords: DNA 6mA位点,双向长短期记忆网络,卷积神经网络,特征编码
DNA 6mA Sites
, Bidirectional Long Short-Term Memory Network, Convolutional Neural Network, Feature Encoding

Full-Text   Cite this paper   Add to My Lib

Abstract:

DNA N6-甲基腺嘌呤(6mA)是一种重要的表观遗传修饰,参与基因调控、DNA复制和修复等生物过程,对疾病研究也具有重要意义,准确识别DNA 6mA位点对理解其功能和机制至关重要。尽管现有的NA 6mA位点预测方法已取得较大成功,但在预测精度和跨物种泛化能力上仍有改进空间。本文提出了一种结合双向长短期记忆网络(BiLSTM)和卷积神经网络(CNN)的混合深度学习模型( BiLSTMCNN )来提高对DNA 6mA位点预测的能力。模型首先采用one-hot、EIIP和DNA二聚体三种编码方式对DNA序列进行编码,然后在不同网络结构、层数和优化器下优化模型。通过在蔷薇科植物、水稻和拟南芥的数据集上的广泛实验表明, BiLSTMCNN 模型在蔷薇科植物中的准确率(ACC)为94.5%,在水稻中为93.8%,在拟南芥中为86.6%。与其他方法相比, BiLSTMCNN 模型在三个植物物种的6mA位点预测中均展现出良好的性能,并具有出色的跨物种泛化能力。
DNA N6-methyladenine (6mA) is an important epigenetic modification involved in biological processes such as gene regulation, DNA replication, and repair, making it significant for disease research. Therefore, accurately identifying DNA 6mA sites is crucial for understanding their functions and mechanisms. Despite notable successes with existing methods, there is still room for improvement in prediction accuracy and cross-species generalization. In this study, we propose a hybrid deep learning model ( BiLSTMCNN ) that integrates bidirectional long short-term memory networks (BiLSTM) and convolutional neural networks (CNN). Firstly, the model-encoded DNA sequences employ one-hot encoding, EIIP encoding, and DNA dimer encoding. And then optimized under various network architectures, layer configurations and optimizers. We conducted experiments on datasets from Rosaceae, rice and Arabidopsis thaliana, the results indicate that the

References

[1]  杜轲. DNA表观遗传修饰6mA抑制DNA聚合酶eta催化DNA复制的动力学研究[D]: [硕士学位论文]. 延安: 延安大学, 2019.
[2]  Ye, Q., Belabed, H., Wang, Y., Yu, Z., Palaniappan, M., Li, J., et al. (2022) Advancing ASMS with LC‐MS/MS for the Discovery of Novel PDCL2 Ligands from DNA‐Encoded Chemical Library Selections. Andrology, 11, 808-815.
https://doi.org/10.1111/andr.13309
[3]  Adhikari, S., Erill, I. and Curtis, P.D. (2021) Transcriptional Rewiring of the GcrA/CcrM Bacterial Epigenetic Regulatory System in Closely Related Bacteria. PLOS Genetics, 17, e1009433.
https://doi.org/10.1371/journal.pgen.1009433
[4]  Chen, W., Lv, H., Nie, F. and Lin, H. (2019) i6mA-Pred: Identifying DNA N6-Methyladenine Sites in the Rice Genome. Bioinformatics, 35, 2796-2800.
https://doi.org/10.1093/bioinformatics/btz015
[5]  Pian, C., Zhang, G., Li, F. and Fan, X. (2019) MM-6mAPred: Identifying DNA N6-Methyladenine Sites Based on Markov Model. Bioinformatics, 36, 388-392.
https://doi.org/10.1093/bioinformatics/btz556
[6]  Kong, L. and Zhang, L. (2019) i6mA-DNCP: Computational Identification of DNA N6-Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features. Genes, 10, Article 828.
https://doi.org/10.3390/genes10100828
[7]  Hasan, M.M., Manavalan, B., Shoombuatong, W., Khatun, M.S. and Kurata, H. (2020) i6mA-Fuse: Improved and Robust Prediction of DNA 6 Ma Sites in the Rosaceae Genome by Fusing Multiple Feature Representation. Plant Molecular Biology, 103, 225-234.
https://doi.org/10.1007/s11103-020-00988-y
[8]  Xu, H., Hu, R., Jia, P. and Zhao, Z. (2020) 6mA-Finder: A Novel Online Tool for Predicting DNA N6-Methyladenine Sites in Genomes. Bioinformatics, 36, 3257-3259.
https://doi.org/10.1093/bioinformatics/btaa113
[9]  Khanal, J., Lim, D.Y., Tayara, H. and Chong, K.T. (2021) i6mA-Stack: A Stacking Ensemble-Based Computational Prediction of DNA N6-Methyladenine (6mA) Sites in the Rosaceae Genome. Genomics, 113, 582-592.
https://doi.org/10.1016/j.ygeno.2020.09.054
[10]  Hasan, M.M., Basith, S., Khatun, M.S., Lee, G., Manavalan, B. and Kurata, H. (2020) Meta-i6mA: An Interspecies Predictor for Identifying DNA N6-Methyladenine Sites of Plant Genomes by Exploiting Informative Features in an Integrative Machine-Learning Framework. Briefings in Bioinformatics, 22, bbaa202.
https://doi.org/10.1093/bib/bbaa202
[11]  He, S., Kong, L. and Chen, J. (2021) iDNA6mA-Rice-DL: A Local Web Server for Identifying DNA N6-Methyladenine Sites in Rice Genome by Deep Learning Method. Journal of Bioinformatics and Computational Biology, 19, Article ID: 2150019.
https://doi.org/10.1142/s0219720021500190
[12]  Huang, G., Huang, X. and Luo, W. (2023) 6mA-StackingCV: An Improved Stacking Ensemble Model for Predicting DNA N6-Methyladenine Site. BioData Mining, 16, Article No. 34.
https://doi.org/10.1186/s13040-023-00348-8
[13]  Teng, Z., Zhao, Z., Li, Y., Tian, Z., Guo, M., Lu, Q., et al. (2022) i6mA-Vote: Cross-Species Identification of DNA N6-Methyladenine Sites in Plant Genomes Based on Ensemble Learning with Voting. Frontiers in Plant Science, 13, Article 845835.
https://doi.org/10.3389/fpls.2022.845835
[14]  Kelley, D.R., Snoek, J. and Rinn, J.L. (2016) Basset: Learning the Regulatory Code of the Accessible Genome with Deep Convolutional Neural Networks. Genome Research, 26, 990-999.
https://doi.org/10.1101/gr.200535.115
[15]  Alakuş, T.B. (2023) A Novel Repetition Frequency-Based DNA Encoding Scheme to Predict Human and Mouse DNA Enhancers with Deep Learning. Biomimetics, 8, Article 218.
https://doi.org/10.3390/biomimetics8020218
[16]  Matsuki, M., Lago, P. and Inoue, S. (2019) Characterizing Word Embeddings for Zero-Shot Sensor-Based Human Activity Recognition. Sensors, 19, Article 5043.
https://doi.org/10.3390/s19225043
[17]  Farid, A.B., Fathy, E.M., Sharaf Eldin, A. and Abd-Elmegid, L.A. (2021) Software Defect Prediction Using Hybrid Model (CBIL) of Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (Bi-LSTM). PeerJ Computer Science, 7, e739.
https://doi.org/10.7717/peerj-cs.739
[18]  Yasin, M., Sarıgül, M. and Avci, M. (2024) Logarithmic Learning Differential Convolutional Neural Network. Neural Networks, 172, Article ID: 106114.
https://doi.org/10.1016/j.neunet.2024.106114
[19]  王双印, 滕国文. 卷积神经网络中ReLU激活函数优化设计[J]. 信息通信, 2018(1): 42-43.
[20]  邢波涛. 基于全卷积神经网络的MR脑肿瘤图像分割算法研究[D]: [硕士学位论文]. 天津: 天津大学, 2018.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133