全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Gene Expression Model for the Disease Prediction with Auto-Encoder Model with Classifiers

DOI: 10.4236/jbm.2025.133013, PP. 155-182

Keywords: Auto-Encoder, Stacked Voting, Classification, Cancer Diagnosis, Gene Expression, Prediction

Full-Text   Cite this paper   Add to My Lib

Abstract:

Gene expression is the process through which genetic information in DNA is converted into functional products, primarily proteins. This involves two main steps: transcription, where DNA is copied into messenger RNA (mRNA), and translation, where mRNA is decoded by ribosomes to synthesize proteins. Gene expression is tightly regulated to ensure proper cellular function, and its analysis is vital in fields like cancer research, drug development, and genetic engineering. Hence, this paper proposed effective Voting-based Stacked Denoising Auto-encoder (VSDA) for the prediction of diseases. The VADA model uses the stacked model within the Auto-encoder for the accurate prediction of the gene expressions. This paper investigates the performance of four machine learning classifiers—Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbours (KNN), and Multi-Layer Perceptron (MLP)—on a cancer diagnosis dataset, using metrics such as Precision, Recall, F1-Score, and Support across multiple cancer types. Our results show that MLP achieves the highest overall performance with an average Precision of 0.92, Recall of 0.75, and F1-Score of 0.74. SVM follows closely with an average Precision of 0.89, Recall of 0.78, and F1-Score of 0.79, demonstrating strong reliability, particularly for cancers such as LUAD, KIRC, and THCA. RF exhibited an average Precision of 0.75, Recall of 0.68, and F1-Score of 0.66, indicating balanced performance but with slightly lower accuracy compared to SVM and MLP. KNN, while performing well in certain cancer types, had the lowest overall F1-Score of 0.60 and Precision of 0.71, showing greater variability across different cancer types. These results underscore the superiority of MLP in most scenarios, with SVM offering a competitive alternative for specific cancers. The study highlights the importance of classifier selection based on specific cancer datasets, with the goal of improving diagnostic accuracy and supporting clinical decision-making.

References

[1]  Gokhale, M., Mohanty, S.K. and Ojha, A. (2022) A Stacked Autoencoder Based Gene Selection and Cancer Classification Framework. Biomedical Signal Processing and Control, 78, Article 103999.
https://doi.org/10.1016/j.bspc.2022.103999
[2]  Arafa, A., El-Fishawy, N., Badawy, M. and Radad, M. (2023) Rn-Autoencoder: Reduced Noise Autoencoder for Classifying Imbalanced Cancer Genomic Data. Journal of Biological Engineering, 17, Article No. 7.
https://doi.org/10.1186/s13036-022-00319-3
[3]  Ram, P.K. and Kuila, P. (2022) GAAE: A Novel Genetic Algorithm Based on Autoencoder with Ensemble Classifiers for Imbalanced Healthcare Data. The Journal of Supercomputing, 79, 541-572.
https://doi.org/10.1007/s11227-022-04679-x
[4]  Babichev, S., Liakh, I. and Kalinina, I. (2024) Applying the Deep Learning Techniques to Solve Classification Tasks Using Gene Expression Data. IEEE Access, 12, 28437-28448.
https://doi.org/10.1109/access.2024.3368070
[5]  Uzma,, Manzoor, U. and Halim, Z. (2023) Protein Encoder: An Autoencoder-Based Ensemble Feature Selection Scheme to Predict Protein Secondary Structure. Expert Systems with Applications, 213, Article 119081.
https://doi.org/10.1016/j.eswa.2022.119081
[6]  Yuan, L., Zhao, J., Shen, Z., Zhang, Q., Geng, Y., Zheng, C., et al. (2023) Icircda-Neae: Accelerated Attribute Network Embedding and Dynamic Convolutional Autoencoder for Circrna-Disease Associations Prediction. PLOS Computational Biology, 19, e1011344.
https://doi.org/10.1371/journal.pcbi.1011344
[7]  Fu, Y., Yang, R. and Zhang, L. (2022) Association Prediction of Circrnas and Diseases Using Multi-Homogeneous Graphs and Variational Graph Auto-Encoder. Computers in Biology and Medicine, 151, Article 106289.
https://doi.org/10.1016/j.compbiomed.2022.106289
[8]  Wang, C., Li, T., Huang, L. and Chen, X. (2022) Prediction of Potential miRNA-Disease Associations Based on Stacked Autoencoder. Briefings in Bioinformatics, 23, bbac021.
https://doi.org/10.1093/bib/bbac021
[9]  Al Abir, F., Shovan, S.M., Hasan, M.A.M., Sayeed, A. and Shin, J. (2022) Biomarker Identification by Reversing the Learning Mechanism of an Autoencoder and Recursive Feature Elimination. Molecular Omics, 18, 652-661.
https://doi.org/10.1039/d1mo00467k
[10]  Khalsan, M., Mu, M., Al-Shamery, E.S., Ajit, S., Machado, L.R. and Opoku Agyeman, M. (2023) A Novel Fuzzy Classifier Model for Cancer Classification Using Gene Expression Data. IEEE Access, 11, 115161-115178.
https://doi.org/10.1109/access.2023.3325381
[11]  Shon, H., Batbaatar, E., Cha, E., Kang, T., Choi, S. and Kim, K. (2022) Deep Autoencoder Based Classification for Clinical Prediction of Kidney Cancer. The Transactions of the Korean Institute of Electrical Engineers, 71, 1393-1404.
https://doi.org/10.5370/kiee.2022.71.10.1393
[12]  Gupta, S., Gupta, M.K., Shabaz, M. and Sharma, A. (2022) Deep Learning Techniques for Cancer Classification Using Microarray Gene Expression Data. Frontiers in Physiology, 13, Article 952709.
https://doi.org/10.3389/fphys.2022.952709
[13]  Ravindran, U. and Gunavathi, C. (2023) A Survey on Gene Expression Data Analysis Using Deep Learning Methods for Cancer Diagnosis. Progress in Biophysics and Molecular Biology, 177, 1-13.
https://doi.org/10.1016/j.pbiomolbio.2022.08.004
[14]  Chen, L., Saykin, A.J., Yao, B. and Zhao, F. (2022) Multi-Task Deep Autoencoder to Predict Alzheimer’s Disease Progression Using Temporal DNA Methylation Data in Peripheral Blood. Computational and Structural Biotechnology Journal, 20, 5761-5774.
https://doi.org/10.1016/j.csbj.2022.10.016
[15]  Kelly, J., Moyeed, R., Carroll, C., Luo, S. and Li, X. (2023) Blood Biomarker-Based Classification Study for Neurodegenerative Diseases. Scientific Reports, 13, Article No. 17191.
https://doi.org/10.1038/s41598-023-43956-4
[16]  Zaccaria, G.M., Altini, N., Mezzolla, G., Vegliante, M.C., Stranieri, M., Pappagallo, S.A., et al. (2024) Surviae: Survival Prediction with Interpretable Autoencoders from Diffuse Large B-Cells Lymphoma Gene Expression Data. Computer Methods and Programs in Biomedicine, 244, Article 107966.
https://doi.org/10.1016/j.cmpb.2023.107966
[17]  Peng, L., Tu, Y., Huang, L., Li, Y., Fu, X. and Chen, X. (2022) DAESTB: Inferring Associations of Small Molecule-miRNA via a Scalable Tree Boosting Model Based on Deep Autoencoder. Briefings in Bioinformatics, 23, bbac478.
https://doi.org/10.1093/bib/bbac478
[18]  Mahdi-Esferizi, R., Haji Molla Hoseyni, B., Mehrpanah, A., Golzade, Y., Najafi, A., Elahian, F., et al. (2023) Deep4med: Deep Learning for P4 Medicine to Predict Normal and Cancer Transcriptome in Multiple Human Tissues. BMC Bioinformatics, 24, Article No. 275.
https://doi.org/10.1186/s12859-023-05400-2
[19]  Sadria, M., Layton, A., Goyal, S. and Bader, G.D. (2024) Fatecode Enables Cell Fate Regulator Prediction Using Classification-Supervised Autoencoder Perturbation. Cell Reports Methods, 4, Article 100819.
https://doi.org/10.1016/j.crmeth.2024.100819
[20]  Almarzouki, H.Z. (2022) Deep-Learning-Based Cancer Profiles Classification Using Gene Expression Data Profile. Journal of Healthcare Engineering, 2022, 1-13.
https://doi.org/10.1155/2022/4715998

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133