|
SMLS:一种用于宫颈图像识别的分类识别模型
|
Abstract:
宫颈癌是全球妇女中发病率第四高、最危险的癌症。不过,只要及时发现和治疗,治愈率几乎可以达到100%。阴道镜检查是一种常见的医疗程序,用于评估宫颈病变的严重程度。然而,随着宫颈癌病例数量的逐年上升,医生面临的工作量也在不断增加,导致人工视觉检查中可能出现诊断错误和漏诊。由于传统ViT模型的特征提取能力并不能充分利用且运行效率低,因此本研究提出了一种新的模型,通过深度可分卷积和多层次多尺度特征融合(deeply separable convolution and multi-level multi-scale feature fusion, SMLS)对ViT模型进行改进,并辅以迁移学习以提高准确性。该方法旨在将宫颈阴道镜图像分为正常、宫颈上皮内瘤变和浸润癌三个类别。首先对数据集进行数据扩充以扩大其规模,然后对宫颈图像数据集的模型参数进行微调,然后通过与四种传统神经网络模型进行对比验证该模型的有效性,最后利用消融实验来进行对模型的各个模块的有效性验证。实验结果表明,所提出的方法在有限的数据集上达到了87.80%的准确率。该方法能够充分利用该模型的特征提取能力,识别效果好,可以应用于对宫颈图像的快速诊断识别。
Cervical cancer is the fourth most prevalent and dangerous female cancer in the world, but if detected and treated in a timely manner, the cure rate is almost 100%. Colposcopy is a common medical procedure used to assess the severity of cervical lesions. However, with the number of cervical cancer cases rising every year, the workload faced by physicians is increasing, leading to potential diagnostic errors and missed diagnoses during manual visualization. Since the feature extraction capability of the traditional Vision Transformer (ViT) model is not fully utilized and operates inefficiently, this study introduces a novel model, which improves the ViT model by using deeply separable convolution and multi-level multi-scale feature fusion (SMLS), and is used to perform a new method of recognition and classification of cervical images, supplemented with migration learning to improve the accuracy. The objective of the method is to categorize cervical colposcopy images into three classifications: normal, cervical intraepithelial neoplasia and invasive carcinoma. Firstly, data expansion is performed on the dataset to enlarge its size, then the model parameters of the cervical image dataset are fine-tuned, and then the efficacy of the model is validated through comparing it with four traditional neural network models, and finally ablation experiments are used to carry out the validation of the efficacy of each module of the model. The experimental findings demonstrate that the proposed method achieves an accuracy of 87.80% on a limited dataset. The method can make full use of the feature extraction ability of the model, with good recognition effect, and can be applied to the rapid diagnosis and recognition of cervical images.
[1] | Vale, D.B. and Teixeira, J.C. (2023) Implementing Plans for Global Elimination of Cervical Cancer. Nature Medicine, 29, 3004-3005. https://doi.org/10.1038/s41591-023-02577-0 |
[2] | Kashyap, N., Krishnan, N., Kaur, S. and Ghai, S. (2019) Risk Factors of Cervical Cancer: A Case-Control Study. Asia-Pacific Journal of Oncology Nursing, 6, 308-314. https://doi.org/10.4103/apjon.apjon_73_18 |
[3] | Bravo, C.A., Walker, M.J., Papadopoulos, A. and McWhirter, J.E. (2024) Social Media Use in HPV, Cervical Cancer, and Cervical Screening-Related Research: A Scoping Review. Preventive Medicine, 179, Article 107798. https://doi.org/10.1016/j.ypmed.2023.107798 |
[4] | Swanson, A.A. and Pantanowitz, L. (2024) The Evolution of Cervical Cancer Screening. Journal of the American Society of Cytopathology, 13, 10-15. https://doi.org/10.1016/j.jasc.2023.09.007 |
[5] | Asare, M., Obiri-Yeboah, D., Enyan, N.I.E., Nuer-Allornuvor, G., Fosu, E.S., Ken-Amoah, S., et al. (2024) An Intervention to Increase Cervical Cancer Screening among Women Living with HIV: A Mixed Methods Study. Patient Education and Counseling, 118, Article 107993. https://doi.org/10.1016/j.pec.2023.107993 |
[6] | Fu, L., Xia, W., Shi, W., Cao, G., Ruan, Y., Zhao, X., et al. (2022) Deep Learning Based Cervical Screening by the Cross-Modal Integration of Colposcopy, Cytology, and HPV Test. International Journal of Medical Informatics, 159, Article 104675. https://doi.org/10.1016/j.ijmedinf.2021.104675 |
[7] | Ramírez, A.T., Valls, J., Baena, A., et al. (2023) Performance of Cervical Cytology and HPV Testing for Primary Cervical Cancer Screening in Latin America: An Analysis within the ESTAMPA Study. The Lancet Regional Health—Americas, 26, Article 100593. |
[8] | Petry, K.U., Nieminen, P.J., Leeson, S.C., Bergeron, C.O.M.A. and Redman, C.W.E. (2018) 2017 Update of the European Federation for Colposcopy (EFC) Performance Standards for the Practice of Colposcopy. European Journal of Obstetrics & Gynecology and Reproductive Biology, 224, 137-141. https://doi.org/10.1016/j.ejogrb.2018.03.024 |
[9] | Wang, Y., Wang, J. and Mei, H. (2022) Diagnosis of Cervical Intraepithelial Neoplasia and Invasive Cervical Carcinoma by Cervical Biopsy under Colposcopy and Analysis of Factors Influencing. Emergency Medicine International, 2022, 1-5. https://doi.org/10.1155/2022/9621893 |
[10] | Dennis, A.P. and Strafella, A.P. (2024) The Role of AI and Machine Learning in the Diagnosis of Parkinson’s Disease and Atypical Parkinsonisms. Parkinsonism & Related Disorders, 126, Article 106986. https://doi.org/10.1016/j.parkreldis.2024.106986 |
[11] | Hosain, M.T., Jim, J.R., Mridha, M.F. and Kabir, M.M. (2024) Explainable AI Approaches in Deep Learning: Advancements, Applications and Challenges. Computers and Electrical Engineering, 117, Article 109246. https://doi.org/10.1016/j.compeleceng.2024.109246 |
[12] | Li, J., Hu, P., Gao, H., Shen, N. and Hua, K. (2024) Classification of Cervical Lesions Based on Multimodal Features Fusion. Computers in Biology and Medicine, 177, Article 108589. https://doi.org/10.1016/j.compbiomed.2024.108589 |
[13] | Dosovitskiy, A., Beyer, L. and Kolesnikov, A. (2021) An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. Proceedings of the ICLR Conference, Austria, 3-7 May 2021, 1-21. |
[14] | Islam, M.K., Rahman, M.M., Ali, M.S., Mahim, S.M. and Miah, M.S. (2024) Enhancing Lung Abnormalities Diagnosis Using Hybrid DCNN-ViT-GRU Model with Explainable AI: A Deep Learning Approach. Image and Vision Computing, 142, Article 104918. https://doi.org/10.1016/j.imavis.2024.104918 |
[15] | Kyrgiou, M., Bowden, S.J., Athanasiou, A., Paraskevaidi, M., Kechagias, K., Zikopoulos, A., et al. (2021) Morbidity after Local Excision of the Transformation Zone for Cervical Intra-Epithelial Neoplasia and Early Cervical Cancer. Best Practice & Research Clinical Obstetrics & Gynaecology, 75, 10-22. https://doi.org/10.1016/j.bpobgyn.2021.05.007 |
[16] | Manley, K.M., Wills, A.K., Morris, G.C., Hogg, J.L., López Bernal, A. and Murdoch, J.B. (2016) The Impact of HPV Cervical Screening on Negative Large Loop Excision of the Transformation Zone (LLETZ): A Comparative Cohort Study. Gynecologic Oncology, 141, 485-491. https://doi.org/10.1016/j.ygyno.2016.03.032 |
[17] | Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., et al. (2017) A Survey on Deep Learning in Medical Image Analysis. Medical Image Analysis, 42, 60-88. https://doi.org/10.1016/j.media.2017.07.005 |
[18] | Fernandes, K., Cardoso, J.S. and Fernandes, J. (2017) Transfer Learning with Partial Observability Applied to Cervical Cancer Screening. In: Lecture Notes in Computer Science, Springer, 243-250. https://doi.org/10.1007/978-3-319-58838-4_27 |
[19] | Lopes, U.K. and Valiati, J.F. (2017) Pre-Trained Convolutional Neural Networks as Feature Extractors for Tuberculosis Detection. Computers in Biology and Medicine, 89, 135-143. https://doi.org/10.1016/j.compbiomed.2017.08.001 |
[20] | Zhang, T., Luo, Y., Li, P., Liu, P., Du, Y., Sun, P., et al. (2020) Cervical Precancerous Lesions Classification Using Pre-Trained Densely Connected Convolutional Networks with Colposcopy Images. Biomedical Signal Processing and Control, 55, Article 101566. https://doi.org/10.1016/j.bspc.2019.101566 |
[21] | Zhang, X. and Zhao, S. (2018) Cervical Image Classification Based on Image Segmentation Preprocessing and a Capsnet Network Model. International Journal of Imaging Systems and Technology, 29, 19-28. https://doi.org/10.1002/ima.22291 |
[22] | Supriyanti, R., Budiono, T., Ramadhani, Y., et al. (2015) Pre-Processing of Ultrasonography Image Quality Improvement in Cases of Cervical Cancer Using Image Enhancement. 2015 IEEE 12th International Symposium on Biomedical Imaging, 2, 1318-1322. |
[23] | Zhu, D. and Wang, D. (2023) Transformers and Their Application to Medical Image Processing: A Review. Journal of Radiation Research and Applied Sciences, 16, Article 100680. https://doi.org/10.1016/j.jrras.2023.100680 |
[24] | Wang, Y., Du, B., Wang, W. and Xu, C. (2024) Multi-Tailed Vision Transformer for Efficient Inference. Neural Networks, 174, Article 106235. https://doi.org/10.1016/j.neunet.2024.106235 |
[25] | Chen, F., Huang, D., Lin, M., Song, J. and Huang, X. (2024) RDNet: Lightweight Residual and Detail Self-Attention Network for Infrared Image Super-Resolution. Infrared Physics & Technology, 141, Article 105480. https://doi.org/10.1016/j.infrared.2024.105480 |
[26] | Xia, X. and Ma, Y. (2024) Cross-Stage Feature Fusion and Efficient Self-Attention for Salient Object Detection. Journal of Visual Communication and Image Representation, 104, Article 104271. https://doi.org/10.1016/j.jvcir.2024.104271 |
[27] | Zhang, R., Liu, G., Wen, Y. and Zhou, W. (2023) Self-Attention-Based Convolutional Neural Network and Time-Frequency Common Spatial Pattern for Enhanced Motor Imagery Classification. Journal of Neuroscience Methods, 398, Article 109953. https://doi.org/10.1016/j.jneumeth.2023.109953 |
[28] | Li, Y., Wang, Y., Huang, Y., Xiang, P., Liu, W., Lai, Q., et al. (2023) RSU-Net: U-Net Based on Residual and Self-Attention Mechanism in the Segmentation of Cardiac Magnetic Resonance Images. Computer Methods and Programs in Biomedicine, 231, Article 107437. https://doi.org/10.1016/j.cmpb.2023.107437 |
[29] | Al-Selwi, S.M., Hassan, M.F., Abdulkadir, S.J., Muneer, A., Sumiea, E.H., Alqushaibi, A., et al. (2024) RNN-LSTM: From Applications to Modeling Techniques and beyond—Systematic Review. Journal of King Saud University—Computer and Information Sciences, 36, Article 102068. https://doi.org/10.1016/j.jksuci.2024.102068 |
[30] | Park, G., Han, C., Kim, D. and Yoon, W. (2020) MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, 1-5 March 2020, 1518-1526. |
[31] | Wang, D., Zhang, Z., Jiang, Y., Mao, Z., Wang, D., Lin, H., et al. (2021) DM3Loc: Multi-Label mRNA Subcellular Localization Prediction and Analysis Based on Multi-Head Self-Attention Mechanism. Nucleic Acids Research, 49, e46. https://doi.org/10.1093/nar/gkab016 |
[32] | Wang, W., Li, Y., Zou, T., Wang, X., You, J. and Luo, Y. (2020) A Novel Image Classification Approach via Dense-Mobilenet Models. Mobile Information Systems, 2020, 1-8. https://doi.org/10.1155/2020/7602384 |
[33] | Chowanda, A. (2021) Separable Convolutional Neural Networks for Facial Expressions Recognition. Journal of Big Data, 8, Article No. 132. https://doi.org/10.1186/s40537-021-00522-x |
[34] | Chollet, F. (2017) Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 1251-1258. |
[35] | Liu, C. and Meng, Z. (2024) TBFF-DAC: Two-Branch Feature Fusion Based on Deformable Attention and Convolution for Object Detection. Computers and Electrical Engineering, 116, Article 109132. https://doi.org/10.1016/j.compeleceng.2024.109132 |
[36] | Marcelino, P. (2018) Transfer Learning from Pre-Trained Models towards Data Science. |
[37] | Liu, S., Wang, L. and Yue, W. (2024) An Efficient Medical Image Classification Network Based on Multi-Branch CNN, Token Grouping Transformer and Mixer MLP. Applied Soft Computing, 153, Article 111323. https://doi.org/10.1016/j.asoc.2024.111323 |
[38] | Yang, T., Hu, H., Li, X., Meng, Q., Lu, H. and Huang, Q. (2024) An Efficient Fusion-Purification Network for Cervical Pap-Smear Image Classification. Computer Methods and Programs in Biomedicine, 251, Article 108199. https://doi.org/10.1016/j.cmpb.2024.108199 |