|
一种混合CNN-Transformer多尺度特征的肝脏肿瘤分割网络
|
Abstract:
针对肝脏肿瘤分割中肿瘤区域与肝脏正常组织界限模糊、大小和形状的可变性所导致的挑战,对基于融合多尺度特征的双编码器分割算法进行了研究。提出一种基于双编码器融合多尺度特征的医学图像分割算法MCT-Net (Mix CNN-Transformer Multi-scale Feature Network)。首先,在编码器阶段,一方面使用细节特征提取模块DFEM (Detail Feature Extraction Module),提取详细的肿瘤边缘局部特征;另一方面引入Transformer编码器,在保持肿瘤边缘细节分割良好的前提下,获得了更大的感受野,进一步提高肿瘤整体感知。其次,在不同分支分别设计局部空间注意力模块LAM (Local Spatial attention Module)和全局空间注意力模块GAM (Global Spatial attention Module),在降低计算复杂度的同时,提高肿瘤特征的表达能力以充分学习可变肿瘤的局部边缘信息和全局语义信息。进而,在Transformer编码器中添加多轴自注意力Max-SA (Multi-axis self-attention),将完全密集的注意力机制分解为两个更轻量级的变体。在解码器阶段有效利用并行编码器的不同信息,设计了多尺度信息交互模块MSIF (Multi-scale Information Fusion),实现了信息在不同尺度之间的传递和补充,从而提高边界的分割精度。最后,所提的方法分别在公开数据集LiTS2017和3D-IRCADb进行试验评估测试和泛化实验,所提方法在LiTS2017上评价指标Dice和ASD分别为72.16%和3.380 mm。
Liver tumor segmentation is greatly challenging because of blurred boundaries between tumor areas and normal liver tissues, and variability in its sizes and shapes. To deal with such challenges, research on a multi-scale feature fusion-based medical image segmentation algorithm via double encoders and is proposed MCT-Net (Mix CNN-Transformer Multi-scale Feature Network). Firstly, in the encoder stage, on the one hand, the DFEM (Detail Feature Extraction Module) was used to extract the detailed local features of the tumor edge. Transformer encoder, introduced, on the other hand, in keeping the tumor edge detail segmentation under the premise of good, even more receptive field, further raise the overall awareness. Second, the Local space attention in different branches respectively designs LAM (Local Spatial attention Module) and GAM (Global Spatial attention Module), in reducing computational complexity at the same time, improve the expression ability of tumor features to fully learn the local edge information and global semantic information of variable tumors. Furthermore, Max-SA (Multi-axis self-attention) is added to the Transformer encoder to decompose the fully dense attention mechanism into two more lightweight variants. In effectively using the different Information of parallel encoder, decoder design the MSIF (Multi-scale Information Fusion), realizes the Information transfer and complementary between different scales, so as to improve the segmentation accuracy of the boundary. Finally, the proposed method, respectively, in the open LiTS2017 and 3D-IRCADb experiment assessment and generalization experiment, the proposed method on LiTS2017 Dice and ASD evaluation indexes are72.16% and 3.380 mm respectively.
[1] | Sung, H., Ferlay, J., Siegel, R.L., Laversanne, M., Soerjomataram, I., Jemal, A., et al. (2021) Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians, 71, 209-249. https://doi.org/10.3322/caac.21660 |
[2] | 乐美琰, 魏千越, 邓炜, 等. 基于电子计算机断层扫描图像的肝癌病灶自动分割方法研究进展[J]. 生物医学工 程学杂志, 2018, 35(3): 481-487. |
[3] | Aqil Burney, S.M. and Tariq, H. (2014) K-Means Cluster Analysis for Image Segmentation. International Journal of Computer Applications, 96, 1-8. https://doi.org/10.5120/16779-6360 |
[4] | Cremers, D. (2003) A Multiphase Level Set Framework for Motion Segmentation. In: In: Griffin, L.D. and Lillholm, M., Eds., Scale Space Methods in Computer Vision, Springer, 599-614. https://doi.org/10.1007/3-540-44935-3_42 |
[5] | Liu, Y.W., Mao, J. and Chen, X.L. (2014) Interactive Liver Tumor Segmentation Method Based on Support Vector Machine Classification. Automation and Instrumentation, 6, 166-169. |
[6] | Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W. and Frangi, A., Eds., Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Springer, 234-241. https://doi.org/10.1007/978-3-319-24574-4_28 |
[7] | Oktay, O., Schlemper, J., Folgoc, L.L., et al. (2018) Attention U-Net: Learning Where to Look for the Pancreas. arXiv: 1804.03999. |
[8] | Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N. and Liang, J. (2020) UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation. IEEE Transactions on Medical Imaging, 39, 1856-1867. https://doi.org/10.1109/tmi.2019.2959609 |
[9] | Arulappan, A. and Thankaraj, A.B.R. (2021) Liver Tumor Segmentation Using a New Asymmetrical Dilated Convolutional Semantic Segmentation Network in CT Images. International Journal of Imaging Systems and Technology, 32, 815-830. https://doi.org/10.1002/ima.22663 |
[10] | Chen, J., Lu, Y., Yu, Q., et al. (2021) TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv: 2102.04306. |
[11] | Li, X., Chen, H., Qi, X., Dou, Q., Fu, C. and Heng, P. (2018) H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes. IEEE Transactions on Medical Imaging, 37, 2663-2674. https://doi.org/10.1109/tmi.2018.2845918 |
[12] | Lv, P., Wang, J. and Wang, H. (2022) 2.5D Lightweight RIU-Net for Automatic Liver and Tumor Segmentation from CT. Biomedical Signal Processing and Control, 75, Article ID: 103567. https://doi.org/10.1016/j.bspc.2022.103567 |
[13] | Chen, Y., Zheng, C., Zhang, W., Lin, H., Chen, W., Zhang, G., et al. (2023) MS-FANet: Multi-Scale Feature Attention Network for Liver Tumor Segmentation. Computers in Biology and Medicine, 163, Article ID: 107208. https://doi.org/10.1016/j.compbiomed.2023.107208 |
[14] | Deng, H., Deng, Y.X., Ding, T.B., et al. (2021) Liver CT Image Segmentation Based on Generative Adversarial Network. Beijing Biomedical Engineering, 40, 367-376. |
[15] | Wang, X., Wang, S., Zhang, Z., Yin, X., Wang, T. and Li, N. (2023) CPAD-Net: Contextual Parallel Attention and Dilated Network for Liver Tumor Segmentation. Biomedical Signal Processing and Control, 79, Article ID: 104258. https://doi.org/10.1016/j.bspc.2022.104258 |
[16] | Peng, X.G. and Peng, D.L. (2023) MDA-Net: A Medical Image Segmentation Network That Combines Dual-Path Attention Mechanisms. Journal of Chinese Computer Systems, 44, 2308-2313. |
[17] | Kushnure, D.T. and Talbar, S.N. (2021) MS-UNet: A Multi-Scale UNet with Feature Recalibration Approach for Automatic Liver and Tumor Segmentation in CT Images. Computerized Medical Imaging and Graphics, 89, Article ID: 101885. https://doi.org/10.1016/j.compmedimag.2021.101885 |
[18] | Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., et al. (2021) Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 9992-10002. https://doi.org/10.1109/iccv48922.2021.00986 |
[19] | Dai, Z., Liu, H., Le, Q.V., et al. (2021) CoatNet: Marrying Convolution and Attention for All Data Sizes. Advances in Neural Information Processing Systems, 34, 3965-3977. |
[20] | Jiang, Y., Chang, S. and Wang, Z. (2021) Transgan: Two Pure Transformers Can Make One Strong Gan, and That Can Scale up. Advances in Neural Information Processing Systems, 34, 14745-14758. |
[21] | Shaw, P., Uszkoreit, J. and Vaswani, A. (2018) Self-attention with Relative Position Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), New Orleans, June 2018, 464-468. https://doi.org/10.18653/v1/n18-2074 |
[22] | Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. arXiv: 1706.03762. |
[23] | Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al. (2021) An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv: 2010.11929. |
[24] | Bilic, P., Christ, P., Li, H.B., et al. (2023) The Liver Tumor Segmentation Benchmark (Lits). Medical Image Analysis, 84, Article ID: 102680. |
[25] | Milletari, F., Navab, N. and Ahmadi, S. (2016) V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. 2016 Fourth International Conference on 3D Vision (3DV), Stanford, 25-28 October 2016, 565-571. https://doi.org/10.1109/3dv.2016.79 |
[26] | Jaccard, P. (1901) Eude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Société vaudoise des sciences naturelles, 37, 547-579. |
[27] | Heimann, T., van Ginneken, B., Styner, M.A., Arzhaeva, Y., Aurich, V., Bauer, C., et al. (2009) Comparison and Evaluation of Methods for Liver Segmentation from CT Datasets. IEEE Transactions on Medical Imaging, 28, 1251-1265. https://doi.org/10.1109/tmi.2009.2013851 |
[28] | Song, L., Liu, G. and Ma, M. (2022) TD-Net: Unsupervised Medical Image Registration Network Based on Transformer and CNN. Applied Intelligence, 52, 18201-18209. https://doi.org/10.1007/s10489-022-03472-w |