|
基于改进DeepLabV3+模型的遥感图像语义分割
|
Abstract:
我国自乡村振兴战略施行以来,不断促进乡村耕地的改良扩张,通过遥感图像对其面积的检测提取对于保障我国农村产业发展起到至关重要的作用。针对本文采用的数据集,普通DeepLabV3+模型缺少对条形结构区域以及离散结构区域特征的关注,并且缺少自适应的特征层融合过程,针对该问题,本文提出一种以MobileNetV2为骨干网络的轻量型改进DeepLabV3+模型。首先,通过MobileNetV2提取遥感图像特征;其次,将空间金字塔模块(ASPP)中的全局平均池化层替换为条形池化层,便于提取条形区域以及离散区域的特征信息;最后,为了获得多尺度特征信息,加入自适应特征融合模块,对不同特征层的特征进行融合,提高网络的特征表达能力。实验结果表明在不同类型的数据集中,改进DeepLabV3+模型的性能相比普通DeepLabV3+模型以及现有主流网络模型得到了一定提升,一定程度上改善了DeepLabV3+的不足,提高了遥感图像语义分割能力。
Since the implementation of rural revitalization strategy, constantly promoting the improvement and expansion of rural farmland, the detection and extraction of its area through remote sensing image plays a vital role in guaranteeing the development of rural industry. Aiming at the data set adopted in this paper, the ordinary DeepLabV3+ model lacks attention to the features of strip structure region and discrete structure region, and lacks adaptive feature layer fusion process. To solve this problem, this paper proposes a lightweight improved DeepLabV3+ model with MobileNetV2 as the backbone network. Firstly, remote sensing image features were extracted by MobileNetV2. Secondly, the global average pooling layer is replaced by the strip pooling layer in the spatial pyramid module (ASPP), which is convenient to extract the feature information of the strip region and discrete region. Finally, in order to obtain multi-scale feature information, an adaptive feature fusion module is added to fuse features of different feature layers and improve the feature representation ability of the network. Experimental results show that in different types of data sets, the performance of the improved DeepLabV3+ model is improved to some extent compared with the ordinary DeepLabV3+ model and the existing mainstream network models, which to some extent improves the deficiency of DeepLabV3+ and improves the semantic segmentation capability of remote sensing images.
[1] | Krizhevsky, A., Sutskever, I., Hinton, G. (2012) ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012, Lake Tahoe, 3-6 December 2012, 2. |
[2] | Sun, Y., Wang, X., Tang, X. (2013) Deep Convolutional Net-work Cascade for Facial Point Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni-tion, Portland, 23-28 June 2013, 3476-3483.
https://doi.org/10.1109/CVPR.2013.446 |
[3] | Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolu-tional Networks for Biomedical Image Segmentation. Springer, Cham. |
[4] | Badrinarayanan, V., Kendall, A. and Cipolla, R. (2017) Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39, 2481-2495.
https://doi.org/10.1109/TPAMI.2016.2644615 |
[5] | Shelhamer, E., Long, J. and Darrell, T. (2014) Fully Convolu-tional Networks for Semantic Segmentation. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 3431-3440. |
[6] | Chen, L., Papandreou, G., Kokkinos, I., Murphy, K.P. and Yuille, A.L. (2014) Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. IEEE Transac-tions on Pattern Analysis and Machine Intelligence, 40, 834-848. |
[7] | Chen, L., Papandreou, G., Kokkinos, I., Murphy, K.P. and Yuille, A.L. (2016) DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolu-tion, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 834-848. https://doi.org/10.1109/TPAMI.2017.2699184 |
[8] | Chen, L., Papandreou, G., Schroff, F. and Adam, H. (2017) Rethinking Atrous Convolution for Semantic Image Segmentation. |
[9] | Chen, L., Zhu, Y., Papandreou, G., Schroff, F. and Adam, H. (2018) Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Computer Vision—ECCV 2018 15th European Conference, Munich, 8-14 September 2018, 833-851. https://doi.org/10.1007/978-3-030-01234-2_49 |
[10] | Hu, J., Shen, L. and Sun, G. (2017) Squeeze-and-Excitation Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-22 June 2018, 7132-7141. https://doi.org/10.1109/CVPR.2018.00745 |
[11] | Woo, S., Park, J., Lee, J. and Kweon, I. (2018) CBAM: Convolutional Block Attention Module. European Conference on Computer Vision, Munich, 8-14 September 2018, 3-19. https://doi.org/10.1007/978-3-030-01234-2_1 |
[12] | Hou, Q., Zhou, D. and Feng, J. (2021) Coordinate Attention for Efficient Mobile Network Design. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recogni-tion (CVPR), Nashville, 20-25 June 2021, 13708-13717.
https://doi.org/10.1109/CVPR46437.2021.01350 |
[13] | Pena, J., Tan, Y.M. and Boonpook, W. (2019) Semantic Segmentation Based Remote Sensing Data Fusion on Crops Detection. Journal of Computer and Communications, 7, 53-64. https://doi.org/10.4236/jcc.2019.77006 |
[14] | 宁纪锋, 倪静, 何宜家, 李龙飞, 赵志新, 张智韬. 基于卷积注意力的无人机多光谱遥感影像地膜农田识别[J]. 农业机械学报, 2021, 52(9): 213-220. |
[15] | 邓泓, 杨滢婷, 刘兆朋, 刘木华, 陈雄飞, 刘鑫. 基于深度学习的无人机水田图像语义分割方法[J]. 中国农机化学报, 2021, 42(10): 165-172. |
[16] | 任鸿杰, 刘萍, 岱超, 史俊才. 改进DeepLabV3+网络的遥感影像农作物分割方法[J]. 计算机工程与应用, 2022, 58(11): 215-223. |
[17] | 董荣胜, 马雨琪, 刘意, 李凤英. 加强类别关系的农作物遥感图像语义分割[J]. 中国图象图形学报, 2022, 27(11): 3382-3394. |
[18] | Lin, T., Goyal, P., Girshick, R.B., He, K. and Dollár, P. (2017) Focal Loss for Dense Object Detection. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 2999-3007.
https://doi.org/10.1109/ICCV.2017.324 |
[19] | Milletarì, F., Navab, N. and Ahmadi, S. (2016) V-Net: Fully Convo-lutional Neural Networks for Volumetric Medical Image Segmentation. 2016 4th International Conference on 3D Vision (3DV), Stanford, 25-28 October 2016, 565-571.
https://doi.org/10.1109/3DV.2016.79 |