|
基于RFB模块与注意力机制的目标检测算法
|
Abstract:
针对目标检测算法中深度卷积网络提取特征图关联性不足导致的检测精度下降问题,提出一种基于群体感受野模块(Receptive Field Block, RFB)与坐标注意力(Coordinate Attention, CA)的改进SSD目标检测算法。使用深层特征提取网络ResNet50作为主干网络,并在卷积层结构中添加坐标注意力模块,捕获方向和位置感知的信息;为充分利用不同特征图之间的关联信息,在特征提取与预测中采用反卷积与上采样等方式,融合低层位置特征和高层语义信息。同时在网络结构中引入多尺度卷积核与空洞卷积的RFB模块,以提高感受野的方式提高网络的特征提取能力。实验表明:该算法在PASCAL VOC 2007数据集上的mAP为78.08%,相较于传统的SSD算法检测能力得到了显著提升。
Aiming at the problem of insufficient correlation of feature map extracted by deep convolutional network in object detection algorithm, an improved SSD object detection algorithm based on Receptive Field Block and Coordinate Attention is proposed. The deep feature extraction network ResNet50 is used as the backbone network, and a coordinate attention module is added to the convolutional layer structure to capture the information of direction and location awareness. In order to make full use of the association information between different feature maps, deconvolution and upsampling are used in feature extraction and prediction to integrate low-level location features and high-level semantic information. At the same time, the RFB module of multi-scale convolution kernel and hole convolution is introduced in the network structure to improve the feature extraction ability of the network by improving the receptive field. Experiments show that the mAP of the algorithm on the PASCAL VOC 2007 dataset is 78.08%, which is significantly improved compared with the traditional SSD algorithm.
[1] | 罗会兰, 陈鸿坤. 基于深度学习的目标检测研究综述[J]. 电子学报, 2020, 48(6): 1230-1239. |
[2] | 张阳婷, 黄德启, 王东伟, 贺佳佳. 基于深度学习的目标检测算法研究与应用综述[J/OL]. 计算机工程与应用.
https://kns.cnki.net/kcms/detail/11.2127.TP.20230620.1746.002.html, 2023-01-13. |
[3] | Girshick, R. (2015) Fast R-CNN. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 1440-1448. https://doi.org/10.1109/ICCV.2015.169 |
[4] | 王琳毅, 白静, 李文静, 等. YOLO系列目标检测算法研究进展[J]. 计算机工程与应用, 2023, 59(14): 15-29. |
[5] | Liu, W., Anguelov, D., Erhan, D., et al. (2016) SSD: Single Shot MultiBox Detector. In: Leibe, B., Matas, J., Sebe, N. and Welling, M., Eds., Computer Vision—ECCV 2016, Springer, Cham. |
[6] | Redmon, J., Divvala, S.K., Girshick, R.B. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 779-788. https://doi.org/10.1109/CVPR.2016.91 |
[7] | 许德刚, 王露, 李凡. 深度学习的典型目标检测算法研究综述[J]. 计算机工程与应用, 2021, 57(8): 10-25. |
[8] | 杜紫薇, 周恒, 李承阳, 等. 面向深度卷积神经网络的小目标检测算法综述[J]. 计算机科学, 2022, 49(12): 205-218. |
[9] | Fu, C.Y., Liu, W., Ranga, A., et al. (2017) DSSD: Deconvolutional Single Shot Detector. Computer Vision and Pattern Recognition, 1-11. https://doi.org/10.48550/arXiv.1701.06659 |
[10] | Zhao, Q., Sheng, T., Wang, Y., et al. (2019) M2Det: A Single-Shot Object Detector Based on Multi-Level Feature Pyramid Network. Proceedings of the AAAI Conference on Artificial Intelligence, Washington DC, 7-14 February 2023, Vol. 33, 9259-9266. https://doi.org/10.1609/aaai.v33i01.33019259 |
[11] | Zhou, P., Ni, B.B., Geng, C., Hu, J.G. and Xu, Y. (2018) Scale-Transferrable Object Detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 528-537.
https://doi.org/10.1109/CVPR.2018.00062 |
[12] | Huang, G., Liu, Z. and Weinberger, Q.K. (2016) Densely Connected Convolutional Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 2261-2269.
https://doi.org/10.1109/CVPR.2017.243 |
[13] | 聂志勇, 阴宇薇, 汤佳欣, 等. 一种基于边界框关键点距离的框回归算法[J]. 计算机工程, 2023, 49(7): 65-75. |
[14] | He, K.M., Zhang, X.Y., Ren, S.Q. and Sun, J. (2015) Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 770-778.
https://doi.org/10.1109/CVPR.2016.90 |
[15] | Cao, G., Xie, X., Yang, W., et al. (2018) Feature-Fused SSD: Fast Detection for Small Objects. International Conference on Graphic and Image Processing, Qingdao, 14-16 October 2017, 106151E.
https://doi.org/10.1117/12.2304811 |
[16] | Hou, Q., Zhou, D. and Feng, J. (2021) Coordinate Attention for Efficient Mobile Network Design. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 13713-13722.
https://doi.org/10.1109/CVPR46437.2021.01350 |
[17] | Liu, S. and Huang, D. (2018) Receptive Field Block Net for Accurate and Fast Object Detection. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds., Computer Vision—ECCV 2018, Springer, Cham, 404-419.
https://doi.org/10.1007/978-3-030-01252-6_24 |