|
优化YOLOv11模型:基于多尺度注意力机制的小目标检测性能提升研究
|
Abstract:
随着遥感图像中小目标检测问题的日益突出,传统目标检测方法在小目标的精确定位上存在局限性。为解决这一问题,本文提出了一种基于YOLOv11模型的多尺度注意力机制优化方法。首先,删除了YOLOv11模型中用于大目标检测的20 × 20尺度检测层,增加了160 × 160尺度的小目标检测层,以提升小目标的检测精度。其次,采用EIoU (Enhanced Intersection over Union)损失函数替代CIoU损失函数,解决了CIoU在长宽比差异较大的目标中的定位问题,从而加速收敛并提高定位精度。最后,结合空间注意力和通道注意力机制,增强了模型对不同尺度目标的感知能力。实验结果表明,优化后的YOLOv11模型在多个遥感图像数据集上表现出较传统YOLOv11显著提高的精度、召回率和F1分数,特别在小目标检测任务中具有更强的鲁棒性和更高的检测精度。研究表明,提出的方法能有效提升小目标检测性能,为遥感图像分析提供了新的解决方案。
With the increasingly prominent problem of small target detection in remote sensing images, traditional object detection methods have limitations in accurately locating small targets. To address this issue, this paper proposes a multi-scale attention mechanism optimization method based on the YOLOv11 model. Firstly, the 20 × 20 scale detection layer used for large object detection in the YOLOv11 model was removed, and a 160 × 160 scale small object detection layer was added to improve the detection accuracy of small objects. Secondly, the EIoU (Enhanced Intersection over Union) loss function is used instead of the CIoU loss function to solve the localization problem of CIoU in targets with large aspect ratio differences, thereby accelerating convergence and improving localization accuracy. Finally, by combining spatial attention and channel attention mechanisms, the model’s perception ability for targets of different scales was enhanced. The experimental results show that the optimized YOLOv11 model exhibits significantly improved accuracy, recall, and F1 score compared to traditional YOLOv11 on multiple remote sensing image datasets, especially in small object detection tasks with stronger robustness and higher detection accuracy. Research has shown that the proposed method can effectively improve the performance of small object detection, providing a new solution for remote sensing image analysis.
[1] | 童康, 吴一全. 基于深度学习的小目标检测基准研究进展[J]. 电子学报, 2024, 52(3): 1016-1040. |
[2] | 贾桂敏, 程羽, 齐孟飞. 多尺度注意力特征增强融合的红外小目标检测新网络[J]. 中国安全科学学报, 2024, 34(6): 90-98. |
[3] | 马鸽, 李洪伟, 严梓维, 等. 基于多注意力的改进YOLOv5s小目标检测算法[J]. 工程科学学报, 2024, 46(9): 1647-1658. |
[4] | Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 779-788. https://doi.org/10.1109/cvpr.2016.91 |
[5] | Khanam, R. and Hussain, M. (2024) YOLOv11: An Overview of the Key Architectural Enhancements. |
[6] | Zheng, Z., Wang, P., Ren, D., Liu, W., Ye, R., Hu, Q., et al. (2022) Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. IEEE Transactions on Cybernetics, 52, 8574-8586. https://doi.org/10.1109/tcyb.2021.3095305 |
[7] | Zhang, Y., Ren, W., Zhang, Z., Jia, Z., Wang, L. and Tan, T. (2022) Focal and Efficient IOU Loss for Accurate Bounding Box Regression. Neurocomputing, 506, 146-157. https://doi.org/10.1016/j.neucom.2022.07.042 |
[8] | Li, K., Wan, G., Cheng, G., Meng, L. and Han, J. (2020) Object Detection in Optical Remote Sensing Images: A Survey and a New Benchmark. ISPRS Journal of Photogrammetry and Remote Sensing, 159, 296-307. https://doi.org/10.1016/j.isprsjprs.2019.11.023 |
[9] | Cheng, G., Wang, J., Li, K., Xie, X., Lang, C., Yao, Y., et al. (2022) Anchor-Free Oriented Proposal Generator for Object Detection. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-11. https://doi.org/10.1109/tgrs.2022.3183022 |
[10] | Ren, S., He, K., Girshick, R. and Sun, J. (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. https://doi.org/10.1109/tpami.2016.2577031 |
[11] | Ross, T.Y. and Dollár, G. (2017) Focal Loss for Dense Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 2980-2988. |