|
基于全局注意力机制的图像检索算法研究
|
Abstract:
针对图像检索中由于图像尺度变化大、目标相似性等影响检索精度的问题,本文提出了一种基于多特征融合的图像检索算法,采用残差网络(ResNet50)提取图像特征,加入全局注意力机制(Global Attention Mechanism),将网络提取的原始特征与GAM注意力机制提取的特征融合,使图像中的关键部分得到更多的关注,实验证明了所提出的算法具有较高的检索准确率和鲁棒性。
This paper proposes an image retrieval algorithm based on multi feature fusion to address the issues of significant changes in image scale and target similarity that affect retrieval accuracy in image retrieval. The algorithm uses a residual network (ResNet50) to extract image features, adds a Global Attention Mechanism, and fuses the original features extracted by the network with the features extracted?by the GAM attention mechanism, so that key parts of the image receive more attention, The experiment has proven that the proposed algorithm has high retrieval accuracy and robustness.
[1] | Zhou, B., Lapedriza, A., Xiao, J., et al. (2014) Learning Deep Features for Image Retrieval Using Convolutional Neural Networks. European Conference on Computer Vision. Springer, Berlin, 157-172. |
[2] | Lowe, D.G. (2004) Distinctive Image Features from Scale-Invariant Keypoints[J]. International Journal of Computer Vision, 60, 91-110. https://doi.org/10.1023/B:VISI.0000029664.99615.94 |
[3] | Liang, L., Yang, H., Zhang, J., et al. (2018) Mul-ti-Feature Fusion Based on Deep Learning for Image Retrieval. Neurocomputing, 275, 2357-2364. |
[4] | 董华, 王涛. 基于深度神经网络的图像检索研究[J]. 计算机工程与设计, 2018, 39(7): 1662-1666. |
[5] | Wu, X., Zha, Z.J., Yang, Y., et al. (2019) A Multilayer Feature Fusion Framework for Image Retrieval. IEEE Transactions on Image Processing, 28, 147-162. |
[6] | Liu, Y., Wang, M., Cao, L., et al. (2017) A Multiple Feature Fusion Model for Image Retrieval Based on Bag of Words. 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, 25-26 March 2017, 2644-2649. |
[7] | Wang, Y., Yao, H., Shi, H., et al. (2018) Deep Multiple Feature Fusion for Image Retrieval. Multimedia Tools and Applications, 77, 19529-19547. |
[8] | He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 770-778.
https://doi.org/10.1109/CVPR.2016.90 |
[9] | Zhang, H., Wang, L., Liu, Y. and Shen, H.T. (2018) Graph Attention Networks. Proceedings of the 32nd Conference on Neural Information Processing Systems, Montreal, 3-8 December 2018, 9110-9119. |
[10] | Lee, S.W., Kim, S.H., Lee, S.U., and Kim, S.J. (2006) Content-Based Image Retrieval Using Color and Texture Combined Features. Pattern Recognition Letters, 27, 1805-1811. |
[11] | Hu, J., Shen, L. and Sun, G. (2018) Squeeze-and-Excitation Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-nition, Salt Lake City, 18-23 June 2018, 7132-7141.
https://doi.org/10.1109/CVPR.2018.00745 |
[12] | Woo, S., Park, J., Lee, J.-Y. and Kweon, I.S. (2018) Cbam: Con-volutional Block Attention Module. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds. Computer Vi-sion—ECCV 2018. Springer, Cham.
https://doi.org/10.1007/978-3-030-01234-2_1 |