|
深度稀疏门控Transformer图像去模糊模型
|
Abstract:
基于Transformer的图像去模糊方法已经取得了显著的成绩,现阶段已经现有的大多数Transformer图像恢复方法将内部模块设计为自注意力 + 前馈网络的模式。为了降低这样设计带来的巨大计算开销与时间成本,本文提出了一种能够同时融合空间特征与通道特征的深度稀疏门控自注意力求解器。该方法通过Top-k稀疏选择与ReLU2稀疏激活将注意力转化为深度稀疏的形式,能够有效地消除令牌全局交互带来的冗余表示,还能增强通道特征融合能力。此外,本文通过设计判别式频域门控模块实现自适应保留与增强对图像恢复有帮助的特征,进一步完成空间特征融合。由这些基本模块组成的神经网络在GoPro基准数据集上取得了先进的结果。
Transformer-based image deblurring methods have achieved remarkable results. Currently, most existing Transformer-based image restoration approaches adopt a design pattern of self-attention and feed-forward networks for their internal modules. To reduce the substantial computational overhead and time costs associated with such designs, this paper proposes a deep sparse gated self-attention solver capable of simultaneously integrating spatial and channel features. By employing Top-k sparse selection and ReLU2 sparse activation, this method transforms attention into a deep sparse form, effectively eliminating redundant representations caused by token-wise global interactions while enhancing channel feature fusion capabilities. Furthermore, this paper designs a discriminative frequency-domain gating module to adaptively preserve and enhance features beneficial for image restoration, thereby further improving spatial feature fusion. The neural network composed of these fundamental modules achieves state-of-the-art results on the GoPro benchmark dataset.
[1] | Liu, J., Yan, M. and Zeng, T. (2021) Surface-Aware Blind Image Deblurring. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 1041-1055. https://doi.org/10.1109/tpami.2019.2941472 |
[2] | Pan, J., Sun, D., Pfister, H. and Yang, M. (2018) Deblurring Images via Dark Channel Prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 2315-2328. https://doi.org/10.1109/tpami.2017.2753804 |
[3] | Nah, S., Kim, T.H. and Lee, K.M. (2017) Deep Multi-Scale Convolutional Neural Network for Dynamic Scene Deblurring. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 257-265. https://doi.org/10.1109/cvpr.2017.35 |
[4] | Chen, X., Li, H., Li, M. and Pan, J. (2023) Learning a Sparse Transformer Network for Effective Image Deraining. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 5896-5905. https://doi.org/10.1109/cvpr52729.2023.00571 |
[5] | Cui, Y., Ren, W., Cao, X., et al. (2024) Revitalizing Convolutional Network for Image Restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46, 9423-9438. |
[6] | Alexey, D., Lucas, B., Alexander, K., et al. (2021) An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. The 9th International Conference on Learning Representations (ICLR), Austria, 3-7 May 2021, 1-22. |
[7] | Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., et al. (2021) Pre-Trained Image Processing Transformer. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 12294-12305. https://doi.org/10.1109/cvpr46437.2021.01212 |
[8] | Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S. and Yang, M. (2022) Restormer: Efficient Transformer for High-Resolution Image Restoration. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 5718-5729. https://doi.org/10.1109/cvpr52688.2022.00564 |
[9] | Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M., et al. (2021) Multi-Stage Progressive Image Restoration. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 14816-14826. https://doi.org/10.1109/cvpr46437.2021.01458 |
[10] | Tsai, F., Peng, Y., Lin, Y., Tsai, C. and Lin, C. (2022) Stripformer: Strip Transformer for Fast Image Deblurring. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M. and Hassner, T., Eds., Lecture Notes in Computer Science, Springer, 146-162. https://doi.org/10.1007/978-3-031-19800-7_9 |
[11] | Jin, Z., Qiu, Y., Zhang, K., Li, H. and Luo, W. (2025) MB-Taylorformer V2: Improved Multi-Branch Linear Transformer Expanded by Taylor Formula for Image Restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47, 5990-6005. https://doi.org/10.1109/tpami.2025.3559891 |
[12] | Rim, J., Lee, H., Won, J. and Cho, S. (2020) Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms. In: Vedaldi, A., Bischof, H., Brox, T. and Frahm, J.M., Eds., Lecture Notes in Computer Science, Springer International Publishing, 184-201. https://doi.org/10.1007/978-3-030-58595-2_12 |