OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Computer Science and Application 2022

基于掩码Transformer的图像修复网络
An Image in Painting Model Based on Mask Transformer

DOI: 10.12677/CSA.2022.121010, PP. 83-94

康延亭, 王直杰

Keywords: 掩码，注意力机制，Transformer，查询集，相似度矩阵
Mask, Attention, Transformer, Query Set, Similarity Matrix

Full-Text Cite this paper Add to My Lib

Abstract:

现有的基于深度学习的图像修复网络通常采用注意力机制以相似匹配的方式将完好区域信息填充到待修复区域来提升待修复区域的纹理细节。然而，现有的注意力机制的度量方式仅考虑特征纹理而缺少对语义的理解以至于会利用到一些语义不相似区域的信息。为了解决这一问题，本文提出一种基于掩码transformer的图像修复网络，该掩码transformer模块相较于基本的transformer层的区别主要包括两部分：1) 通过掩码将特征图分为有效区域和无效区域并提出掩码注意力机制有效的建模待修复区域和完好区域的相似性；2) 提出用查询集和相似度矩阵加权融合的方法为待修复区域精确填充信息。与传统的注意力机制相比，基于transformer的方法能够较为有效的提升修复的纹理效果。
Existing deep learning-based image repair networks usually use an attention mechanism to fill intact area information into the area to be repaired in a similar matching manner to improve the texture details of the area to be repaired. However, the existing measurement method of attention mechanism only considers the feature texture and lacks the understanding of semantics, so that it will use the information of some semantically dissimilar regions. In order to solve this problem, this paper proposes an image restoration network based on mask transformer. The difference between the masked transformer module and the basic transformer layer mainly includes two parts: 1) The feature map is divided into valid regions and invalid regions by mask, and the mask attention mechanism is proposed to effectively model the similarity between the regions to be repaired and the intact regions; 2) A method of weighted fusion of query set and similarity matrix is proposed to accurately fill in information for the region to be repaired. Compared with the traditional attention mechanism, the transformer-based method can effectively improve the texture effect of repair.

References

[1]	Ballester, C., Bertalmio, M., Caselles, V., et al. (2001) Filling-In by Joint Interpolation of Vector Fields and Gray Levels. IEEE Transactions on Image Processing, 10, 1200-1211. ://doi.org/10.1109/83.935036
[2]	Bertalmio, M., Sapiro, G., Caselles, V., et al. (2000) Image Inpainting. SIGGRAPH Conference, New Orleans, 23-28 July 2000, 417-424. ://doi.org/10.1145/344779.344972
[3]	Bertalmio, M., Vese, L., Sapiro, G., et al. (2003) Simultaneous Structure and Texture Image Inpainting. IEEE Transactions on Image Processing, 12, 882-889. ://doi.org/10.1109/TIP.2003.815261
[4]	Shen, J. and Chen, T. (2003) Euler’s Elastica and Curvature-Based Inpainting. SIAM Journal on Applied Mathematics, 63, 564-592. ://doi.org/10.1137/S0036139901390088
[5]	Barnes, C., Shechtman, E., Finkelstein, A., et al. (2009) Patchmatch: A Randomized Correspondence Algorithm for Structural Image Editing. Proceedings of ACM SIGGRAPH, Vol. 28, 1-11. ://doi.org/10.1145/1531326.1531330
[6]	Drori, I., Cohen-Or, D. and Yeshurun, H. (2003) Fragment-Based Image Completion. ACM Transactions on Graphics, 22, 303-312. ://doi.org/10.1145/882262.882267
[7]	Esedoglu, S. and Shen, J. (2003) Digital Inpainting Based on the Mum-ford-Shah-Euler Image Model. European Journal of Applied Mathematics, 13, 353-370. ://doi.org/10.1017/S0956792502004904
[8]	Xu, Z. and Sun, J. (2010) Image Inpainting by Patch Propagation Using Patch Sparsity. IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society, 19, 1153-1165. ://doi.org/10.1109/TIP.2010.2042098
[9]	Wang, Z., Bovik, A., Sheikh, H.R., et al. (2004) Image Quality As-sessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 13, 600-612. ://doi.org/10.1109/TIP.2003.819861
[10]	Lindeberg, T. (2012) Scale Invariant Feature Transform. Scholarpedia, 7, 10491. ://doi.org/10.4249/scholarpedia.10491
[11]	Efros, A.A. and Leung, T.K. (1999) Texture Synthesis by Non-Parametric Sampling. Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, 20-27 September 1999, 1033-1038. ://doi.org/10.1109/ICCV.1999.790383
[12]	Criminisi, A., Perez, P. and Toyama, K. (2004) Region Filling and Object Removal by Exemplar-Based Image Inpainting. IEEE Transactions on Image Processing, 13, 1200-1212. ://doi.org/10.1109/TIP.2004.833105
[13]	Levin, A., Zomet, A., Peleg, S., et al. (2004) Seamless Image Stitching in the Gradient Domain. 8th European Conference on Computer Vision, Prague, 11-14 May 2004, 377-389. ://doi.org/10.1007/978-3-540-24673-2_31
[14]	Pathak, D., Krahenbuhl, P., Donahue, J., et al. (2016) Context Encoders: Feature Learning by Inpainting. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 2536-2544. ://doi.org/10.1109/CVPR.2016.278
[15]	Lowe, D.G. (1999) Object Recognition from Local Scale-Invariant Fea-tures. Proceedings of IEEE International Conference on Computer Vision, Corfu, 20-25 September 1999, 1150-1157. ://doi.org/10.1109/ICCV.1999.790410
[16]	Simakov, D., Caspi, Y., Shechtman, E., et al. (2008) Summarizing Visual Data Using Bidirectional Similarity. IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, 24-26 June 2008, 1-8. ://doi.org/10.1109/CVPR.2008.4587842
[17]	Satoshi, L., Edgar, S.-S. and Hiroshi, I. (2017) Globally and Locally Consistent Image Completion. ACM Transactions on Graphics, 36, 107:1-107:14. ://doi.org/10.1145/3072959.3073659
[18]	Liu, G., Reda, F.A., Shih, K.J., et al. (2018) Image Inpainting for Ir-regular Holes Using Partial Convolutions. In: European Conference on Computer Vision, Springer, Cham, 85-100. ://doi.org/10.1007/978-3-030-01252-6_6
[19]	Yu, J., Lin, Z., Yang, J., et al. (2018) Generative Image Inpainting with Contextual Attention. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-22 June 2018, 5505-5514. ://doi.org/10.1109/CVPR.2018.00577
[20]	Zheng, C., Cham, T.J. and Cai, J. (2019) Pluralistic Image Completion. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 1438-1447. ://doi.org/10.1109/CVPR.2019.00153
[21]	Yu, J., Lin, Z., Yang, J., et al. (2018) Free-Form Image Inpainting with Gated Convolution. 2019 IEEE/CVF International Conference on Computer Vision Workshops, Seoul, 27-28 October 2019, 4471-4480. ://doi.org/10.1109/ICCV.2019.00457
[22]	Nazeri, K., Ng, E., Joseph, T., et al. (2019) EdgeConnect: Structure Guided Image Inpainting Using Edge Prediction. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, 27-28 October 2019, 1-8. ://doi.org/10.1109/ICCVW.2019.00408
[23]	Li, J., Wang, N., Zhang, L., et al. (2020) Recurrent Feature Reasoning for Image Inpainting. IEEE Conference on Computer Vision and Pattern Recognition, Seattle, 14-19 June 2020, 7760-7768. ://doi.org/10.1109/CVPR42600.2020.00778
[24]	Xie, C., Liu, S., Li, C., et al. (2019) Image Inpainting with Learnable Bidirectional Attention Maps. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27-28 October 2019, 8858-8867. ://doi.org/10.1109/ICCV.2019.00895
[25]	Miyato, T., Kataoka, T., Koyama, M., et al. (2018) Spectral Normali-zation for Generative Adversarial Networks. 6th International Conference on Learning Representations, Vancouver, 30 April-3 May, 2018.
[26]	Zeng, Y., Fu, J., Chao, H., et al. (2019) Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 1486-1494. ://doi.org/10.1109/CVPR.2019.00158
[27]	Xiao, Q., Li, G. and Chen, Q. (2018) Deep Inception Generative Network for Cognitive Image Inpainting.
[28]	Yang, C., Lu, X., Lin, Z., et al. (2017) High-Resolution Image Inpainting Using Multi-Scale Neural Patch Synthesis. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 6721-6729. ://doi.org/10.1109/CVPR.2017.434
[29]	Song, Y., Yang, C., Lin, Z., et al. (2017) Contextual-Based Image Inpainting: Infer, Match, and Translate. 15th European Conference on Computer Vision, Munich, 8-14 September 2018, 3-19. ://doi.org/10.1007/978-3-030-01216-8_1
[30]	Sagong, M.C., Shin, Y.G., Kim, S.W., et al. (2020) PEPSI: Fast Image Inpainting with Parallel Decoding Network. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 11360-11368. ://doi.org/10.1109/CVPR.2019.01162
[31]	Vaswani, A., Shazeer, N., Niki, P., et al. (2017) Attention Is All You Need. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, 4-9 December 2017, 5998-6008.
[32]	Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020) An Image Is Worth 16x16 Words: Trans-formers for Image Recognition at Scale.
[33]	Chen, M., Radford, A., Child, R., et al. (2020) Generative Pretraining from Pixels. International Conference on Machine Learning, Vienna, 13-18 July 2020, 1691-1703.
[34]	Ba, J.L., Kiros, J.R. and Hinton, G.E. (2016) Layer Normalization.
[35]	Simonyan, K. and Zisserman, A. (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer Science.
[36]	Zhou, B., Lapedriza, A., Khosla, A., et al. (2018) Places: A 10 Million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis & Machine Intelligence, 40, 1452-1464. ://doi.org/10.1109/TPAMI.2017.2723009
[37]	Karras, T., Aila, T., Laine, S., et al. (2017) Progressive Growing of GANs for Improved Quality, Stability, and Variation.
[38]	Efros, A.A., et al. (2015) What Makes Paris Look Like Paris? ACM Transactions on Graphics, 31, 1-9. ://doi.org/10.1145/2185520.2185597

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

基于掩码Transformer的图像修复网络An Image in Painting Model Based on Mask Transformer

基于掩码Transformer的图像修复网络
An Image in Painting Model Based on Mask Transformer