%0 Journal Article %T 基于上下文注意的场景文本识别
Context Attention Network for Scene Text Recognition %A 董田荣 %J Software Engineering and Applications %P 345-353 %@ 2325-2278 %D 2023 %I Hans Publishing %R 10.12677/SEA.2023.122035 %X 作为计算机视觉领域的研究热点,自然场景中不规则文本的识别是一项具有挑战的任务。本文提出了一种简单有效的方法来识别不规则文本。所提出的方法采用薄板样条变换将不规则文本转换为规则文本,采用融合空间多尺度感知模块的ResNet34提取文本特征,然后将文本特征通过Bi-LSTM编码为上下文特征。整个模型分别使用上下文感知模块和文本特征增强模块进行监督。上下文感知模块关注于文本特征与上下文特征构成的新的特征空间,文本特征增强模块重点关注单个字符本身以处理无上下文语义的文本行。与其他的文本识别模型相比,所提出的方法对于不规则文本的识别能力有较大的提高,同时保持了对于常规文本的识别能力。在通用的场景文本数据集上通过大量的实验验证了模型对于不规则文本识别的有效性。
As a research hotspot in the field of computer vision, the recognition of irregular text in natural scenes is a challenging task. In this paper, we propose a simple and effective method to recognize irregular text. The proposed method uses Thin Plate Spline to convert irregular text into regular text, ResNet34 with fused spatial multiscale perception module to extract text features, and then encodes text features into contextual features by Bi-LSTM. The whole model is supervised using a context-aware module and a text feature enhancement module, respectively. The context-aware module focuses on a new feature space composed of text features and contextual features, and the text feature enhancement module focuses on individual characters to handle text lines without contextual semantics. Compared with other text recognition models, the proposed approach has a large improvement in the recognition of irregular text while maintaining the recognition capability for regular text. The effectiveness of the model for irregular text recognition is verified by extensive experiments on scene text datasets. %K 文本识别,不规则文本,薄板样条变换,Bi-LSTM,多尺度感知
Text Recognition %K Irregular Text %K Thin Plate Spline %K Bi-LSTM %K Multi-Scale Perception %U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=64929