|
- 2018
基于词缀的维吾尔谚语识别关键技术研究Keywords: 维吾尔谚语, 谚语词缀, 谚语规则, 词缀覆盖率, 谚语规则库, 谚语语料库, 识别系统Uyghur proverbs, proverbs affix, proverb rules, coverage rate of affix, proverb rule bases, proverb corpus, recognition system Abstract: 在自然语言理解、机器翻译、舆情分析等自然语言处理领域中,维吾尔谚语识别是整个文本实体识别的重要组成部分。为满足维吾尔谚语信息化的需求,本文构建了比较完善的维吾尔谚语语料库。同时,从传统语言学角度对维吾尔谚语的语法、语义结构进行分析,构建了一个由维吾尔谚语功能语类(词缀)组成的、专属维吾尔谚语规则的知识库,并将此知识库与自然语言处理技术相结合,实现一个既能够从文本中识别出维吾尔谚语,又能提供维汉互译等功能的信息软件系统。该系统也为开展计算机理解与处理维吾尔文字奠定了一个崭新的基础。In fields of natural language processing such as natural language understanding, machine translation, and public opinion analysis, Uyghur proverb recognition is an important part of the whole text entity recognition. To meet the need of Uyghur proverb informationization, this paper establishes a relatively complete corpus of Uyghur proverbs. The grammar and semantic structure of Uygur proverbs were analyzed from the perspective of traditional linguistics, and a knowledge base that comprises functional genres (affixes) of Uyghur proverbs and obeys Uyghur proverb rules was constructed. In addition, the knowledge base was combined with natural language processing technologies to realize an information software system that can recognize Uyghur proverbs from text and mutually translate between Chinese and Uyghur language. The system also laid a new foundation for understanding and processing Uyghur language and characters by computer
|