%0 Journal Article %T 融合实体特性识别越南语复杂命名实体的混合方法 %A 刘艳超 %A 郭剑毅 %A 余正涛 %A 周兰江 %A 严馨 %A 陈秀琴 %J 智能系统学报 %D 2016 %R 10.11992/tis.201606009 %X 命名实体识别是自然语言处理过程中的基础任务。本文针对越南语的复杂命名实体难识别及F值不够高的问题,提出了一种结合实体库的越南语命名实体识别混合方法。首先,本文根据越南语的语言和实体特点,选取有效的局部特征和全局特征,应用最大熵模型进行越南语命名实体识别;其次,根据本文制定的命名实体的规则进行越南语命名实体识别;然后,结合两者的识别结果,以规则为主,统计为辅原则;最后经过人工校对,把获取到的正确标记的实体加入到实体库,动态扩增实体库,为规则制定和特征选取提供丰富的语料和依据。实验表明,该方法能够有效地结合规则与统计的方法优点,互相弥补不足;明显提高了识别的正确率、召回率和F值。</br>NER (Named entity recognition) is the basic task in natural language processing. Aimed at the problems of low F values and the difficulty with complex Vietnamese named entity recognition, a hybrid method incorporating entity properties is proposed. Firstly, according to the Vietnamese language and entity characteristics, local and global features were selected and a maximum entropy model built to recognize Vietnamese named entities. Secondly, according to the named entity rules obtained, the Vietnamese entity was recognized. Then, combining the recognition results, this paper uses the rule as the main principle and statistics as the supplementary principle. Finally, the obtained correct entity was added to the entity corpus after manual correction, dynamically expanding the entity corpus, which provided a rich corpus and a basis for determining rules and selecting features. Experimental results show that the method can effectively take advantage of rules and statistics, and that recognition accuracy, recall, and F are all significantly improved %K 越南语 %K 实体库构建 %K 实体识别 %K 最大熵 %K 规则 %K 实体特点< %K /br> %K vietnamese %K entity library construction %K entity recognition %K maximum entropy %K rules set %K entity characters %U http://tis.hrbeu.edu.cn/oa/darticle.aspx?type=view&id=20160410