全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
软件学报  2000 

Analysis and Improvement of Statistics-Based Chinese Part-of-Speech Tagging
基于统计的汉语词性标注方法的分析与改进

Keywords: Part-of-Speech tagging,n-gram,corpus,grammatical attribute
词性标注
,n元语法,语料,语法属性.

Full-Text   Cite this paper   Add to My Lib

Abstract:

In this paper, a popular statistics\|based training and tagging method for Chinese texts is studied, and the nonlinear relation between training set and tagging accuracy is analyzed from the aspects of the structure and numerical value of the matrix of transition probabilities and the matrix of symbol probabilities. In order to make use of training corpus sufficiently and get the higher tagging accuracy, the training and tagging method is improved from two aspects: using other grammatical attributes of words, and strengthening the processing of unknown words. With the improved method, open test and close test showed that the overall accuracies are about 96.5% and 96% respectively.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133