全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

A Double-layer Word Segmentation Combined with Local Ambiguity Word Grid and CRF

Keywords: Local ambiguity word grid , CRF , Word segmentation

Full-Text   Cite this paper   Add to My Lib

Abstract:

This paper presents a double-layer model of Chinese word segmentation based on the combination of Local Ambiguity Word Grid and Conditional Random Fields. Firstly, the Local Ambiguity Word Grid algorithm is used to generate rough segmentation results in the lower level. Then, the text is segmented again based on CRF, where the rough results are set as one feature. The Local Ambiguity Word Grid algorithm has the advantage of detecting ambiguity from the process of Chinese word segmentation, while CRF can cope with vocabulary and out-of-vocabulary word equally. Therefore, the hybrid Local Ambiguity Word Grid and CRF approach is the effective resolution for the ambiguity and out-of-vocabulary word. The system is closed tested in the MSRA and PKU testing sets that are provided by the SIGHAN2005 Chinese Language Processing Bakeoff, along with the comparison between four characters and six characters in a set of label. The experiments show that F-measures of the MSRA and PKU testing sets in the closed test reach 97.1% and 95.1% respectively. Additional, the experimental results of open test reveal the practical application of the model.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133