%0 Journal Article %T A New Error-driven Learning Approach for Chinese Word Segmentation
一种新的错误驱动学习方法在中文分词中的应用 %A XIA Xin-Song %A XIAO Jian-Guo %A
夏新松 %A 肖建国 %J 计算机科学 %D 2006 %I %X A well known problem for Chinese word segmentation(CWS)is that we can not have a unique definition of words.Different standards may result in different word segmentation outputs.It is unrealizable to develop different CWS systems according to different applications or standards,so it is significantly important to flexibly adapt segmen- tation outputs towards different standards or applications using existing CWS system.The paper presents a linguistical- ly enriched transformation-based learning approach for performing CWS adaptation as a postprocessor.Different from other transform-based learning used in CWS,the approach utilizes some linguistics information,and introduces word class and word internal structure to rule templates and transformations.The performance of the approach is evaluated on four different test sets,which represent four different standards.It turns out to be comparable to several state-of- the-art approaches which perform Chinese word segmentation based on single standard. %K Chinese word segmentation %K Rule template %K Word class %K Word internal structure %K Transformation-based Learning(TBL)
中文分词 %K 规则模板 %K 词类 %K 词内结构 %K 基于转换的学习(TBL) %U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=64A12D73428C8B8DBFB978D04DFEB3C1&aid=25990805045629F0&yid=37904DC365DD7266&vid=27746BCEEE58E9DC&iid=38B194292C032A66&sid=1B97AE5098AEB49C&eid=F260CE035846B3B8&journal_id=1002-137X&journal_name=计算机科学&referenced_num=2&reference_num=12