%0 Journal Article
%T Chinese Chunking with Large Margin Method
基于大间隔方法的汉语组块分析
%A ZHOU Jun-Sheng
%A DAI Xin-Yu
%A CHEN Jia-Jun
%A QU Wei-Guang
%A
周俊生
%A 戴新宇
%A 陈家骏
%A 曲维光
%J 软件学报
%D 2009
%I
%X Chinese chunking plays an important role in natural language processing. This paper presents a large margin method for Chinese chunking based on structural SVMs (support vector machines). First, a sequence labeling model and the formulation of the learning problem are introduced for Chinese chunking problem, and then the cutting plane algorithm is applied to efficiently approximate the optimal solution of the optimization problem.Finally, an improved F1 loss function is proposed to tackle Chinese chunking. The loss function can scale the F1loss value to the length of the sentence to adjust the margin accordingly, leading to more effective constraintinequalities. Experiments are conducted on UPENN Chinese Treebank-4 (CTB4), and the hamming loss function is compared with the improved F1 loss function. The experimental results show that the training algorithm with the improved F1 loss function can achieve higher performance than the Hamming loss function. The overall F1 score of Chinese chunking obtained with this approach is 91.61%, which is higher than the performance produced by the state-of-the-art machine learning models, such as CRFs (conditional random fields) and SVMs models.
%K Chinese chunking
%K large margin
%K discriminative learning
%K loss function
汉语组块分析
%K 大间隔
%K 判别式学习
%K 损失函数
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=66D86E5CD33F368D80666B22EE78D429&yid=DE12191FBD62783C&vid=A04140E723CB732E&iid=E158A972A605785F&sid=971ECAFE8682845B&eid=A766A50385B9FB1F&journal_id=1000-9825&journal_name=软件学报&referenced_num=0&reference_num=17