All Title Author
Keywords Abstract


基于统计的介词短语边界识别研究

, PP. 636-640

Keywords: 介词短语,支持向量机,最大熵,条件随机场

Full-Text   Cite this paper   Add to My Lib

Abstract:

以已经分词并进行了词性标注和介词短语标注的《人民日报》为实验语料,选取其中出现频次高于20次的61个介词为实验对象,采用支持向量机、最大熵和条件随机场这3种统计模型,对介词短语边界识别进行了研究.实验结果表明在3种模型中,采用条件随机场模型效果最好,微平均准确率达到了95.68%.关键词:介词短语;支持向量机;最大熵;条件随机场

References

[1]  陈昌来.介词与介引功能[M]. 合肥:安徽教育出版社,2002.
[2]  Hindle D,Rooth M.Struetural ambiguity and lexcial relation[J].Computational linguistics,1993,19(1):103-120.
[3]  Brill E,Resnik P.A rule based approach to PP attachment disambiguation[C]. Proceeding of the 15th conference on ComPutational linguistics,Kokyo,JaPan,1994:1198-1204.
[4]  Stetion J,Nagao M. Corpus based PP attachment ambiguity resolution with a semant dictionary[C]. Proceedings of the Third workshop on Very Large Corpora,Beijing and Hong Kong,1997:18-20.
[5]  温苗苗,吴云芳. 基于SVM融合多特征的介词结构自动识别[J].中文信息学报,2009,23(5):19-24.
[6]  Vapnik V N. Statistical learning theory [M]. New York: Wiley?Interscience Publication,1998.
[7]  E T Jaynes. Information theory and statistical mechanics [J]. Physics Reviews,1957,106:620-630.
[8]  J N Darroch, D Ratcliff. Generalized iterative scaling for log?linear models[C]. Annals of Mathematical Statistics, 43:1470-1480.
[9]  S Della Pietra, V Della Pietra, J Lafferty. Inducing features of random fields [R]. Technical Report CMU?CS?95?144,CMU,1995.
[10]  J Lafferty, A McCallum, F Pereira. Conditional random fields: probabilistic models for segmenting and labeling sequence data[C]. International Conference on Machine Learning, 2001:282-289
[11]  C Zhu, R H Byrd, P Lu, et al. Algorithm 778:L?BFGS?B?Fortran subroutines for large?scale bound constrained optimization[J].ACM Trans.Math. Software, 1997(12):550-560.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

微信:OALib Journal