全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

BFS-CTC汉语句义结构标注语料库构建方法

Keywords: 中文信息处理,句义分析,句义结构,语义标注,语料库

Full-Text   Cite this paper   Add to My Lib

Abstract:

根据现代汉语语义学,构建了一种层次化的句义结构模型.基于该模型构建了汉语句义结构标注语料库(Beijingforeststudio-Chinesetaggedcorpus,BFS-CTC).利用自行开发的标注和管理工具,对模型中各个句义成分及其组合关系进行快速标注,降低培训工作量和标注成本.BFS-CTC涵盖了6种句式类型,约1万句,提供了符合现有规范的词法和句法标注信息与自定义规范的句义结构标注信息,便于词法、句法和句义的对照分析研究,以及语料的综合使用和横向分析.此外,BFS-CTC还具有较强的可扩展性,可在核心标注库基础上扩展生成其它扩展库和标注资源.

References

[1]  Palmer M, Gildea D, Kingsbury P. The proposition bank: an annotated corpus of semantic roles[J]. Computational Linguistics, 2005,31(1):71-105.
[2]  周强.汉语基本块描述体系[J].中文信息学报,2007,21(3):21-27. Zhou Qiang. Base chunk scheme for the Chinese language[J]. Journal of Chinese Information Processing, 2007,21(3):21-27. (in Chinese)
[3]  刘开瑛,由丽萍.汉语框架语义知识库构建工程 //中文信息处理前沿进展:中国中文信息学会二十五周年学术会议.北京:清华大学出版社,2006:64-71. Liu Kaiying, You Liping. The Chinese framenet project //Proceedings of Frontiers of Chinese Information Processing, the 25th Annual Meeting of Chinese Information Processing Society of China. Beijing: Tsinghai University Press, 2006:64-71. (in Chinese)
[4]  冯扬.汉语句义模型构建及若干关键技术研究 .北京:北京理工大学信息与电子学院,2010. Feng Yang. Research on Chinese sentential semantic mode and some key problems . Beijing: School of Information and Electronics, Beijing Institute of Technology, 2010. (in Chinese)
[5]  贾彦德.汉语语义学[M].北京:北京大学出版社,2005:249-265. Jia Yande. Chinese semantics[M]. Beijing: Peking University Press, 2005:249-265. (in Chinese)
[6]  龚千言. 汉语的时相时制时态[M]. 北京: 商务印书馆, 1995. Gong Qianyan. Phase, tense and aspect in Chinese[M]. Beijing: The Commercial Press, 1995. (in Chinese)
[7]  俞士汶,段慧明,朱学锋,等.北京大学现代汉语语料库基本加工规范[J].中文信息学报,2002,16(5):49-64. Yu Shiwen, Duan Huiming, Zhu Xuefeng, et al. The basic processing of contemporary Chinese corpus at Peking University Specification[J]. Journal of Chinese Information Processing, 2002,16(5):49-64. (in Chinese)
[8]  周强.汉语语料库的短语自动划分和标注研究 .北京:北京大学计算机科学和技术系,2002. Zhou Qiang. Phrase bracheting and annotating on Chinese language corpus . Beijing: Department of Computer Science Technology, Peking University, 2002. (in Chinese)
[9]  陈立民.汉语的时态和时态成分[J].语言研究,2002(3):14-31. Chen Limin. Tense and tense component in Chinese[J]. Language Study, 2002(3):14-31. (in Chinese)
[10]  刘莉莉.汉语句义类型及谓词时态识别算法研究 .北京:北京理工大学信息与电子学院,2010. Liu Lili. Research on algorithms of chinese sentential semantic type and predicate aspect recognition . Beijing: School of Information and Electronics, Beijing Institute of Technology, 2010. (in Chinese)

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133