|
- 2018
一种基于复合特征的中文地名识别方法
|
Abstract:
中文地名识别是命名实体识别的重要研究课题之一,也是提高地理信息系统应用水平的关键。传统的地名识别主要基于词性或地名要素特征,特征类型有限。提出了一种基于复合特征的中文地名识别方法,挖掘中文地名在自然语言中的特点,设计了类型、路径、距离和数量四种句法特征,基于地名要素特征、词性特征、句法特征三种复合特征利用条件随机场模型实现了中文地名的训练和识别。通过实验对比复合特征在中文地名识别方法的效果,结果表明复合特征能够有效提高中文地名识别的准确率和召回率,尤其是对于复杂地名的识别,具有良好的效果
[1] | Tan Kankan. Rule-based Chinese Address Segmentation and Matching Methods[D]. Qingdao:Shandong University of Science and Technology, 2011(谭侃侃. 基于规则的中文地址分词与匹配方法[D]. 青岛:山东科技大学, 2011) |
[2] | Gao Lingling. A Study on Chinese Syntax Analysis Based on Dependency Grammer[D]. Qingdao:Ocean University of China, 2009(高玲玲. 基于依存语法的汉语句法分析研究[D]. 青岛:中国海洋大学, 2009) |
[3] | Lafferty J, Mccallum A, Pereira F. Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]. Proc of the 18th ICML, San Francisco, USA, 2001 |
[4] | Laokulrat N, Miwa M, Tsuruoka Y, et al. Uttime:Temporal Relation Classification Using Deep Syntactic Features[C]. Second Joint Conference on Lexical and Computational Semantics, Atlanta, USA, 2013 |
[5] | Hancke J, Vajjala S, Meurers D. Readability Classification for German Using Lexical, Syntactic, and Morphological Features[C]. 24th International Conference on Computational Linguistics, Mumbai, India, 2012 |
[6] | Grundstr?m J, Nugues P. Using Syntactic Features in Answer Reranking[C]. AAAI 2014 Workshop on Cognitive Computing for Augmented Human Intelligence, Québec, Canada, 2014 |
[7] | Jiang Wenming,Zhang Xueying,Li Boqiu. CRFs-based Approach to Recognition of Chinese Address Element[J]. <em>Computer Engineering and Application</em>,2010, 46(13):129-131(蒋文明,张雪英,李伯秋. 基于条件随机场的中文地址要素识别方法[J]. 计算机工程与应用,2010, 46(13):129-131) |
[8] | Zhang Xueying,Zhang Chunju,Lv Guonian. Design and Analysis of a Classification Scheme of Geographical Named Entities[J]. <em>Journal of Geo-information Science</em>, 2010(02):2220-2227(张雪英,张春菊,闾国年. 地理命名实体分类体系的设计与应用分析[J]. 地球信息科学学报,2010(02):2220-2227) |
[9] | Li Yusen,Zhang Xueying,Yuan Zhengwu. Study on Geographical Entity Recognition in GIS[J]. <em>Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)</em>. 2008, 20(6):719-726(李玉森,张雪英,袁正午. 面向GIS的地理命名实体识别研究[J]. 重庆邮电大学学报(自然科学版),2008, 20(6):719-726) |
[10] | Dai Min,Wang Rongyang,Li Shoushan,et al. Opinion Target Extraction with Syntactic Features[J]. <em>Journal of Chinese Information Processing</em>,2014, 28(04):92-97(戴敏,王荣洋,李寿山,等. 基于句法特征的评价对象抽取方法研究[J]. 中文信息学报,2014, 28(04):92-97) |
[11] | Zhang Xueying,Lv Guonian,Li Boqiu. Rule-based Approach to Semantic Resolution of Chinese Addresses[J]. <em>Journal of Geo-information Science</em>. 2010(01):9-16(张雪英,闾国年,李伯秋. 基于规则的中文地址要素解析方法[J]. 地球信息科学学报,2010(01):9-16) |
[12] | Du Ping,Liu Yong. Recognition of Chinese Place Names Based on Ontology[J]. <em>Journal of Northwest Normal University(Natural Science)</em>,2011, 47(06):87-93(杜萍,刘勇. 基于本体的中文地名识别[J]. 西北师范大学学报(自然科学版),2011, 47(06):87-93) |
[13] | Tang Xuri,Chen Xiaohe,Zhang Xueying. Research on Toponym Resolution in Chinese Text[J]. <em>Geomatics and Information Science of Wuhan University</em>,2010, 35(08):930-935(唐旭日,陈小荷,张雪英. 中文文本的地名解析方法研究[J]. 武汉大学学报·信息科学版,2010, 35(08):930-935) |
[14] | Arisoy E, Saraclar M, Roark B, et al. Syntactic and Sub-lexical Features for Turkish Discriminative Language Models[C]. Acoustics Speech and Signal Processing, Dallas, USA, 2010 |
[15] | Bykh S, Meurers D. Exploring Syntactic Features for Native Language Identification:A Variationist Perspective on Feature Encoding and Ensemble Optimization[C]. 25th International Conference on Computational Linguistics, Dublin, Ireland, 2014 |
[16] | Guo Xiyue,He Tingting,Hu Xiaohua,et al. Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features[J]. <em>Journal of Chinese Information Processing</em>,2014, 28(6):183-189(郭喜跃,何婷婷,胡小华,等. 基于句法语义特征的中文实体关系抽取[J]. 中文信息学报,2014, 28(6):183-189) |
[17] | Cheng Changxiu,Yu Bin. A Rule-Based Segmenting and Matching Method for Fuzzy Chinese Addresses[J]. <em>Geography and Geo-Information Science</em>. 2011(03):26-29(程昌秀,于滨. 一种基于规则的模糊中文地址分词匹配方法[J]. 地理与地理信息科学,2011(03):26-29) |
[18] | Aaron L F H, Derek F W, Lidia S C. Chinese Named Entity Recognition with Conditional Random Fields in the Light of Chinese Characteristics[M]. LP&ⅡS2013, Warsaw:Springer, 2013 |
[19] | Chen Wenliang, Zhang Yujie, Hitoshi Isahara. Chinese Named Entity Recognition with Conditional Random Fields[C]. Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney, Australia, 2006 |
[20] | Li Zhenghua. Research on Key Technologies of Chinese Dependency Parsing[D]. Harbin:Harbin Institute of Technology, 2013(李正华. 汉语依存句法分析关键技术研究[D]. 哈尔滨:哈尔滨工业大学, 2013) |
[21] | Yu Shiwen,Duan Huiming,Zhu Xuefeng,et al. The Basic Processing of Contemporary Chinese Corpus at Peking University[J]. <em>Journal of Chinese Information Processing</em>,2002(05):49-64(俞士汶,段慧明,朱学锋,等. 北京大学现代汉语语料库基本加工规范[J]. 中文信息学报,2002(05):49-64) |
[22] | Benajiba Y, Zitouni I, Diab M, et al. Arabic Named Entity Recognition:Using Features Extracted from Noisy Data[C]. Proceedings of the ACL 2010 Conference Short Papers, Uppsala, Sweden, 2010 |
[23] | Xu Bing,Zhao Tiejun,Wang Shanyu,et al. Extraction of Opinion Targets Based on Shallow Parsing Features[J]. <em>Acta Automatica Sinica</em>,2011, 37(10):1241-1247(徐冰,赵铁军,王山雨,等. 基于浅层句法特征的评价对象抽取研究[J]. 自动化学报,2011, 37(10):1241-1247) |
[24] | Mukherjee S, Tiwari A, Gupta M, et al. Shallow Discourse Parsing with Syntactic and (a Few) Semantic Features[C]. Proceedings of the Nineteenth Conference on Computational Natural Language Learning, Beijing, China, 2015 |
[25] | Johansson R, Moschitti A. Syntactic and Semantic Structure for Opinion Expression Detection[C]. Proceedings of the Fourteenth Conference on Computational Natural Language Learning, Uppsala, Sweden, 2010 |
[26] | Stein D, Peitz S, Vilar D, et al. A Cocktail of Deep Syntactic Features for Hierarchical Machine Translation[C]. Conference of the Association for Machine Translation, Denver, USA, 2010 |
[27] | Loni B, Van T G, Wiggers P, et al. Question Classification by Weighted Combination of Lexical, Syntactic and Semantic Features[M]. Text, Speech and Dialogue, Pilsen, Czech:Springer, 2011 |
[28] | Qiu Sha,A. Yuan,Wang Fuyan,et al. Study on Automatic Recognition of Chinese Location Names Based on Statistical Method[J]. <em>Computer Technology and Development</em>,2011, 21(11):35-38(邱莎,阿圆,王付艳,等. 基于统计的中文地名自动识别研究[J]. 计算机技术与发展,2011, 21(11):35-38) |
[29] | Yin Dechun. Chinese Syntactic Parsing Based on Linguistic Entity Relationship Model[D]. BeiJing:Beijing Institute of Technology, 2014(尹德春. 基于语言实体关系模型的汉语句法分析[D]. 北京:北京理工大学, 2014) |