全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
-  2018 

基于事件分析的Web地震新闻时空信息挖掘研究 Web based extraction of spatiotemporal information of earthquake event by semantic technology

Keywords: Web地震新闻,信息挖掘,事件框架,文本分析

Full-Text   Cite this paper   Add to My Lib

Abstract:

针对Web地震新闻挖掘的需求,采用网络爬虫抓取新闻文本作为研究语料,采用改进的TF-IDF(Term Frequency-Inverse Document Frequency)算法对语料集进行文本训练,选取权值较大的特征词初步识别地震类文档;采用特征词构成要素描述地震事件,构建了地震事件的知识框架;基于框架的要素特征词匹配从地震类文档中获取候选事件语句,对候选事件语句进行句法分析,总结出地震要素出现形式和规律,构造抽取规则,编写抽取算法,完成了地震事件识别和提取实验,并对地震事件提取的精度进行分析和评价,验证了该方法具有较高的地震事件识别和提取精度,是一种有前景的Web专题事件挖掘的途径

References

[1]  杨洁,安建成.基于本体和Apriori算法的语义挖掘技术研究[D].太原:太原理工大学,2011.Yang Jie,An Jiancheng.Research on semantic web mining based on ontology and algorithm Apriori[D].Taiyuan:Taiyuan University of Technology,2011.
[2]  Chen Keli,Zong Chengqing,Wang Xia.Analysis on banlance-corpus and text categorization based on largescale realistic corpora[C]//China Academic Conference on Computer Lingusitics,2003:540-545.
[3]  Wang Na,Wang Pengyuan,Zhang Baowei.An improved TF-IDF weights function based on information theory[C]//International Conference on Computer and Comunication Technologies in Agriculture Engineering,2010,3:439-441.
[4]  林海霞,司海峰,张微微.基于Java技术的主题网络爬虫的研究与实现[J].微型电脑应用,2009,25(2):56-58.Lin Haixia,Si Haifeng,Zhang Weiwei.Research and realization of topic web crawler based on Java technology[J].Microcomputer Applicatioin 2009,25(2):56-58.
[5]  吴平博,陈群秀,马亮.基于事件框架的事件相关文档的智能检索研究[J],中文信息学报,2003,1(7):25-30.Wu Pingbo,Chen Qunxiu,Ma Liang.Study on intelligent retrieval of event relevant documents based on event frame[J].Journal of Chinese Information Processing,2003,1(7):25-30.
[6]  于江德,肖新峰,樊孝忠.基于隐马尔可夫模型的中文文本事件信息抽取[J].微电子学与计算机,2007,24(10):92-94.Yu Jiangde,Xiao Xinfeng,Fan Xiaozhong.Event information extraction from chinese text based on hidden Markov models[J].Microelectronics&Computer,2007,24(10):94-94.
[7]  Amami Maha,Faiz Rim,Elkhlifi Aymen.A framework for biological event extraction from text[C]//International Conference on Web Intelligence,2012:1-9.
[8]  孙敏.地理信息本体论[J].地理与地理信息科学,2004,20(3):6-11.Sun Min.Geographical Information Ontology[J].Geography and Geographical Information Science,2004,20(3):6-11.
[9]  Gong Lejun,Sun Xiao.Extraction of biomedical events related to disease based on deep parsing[J].Advanced Science Letters,2011,14(11-12):3470-3474.
[10]  刘耀华.基于句法分析的中文事件抽取方法研究[D].上海:上海大学,2009.Liu Yaohua.Research on chinese event extraction method based on syntactic analysis[D].Shanghai:Shanghai University,2009.
[11]  Ananiadou Sophia,Pyysalo Sampo,Tsujii Junichi,et al.Event extraction for systems biology by text mining the literature[J].Trends in Biotechnology,2010,28(7):381-390.
[12]  翟晓华,孙炜.中文信息的语义数据挖掘技术研究[D].长沙:湖南大学,2008.Zhai Xiaohua,Sun Wei.Research on semantic data mining technique of chinese information[D].Changsha:Hunan University,2008.
[13]  Bao Li,Chen Yuzhong,Yu Shiwen.Research on information extraction:a survey[J].Computer Engineering&Applications,2003,501:17-26.
[14]  刘纪平,栗斌,石丽红,等.一种本体驱动的地理空间时间相关信息自动检索方法[J].测绘学报,2011,40(4):502-508.Liu Jiping,Li Bin,Shi Lihong,et al.An automated retrieval method of geo-spatial event information based on ontology[J].Acta Geodaetica et Cartographica Sinica,2011,40(4):502-508.
[15]  王振峰.基于本体的地理事件信息检索[D].武汉:武汉大学,2009.Wang Zhenfeng.Geographical event retrieval based on ontology[D].Wuhan:Wuhan University,2009.
[16]  Atkinson,Martin,Piskorski,Jakub.Frontex realtime news event extraction framework[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2011:749-752.
[17]  Lin Kongyuan,Lu Zhiying.Study on the chinese syntactic analysis[C]//Proceedings of the IEEE International Conference on Intelligent Processing System,ICIPS,1998:2:1790-1793.
[18]  Guo Hao,Chen Dongwei,Chen Junjie.Application of data oriented parsing method in chinese syntactic analysis[J].Journal of Computational Information System,2011,7(16):5708-5714.
[19]  Hung Chenming,Chien Leefeng.Web-based text classification in the absence of manually labeled training documents[J].Journal of the American Society for Information Science and Technology,2007,58(1):88-96.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133