全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

词性对新闻和微博网络话题检测的影响

Keywords: 话题检测,词性,文本特征,新闻,微博

Full-Text   Cite this paper   Add to My Lib

Abstract:

针对新闻和微博2组有代表性的语料开展实验研究,旨在发现不同词性特征及其组合对2种通用网络平台话题检测的作用及其影响.研究表明:在选择单一词性特征时,名词特征可得到最好的检测结果,命名实体可在保证准确率的情况下大大降低聚类的特征维度.在选择词性组合作为特征时,名词或命名实体、数词、时间短语、形容词以及量词的组合特征可提升新闻网络话题检测的准确率,而名词或命名实体、形容词、量词、数词以及特殊符号与网址的组合特征可在微博语料上获得较好的检测结果.

References

[1]  YANG Liang, LIN Yuan, LIN Hong-fei. Micro-blog hot events detection based on emotion distribution[J]. Journal of Chinese Information Processing, 2012, 26(1): 84-90. (in Chinese)
[2]  周刚, 邹鸿程, 熊小兵, 等. MB-SinglePass: 基于组合相似度的微博话题检测[J]. 计算机科学, 2012, 39(10): 198-202.
[3]  ZHOU Gang, ZOU Hong-cheng, XIONG Xiao-bing, et al. MB-singlepass: micro-blog topic detection based on combined similarity[J]. Computer Science, 2012, 39(10): 198-202. (in Chinese)
[4]  CHUA S. The role of parts-of-speech in feature selection[C]//Proceedings of the International MultiConference of Engineers and Computer Scientists. Hong Kong: International Association of Engineers, 2008: 457-461.
[5]  LIU Zi-tao, YU Wen-chao, DENG Ya-lan. A feature selection method for document clustering based on part-of-speech and word co-occurrence[C]//Proceedings of the 7th International Conference on Fuzzy Systems and Knowledge Discovery. Yantai: Yantai University, 2010: 2331-2334.
[6]  韩普, 王东波, 刘艳云, 等. 词性对中英文文本聚类的影响研究[J]. 中文信息学报, 2013, 27(2): 65-73.
[7]  HAN Pu, WANG Dong-bo, LIU Yan-yun, et al. Influence of part-of-speech on Chinese and English document clustering[J]. Journal of Chinese Information Processing, 2013, 27(2): 65-73. (in Chinese)
[8]  洪宇, 张宇, 刘挺, 等. 话题检测与跟踪的评测及研究综述[J]. 中文信息学报, 2007, 21(6): 71-87.
[9]  HONG Yu, ZHANG Yu, LIU Ting, et al. Topic detection and tracking review[J]. Journal of Chinese Information Processing, 2007, 21(6): 71-87. (in Chinese)
[10]  张小明, 李舟军, 巢文涵. 基于增量型聚类的自动话题检测研究[J]. 软件学报,2012, 23(6): 1578-1587.
[11]  ZHANG Xiao-ming, LI Zhou-jun, CHAO Wen-han. Research of automatic topic detection based on incremental clustering[J]. Journal of Software, 2012, 23(6): 1578-1587. (in Chinese)
[12]  李营那, 阮彤, 顾春华. 基于新闻要素的在线新事件检测[J]. 计算机应用与软件, 2013, 30(12): 100-104,176.
[13]  LI Ying-na, RUAN Tong, GU Chun-hua. Online new event detection based on news elements[J]. Computer Application and Software, 2013, 30(12): 100-104, 176.(in Chinese)
[14]  ALLAN J, CARBONELL J G, DODDINGTON G, et al. Topic detection and tracking pilot study final report[C]//Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop. Virginia: Carnegie Mellon University, 1998: 194-218.
[15]  ALSUMAIT L, BARBARA D, DOMENICONI C. On-line LDA: adaptive topic models for mining text streams with applications to topic detection and tracking[C] //Eighth IEEE International Conference on Data Mining. Pisa: Institute of Electrical and Electronics Engineers, 2008: 3-
[16]  12.
[17]  CATALDI M, DI C L, SCHIFANELLA C. Emerging topic detection on twitter based on temporal and social terms evaluation[C] //Proceedings of the Tenth International Workshop on Multimedia Data Mining. Washington DC: Association for Computing Machinery, 2010: 4.
[18]  SAKAKI T, OKAZAKI M, MATSUO Y. Earthquake shakes twitter user: real-time event detection by social sensors[C] //Proceedings of the 19th International Conference on World Wide Web. North Carolina: Association for Computing Machinery, 2010: 851-861.
[19]  杨亮, 林原, 林鸿飞. 基于情感分布的微博热点事件发现[J]. 中文信息学报, 2012, 26(1): 84-90.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133