全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于微博表情符号的情感词典构建研究

Keywords: 表情符号,情感词典,语料库,情感极性

Full-Text   Cite this paper   Add to My Lib

Abstract:

基于微博表情符号,提出一种自动构建情感词典的方法。从微博平台抓取大量带有表情符号的微博文本,并依据表情符号对微博文本进行情感倾向标注,生成情感语料库。对语料库进行分词、去重等预处理工作,根据词性规则抽取微博文本中情感词,统计每个情感词在正向和负向语料库中出现的次数,计算情感词的卡方统计值获得情感强度,根据情感词在正负微博文本中出现的概率判定情感词的倾向性,进而生成情感词典。这是一种全新的思路。以人工标注的情感词典为基准数据,实验结果表明,本文方法标注情感词的准确率在80%左右,在情绪词强度阈值θ为20、30时,生成情感词典综合F值最好,达到了82%以上。

References

[1]  Zhao Jichang,Dong Li,Wu Junjie,et al.MoodLens: an emoticon-based sentiment analysis system for Chinese tweets in Weibo [C]//Proceedings of the Eighteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).Beijing: ACM, 2012:1528-1531.
[2]  Hu Minqing, Liu Bing.Mining and summarizing customer reviews [C]//Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Chicago: ACM, 2004:168-177.
[3]  朱嫣岚,阂锦,周雅倩,等.基于HowNet的词汇褒贬计算[J].中文信息学报,2006,20(1):14-20.Zhu Yanlan, Min Jin, Zhou Yaqian, et al.Semantic orientation computing based on HowNet[J].Journal of Chinese Information Processing, 2006,20(1):14-20.(in Chinese)
[4]  路斌,万小军,杨建武,等.基于同义词词林的词汇褒贬计算 [C]//第七届中文信息处理国际会议论文集.北京:电子工业出版社,2007:17-23.Lu Bin, Wan Xiaojun, Yang Jianwu, et al.Using tongyici cilin to compute word semantic polaity [C]//Proceedings of The International Conference On Chinese Information Processing.Beijing: Electronic Industry Press, 2007:17-23.(in Chinese)
[5]  Turney P.Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews [C]//Proceedings of the Association of Computational Linguistics(ACL\'02).Philadelphia:[s.n.], 2002:417-424.
[6]  Raymond W M Yuen,Terence Y W Chan,Tom B Y Lai,et al.Morpheme-based derivation of bipolar semantic orientation of Chinese words [C]//Proceedings of the 20th International Conference on Computational Linguistics.Geneva: COLING, 2004:1008-1014.
[7]  Cui A, Zhang M, Liu Y, et al.Emotion tokens: bridging the gap among multilingual twitter sentiment analysis [C]//Proceedings of 7th Asia Information Retrieval Societies Conference.Dubai: United Arab Emirates, 2011: 238-249

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133