全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
-  2015 

基于词亲和度的微博词语语义倾向识别算法
Semantic Orientation Identification for Terms From Chinese Micro-blogs Based on Word Affinity Measure

DOI: 10.16337/j.1004-9037.2015.01.013

Keywords: 微博,情感词,情感分析,语义倾向,词亲和度
Micro-blog
, opinioned terms, sentiment analysis, semantic orientation, word affinity measure

Full-Text   Cite this paper   Add to My Lib

Abstract:

准确识别词语语义倾向并构建高质量的情感词典,从而提高微博文本情感分析的准确率,具有重要意义。传统的基于语料库方法对种子词选取敏感,并且不能有 效对低频词语语义倾向进行识别。本文提出了一种基于词亲和度的微博词语语义倾向识别算法。利用词性组合模式提取候选词集,选取微博表情符号作为种子词, 并构建词亲和度网络,利用同义词词林对低频词进行扩展,计算候选词与种子词之间语义倾向相似度。根据设定阈值判断词语语义倾向。在200万条微博语料上分别将本文算法与传统算法进行对比,实验结果表明本文算法优于传统算法。
How to identify the semantic orientation of terms and build a high-quality sentiment dictionary to improve the accuracy of sentiment analysis on Micro-blogs has significant importance. Traditional algorithms based on corpus are sensitive to the seed words, and cannot effectively identify semantic orientation identification on low-frequency terms. To solve this problem, an algorithm based on word affinity measure is proposed to identify the semantic orientation of terms from Chinese Micro-blogs. Firstly, candidate words are extracted by the part of speech combination patterns. Secondly, Micro-blog emoticons are selected as seed words, and word affinity networks are built. Then, low frequency words are expanded by a synonyms dictionary during calculating the semantic orientation similarity between candidate words and seed words. Finally, the semantic orientation is determined according to the threshold. Experiments are conducted on a corpus with two million Micro-blogs using the proposed algorithm and traditional algorithms respectively. Experimental results show the advantage of the proposed algorithm.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133