|
- 2018
基于双重注意力模型的微博情感分析方法
|
Abstract:
微博情感分析是获取微博用户观点的基础。该文针对现有大多数情感分析方法将深度学习模型与情感符号相剥离的现状,提出了一种基于双重注意力模型的微博情感分析方法。该方法利用现有的情感知识库构建了一个包含情感词、程度副词、否定词、微博表情符号和常用网络用语的微博情感符号库;采用双向长短记忆网络模型和全连接网络,分别对微博文本和文本中包含的情感符号进行编码;采用注意力模型分别构建微博文本和情感符号的语义表示,并将两者的语义表示进行融合,以构建微博文本的最终语义表示;基于所构建的语义表示对情感分类模型进行训练。该方法通过将注意力模型和情感符号相结合,有效增强了对微博文本情感语义的捕获能力,提高了微博情感分类的性能。基于自然语言处理与中文计算会议(NLPCC)微博情感测评公共数据集,对所提出的模型进行评测,结果表明:该模型在多个情感分类任务中都取得了最佳效果,相对于已知最好的模型,在2013年的数据集上,宏平均和微平均的F1值分别提升了1.39%和1.26%,在2014年的数据集上,宏平均和微平均的F1值分别提升了2.02%和2.21%。
Abstract:Microblog sentiment analysis is used to get a user's point of view. Most sentiment analysis methods based on deep learning models do not use emotion symbols. This study uses a double attention model for microblog sentiment analysis that first constructs a microblog emotion symbol knowledge base based on existing emotional semantic resources including emotion words, degree adverbs, negative words, microblog emoticons and common Internet slang. Then, bidirectional long short-term memory and a full connection network are used to encode the microblog text and the emotion symbols in the text. After that, an attention model is used to construct the semantic representations of the microblog text and emotion symbols which are combined to construct the final semantic expression of the microblog text. Finally, the emotion classification model is trained on these semantic representations. The combined attention model and emotion symbols enhance the ability to capture the emotions and improve the microblog sentiment classification. This model gives the best accuracy for many sentiment classification tasks on the Natural Language Processing and Chinese Computing (NLPCC) microblog sentiment analysis task datasets. Tests on the 2013 and 2014 NLPCC datasets give F1-scores for the macro and micro averages that are 1.39% and 1.26% higher than the known best model for the 2013 dataset and 2.02% and 2.21% higher for the 2014 dataset.