%0 Journal Article %T 面向短文本情感分析的词扩充LDA模型<br>A word extend LDA model for short text sentiment %A 沈冀 %A 马志强 %A 李图雅 %A 张力< %A br> %A SHEN Ji %A MA Zhiqiang %A LI Tuya %A ZHANG Li %J 山东大学学报(工学版) %D 2018 %R 10.6040/j.issn.1672-3961.0.2017.407 %X 摘要: 针对短文本在情感极性判断上准确率不高的缺点,在隐含狄利克雷分配(latent Dirichlet allocation, LDA)的基础上提出一种适用于短文本的情感分析模型。该模型在短文本中按词性寻找情感词汇,并对其进行有约束的词语扩充形成扩充集合,增强情感词汇之间的共现频率。将扩充集合加入文本中已发现的情感词汇,使得短文本长度增加并且模型可以提取到情感信息,模型通过这种方法将主题聚类变成情感主题聚类。该模型使用4 000条带有正负情感极性的短文本进行验证,结果表明该模型准确率比情感主题联合模型提高约11%,比隐含情感模型提高约9.5%,同时可以发现更多的情感词汇,证明该模型对于短文本能够提取更丰富的情感特征并在情感极性分类上准确率较高。<br>Abstract: Faced with low accuracy of sentiment polarity analysis for short text, this research presented an sentiment analysis model for short text based on latent dirichlet allocation. The model searched for the emotional words by the part of speech in the short texts and expanded them restrainedly to an extended set, enhanced the co-occurrence frequency between emotional words. The model added the expanded set to the discovered emotional words in short texts, increasing length of the short texts, extracting emotional information and turning topic clustering into emotion topic clustering. The model used 4 000 positive and negative short texts to experiments. The results showed that our model improved sentiment classification 11.8% than joint sentiment topic model model and 9.5% than latent sentiment model model; more emotional words were found at the same time. It proved that the model extracted richer emotion features for short texts and had a higher accuracy of classification in sentiment analysis %K 短文本 %K 情感分析 %K 隐含狄利克雷分配 %K 无监督学习 %K 词扩充 %K 文档-主题生成模型 %K < %K br> %K short text %K word extend function %K latent Dirichlet allocation %K unsupervised learning %K sentiment analysis %K document-topic generative model %U http://gxbwk.njournal.sdu.edu.cn/CN/10.6040/j.issn.1672-3961.0.2017.407