|
- 2017
基于不同关键词提取算法的维吾尔文本情感辨识
|
Abstract:
该文在研究不同的关键词提取方法的基础上,针对维吾尔语文本中的生气、高兴等常见情感类型进行情感辨识研究。结合维吾尔文本句子中的情感表达特点,用TextRank、稀疏判别分析(sparse discriminant analysis,SDA)和稀疏支持向量机(sparse support vector machine,Sparse SVM)等提取方法得到具有代表性的关键词集,并基于这些关键词集进行特征提取和情感模型构造。该文从电影电视剧中演员的维吾尔语台词、小说等文本中选取含有生气和高兴2种情感文本的句子,构造实验数据集并验证所提出的文本情感倾向性分析方法的有效性。实验结果表明:该文用多种方法所提取的关键词集都能有效地对维吾尔语文本句子进行情感分类,尤其是基于Sparse SVM的稀疏性分析的关键词提取方法在少量关键词语集上能有效地进行较高准确率的情感分类。
Abstract:This paper describes sentiment classification research on Uyghur text using different keyword extraction methods to recognize common emotions like anger and happiness. The keywords expressing happiness and anger are extracted using the TextRank, sparse discriminant analysis (SDA) and sparse support vector machine (Sparse SVM) methods to train feature extraction and sentiment models. A sentiment text database was built by excerpting the anger and happiness sentiments from Uyghur movies and novels with several validation experiments based on those text databases. The tests show that the keyword extraction methods presented in this paper are effective for emotion classification from Uyghur sentences. The Sparse SVM method is robustness and has higher accuracy in recognition tests with a smaller number of keywords extracted.