%0 Journal Article
%T 短剧用户观感评价情感分析<br>Sentiment Analysis of the Users of Micro-Dramas
%A 丁文轩
%A 李征宇
%J Hans Journal of Data Mining
%P 82-93
%@ 2163-1468
%D 2025
%I Hans Publishing
%R 10.12677/hjdm.2025.151007
%X 短剧随着时代发展逐渐崛起&#65292;成为当今国内外新潮的娱乐载体。本文爬取腾讯短剧品牌十分剧场的短剧用户评价&#65292;对该不平衡样本数据进行情感分析&#65292;比较多种模型与模型组合的效率与效果。1) 使用Word2vec的连续词袋模型将预处理后的文本转为词向量&#65292;构建LSTM/BILSTM模型&#65292;两者无效果差别&#65292;LSTM所用时间最短&#65307;2) 构建TextCNN + LSTM/BILSTM模型&#65292;使用TextCNN获取向量特征&#65292;通过LSTM/BILSTM学习情感规律&#65292;稀少数据的F1-Score提升约10%&#65307;3) 构建TextCNN + LSTM + Muti_Head_Attention模型&#65292;添加多头注意力机制把握字与字之间的多重联系&#65292;耗时增加一倍&#65292;稀少数据的F1-Score上限再次提升1%&#65307;4) 使用随机删除增强数据会以降低20%的精准率的代价提高10%的召回率&#65307;5) 在第3点的基础上在卷积层中添加残差连接&#65292;稀少数据的F1-Score上限提高2%&#65307;6) 使用Bert/Roberta的分词器和模型取代Word2vec与传统RNN&#65292;得到的结果对比第5点&#65292;提升约为9%/12%&#65292;泛化性更强&#65292;时间和硬件成本大幅提升&#65292;但添加TextCNN、LSTM与多头注意力后&#65292;效果反而出现下降。<br>As Micro-Dramas grow in popularity worldwide, this article evaluates user reviews from Tencent&#8217;s &#8220;Shifen Theater&#8221;, analyzing imbalanced data sentiment and comparing various models and combinations. 1) Word2Vec&#8217;s bag-of-words model turns preprocessed text into vectors, building LSTM/BiLSTM models&#8212;both perform poorly, with LSTM being the fastest; 2) The TextCNN + LSTM/ BiLSTM model uses TextCNN for vector features and LSTM/BiLSTM for sentiment learning, boosting the F1-Score for rare data by about 10%; 3) Adding Multi-Head Attention to TextCNN + LSTM/BiLSTM captures intricate character relationships, doubling the runtime and increasing the F1-Score by 1%; 4) Random deletion enhances data but sacrifices 20% precision for 10% better recall; 5) Add residual connections to the convolution layers in model 3, improving the F1-Score by 2% on sparse data; 6) Replacing Word2Vec and traditional RNNs with Bert/Roberta improves results by 11%/14% over the third model, offers better generalizability, but increases time and cost significantly. However, incorporating TextCNN, LSTM, and Multi-Head Attention can decrease performance.
%K 短剧&#65292
%K 自然语言处理&#65292
%K 情感分析&#65292
%K 深度学习<br>Micro-Dramas
%K NLP
%K Sentiment Analysis
%K Deep Learning
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=105824