%0 Journal Article
%T 短剧用户观感评价情感分析
Sentiment Analysis of the Users of Micro-Dramas
%A 丁文轩
%A 李征宇
%J Hans Journal of Data Mining
%P 82-93
%@ 2163-1468
%D 2025
%I Hans Publishing
%R 10.12677/hjdm.2025.151007
%X 短剧随着时代发展逐渐崛起,成为当今国内外新潮的娱乐载体。本文爬取腾讯短剧品牌十分剧场的短剧用户评价,对该不平衡样本数据进行情感分析,比较多种模型与模型组合的效率与效果。1) 使用Word2vec的连续词袋模型将预处理后的文本转为词向量,构建LSTM/BILSTM模型,两者无效果差别,LSTM所用时间最短;2) 构建TextCNN + LSTM/BILSTM模型,使用TextCNN获取向量特征,通过LSTM/BILSTM学习情感规律,稀少数据的F1-Score提升约10%;3) 构建TextCNN + LSTM + Muti_Head_Attention模型,添加多头注意力机制把握字与字之间的多重联系,耗时增加一倍,稀少数据的F1-Score上限再次提升1%;4) 使用随机删除增强数据会以降低20%的精准率的代价提高10%的召回率;5) 在第3点的基础上在卷积层中添加残差连接,稀少数据的F1-Score上限提高2%;6) 使用Bert/Roberta的分词器和模型取代Word2vec与传统RNN,得到的结果对比第5点,提升约为9%/12%,泛化性更强,时间和硬件成本大幅提升,但添加TextCNN、LSTM与多头注意力后,效果反而出现下降。
As Micro-Dramas grow in popularity worldwide, this article evaluates user reviews from Tencent’s “Shifen Theater”, analyzing imbalanced data sentiment and comparing various models and combinations. 1) Word2Vec’s bag-of-words model turns preprocessed text into vectors, building LSTM/BiLSTM models—both perform poorly, with LSTM being the fastest; 2) The TextCNN + LSTM/ BiLSTM model uses TextCNN for vector features and LSTM/BiLSTM for sentiment learning, boosting the F1-Score for rare data by about 10%; 3) Adding Multi-Head Attention to TextCNN + LSTM/BiLSTM captures intricate character relationships, doubling the runtime and increasing the F1-Score by 1%; 4) Random deletion enhances data but sacrifices 20% precision for 10% better recall; 5) Add residual connections to the convolution layers in model 3, improving the F1-Score by 2% on sparse data; 6) Replacing Word2Vec and traditional RNNs with Bert/Roberta improves results by 11%/14% over the third model, offers better generalizability, but increases time and cost significantly. However, incorporating TextCNN, LSTM, and Multi-Head Attention can decrease performance.
%K 短剧,
%K 自然语言处理,
%K 情感分析,
%K 深度学习
Micro-Dramas
%K NLP
%K Sentiment Analysis
%K Deep Learning
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=105824