%0 Journal Article
%T 基于大模型的引文情感分类问题的研究
A Research on Citation Sentiment Classification Based on Large Language Model
%A 孔明辉
%A 赵兰
%J Computer Science and Application
%P 209-219
%@ 2161-881X
%D 2025
%I Hans Publishing
%R 10.12677/csa.2025.151021
%X 针对科学文献影响力排名研究领域中,需要对引文情感极性进行预测的问题,提出了将大语言模型的提示工程(零样本学习以及少样本学习)方法应用在引文情感分类中这一方案,分析当下热门大语言模型如Llama,Gpt-4o-Mini等以及基于Bert的深度学习模型在科学引文情感分类问题上的效果。首先通过基于大语言模型的提示工程方法预测引文情感极性,分析预测效果,再与基于Bert的深度学习模型在这一问题中的表现进行对比分析。实验结果表明,基于Bert的深度学习模型情感分类准确率在90%以上,最高可达94.31%,F1值均在80%以上;基于大语言模型的零样本学习和少样本学习方法分类效果与前者有明显差距,准确率最高可达84.70%,F1值最高仅可达63.65%。和基于Bert的深度学习模型分类效果相比,基于大语言模型的提示工程方法虽然在该任务中准确率受限,但其泛化能力较强,是一种简便且高效的方法,对于任务快速部署和应用非常有用。
This paper proposes a method that applies prompt engineering (zero-shot and few-shot learning) from large language models (LLMs) to predict citation sentiment polarity in scientific literature impact ranking research. The study analyzes the performance of popular LLMs, such as Llama and GPT-4o-Mini, and BERT-based deep learning models in the task of scientific citation sentiment classification. The method uses prompt engineering with large language models to predict sentiment polarity. The results are compared with those of BERT-based deep learning models. Experimental results show that BERT-based models achieve sentiment classification accuracy over 90%, with a maximum of 94.31%, and F1 scores above 80%. The zero-shot and few-shot learning methods based on large language models have a significant performance gap. Their maximum accuracy is 84.70%, and the highest F1 score is only 63.65%. Compared to BERT-based models, the prompt engineering method based on large language models has lower accuracy but shows strong generalization ability. It is a simple and efficient method for quick deployment and application.
%K 引文情感分类,
%K Llm,
%K 深度学习
Citation Sentiment Classification
%K Llm
%K Deep Learning
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=106374