|
基于提示增强的LLM信息抽取算法
|
Abstract:
信息抽取技术随着自然语言处理技术的发展,已经取得了较好的发展,但在实际应用中,由于算法标注数据需求高、训练代价大,上下文理解难,私有化领域落地一直存在较高瓶颈。本文提出了一种基于提示增强的LLM信息抽取算法(LLM-IE Base on Prompt Enhance),通过将文本信息抽取任务转化为文本生成任务,并基于生成文本进行结构化解析,形成文本信息抽取结果。该方法在实体、关系、事件三类自建数据集上进行测试验证,面对少样本困境,该方法通过提示增强激发模型信息提取任务能力,可以近似达成模型微调的效果,同时相较于其他主流信息抽取模型在准确率与召回率上都有提升。
With the development of natural language processing technology, information extraction techniques have made significant progress. However, in practical applications, due to high algorithmic annotation data requirements, large training costs, and challenges in understanding context, private domain implementations have consistently faced high barriers. This paper proposes an information extraction algorithm for LLMs based on prompt enhance (LLM-IE Based on Prompt Enhance). This method transforms text information extraction tasks into text generation tasks and performs structured parsing based on the generated text to form the results of information extraction. The method was tested and validated on three self-built datasets for entities, relationships, and events. In addressing the challenge of limited sample data, this approach can approximate the effect of model fine-tuning by stimulating the model’s information extraction task capabilities through prompt Enhancement. Additionally, compared to other mainstream information extraction models, this method shows improvements in both accuracy and recall rates.
[1] | 任安琪, 柳林, 王海龙, 等. 面向文本实体关系抽取研究综述[J]. 计算机科学与探索, 2024, 18(11): 2848-2871. |
[2] | 朱炫鹏, 姚海东, 刘隽, 等. 大语言模型算法演进综述[J]. 中兴通讯技术, 2024, 30(2): 9-20. |
[3] | 张钦彤, 王昱超, 王鹤羲, 等. 大语言模型微调技术的研究综述[J]. 计算机工程与应用, 2024, 60(17): 17-33. |
[4] | 赵凯琳, 靳小龙, 王元卓. 小样本学习研究综述[J]. 软件学报, 2021, 32(2): 349-369. |
[5] | Liu, P.F., Yuan, W.Z., Fu, J.L., Jiang, Z.B., Hayashi, H. and Neubig, G. (2021) Pretain Prompt and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. arXiv: 2107.13586. |
[6] | Yang, S., Feng, D., Qiao, L., Kan, Z. and Li, D. (2019) Exploring Pre-Trained Language Models for Event Extraction and Generation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, July 2019, 5284-5294. https://doi.org/10.18653/v1/p19-1522 |
[7] | Zeng, D., Liu, K., Lai, S., et al. (2014) Relation Classification via Convolutional Deep Neural Network. Proceedings of the 25th International Conference on Computational Linguistics, Dublin, 23-29 August 2014, 2335-2344. |
[8] | Chen, Y., Xu, L., Liu, K., Zeng, D. and Zhao, J. (2015) Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, July 2015, 167-176. https://doi.org/10.3115/v1/p15-1017 |
[9] | Socher, R., Huval, B., Manning, C.D., et al. (2012) Semantic Compositionality through Recursive Matrix-Vector Spaces. Proceedings of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, New York, 1201-1211. |
[10] | Nguyen, T.H., Cho, K. and Grishman, R. (2016) Joint Event Extraction via Recurrent Neural Networks. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, June 2016, 300-309. https://doi.org/10.18653/v1/n16-1034 |
[11] | Katiyar, A. and Cardie, C. (2017) Going Out on a Limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, July 2017, 917-928. https://doi.org/10.18653/v1/p17-1085 |
[12] | Xie, T.Y., Li, Q., Zhang, Y., Liu, Z.Z. and Wang, H.W. (2023) Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models. arXiv: 2311.08921. |
[13] | Li, J.P., Jia, Z.X. and Zheng, Z.L. (2024) Semi-Automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models. arXiv: 2403.14888v3. |
[14] | Luo, L. and Xu, Y. (2023) Context-Aware Prompt for Generation-Based Event Argument Extraction with Diffusion Models. Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, 21-25 October 2023, 1717-1725. https://doi.org/10.1145/3583780.3614820 |
[15] | Keloth, V.K., Hu, Y., Xie, Q., Peng, X., Wang, Y., Zheng, A., et al. (2024) Advancing Entity Recognition in Biomedicine via Instruction Tuning of Large Language Models. Bioinformatics, 40, btae163. https://doi.org/10.1093/bioinformatics/btae163 |
[16] | Jin, Y., Liu, J. and Chen, S. (2025) Multi-Lora Continual Learning Based Instruction Tuning Framework for Universal Information Extraction. Knowledge-Based Systems, 308, Article ID: 112750. https://doi.org/10.1016/j.knosys.2024.112750 |
[17] | Bian, J., Zhai, W., Huang, X., Zheng, J. and Zhu, S. (2024) VANER: Leveraging Large Language Model for Versatile and Adaptive Biomedical Named Entity Recognition. Frontiers in Artificial Intelligence and Applications, 392, 1583-1590. https://doi.org/10.3233/faia240664 |
[18] | Wang, X., Zhou, W.K., Zu, C., Xia, H., Chen, T.Z., Zhang, Y.S., Zheng, R., Ye, J.J., Zhang, Q., Gui, T., et al. (2023) InstructUIE: Multi-Task Instruction Tuning for Unified Information Extraction. arXiv: 2304.08085. |
[19] | Wang, C., Liu, X., Chen, Z., Hong, H., Tang, J. and Song, D. (2022) DeepStruct: Pretraining of Language Models for Structure Prediction. Findings of the Association for Computational Linguistics: ACL 2022, Dublin, May 2022, 803-823. https://doi.org/10.18653/v1/2022.findings-acl.67 |
[20] | Xiao, S.T., Liu, Z., et al. (2024) C-Pack: Packed Resources for General Chinese Embeddings. arXiv: 2309.07597V5. |
[21] | 李荣涵, 浦荣成, 沈佳楠, 等. 基于思维链的大语言模型知识蒸馏[J]. 数据采集与处理, 2024, 39(3): 547-558. |
[22] | Lu, Y., Liu, Q., Dai, D., Xiao, X., Lin, H., Han, X., et al. (2022) Unified Structure Generation for Universal Information Extraction. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, May 2022, 5755-5772. https://doi.org/10.18653/v1/2022.acl-long.395 |