OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Modeling and Simulation 2025

基于检索增强生成与软提示优化的大模型开放域问答方法
LLMs Open-Domain Question Answering Method Based on Retrieval-Augmented Generation and Soft Prompt Optimization

DOI: 10.12677/mos.2025.144340, PP. 901-913

刘浩然

Keywords: 长尾知识，检索增强生成，软提示，大语言模型
Long-Tail Knowledge, Retrieval-Augmented Generation, Soft Prompt, LLMs

Full-Text Cite this paper Add to My Lib

Abstract:

针对大语言模型(LLM)在开放域问答任务中长尾知识处理能力不足的问题，本文提出了一种融合检索增强生成(RAG)与软提示优化的新型框架SOFTRAG，旨在提升模型对低频知识的利用效率并缓解传统方法的局限性。研究结合检索增强生成(RAG)与软提示优化技术，并引入基于Perceiver的软提示适配器用于提取关键信息，同时采用LoRAMoE方法实现参数高效微调。在PopQA、TriviaQA、PubHealth和ASQA等数据集上，SOFTRAG框架在准确率、推理精度及泛化能力上均显著超越无检索基线和传统RAG方法。消融实验进一步验证了软提示、检索模块和微调技术对性能提升的关键作用。本研究方法有效平衡了性能与资源开销，显著改善了大模型在处理长尾知识任务中的表现，为开放域问答提供了新的优化思路。
To address the limitations of large language models (LLMs) in handling long-tail knowledge for open-domain question answering tasks, this paper proposes SOFTRAG, a novel framework that integrates Retrieval-Augmented Generation (RAG) with soft prompt optimization. The framework aims to enhance the utilization efficiency of low-frequency knowledge and mitigate the constraints of traditional approaches. The study combines RAG with soft prompt optimization techniques, introducing a Perceiver-based soft prompt adapter for extracting critical information and employing the LoRAMoE method for parameter-efficient fine-tuning. Evaluated on datasets including PopQA, TriviaQA, PubHealth, and ASQA, the SOFTRAG framework demonstrates significant improvements in accuracy, reasoning precision, and generalization capabilities compared to retrieval-free baselines and conventional RAG methods. Ablation experiments further validate the critical contributions of soft prompting, retrieval modules, and fine-tuning techniques to performance enhancement. This approach effectively balances performance with computational resource requirements, substantially improving LLMs’ performance on long-tail knowledge tasks and offering new optimization insights for open-domain question answering.

References

[1]	Floridi, L. and Chiriatti, M. (2020) GPT-3: Its Nature, Scope, Limits, and Consequences. Minds and Machines, 30, 681-694. https://doi.org/10.1007/s11023-020-09548-1
[2]	Devlin, J., Chang, M. W., Lee, K. and Toutanova, K. (2019) Bert: Pretraining of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, 2-7 June 2019, 4171-4186.
[3]	Siriwardhana, S., Weerasekera, R., Wen, E., Kaluarachchi, T., Rana, R. and Nanayakkara, S. (2023) Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering. Transactions of the Association for Computational Linguistics, 11, 1-17. https://doi.org/10.1162/tacl_a_00530
[4]	Kandpal, N., Deng, H., Roberts, A., Wallace, E. and Raffel, C. (2023) Large Language Models Struggle to Learn Long-Tail Knowledge. International Conference on Machine Learning, Honolulu, 23-29 July 2023, 15696-15707.
[5]	Zhang, T., Wang, C., Hu, N., Qiu, M., Tang, C., He, X., et al. (2022) DKPLM: Decomposable Knowledge-Enhanced Pre-Trained Language Model for Natural Language Understanding. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 11703-11711. https://doi.org/10.1609/aaai.v36i10.21425
[6]	Li, D., Yan, J., Zhang, T., Wang, C., He, X., Huang, L., et al. (2024) On the Role of Long-Tail Knowledge in Retrieval Augmented Large Language Models. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Bangkok, 11-16 August 2024, 120-126. https://doi.org/10.18653/v1/2024.acl-short.12
[7]	Islam, S.B., Rahman, M.A., Hossain, K.S.M.T., Hoque, E., Joty, S. and Parvez, M.R. (2024) Open-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models. Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, November 2024, 14231-14244. https://doi.org/10.18653/v1/2024.findings-emnlp.831
[8]	Asai, A., Wu, Z., Wang, Y., Sil, A. and Hajishirzi, H. (2023) Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. arXiv: 2310.11511.
[9]	Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H. and Neubig, G. (2023) Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys, 55, 1-35. https://doi.org/10.1145/3560815
[10]	Yao, Y., Duan, J., Xu, K., Cai, Y., Sun, Z. and Zhang, Y. (2024) A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly. High-Confidence Computing, 4, Article ID: 100211. https://doi.org/10.1016/j.hcc.2024.100211
[11]	Qin, G. and Eisner, J. (2021) Learning How to Ask: Querying LMs with Mixtures of Soft Prompts. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 6-11 June 2021, 5203-5212. https://doi.org/10.18653/v1/2021.naacl-main.410
[12]	Dou, S., Zhou, E., Liu, Y., Gao, S., Shen, W., Xiong, L., et al. (2024) LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via Moe-Style Plugin. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Bangkok, 11-16 August 2024, 1932-1945. https://doi.org/10.18653/v1/2024.acl-long.106
[13]	Ding, N., Qin, Y., Yang, G., Wei, F., Yang, Z., Su, Y., et al. (2023) Parameter-Efficient Fine-Tuning of Large-Scale Pre-Trained Language Models. Nature Machine Intelligence, 5, 220-235. https://doi.org/10.1038/s42256-023-00626-4
[14]	Ni, J., Hernandez Abrego, G., Constant, N., Ma, J., Hall, K., Cer, D., et al. (2022) Sentence-T5: Scalable Sentence Encoders from Pre-Trained Text-To-Text Models. Findings of the Association for Computational Linguistics: ACL 2022, Dublin, 22-27 May 2022, 1864-1874. https://doi.org/10.18653/v1/2022.findings-acl.146
[15]	Jaegle, A., Gimeno, F., Brock, A., Vinyals, O., Zisserman, A. and Carreira, J. (2021) Perceiver: General Perception with Iterative Attention. International Conference on Machine Learning, 18-24 July 2021, 4651-4664.
[16]	Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Chen, W., et al. (2022) LoRA: Low-Rank Adaptation of Large Language Models. arXiv: 2106.09685.
[17]	Mallen, A., Asai, A., Zhong, V., Das, R., Khashabi, D. and Hajishirzi, H. (2023) When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, 9-14 July 2023, 9802-9822. https://doi.org/10.18653/v1/2023.acl-long.546
[18]	Joshi, M., Choi, E., Weld, D. and Zettlemoyer, L. (2017) TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, July 2017, 1601-1611. https://doi.org/10.18653/v1/p17-1147
[19]	Kotonya, N. and Toni, F. (2020) Explainable Automated Fact-Checking for Public Health Claims. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 16-20 November 2020, 7740-7754. https://doi.org/10.18653/v1/2020.emnlp-main.623
[20]	Stelmakh, I., Luan, Y., Dhingra, B. and Chang, M. (2022) ASQA: Factoid Questions Meet Long-Form Answers. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, 7-11 December 2022, 8273-8288. https://doi.org/10.18653/v1/2022.emnlp-main.566
[21]	Izacard, G., Caron, M., Hosseini, L., Riedel, S., Bojanowski, P., Joulin, A. and Grave, E. (2021) Unsupervised Dense in-Formation Retrieval with Contrastive Learning. arXiv: 2112.09118.
[22]	Ni, J., Qu, C., Lu, J., Dai, Z., Hernandez Abrego, G., Ma, J., et al. (2022) Large Dual Encoders Are Generalizable Retrievers. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, 7-11 December 2022, 9844-9855. https://doi.org/10.18653/v1/2022.emnlp-main.669
[23]	Pillutla, K., Swayamdipta, S., Zellers, R., Thickstun, J., Welleck, S., Choi, Y. and Harchaoui, Z. (2021) Mauve: Measuring the Gap between Neural Text and Human Text Using Divergence Frontiers. Advances in Neural Information Processing Systems, 34, 4816-4828.
[24]	Zheng, L., Chiang, W. L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Stoica, I., et al. (2023). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. Advances in Neural Information Processing Systems, 36, 46595-46623.
[25]	Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. A., Lacroix, T., Lample, G., et al. (2023) LLaMA: Open and Efficient Foundation Language Models. arXiv: 2302.13971.
[26]	Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Scialom, T., et al. (2023) LLaMA 2: Open Foundation and Fine-Tuned Chat Models. arXiv: 2307.09288.
[27]	Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Lowe, R., et al. (2022) Training Language Models to Follow Instructions with Human Feedback. Advances in Neural Information Processing Systems, 35, 27730-27744.
[28]	Luo, H., Zhang, T., Chuang, Y., Gong, Y., Kim, Y., Wu, X., et al. (2023) Search Augmented Instruction Learning. Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 2023, 3717-3729. https://doi.org/10.18653/v1/2023.findings-emnlp.242
[29]	Dubois, Y., Li, C. X., Taori, R., Zhang, T., Gulrajani, I., Ba, J., Hashimoto, T.B., et al. (2023) AlpacaFarm: A Simulation Framework for Methods That Learn from Human Feedback. Advances in Neural Information Processing Systems, 36, 30039-30069.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

基于检索增强生成与软提示优化的大模型开放域问答方法LLMs Open-Domain Question Answering Method Based on Retrieval-Augmented Generation and Soft Prompt Optimization

基于检索增强生成与软提示优化的大模型开放域问答方法
LLMs Open-Domain Question Answering Method Based on Retrieval-Augmented Generation and Soft Prompt Optimization