All Title Author
Keywords Abstract

Publish in OALib Journal
ISSN: 2333-9721
APC: Only $99


Relative Articles


Research on Generative Dialogue System Based on Reinforcement Learning

DOI: 10.12677/HJDM.2023.132018, PP. 185-193

Keywords: 对话系统,强化学习,多样性探索,回复多样性,Dialogue System, Intensive Learning, Diversity Exploration, Reply to Diversity

Full-Text   Cite this paper   Add to My Lib


An open dialogue system model with diverse responses is constructed to try to solve the monotonous questions answered by the dialogue system during the response process. This paper proposes a generative dialogue method that combines bidirectional short-term memory neural network and reinforcement learning model. First, the corpus is preprocessed with various types of filters, so that the discourse corpus can be explored in a variety of ways; Secondly, in order to increase the diversity of the reply of the dialogue system, the diversity cluster search is used as the decoder; Finally, in the fine-tuning model stage, the self-assessment sequence training method is used to reduce the high square error phenomenon of the REINFORCE algorithm strategy gradient. Compared with Srinivasan’s method, the proposed method has increased 10.5%, 9% and 5% respectively in BLUE, ROUGE-L and Perplexity, and the training time of the model has been shortened by 43%. The num-ber of some types of corpus is relatively small, so the topic of dialogue system is relatively lacking. The traditional network architecture and reinforcement learning method can effectively make the dialogue system produce valuable replies.


[1]  Li, J., Monroe, W., Ritter, A., Jurafsky, D., Galley, M. and Gao, J. (2016) Deep Reinforcement Learning for Dialogue Generation. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, 1-4 November 2016, 1192-1202.
[2]  Li, J., Galley, M., Brockett, C., Gao, J. and Dolan, B. (2016) A Diversity Promoting Objective Function for Neural Conversation Models. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, 12-17 June 2016, 110-119.
[3]  Srinivasan, V., Santhanam, S. and Shaikh, S. (2019) Natural Language Generation Using Reinforcement Learning with External Rewards. ArXiv Preprint ArXiv: 1911.11404.
[4]  Liu, Y., Zhang, L., Han, W., Zhang, Y. and Tu, K. (2021) Constrained Text Generation with Global Guidance—Case Study on CommonGen. ArXiv Preprint ArXiv: 2103.07170.
[5]  Ive, J., Li, A.M., Miao, Y., et al. (2021) Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation. ArXiv Preprint ArXiv: 2102.11387.
[6]  Srinivasan, V., Santhanam, S. and Shaikh, S. (2019) Natural Language Generation Using Reinforcement Learning with External Rewards. ArXiv Preprint ArXiv: 1911.11404.
[7]  Liu, Q., Chen, Y., Chen, B., et al. (2020) You Impress Me: Dia-logue Generation via Mutual Persona Perception. Proceedings of the 58th Annual Meeting of the Association for Compu-tational Linguistics, Online, 5-10 July 2020, 1417-1427.
[8]  Vi-jayakumar, A.K., et al. (2017) Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models. ArXiv Preprint ArXiv: 1610.02424
[9]  Arulkumaran, K., Deisenroth, M.P., Brundage, M. and Bharath, A.A. (2017) A Brief Survey of Deep Reinforcement Learning. ArXiv Preprint ArXiv: 1708.05866.
[10]  Danescu-Niculescu-Mizil, C. and Lee, L. (2011) Chameleons in Imagined Conversations: A New Approach to Understanding Coordination of Linguistic Style in Dialogs. ArXiv Preprint ArXiv: 1106.3077.
[11]  Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D. and Kavukcuoglu, K. (2016) Asynchronous Methods for Deep Reinforcement Learning. Proceed-ings of the 33rd International Conference on Machine Learning, New York, 19-24 June 2016, 1928-1937.
[12]  Rennie, S.J., Marcheret, E., Mroueh, Y., Ross, J. and Goel, V. (2017) Self-Critical Sequence Training for Image Captioning. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 1179-1195.
[13]  Xu, C., Li, P., Wang, W., et al. (2022) COSPLAY: Concept Set Guided Personalized Dialogue Generation Across Both Party Personas. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, 11-15 July 2022, 201-211.
[14]  Cao, Y., Bi, W., Fang, M., Shi, S. and Tao, D. (2022) A Mod-el-Agnostic Data Manipulation Method for Persona-based Dialogue Generation. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, 22-27 May 2022, 7984-8002.
[15]  Papineni, K., Roukos, S., Ward, T. and Zhu, W.-J. (2002) Bleu: A Method for Automatic Evaluation of Machine Translation. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, 7-12 July 2002, 311-318.
[16]  高俊. 开放域对话系统的多样化回复生成方法研究[D]: [硕士学位论文]. 苏州: 苏州大学, 2020.
[17]  王晶. 基于强化学习的情感对话回复生成算法研究[D]: [硕士学位论文]. 桂林: 桂林电子科技大学, 2020.


comments powered by Disqus

Contact Us


WhatsApp +8615387084133

WeChat 1538708413