OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

软件学报 2008

Reinforcement Learning Model Based on Regret for Multi-Agent Conflict Games
基于后悔值的多Agent冲突博弈强化学习模型

XIAO Zheng,ZHANG Shi-Yong,
肖正,张世永

Keywords: Markov game,reinforcement learning,conflict game,conflict resolving
Markov对策,强化学习,冲突博弈,冲突消解

Full-Text Cite this paper Add to My Lib

Abstract:

For conflict game,a rational but conservative action selection method is investigated,namely, minimizing regret function in the worst case.By this method the loss incurred possibly in future is the lowest under this very policy,and Nash equilibrium mixed policy is obtained without information about other agents.Based on regret,a reinforcement learning model and its algorithm for conflict game under multi-agent complex environment are put forward.This model also builds agents' belief updating process on the concept of cross entropy distance, which further optimizes action selection policy for conflict games.Based on Markov repeated game model,this paper demonstrates the convergence property of this algorithm,and analyzes the relationship between belief and optimal policy.Additionally,compared with extended Q-learning algorithm under MMDP (multi-agent markov decision process),the proposed algorithm decreases the number of conflicts dramatically,enhances coordination among agents,improves system performance,and helps to maintain system stability.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

Reinforcement Learning Model Based on Regret for Multi-Agent Conflict Games基于后悔值的多Agent冲突博弈强化学习模型

Reinforcement Learning Model Based on Regret for Multi-Agent Conflict Games
基于后悔值的多Agent冲突博弈强化学习模型