%0 Journal Article %T A New Q Learning Algorithm for Multi-agent Systems
一种新的多智能体Q学习算法 %A GUO Rui %A WU Min %A PENG Jun %A PENG Jiao %A CAO Wei-Hua %A
郭锐 %A 吴敏 %A 彭军 %A 彭姣 %A 曹卫华 %J 自动化学报 %D 2007 %I %X Due to the presence of other agents, the environment of multi-agent systems (MAS) cannot be simply treated as Markov decision processes (MDPs). The current reinforcement learning algorithms which are based on MDPs must be reformed before it can be applicable to MAS. Based on an agent's independent learning ability this paper proposes a novel Q-learning algorithm for MAS---an agent learning other agents' action policies through observing the joint action. The policies of other agents are expressed as action probability distribution matrixes. A concise and yet useful updating method for the matrixes is proposed. The full joint probability of distribution matrixes guarantees the learning agent to choose his/her optimal action. The convergence and performance of the proposed algorithm are analyzed theoretically. When applied to RoboCup, our algorithm showed high learning efficiency and good generalization ability. Finally, we briefly point out some directions of multi-agent reinforcement learning. %K Multi-agent systems %K reinforcement learning %K Q-learning
多智能体 %K 增强学习 %K Q学习 %U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=E76622685B64B2AA896A7F777B64EB3A&aid=3FB68992DF64597D&yid=A732AF04DDA03BB3&vid=27746BCEEE58E9DC&iid=E158A972A605785F&sid=4C2B9916B58305BE&eid=965F4E89CD0AFC30&journal_id=0254-4156&journal_name=自动化学报&referenced_num=0&reference_num=12