%0 Journal Article %T 强化学习的地-空异构多智能体协作覆盖研究 %A 张文旭 %A 马磊 %A 贺荟霖 %A 王晓东 %J 智能系统学报 %D 2018 %R 10.11992/tis.201609017 %X 以无人机(unmanned aerial vehicle, UAV)和无人车(unmanned ground vehicle, UGV)的异构协作任务为背景,通过UAV和UGV的异构特性互补,为了扩展和改进异构多智能体的动态覆盖问题,提出了一种地-空异构多智能体协作覆盖模型。在覆盖过程中,UAV可以利用速度与观测范围的优势对UGV的行动进行指导;同时考虑智能体的局部观测性与不确定性,以分布式局部可观测马尔可夫(decentralized partially observable Markov decision processes,DEC-POMDPs)为模型搭建覆盖场景,并利用多智能体强化学习算法完成对环境的覆盖。仿真实验表明,UAV与 UGV间的协作加快了团队对环境的覆盖速度,同时强化学习算法也提高了覆盖模型的有效性。</br>With the heterogeneous coordinate task of unmanned aerial vehicles (UAVs) and unmanned ground vehicle (UGVs) as the background to this study, a novel air-ground heterogeneous coverage model for a coordinated multi-agent is proposed by the complementation between UAV and UGV heterogeneity, in order to extend and improve the dynamic coverage of a heterogeneous multi-agent system. During the coverage process, the advantages of mobility and the observation scope of the UAV were used in order to guide the actions of the UGV. Moreover, in view of the partial agent observability and uncertainty, decentralized and partially observable Markov decision processes (DEC-POMDPs) were applied as the model in order to establish the coverage environment. Additionally, the reinforced learning algorithm of multi-agents was utilized in order to complete the coverage of the environment. The simulation results revealed that the coverage process was accelerated by the cooperation of the UAV and UGV. Additionally, the reinforced learning algorithm also improved the effectiveness of the coverage model %K 异构多智能体 %K 覆盖问题 %K 地-空 %K UAV/UGV %K DEC-POMDPs %K 强化学习< %K /br> %K heterogeneous multi-agent system %K coverage %K air-ground %K UAV/UGV %K DEC-POMDPs %K reinforced learning %U http://tis.hrbeu.edu.cn/oa/darticle.aspx?type=view&id=201609017