%0 Journal Article
%T 基于奇异摄动强化学习的时变系统线性二次零和博弈研究
Singular Perturbation-Based Reinforcement Learning for Time-Varying Linear Quadratic Zero-Sum Games
%A 刘明相
%J Artificial Intelligence and Robotics Research
%P 373-382
%@ 2326-3423
%D 2023
%I Hans Publishing
%R 10.12677/AIRR.2023.124040
%X 本研究探讨了时变系统中的线性二次零和博弈问题,与以往依赖系统模型的方法有所不同。本文提出了一种无模型的强化学习算法,用于寻找纳什均衡解。首先,通过奇异摄动理论,将时变动态博弈问题转化为两个定常系统的博弈问题。接着,利用无模型的强化学习算法,确定这两个定常系统的纳什均衡,进而近似求解了时变系统的纳什均衡解。本文提出的算法框架将为处理基于强化学习的时变系统鲁棒控制问题或信息物理系统的弹性控制问题提供新的研究思路。
This paper tackles the challenge of linear quadratic zero-sum games within dynamic systems that evolve over time. In contrast to previous methods that heavily rely on system models, this paper introduces a novel model-free reinforcement learning algorithm to determine Nash equilibrium solutions. To begin, the paper employs the singular perturbation theory to transform the time- varying dynamic game problem into two separate time-invariant dynamic game problems. Then, by leveraging a model-free reinforcement learning algorithm, it identifies Nash equilibria for these two time-invariant systems, effectively approximating the Nash equilibrium solution for the original time-varying system. The algorithm framework proposed in this paper introduces a fresh perspective for addressing robust control problems in dynamic systems with time variations. Additionally, it opens up new possibilities for robust control problems in time-varying systems or achieving resilient control in cyber-physical systems by harnessing the power of reinforcement learning.
%K 强化学习,时变系统,博弈论,线性二次优化
Reinforcement Learning
%K Time-Varying Systems
%K Game Theory
%K Linear Quadratic Optimization
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=76298