%0 Journal Article
%T 基于关键状态的扩散模型轨迹规划方法
Key-State-Conditioned Diffusion Models for Trajectory Planning
%A 杜潇含
%A 李烨
%J Software Engineering and Applications
%P 535-549
%@ 2325-2278
%D 2025
%I Hans Publishing
%R 10.12677/sea.2025.143047
%X 在离线强化学习的轨迹规划任务中,传统基于自回归的规划方法因误差逐级累积效应而限制了模型性能。近年来,扩散模型凭借其出色的分布建模能力被引入该领域,以缓解误差累积问题。然而,现有方法在高维动作空间生成长时序轨迹时仍面临性能不足的挑战。为此,本文提出了一种基于关键状态的扩散模型轨迹规划方法,通过提取原始轨迹中的关键状态特征,并结合条件扩散生成模型进行轨迹规划,将传统的自回归式轨迹规划范式转化为基于关键状态的条件生成问题。在确保生成轨迹时序连续性的同时,提升了模型轨迹规划的性能。在D4RL基准测试的Gym-Mujoco、Maze2d、AntMaze和Adroit等多个环境中进行的实验表明,本文方法在轨迹规划性能和算法鲁棒性方面均优于现有方法。
In trajectory planning for offline reinforcement learning, conventional autoregressive planning methods suffer from performance limitations due to error accumulation effects. While diffusion models have recently been introduced to this domain to mitigate error accumulation through their exceptional distribution modeling capabilities, existing approaches still face performance challenges when generating long-horizon trajectories in high-dimensional action spaces. To address this, we propose a Key-State-Conditioned Diffusion Models for Trajectory Planning method that integrates key states with diffusion models. Our approach extracts critical state features from original trajectories and combines them with conditional diffusion generative models for trajectory planning, effectively transforming the traditional autoregressive planning paradigm into a key state-conditioned generation problem. This method not only maintains temporal continuity in generated trajectories but also significantly enhances planning performance. Extensive experiments conducted on multiple D4RL benchmark environments, including Gym-Mujoco, Maze2d, AntMaze, and Adroit, demonstrate that our method outperforms existing approaches in both trajectory planning performance and algorithmic robustness.
%K 离线强化学习,
%K 扩散模型,
%K 轨迹规划,
%K Transformer,
%K 变分自编码器
Offline Reinforcement Learning
%K Diffusion Model
%K Trajectory Planning
%K Transformer
%K Variational Autoencoder
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=116876