%0 Journal Article
%T 基于深度强化学习的3D-TSP智能路径规划算法研究<br>Research on 3D-TSP Intelligent Path Planning Algorithm Based on Deep Reinforcement Learning
%A 周鑫
%A 周林华
%J Advances in Applied Mathematics
%P 40-52
%@ 2324-8009
%D 2025
%I Hans Publishing
%R 10.12677/aam.2025.145231
%X 随着人工智能技术的快速发展&#65292;深度强化学习在复杂优化问题中展现出巨大的潜力。三维旅行商问题作为经典路径规划问题的扩展&#65292;具有更高的复杂性和实际应用价值&#65292;但其大规模动态环境下的实时求解仍面临挑战。传统方法存在计算复杂度高的问题&#65292;而启发式算法则受限于求解精度和泛化能力。本文提出一种基于图Transformer和深度强化学习的智能路径规划算法&#65292;通过编码器捕捉三维图结构的节点位置关系生成嵌入表示&#65292;结合解码器的注意力机制和序贯决策特性构建路径规划策略&#65292;并采用强化学习Reinforce算法优化策略。实验结果表明&#65292;所提算法在求解时间上显著优于传统方法和启发式算法&#65292;尤其在大规模问题中的优势更为明显。此外&#65292;模型通过贪婪解码与采样解码策略的权衡&#65292;兼顾了求解效率与精度。该方法为无人机路径规划、智能制造等领域的大规模动态路径优化问题提供了高效解决方案&#65292;具有重要的理论与应用价值。<br>With the rapid development of artificial intelligence technology, deep reinforcement learning has shown great potential in solving complex optimization problems. The three-dimensional traveling salesman problem, as an extension of the classic path planning problem, has higher complexity and practical application value. However, real-time solutions in large-scale dynamic environments still face challenges. Traditional methods suffer from high computational complexity, while heuristic algorithms are limited by solution accuracy and generalization capabilities. This paper proposes an intelligent path planning algorithm based on graph Transformer and deep reinforcement learning. The algorithm captures the positional relationships of nodes in the 3D graph structure through an encoder to generate embeddings, combines the attention mechanism and sequential decision-making characteristics of the decoder to construct a path planning strategy, and optimizes the strategy using the Reinforce learning algorithm. Experimental results show that the proposed algorithm significantly outperforms traditional methods and heuristic algorithms in terms of solving time, especially in large-scale problems. Additionally, the proposed model balances solving efficiency and accuracy through the trade-off between greedy decoding and sampling decoding strategies. This method provides an efficient solution for large-scale dynamic path optimization problems in fields such as drone path planning and intelligent manufacturing, demonstrating significant theoretical and practical value.
%K 三维旅行商问题&#65292
%K 深度强化学习&#65292
%K 图Transformer&#65292
%K 路径规划&#65292
%K 注意力机制<br>Three-Dimensional Traveling Salesman Problem
%K Deep Reinforcement Learning
%K Graph Transformer
%K Path Planning
%K Attention Mechanism
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=113849