OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Advances in Applied Mathematics 2025

基于深度强化学习的3D-TSP智能路径规划算法研究
Research on 3D-TSP Intelligent Path Planning Algorithm Based on Deep Reinforcement Learning

DOI: 10.12677/aam.2025.145231, PP. 40-52

周鑫, 周林华

Keywords: 三维旅行商问题，深度强化学习，图Transformer，路径规划，注意力机制
Three-Dimensional Traveling Salesman Problem, Deep Reinforcement Learning, Graph Transformer, Path Planning, Attention Mechanism

Full-Text Cite this paper Add to My Lib

Abstract:

随着人工智能技术的快速发展，深度强化学习在复杂优化问题中展现出巨大的潜力。三维旅行商问题作为经典路径规划问题的扩展，具有更高的复杂性和实际应用价值，但其大规模动态环境下的实时求解仍面临挑战。传统方法存在计算复杂度高的问题，而启发式算法则受限于求解精度和泛化能力。本文提出一种基于图Transformer和深度强化学习的智能路径规划算法，通过编码器捕捉三维图结构的节点位置关系生成嵌入表示，结合解码器的注意力机制和序贯决策特性构建路径规划策略，并采用强化学习Reinforce算法优化策略。实验结果表明，所提算法在求解时间上显著优于传统方法和启发式算法，尤其在大规模问题中的优势更为明显。此外，模型通过贪婪解码与采样解码策略的权衡，兼顾了求解效率与精度。该方法为无人机路径规划、智能制造等领域的大规模动态路径优化问题提供了高效解决方案，具有重要的理论与应用价值。
With the rapid development of artificial intelligence technology, deep reinforcement learning has shown great potential in solving complex optimization problems. The three-dimensional traveling salesman problem, as an extension of the classic path planning problem, has higher complexity and practical application value. However, real-time solutions in large-scale dynamic environments still face challenges. Traditional methods suffer from high computational complexity, while heuristic algorithms are limited by solution accuracy and generalization capabilities. This paper proposes an intelligent path planning algorithm based on graph Transformer and deep reinforcement learning. The algorithm captures the positional relationships of nodes in the 3D graph structure through an encoder to generate embeddings, combines the attention mechanism and sequential decision-making characteristics of the decoder to construct a path planning strategy, and optimizes the strategy using the Reinforce learning algorithm. Experimental results show that the proposed algorithm significantly outperforms traditional methods and heuristic algorithms in terms of solving time, especially in large-scale problems. Additionally, the proposed model balances solving efficiency and accuracy through the trade-off between greedy decoding and sampling decoding strategies. This method provides an efficient solution for large-scale dynamic path optimization problems in fields such as drone path planning and intelligent manufacturing, demonstrating significant theoretical and practical value.

References

[1]	Lawler, E.L. and Wood, D.E. (1966) Branch-and-Bound Methods: A Survey. Operations Research, 14, 699-719. https://doi.org/10.1287/opre.14.4.699
[2]	Papadimitriou, C.H. and Steiglitz, K. (1998) Combinatorial Optimization: Algorithms and Complexity. Courier Corporation.
[3]	Bertsekas, D.P. (1995) Dynamic Programming and Optimal Control. Athena Scientific.
[4]	Sniedovich, M. (2010) Dynamic Programming: Foundations and Principles. CRC Press.
[5]	Rezoug, A., Bader-El-Den, M. and Boughaci, D. (2018) Guided Genetic Algorithm for the Multidimensional Knapsack Problem. Memetic Computing, 10, 29-42. https://doi.org/10.1007/s12293-017-0232-7
[6]	Lin, B., Sun, X. and Salous, S. (2016) Solving Travelling Salesman Problem with an Improved Hybrid Genetic Algorithm. Journal of Computer and Communications, 4, 98-106. https://doi.org/10.4236/jcc.2016.415009
[7]	Teoh, E.J., Tang, H. and Tan, K. (2006) A Columnar Competitive Model with Simulated Annealing for Solving Combinatorial Optimization Problems. The 2006 IEEE International Joint Conference on Neural Network Proceedings, Vancouver, 16-21 July 2006, 3254-3259. https://doi.org/10.1109/ijcnn.2006.247320
[8]	van Laarhoven, P.J.M., Aarts, E.H.L. and Lenstra, J.K. (1992) Job Shop Scheduling by Simulated Annealing. Operations Research, 40, 113-125. https://doi.org/10.1287/opre.40.1.113
[9]	Deng, W., Xu, J. and Zhao, H. (2019) An Improved Ant Colony Optimization Algorithm Based on Hybrid Strategies for Scheduling Problem. IEEE Access, 7, 20281-20292. https://doi.org/10.1109/access.2019.2897580
[10]	Ramadhani, T., Hertono, G.F. and Handari, B.D. (2017) An Ant Colony Optimization Algorithm for Solving the Fixed Destination Multi-Depot Multiple Traveling Salesman Problem with Non-Random Parameters. AIP Conference Proceedings, 1862, Article ID: 030123. https://doi.org/10.1063/1.4991227
[11]	Vinyals, O., Fortunato, M. and Jaitly, N. (2015) Pointer Networks. Advances in Neural Information Processing Systems, 28, 1-9.
[12]	Nazari, M., Oroojlooy, A., Snyder, L., et al. (2018) Reinforcement Learning for Solving the Vehicle Routing Problem. Advances in Neural Information Processing Systems, 31, 1-11.
[13]	Kool, W., Van Hoof, H. and Welling, M. (2018) Attention, Learn to Solve Routing Problems! arXiv: 1803.08475.
[14]	Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. Advances in Neural Information Processing Systems, 30, 1-11.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

基于深度强化学习的3D-TSP智能路径规划算法研究Research on 3D-TSP Intelligent Path Planning Algorithm Based on Deep Reinforcement Learning

基于深度强化学习的3D-TSP智能路径规划算法研究
Research on 3D-TSP Intelligent Path Planning Algorithm Based on Deep Reinforcement Learning