|
基于安全强化学习的机械臂无碰路径规划
|
Abstract:
可以通过为机器人关节添加相关约束的方式来保证规划路径的可靠性。本文研究了带有运动学约束的强化学习(Reinforcement Learning)方法,以保证规划中的安全性。通过与替代动作思想的结合,对强化学习动作空间的进行设计,可以进一步保证动作的可行性。为了评估算法性能,在船舶焊接场景中对工业机械臂进行了路径规划,使得机械臂末端成功运动到位于狭窄空间中的焊接起点。实验结果表明,该方法不仅保证了训练的收敛性,而且保证了任务的安全性和可靠性。
The reliability of the planned path can be guaranteed by adding relevant constraints to the robot joints. In this paper, reinforcement learning (RL) with motion constraints is studied to ensure the safety in planning. With the design of the reinforcement of the learning movement space by combining with the idea of alternative movements, the feasibility of action is further guaranteed. In order to evaluate the performance of the algorithm, the path planning of the industrial robot arm was carried out in the Marine welding scene, so that the end of the robot arm successfully moved to the welding starting point located in the narrow space. Experimental results show that the proposed method not only guarantees the convergence of the training, but also ensures the security and reliability of the task.
[1] | Yiu, Y.F. and Mahapatra, R. (2020) Hierarchical Evolutionary Heuristic A Search. 2020 IEEE International Conference on Humanized Computing and Communication with Artificial Intelligence (HCCAI), Irvine, 21-23 September 2020, 33-40. https://doi.org/10.1109/HCCAI49649.2020.00011 |
[2] | Ardiyanto, I. and Miura, J. (2012) Real-Time Navi-gation Using Randomized Kinodynamic Planning with Arrival Time Field. Robotics and Autonomous Systems, 60, 1579-1591. https://doi.org/10.1016/j.robot.2012.09.011 |
[3] | Kuffner, J.J. and LaValle, S.M. (2000) RRT-Connect: An Efficient Approach to Single-Query Path Planning. Proceedings 2000 ICRA. Millennium Conference. IEEE Interna-tional Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065), San Francisco, 24-28 April 2000, 995-1001. |
[4] | Morales, E.F. and Zaragoza, J.H. (2012) An Introduction to Reinforcement Learning. In: Enrique, L., Morales, E. and Hoey, J., Eds., Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions, IGI Global, Pennsylvania, 63-80. https://doi.org/10.4018/978-1-60960-165-2.ch004 |
[5] | Mnih, V., Kavukcuoglu, K., Silver, D., et al. (2015) Human-Level Control through Deep Reinforcement Learning. Nature, 518, 529-33. https://doi.org/10.1038/nature14236 |
[6] | Lillicrap, T.P., et al. (2015) Continuous Control with Deep Rein-forcement Learning. ArXiv Preprint ArXiv: 1509.02971. |
[7] | Achiam, J., Held, D., Tamar, A. and Abbeel, P. (2017) Constrained Policy Optimization. Proceedings of the 34th International Conference on Machine Learning, Sydney, 6-11 August 2017, 22-31. |
[8] | Sui, Y., Gotovos, A., Burdick, J. and Krause, A. (2015) Safe Exploration for Optimization with Gaussian Processes. Proceedings of the 32nd International Conference on Machine Learning, Lille, 6-11 July 2015, 997-1005. |
[9] | Rubrecht, S., Padois, V., Bidaud, P., De Broissia, M. and Da Silva Simoes, M. (2012) Motion Safety and Constraints Compatibility for Multibody Robots. Autonomous Robots, 32, 333-349. https://doi.org/10.1007/s10514-011-9264-x |
[10] | Sutton, R.S. and Barto, V. (1998) Reinforcement Learning: An Introduction. IEEE Transactions on Neural Networks, 9, 1054. https://doi.org/10.1109/TNN.1998.712192 |
[11] | Ibarz, J., Tan, J., Finn, C., Kalakrishnan, M., Pastor, P. and Levine, S. (2021) How to Train Your Robot with Deep Reinforcement Learning: Lessons We’ve Learned. The International Journal of Robotics Research, 40, 698-721.
https://doi.org/10.1177/0278364920987859 |
[12] | Marchesini, E., Corsi, D. and Farinelli, A. (2021) Benchmark-ing Safe Deep Reinforcement Learning in Aquatic Navigation. 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems, Prague, 27 September-1 October 2021, 5590-5595. https://doi.org/10.1109/IROS51168.2021.9635925 |
[13] | Tan, J., et al. (2018) Sim-to-Real: Learning Agile Locomo-tion for Quadruped Robots. ArXiv: 1804.10332.
https://doi.org/10.15607/RSS.2018.XIV.010 |
[14] | Andrychowicz, M., et al. (2019) Learning Dexterous in-Hand Manipulation. The International Journal of Robotics Research, 39, 3-20. https://doi.org/10.1177/0278364919887447 |
[15] | Gu, S., Holly, E., Lillicrap, T. and Levine, S. (2017) Deep Rein-forcement Learning for Robotic Manipulation with Asynchronous off-Policy Updates. 2017 IEEE International Confer-ence on Robotics and Automation, Singapore, 29 May-3 June 2017, 3389-3396. https://doi.org/10.1109/ICRA.2017.7989385 |
[16] | Pecka, M. and Svoboda, T. (2014) Safe Exploration Techniques for Reinforcement Learning—An Overview. In: Hodicky, J., Ed., Modelling and Simulation for Autonomous Systems. MESAS 2014. Lecture Notes in Computer Science, Vol. 8906, Springer, Cham, 357-375. https://doi.org/10.1007/978-3-319-13823-7_31 |