|
动态场景下基于YOLO的场景重建
|
Abstract:
在实际动态环境中,深度传感器在获取环境信息时不可避免地会受到运动物体的干扰。如何有效处理动态物体、使机器人准确理解周围环境并完成复杂任务,仍然是一个亟待解决的难题。本文提出了一种基于改进型ORB-SLAM3与改进的YOLOv5相结合的语义分割方法。该方法通过识别并剔除动态特征,同时最大程度地保留静态环境的有效特征,结合ORB-SLAM3算法实现了高精度的场景重建,成功生成稠密点云地图。实验结果表明,在TUM-RGB-D数据集上,本文提出的方法相比原始ORB-SLAM3算法,在高动态场景中的RMSE平均降低了92.04%,在低动态场景中的RMSE平均降低了19.48%。特别是在动态物体比例较高的场景中,系统表现出优异的鲁棒性和准确性。此外,本文还对系统的实时性进行了优化,通过轻量化的目标检测网络和高效的特征筛选策略,确保了系统在普通硬件平台上的实时运行能力。研究结果为解决动态环境下的视觉SLAM问题提供了一种高效可靠的解决方案。
In real-world dynamic environments, depth sensors inevitably encounter interference from moving objects while acquiring environmental information. How to effectively process dynamic objects, enable robots to accurately understand their surroundings, and accomplish complex tasks remains a challenging problem. This paper proposes a semantic segmentation method that combines an improved ORB-SLAM3 with an enhanced YOLOv5. The method identifies and eliminates dynamic features while maximally preserving effective features of the static environment. By integrating the ORB-SLAM3 algorithm, it achieves high-precision scene reconstruction and successfully generates dense point cloud maps. Experimental results on the TUM-RGB-D dataset show that compared to the original ORB-SLAM3 algorithm, our proposed method reduces RMSE by an average of 92.04% in highly dynamic scenes and 19.48% in low dynamic scenes. The system demonstrates excellent robustness and accuracy, particularly in scenarios with a high proportion of dynamic objects. Additionally, we optimized the system’s real-time performance through a lightweight object detection network and efficient feature filtering strategy, ensuring real-time operation on standard hardware platforms. The research provides an efficient and reliable solution for visual SLAM problems in dynamic environments.
[1] | Davison, A.J., Reid, I.D., Molton, N.D. and Stasse, O. (2007) MonoSLAM: Real-Time Single Camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1052-1067. https://doi.org/10.1109/tpami.2007.1049 |
[2] | Engel, J., Koltun, V. and Cremers, D. (2018) Direct Sparse Odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 611-625. https://doi.org/10.1109/tpami.2017.2658577 |
[3] | Mur-Artal, R., Montiel, J.M.M. and Tardos, J.D. (2015) ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics, 31, 1147-1163. https://doi.org/10.1109/tro.2015.2463671 |
[4] | Mur-Artal, R. and Tardos, J.D. (2017) ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Transactions on Robotics, 33, 1255-1262. https://doi.org/10.1109/tro.2017.2705103 |
[5] | Engel, J., Schöps, T. and Cremers, D. (2014) LSD-SLAM: Large-Scale Direct Monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B. and Tuytelaars, T., Eds., Computer Vision—ECCV 2014, Springer, 834-849. https://doi.org/10.1007/978-3-319-10605-2_54 |
[6] | Forster, C., Pizzoli, M. and Scaramuzza, D. (2014) SVO: Fast Semi-Direct Monocular Visual Odometry. 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, 31 May-7 June 2014, 15-22. https://doi.org/10.1109/icra.2014.6906584 |
[7] | Klappstein, J., Barth, A., Franke, U. and Maurer, M. (2006) Detecting Moving Objects in Car Environment by Motion Analysis and Ego-Motion Compensation. IEEE Intelligent Vehicles Symposium, Tokyo, 13-15 June 2006, 682-687. |
[8] | Yu, C., Liu, Z., Liu, X., Xie, F., Yang, Y., Wei, Q., et al. (2018) DS-SLAM: A Semantic Visual SLAM Towards Dynamic Environments. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, 1-5 October 2018, 1168-1174. https://doi.org/10.1109/iros.2018.8593691 |
[9] | Bescos, B., Facil, J.M., Civera, J. and Neira, J. (2018) DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes. IEEE Robotics and Automation Letters, 3, 4076-4083. https://doi.org/10.1109/lra.2018.2860039 |
[10] | Redmon, J. and Farhadi, A. (2018) YOLOv3: An Incremental Improvement. arXiv: 1804.02767. |
[11] | Cheng, J., Wang, C. and Meng, M.Q. (2020) Robust Visual Localization in Dynamic Environments Based on Sparse Motion Removal. IEEE Transactions on Automation Science and Engineering, 17, 658-669. https://doi.org/10.1109/tase.2019.2940543 |
[12] | Sun, Y., Liu, M. and Meng, M.Q. (2017) Improving RGB-D SLAM in Dynamic Environments: A Motion Removal Approach. Robotics and Autonomous Systems, 89, 110-122. https://doi.org/10.1016/j.robot.2016.11.012 |
[13] | Jin, J., Jiang, X., Yu, C., Zhao, L. and Tang, Z. (2023) Dynamic Visual Simultaneous Localization and Mapping Based on Semantic Segmentation Module. Applied Intelligence, 53, 19418-19432. https://doi.org/10.1007/s10489-023-04531-6 |
[14] | Wu, W., Guo, L., Gao, H., You, Z., Liu, Y. and Chen, Z. (2022) YOLO-SLAM: A Semantic SLAM System towards Dynamic Environment with Geometric Constraint. Neural Computing and Applications, 34, 6011-6026. https://doi.org/10.1007/s00521-021-06764-3 |