Monocular visual odometry (VO) is the process of determining a user’s trajectory through a series of consecutive images taken by a single camera. A major problem that affects the accuracy of monocular visual odometry, however, is the scale ambiguity. This research proposes an innovative augmentation technique, which resolves the scale ambiguity problem of monocular visual odometry. The proposed technique augments the camera images with range measurements taken by an ultra-low-cost laser device known as the Spike. The size of the Spike laser rangefinder is small and can be mounted on a smartphone. Two datasets were collected along precisely surveyed tracks, both outdoor and indoor, to assess the effectiveness of the proposed technique. The coordinates of both tracks were determined using a total station to serve as a ground truth. In order to calibrate the smartphone’s camera, seven images of a checkerboard were taken from different positions and angles and then processed using a MATLAB-based camera calibration toolbox. Subsequently, the speeded-up robust features (SURF) method was used for image feature detection and matching. The random sample consensus (RANSAC) algorithm was then used to remove the outliers in the matched points between the sequential images. The relative orientation and translation between the frames were computed and then scaled using the spike measurements in order to obtain the scaled trajectory. Subsequently, the obtained scaled trajectory was used to construct the surrounding scene using the structure from motion (SfM) technique. Finally, both of the computed camera trajectory and the constructed scene were compared with ground truth. It is shown that the proposed technique allows for achieving centimeter-level accuracy in monocular VO scale recovery, which in turn leads to an enhanced mapping accuracy.
References
[1]
Scaramuzza, D. and Fraundorfer, F. (2011) Visual Odometry [Tutorial]. IEEE Robotics & Automation Magazine, 18, 80-92. https://doi.org/10.1109/MRA.2011.943233
[2]
Nistér, D., Naroditsky, O. and Bergen. J. (2004) Visual Odometry. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington DC, 27 June-2 July 2004, 1.
[3]
Aqel, M.O., et al. (2016) Review of Visual Odometry: Types, Approaches, Challenges, and Applications. SpringerPlus, 5, Article No. 1897. https://doi.org/10.1186/s40064-016-3573-7
[4]
Mur-Artal, R. and Tardós, J.D. (2017) Orb-slam2: An Open-Source Slam System for Monocular, Stereo, and Rgb-d Cameras. IEEE Transactions on Robotics, 33, 1255-1262. https://doi.org/10.1109/TRO.2017.2705103
[5]
Engel, J., Schöps, T. and Cremers. D. (2014) LSD-SLAM: Large-Scale Direct Monocular SLAM. In: European Conference on Computer Vision, Springer, Berlin, 834-849. https://doi.org/10.1007/978-3-319-10605-2_54
[6]
Davison, A.J., et al. (2007) MonoSLAM: Real-Time Single Camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1052-1067. https://doi.org/10.1109/TPAMI.2007.1049
[7]
Wang, R., Schworer, M. and Cremers, D. (2017) Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras. Proceedings of the IEEE International Conference on Computer Vision, Venice, 22-29 October 2017, 3903-3911. https://doi.org/10.1109/ICCV.2017.421
[8]
Yin, X., et al. (2017) Scale Recovery for Monocular Visual Odometry Using Depth Estimated with Deep Convolutional Neural Fields. Proceedings of the IEEE International Conference on Computer Vision, Venice, 22-29 October 2017, 5870-5878. https://doi.org/10.1109/ICCV.2017.625
[9]
Dang, T., Hoffmann, C. and Stiller, C. (2009) Continuous Stereo Self-Calibration by Camera Parameter Tracking. IEEE Transactions on Image Processing, 18, 1536-1550. https://doi.org/10.1109/TIP.2009.2017824
[10]
Klein, G. and Murray, D. (2007) Parallel Tracking and Mapping for Small AR Workspaces. 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, Nara, 13-16 November 2007, 225-234. https://doi.org/10.1109/ISMAR.2007.4538852
[11]
Kitt, B.M., et al. (2011) Monocular Visual Odometry Using a Planar Road Model to Solve Scale Ambiguity.
[12]
Choi, S., et al. (2011) What Does Ground Tell Us? Monocular Visual Odometry under Planar Motion Constraint. 2011 11th IEEE International Conference on Control, Automation and Systems, Gyeonggi-do, 26-29 October 2011, 1480-1485.
[13]
Song, S., Chandraker, M. and Guest, C.C. (2015) High Accuracy Monocular SFM and Scale Correction for Autonomous Driving. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 730-743. https://doi.org/10.1109/TPAMI.2015.2469274
[14]
Gakne, P.V. and O’Keefe, K. (2018) Tackling the Scale Factor Issue in a Monocular Visual Odometry Using a 3D City Model. International Technical Symposium on Navigation and Timing, Toulouse, October 2018, 228-243. https://doi.org/10.31701/itsnt2018.20
[15]
Scaramuzza, D., Fraundorfer, F. and Siegwart, R. (2009) Real-Time Monocular Visual Odometry for On-Road Vehicles with 1-Point RANSAC. 2009 IEEE International Conference on Robotics and Automation, Kobe, 12-17 May 2009, 4293-4299. https://doi.org/10.1109/ROBOT.2009.5152255
[16]
Weiss, S. and Siegwart, R. (2011) Real-Time Metric State Estimation for Modular Vision-Inertial Systems. 2011 IEEE International Conference on Robotics and Automation, Shanghai, 9-13 May 2011, 4531-4537. https://doi.org/10.1109/ICRA.2011.5979982
[17]
Nützi, G., et al. (2011) Fusion of IMU and Vision for Absolute Scale Estimation in Monocular SLAM. Journal of Intelligent & Robotic Systems, 61, 287-299. https://doi.org/10.1007/s10846-010-9490-z
[18]
Antigny, N., et al. (2019) Solving Monocular Visual Odometry Scale Factor with Adaptive Step Length Estimates for Pedestrians Using Handheld Devices. Sensors, 19, 953. https://doi.org/10.3390/s19040953
[19]
Lupton, T. and Sukkarieh, S. (2008) Removing Scale Biases and Ambiguity from 6DoF Monocular SLAM Using Inertial. 2008 IEEE International Conference on Robotics and Automation, Pasadena, 19-23 May 2008, 3698-3703. https://doi.org/10.1109/ROBOT.2008.4543778
[20]
Gutiérrez-Gómez, D. and Guerrero, J.J. (2013) Scaled Monocular SLAM for Walking People. Proceedings of the 2013 International Symposium on Wearable Computers, Zurich, Switzerland, September 2013, 9-12. https://doi.org/10.1145/2493988.2494351
[21]
Knorr, S.B. and Kurz, D. (2016) Leveraging the User’s Face for Absolute Scale Estimation in Handheld Monocular Slam. 2016 IEEE International Symposium on Mixed and Augmented Reality, Merida, 19-23 September 2016, 11-17. https://doi.org/10.1109/ISMAR.2016.20
[22]
Leppäkoski, H., Collin, J. and Takala, J. (2013) Pedestrian Navigation Based on Inertial Sensors, Indoor Map, and WLAN Signals. Journal of Signal Processing Systems, 71, 287-296. https://doi.org/10.1007/s11265-012-0711-5
[23]
Tomažič, S. and Škrjanc, I. (2015) Fusion of Visual Odometry and Inertial Navigation System on a Smartphone. Computers in Industry, 74, 119-134. https://doi.org/10.1016/j.compind.2015.05.003
[24]
ikeGPS. Spike User Manual. https://www.gi-geoinformatik.de/wp-content/uploads/spike/Spike_User_Manual.pdf
[25]
Zhang, Z. (2000) A Flexible New Technique for Camera Calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 1330-1334. https://doi.org/10.1109/34.888718
[26]
Hartley, R. and Zisserman, A. (2003) Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511811685
[27]
Mathworks. Camera Calibration and 3-D Vision. https://www.mathworks.com/help/vision/camera-calibration-and-3-d-vision.html
[28]
Bay, H., Tuytelaars, T. and Van Gool, L. (2006) SURF: Speeded Up Robust Features. In: European Conference on Computer Vision, Springer, Berlin, 404-417. https://doi.org/10.1007/11744023_32
[29]
Torr, P.H. and Zisserman, A. (2000) MLESAC: A New Robust Estimator with Application to Estimating Image Geometry. Computer Vision and Image Understanding, 78, 138-156. https://doi.org/10.1006/cviu.1999.0832
[30]
Georgiev, G.H. and Radulov, V.D. (2014) A Practical Method for Decomposition of the Essential Matrix. Applied Mathematical Sciences, 8, 8755-8770. https://doi.org/10.12988/ams.2014.410877
[31]
Pix4D. https://www.pix4d.com
[32]
Scaramuzza, D., Martinelli, A. and Siegwart, R. (2006) A Toolbox for Easily Calibrating Omnidirectional Cameras. 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, 9-15 October 2006, 5695-5701. https://doi.org/10.1109/IROS.2006.282372