In this article, we present a real-time 6DoF egomotion estimation system for indoor environments using a wide-angle stereo camera as the only sensor. The stereo camera is carried in hand by a person walking at normal walking speeds 3–5 km/h. We present the basis for a vision-based system that would assist the navigation of the visually impaired by either providing information about their current position and orientation or guiding them to their destination through different sensing modalities. Our sensor combines two different types of feature parametrization: inverse depth and 3D in order to provide orientation and depth information at the same time. Natural landmarks are extracted from the image and are stored as 3D or inverse depth points, depending on a depth threshold. This depth threshold is used for switching between both parametrizations and it is computed by means of a non-linearity analysis of the stereo sensor. Main steps of our system approach are presented as well as an analysis about the optimal way to calculate the depth threshold. At the moment each landmark is initialized, the normal of the patch surface is computed using the information of the stereo pair. In order to improve long-term tracking, a patch warping is done considering the normal vector information. Some experimental results under indoor environments and conclusions are presented.
References
[1]
Broida, T.; Chandrashekhar, S.; Chellappa, R. Recursive 3-D Motion Estimation from a Monocular Image Sequence. IEEE Trans. Aerosp. Electron. Syst?1990, 26, 639–656.
[2]
Broida, T.; Chellappa, R. Estimating the Kinematics and Structure of a Rigid Object from a Sequence of Monocular Images. IEEE Trans. Pattern Anal. Machine Intell?1991, 13, 497–513.
[3]
Mountney, P.; Stoyanov, D.; Davison, A.J.; Yang, G.Z. Simultaneous Stereoscope Localization and Soft-Tissue Mapping for Minimally Invasive Surgery. Proceedings of Medical Image Computing and Computer Assisted Intervention (MICCAI), Copenhagen, Denmark, October 1–6, 2006.
[4]
Klein, G.; Murray, D. Parallel Tracking and Mapping for Small AR Workspaces. Proceedings of the 6th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Phoenix, AZ, USA, October 28–November 2, 2007.
[5]
Schleicher, D.; Bergasa, L.M.; Barea, R.; Lóez, E.; Oca?a, M.; Nuevo, J. Real-Time Wide-Angle Stereo Visual SLAM on Large Environments Using SIFT Features Correction. Proceedings of IEEE /RSJ International Conference on Intelligent Robots and Systems (IROS), San Diego, CA, USA, October 29–November 2, 2007.
[6]
Schleicher, D.; Bergasa, L.M.; Barea, R.; Lóez, E.; Oca?a, M. Real-Time Simultaneous Localization and Mapping with a Wide-Angle Stereo Camera and Adaptive Patches. Proceedings of IEEE /RSJ International Conference on Intelligent Robots and Systems (IROS), Beijing, China, October 9–15, 2006.
[7]
Davison, A.J.; Reid, I.D.; Molton, N.D.; Stasse, O. MonoSLAM: Real-Time Single Camera SLAM. IEEE Trans. Pattern Anal. Machine Intell?2007, 29, 1052–1067.
Walker, B.N.; Lindsay, J. Navigation Performance with a Virtual Auditory Display: Effects of Beacon Sound, Capture Radius, and Practice. Human Factors?2006, 48, 265–278.
[10]
Li, L.J.; Socher, R.; Li, F.F. Towards Total Scene Understanding:Classification, Annotation and Segmentation in an Automatic Framework. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2009), Miami, FL, USA, June 20–26, 2009.
[11]
Oh, S.; Tariq, S.; Walker, B.; Dellaert, F. Map-Based Priors for Localization. Proceedings of IEEE /RSJ International Conference on Intelligent Robots and Systems (IROS), Sendai, Japan, September 28–October 2, 2004.
[12]
Saéz, J.M.; Escolano, F.; Penalver, A. First Steps towards Stereo-Based 6DOF SLAM for the Visually Impared. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, USA, June 20–26, 2005.
[13]
Paz, L.M.; Piniés, P.; Tardós, J.D.; Neira, J. Large Scale 6DOF SLAM with Stereo-in-hand. IEEE Trans. Robotics?2008, 24, 946–957.
[14]
Paz, L.M.; Guivant, J.; Tardós, J.D.; Neira, J. Data Association in O(n) for Divide and Conquer SLAM. Proceedings of Robotics: Science and Systems, Atlanta, GA, USA, June 27–30, 2007.
[15]
Harris, C.; Stephens, M. A Combined Corner and Edge Detector. Proceedings of the 4th Alvey Vision Conference, Manchester, UK, August 30–September 2, 1988; pp. 147–151.
[16]
Eade, E.; Drummond, T. Monocular SLAM as a Graph of Coalesced Observations. Proceedings of International Conference on Computer Vision (ICCV), Rio de Janeiro, Brazil, October 14–20, 2007.
[17]
Liang, B.; Pears, N. Visual Navigation Using Planar Homographies. Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Washington, DC, USA, May 11–15, 2002.
[18]
Molton, N.; Davison, A.J.; Reid, I. Locally Planar Patch Features for Real-Time Structure from Motion. Proceedings of British Machine Vision Conference (BMVC), London, UK, September 7–9, 2004.
[19]
Chum, O.; Pajdla, T.; Sturm, P. The Geometric Error for Homographies. Comput. Vision Image Underst?2005, 97, 86–102.
[20]
Documentation: Camera Calibration Toolbox for Matlab. 2007. Available online: http://www.vision.caltech.edu/bouguetj/calib_doc/ (accessed on 20 April 2010).
[21]
Piniés, P.; Tardós, J.D. Large Scale SLAM Building Conditionally Independent Local Maps: Application to Monocular Vision. IEEE Trans. Robotics?2008, 24, 1094–1106.
[22]
Kaess, M.; Ranganathan, A.; Dellaert, F. iSAM: Incremental Smoothing and Mapping. IEEE Trans. Robotics?2008, 24, 1365–1378.
[23]
Agrawal, M.; Konolige, K.; Blas, M.R. CenSurE: Center Surround Extremas for Realtime Feature Detection and Matching. Proceedings of the 10th European Conference on Computer Vision (ECCV), Marseille, France, October 12–18, 2008.
[24]
Schleicher, D.; Bergasa, L.M.; Oca?a, M.; Barea, R.; Lóez, E. Real-Time Hierarchical Outdoor SLAM Based on Stereovision and GPS Fusion. IEEE Trans. Intell. Transp. Systems?2009, 10, 440–452.
[25]
Lowe, D. Distinctive Image Features from Scale-Invariant Keypoints. Intl. J. Comput. Vision?2004, 60, 91–110.
[26]
Angeli, A.; Filliat, D.; Doncieux, S.; Meyer, J.A. Fast and Incremental Method for Loop-Closure Detection Using Bags of Visual Words. IEEE Trans. Robotics?2008, 24, 1027–1037.
[27]
Cummins, M.; Newman, P. Highly Scalable Appearance-Only SLAM–FAB-MAP 2.0. Proceedings fo Robotics: Science and Systems (RSS09), Seattle, WA, USA, June 29–July 01, 2009.
[28]
Triggs, B.; McLauchlan, P.; Hartley, R.; Fitzgibbon, A. Bundle Adjustment—A Modern Synthesis. In Vision Algorithms: Theory and Practice; Triggs, W., Zisserman, A., Szeliski, R., Eds.; Springer Verlag: New York, NY, USA, 1999; pp. 298–375.
[29]
Llorca, F.D.; Sotelo, A.M.; Parra, I.; Oca?a, M.; Bergasa, M.L. Error Analysis in a Stereo Vision-Based Pedestrian Detection Sensor for Collision Avo idance Applications. Sensors?2010, 10, 3741–3758.