Detecting people is a key capability for robots that operate in populated environments. In this paper, we have adopted a hierarchical approach that combines classifiers created using supervised learning in order to identify whether a person is in the view-scope of the robot or not. Our approach makes use of vision, depth and thermal sensors mounted on top of a mobile platform. The set of sensors is set up combining the rich data source offered by a Kinect sensor, which provides vision and depth at low cost, and a thermopile array sensor. Experimental results carried out with a mobile platform in a manufacturing shop floor and in a science museum have shown that the false positive rate achieved using any single cue is drastically reduced. The performance of our algorithm improves other well-known approaches, such as C4 and histogram of oriented gradients (HOG).
References
[1]
Shotton, J.; Fitzgibbon, A.; Cook, M.; Sharp, T.; Finocchio, M.; Moore, R.; Kipma, A.; Blake, A. Real-Time Human Pose Recognition in Parts from a Single Depth Image. Proceedings of the Computer Vision and Pattern Recognition, Colorado Springs, CO, USA, 20–25 June 2011.
[2]
Kinect Sensor. Available online: http://en.wikipedia.org/wiki/Kinect (accessed on 10 June 2013).
[3]
Khoshelham, K.; Elberink, S.O. Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 2012, 12, 1437–1454.
[4]
Heimann Sensor. Available online: http://www.heimannsensor.com/index.php (accessed on 20 March 2013).
[5]
Bellotto, N.; Hu, H. A bank of unscented Kalman filters for multimodal human perception with mobile service robots. Int. J. Soc. Robot. 2010, 2, 121–136.
[6]
St-Laurent, L.; Prévost, D.; Maldague, X. Thermal Imaging for Enhanced Foreground-background Segmentation. Proceedings of the The 8th Quantitative Infrared Thermography (QIRT) Conference, Padova, Italy, 27–30 June 2006.
[7]
Hofmann, M.; Kaiser, M.; Aliakbarpour, H.; Rigoll, G. Fusion of Multi-Modal Sensors in a Voxel Occupancy Grid for Tracking and Behaviour Analysis. Proceedings of the 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Delft, Netherlands, 13–15 April 2011.
[8]
Johnson, M.J.; Bajcsy, P. Integration of Thermal and Visible Imagery for Robust Foreground Detection in Tele-immersive Spaces. Proceedings of the 11th International Conference on Information Fusion, Cologne, Germany, 30 June–3 July 2008.
[9]
Zin, T.T.; Takahashi, H.; Toriu, T.; Hama, H. Fusion of Infrared and Visible Images for Robust Person Detection. Image Fusion, 2011. InTech, Available online: http://www.intechopen.com/articles/show/title/fusion-of-infrared-and-visible-images-for-robust-person-detection.
[10]
Schiele, B. Visual People Detection—Different Models, Comparison and Discussion. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Kobe, Japan, 12–17 May 2009.
[11]
Cielniak, G. People Tracking by Mobile Robots using Thermal and Colour Vision. Ph.D. Thesis, Department of Technology Orebro University, Orebro, Sweden, 2007.
[12]
Wilhelm, T.; B?hme, H.J.; Gross, H.M. Sensor Fusion for Vision and Sonar Based People Tracking on a Mobile Service Robot. Proceedings of the International Workshop on Dynamic Perception, Bochum, Germany, 14–15 November 2002.
[13]
Scheutz, M.; McRaven, J.; Cserey, G. Fast, Reliable, Adaptive, Bimodal People Tracking for Indoor Environments. Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, 28 September–2 October 2004; pp. 1347–1352.
[14]
Martin, C.; Schaffernicht, E.; Scheidig, A.; Gross, H. Multi-modal sensor fusion using a probabilistic aggregation scheme for people detection and tracking. Robot. Auton. Syst. 2006, 54, 721–728.
[15]
Hjelmas, E.; Low, B.K. Face detection: A survey. Comput. Vision Image Underst. 2001, 83, 236–274.
[16]
Yang, M.H.; Kriegman, D.; Ahuja, N. Detecting faces in images: A survey. Trans. Patt. Anal. Mach. Intell. 2002, 24, 34–58.
[17]
Murphy-Chutorian, E.; Trivedi, M.M. Head pose estimation in computer vision: A survey. IEEE Trans. Patt. Anal. Mach. Intell. 2009, 31, 607–626.
[18]
Kruppa, H.; Castrillón-Santana, M.; Schiele, B. Fast and Robust Face Finding via Local Context. Proceedings of the Joint EEE Internacional Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Nice, France, 11–12 October 2003; pp. 157–164.
[19]
Xia, L.; Chen, C.C.; Aggarwal, J.K. Human Detection Using Depth Information by Kinect. Proceedings of the International Workshop on Human Activity Understanding from 3D Data in Conjunction with CVPR, Colorado Springs, CO, USA, 20–25 June 2011.
[20]
Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. Proceedings of the International Conference on Computer Vision & Pattern Recognition, San Diego, CA, USA, 25–25 June 2005.
[21]
Viola, P.; Jones, M.J.; Snow, D. Detecting Pedestrians Using Patterns of Motion and Appearance. Proceedings of the International Conference on Computer Vision, Nice, France, 14–17 October 2003; pp. 734–741.
[22]
Wu, J.; Geyer, C.; Rehg, J.M. Real-time Human Detection Using Contour Cues. Proceedings of the ICRA'11, Shangai, China, 9–11 May 2011; pp. 860–867.
[23]
Papageorgiou, C.; Poggio, T. A trainable system for object detection. Int. J. Comput. Vision 2000, 38, 15–33.
[24]
Martínez-Otzeta, J.M.; Ibarguren, A.; Ansuategui, A.; Susperregi, L. Laser based people following behaviour in an emergency environment. Lect. Note. Comput. Sci. 2009, 5928, 33–42.
[25]
Susperregi, L.; Martinez-Otzeta, J.M.; Ansuategui, A.; Ibarguren, A.; Sierra, B. RGB-D, laser and thermal sensor fusion for people following in a mobile robot. Int. J. Adv. Robot. Syst. 2013, doi:10.5772/56123.
[26]
Martinez-Mozos, O.; Kurazume, R.; Hasegawa, T. Multi-part people detection using 2D range data. Int. J. Soc. Robot. 2010, 2, 31–40.
[27]
Bellotto, N.; Hu, H. Multisensor Data Fusion for Joint People Tracking and Identification with a Service Robot. Proceedings of the IEEE International Conference on Robotics and Biomimetics ROBIO, Tianjin, China, 14–18 December 2007; pp. 1494–1499.
[28]
Zhu, Y.; Fujimura, K. Bayesian 3D Human Body Pose Tracking from Depth Image Sequences. Proceedings of the ACCV (2), Xi'an, Chinav, 23–27 September 2009; pp. 267–278.
[29]
Gundimada, S.; Asari, V.K.; Gudur, N. Face recognition in multi-sensor images based on a novel modular feature selection technique. Inf. Fusion 2010, 11, 124–132.
[30]
Meis, U.; Oberlander, M.; Ritter, W. Reinforcing the Reliability of Pedestrian Detection in Far-Infrared Sensing. Proceedings of the 2004 IEEE Intelligent Vehicles Symposium, Parma, Italy, 14–17 June 2004; pp. 779–783.
[31]
Li, W.; Zheng, D.; Zhao, T.; Yang, M. An Effective Approach to Pedestrian Detection in Thermal Imagery. Proceedings of the ICNC. IEEE Maui, HI, USA, 30 January–2 February 2012; pp. 325–329.
[32]
Treptow, A.; Cielniak, G.; Duckett, T. Active People Recognition using Thermal and Grey Images on a Mobile Security Robot. Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Edmonton, Canada, 2–6 August 2005.
[33]
Treptow, A.; Cielniak, G.; Duckett, T. Real-time people tracking for mobile robots using thermal vision. Robot. Auton. Syst. 2006, 54, 729–739.
[34]
Guan, F.; Li, L.Y.; Ge, S.S.; Loh, A.P. Robust human detection and identification by using stereo and thermal images in human-robot interaction. Int. J. Inf. Acquis. 2007, 4, 1–22.
[35]
Correa, M.; Hermosilla, G.; Verschae, R.; del Solar, J.R. Human detection and identification by robots using thermal and visual information in domestic environments. J. Intell. Robot. Syst. 2012, 66, 223–243.
[36]
Arras, K.O.; Martinez-Mozos, O.; Burgard, W. Using Boosted Features for Detection of People in 2D Range Scans. Proceedings of the IEEE International Conference on Robotics and Automation, Roma, Italy, 10–14 April 2007.
[37]
Lai, K.; Bo, L.; Ren, X.; Fox, D. A Large-scale Hierarchical Multi-view RGB-D Object Dataset. Proceedings of the ICRA, Shanghai, China, 9–13 May 2011; pp. 1817–1824.
[38]
Mozos, O.M.; Mizutani, H.; Kurazume, R.; Hasegawa, T. Categorization of Indoor Places Using RGB-D Sensors. Proceedings of the The 8th Joint Workshop on Machine Perception and Robotics, Fukuaka, Japan, 16–17 October 2012.
[39]
Spinello, L.; Arras, K.O. Leveraging RGB-D Data: Adaptive Fusion and Domain Adaptation for Object Detection. Proceedings of the International Conference in Robotics and Automation, St, Paul, MN, USA, 14–18 May 2012.
[40]
Susperregi, L.; Sierra, B.; Martínez-Otzeta, J.M.; Lazkano, E.; Ansuategui, A. A layered learning approach to 3D multimodal people detection using low-cost sensors in a mobile robot. Adv. Intell. Soft Comput. 2012, 153, 27–33.
[41]
Shotton, J.; Fitzgibbon, A.; Cook, M.; Sharp, T.; Finocchio, M.; Moore, R.; Kipman, A.; Blake, A. Real-time Human Pose Recognition in Parts from Single Depth Images. Proceedings of the CVPR, Colorado Springs, CO, USA, 21–23 June 2011.
Cestnik, B. Estimating Probabilities: A Crucial Task in Machine Learning. 6 August 1990; pp. 147–149.
[44]
Sierra, B.; Lazkano, E.; Jauregi, E.; Irigoien, I. Histogram distance-based Bayesian Network structure learning: A supervised classification specific approach. Decis. Support Syst. 2009, 48, 180–190.
[45]
Quinlan, J. C4.5: Programs for Machine Learning; Morgan Kaufmann Publishers: San Francisco, CA, USA, 1993.
[46]
Meyer, D.; Leisch, F.; Hortnik, K. The support vector machine under test. Neurocomputing 2003, 55, 169–186.
[47]
Martínez-Otzeta, J.M.; Sierra, B.; Lazkano, E.; Astigarraga, A. Classifier hierarchy learning by means of genetic algorithms. Patt. Recog. Lett. 2006, 27, 1998–2004.
[48]
Sierra, B.; Serrano, N.; Larra?aga, P.; Plasencia, E.J.; Inza, I.; Jiménez, J.J.; Revuelta, P.; Mora, M.L. Using Bayesian networks in the construction of a bi-level multi-classifier. A case study using intensive care unit patients data. Artif. Intell. Med. 2001, 22, 233–248.
[49]
Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–26 June 2005; pp. 886–893.
[50]
Stone, M. Cross-validatory choice and assessment of statistical prediction. J. R. Stat. Soc. B. 1974, 36, 111–147.
[51]
Spinello, L.; Arras, K.O. People Detection in RGB-D Data. Proceedings of the International Conference on Intelligent Robots and Systems (IROS), San Francisco, CA, USA, 25–30 September 2011.