All Title Author
Keywords Abstract

Sensors  2013 

Combined Hand Gesture — Speech Model for Human Action Recognition

DOI: 10.3390/s131217098

Keywords: hand gesture detection, hand gesture recognition, speech recognition, human behavior

Full-Text   Cite this paper   Add to My Lib


This study proposes a dynamic hand gesture detection technology to effectively detect dynamic hand gesture areas, and a hand gesture recognition technology to improve the dynamic hand gesture recognition rate. Meanwhile, the corresponding relationship between state sequences in hand gesture and speech models is considered by integrating speech recognition technology with a multimodal model, thus improving the accuracy of human behavior recognition. The experimental results proved that the proposed method can effectively improve human behavior recognition accuracy and the feasibility of system applications. Experimental results verified that the multimodal gesture-speech model provided superior accuracy when compared to the single modal versions.


[1]  Calinon, S.; Billard, A. Incremental Learning of Gestures by Imitation in a Humanoid Robot. Proceedings of 2nd ACM/IEEE International Conference on Human-Robot Interaction, Arlington, VA, USA, 9–11 March 2007; pp. 255–262.
[2]  Calinon, S.; Billard, A. A Framework Integrating Statistical and Social Cues to Teach a Humanoid Robot New Skills. Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Workshop on Social Interaction with Intelligent Indoor Robots, Pasadena, CA, USA, 19–23 May 2008.
[3]  Varkonyi-Koczy, A.R.; Tusor, B. Human-computer interaction for smart environment applications using fuzzy hand posture and gesture models. IEEE Trans. Instrum. Meas. 2011, 60, 1505–1514.
[4]  Manresa, C.; Varona, J.; Mas, R.; Perales, F.J. Real-time hand tracking and gesture recognition for human-computer interaction. Electron. Lett. Comput. Vis. Image Anal. 2000, 0, 1–7.
[5]  Dardas, N.H.; Georganas, N.D. Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques. IEEE Trans. Instrum. Meas. 2011, 60, 3592–3607.
[6]  Ren, Z.; Yuan, J.; Meng, J.; Zhang, Z. Robust part-based hand gesture recognition using kinect sensor. IEEE Trans. Multimed. 2013, 15, 1110–1120.
[7]  Chen, Q.; Georganas, N.D.; Petriu, E.M. Hand gesture recognition using haar-like features and a stochastic context-free grammar. IEEE Trans. Instrum. Meas. 2008, 57, 1562–1571.
[8]  Frolova, D.; Stern, H.; Berman, S. Most probable longest common subsequence for recognition of gesture character input. IEEE Trans. Cybern. 2013, 43, 871–880.
[9]  Zhang, X.; Chen, X.; Li, Y.; Lantz, V.; Wang, K.; Yang, J. A framework for hand gesture recognition based on accelerometer and EMG sensors. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2011, 41, 1064–1076.
[10]  Stern, H.; Edan, Y. Cluster labeling and parameter estimation for the automated setup of a hand-gesture recognition system. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2005, 35, 932–944.
[11]  Xu, R.; Zhou, S.; Li, W.J. MEMS accelerometer based nonspecific-user hand gesture recognition. IEEE Sens. J. 2012, 12, 1166–1173.
[12]  Tang, H.-K.; Feng, Z.-Q. Hand's Skin Detection Based on Ellipse Clustering. Proceedings of IEEE International Symposium on Computer Science and Computational Technology, Shanghai, China, 20–22 December 2008; pp. 758–761.
[13]  Zhang, M.-J.; Wang, W.-Q.; Zheng, Q.-F.; Gao, W. Skin-Color Detection Based on Adaptive Thresholds. Proceedings of the Third International Conference on Image and Graphics, Washington, DC, USA, 18–20 December 2004; pp. 18–20.
[14]  Cai, X.; Jiang, L.; Hao, X.-W.; Meng, X.-X. A New Region Gaussian Background Model for Video Surveillance. Proceedings of IEEE Fourth International Conference on Natural Computation, Jinan, China, 18–20 October 2008; pp. 123–127.
[15]  Wang, S.B. Video Completion Based on Effective Spatial and Temporal Inpainting Techniques. M.Sc. Thesis, Department of Information Engineering, I-Shou University, Kaohsiung, Taiwan, 2008.
[16]  Yoon, H.-S.; Chi, S.-Y. Visual Processing of Rock, Scissors, Paper Game for Human Robot Interaction. Proceedings of International Joint Conference SICE-ICASE, Busan, Korea, 18–21 October 2006; pp. 326–329.
[17]  Han, J.; Award, G.M.; Sutherland, A.; Wu, H. Automatic Skin Segmentation for Gesture Recognition Combining Region and Support Vector Machine Active Learning. Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition, Southampton, UK, 10–12 April 2006; pp. 237–242.
[18]  Hayakawa, H.; Shibata, T. Spatiotemporal Projection of Motion Field Sequence for Generating Feature Vectors in Gesture Perception. Proceedings of IEEE International Symposium on Circuits and Systems, Seattle, WA, USA, 18–21 May 2008; pp. 3526–3529.
[19]  Lin, W.; Sun, M.-T.; Poovandran, R.; Zhang, Z. Human Activity Recognition for Video Surveillance. Proceedings of IEEE International Symposium on Circuits and Systems, Seattle, WA, USA, 18–21 May 2008; pp. 2737–2740.
[20]  Ghimire, D.; Lee, J. Geometric feature-based facial expression recognition in image sequences using multi-class adaboost and support vector machines. Sensors 2013, 13, 7714–7734.
[21]  Tang, X.; Liu, Y.; Lv, C.; Sun, D. Hand motion classification using a multi-channel surface electromyography sensor. Sensors 2012, 12, 1130–1147.
[22]  Henry, P.; Krainin, M.; Herbst, E.; Ren, X.; Fox, D. RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments. Proceedings of the 12th International Symposium on Experimental Robotics, Delhi, India, 18–20 December 2010; pp. 22–25.
[23]  Henry, P.; Krainin, M.; Herbst, E.; Ren, X.; Fox, D. RGB-D Mapping: Using kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 2012, 31, 647–663.
[24]  Yang, X.; Yuan, J.; Thalmann, D. Human-Virtual Human Interaction by Upper Body Gesture Understanding. Proceedings of the 19th ACM Symposium on Virtual Reality Software and Technology, Singapore, 6–9 October 2013; pp. 133–142.
[25]  Shen, X.; Hua, G.; Williams, L.; Wu, Y. Dynamic hand gesture recognition: An exemplar-based approach from motion divergence fields. Image Vis. Comput. 2012, 30, 227–235.
[26]  Chen, F.S.; Fu, C.M.; Huang, C.L. Hand gesture recognition using a real-time tracking method and hidden Markov models. Image Vis. Comput. 2003, 21, 745–758.
[27]  Tham, J.Y.; Ranganath, S.; Ranganath, M.; Kassim, A.A. A Novel unrestricted center-biased diamond search algorithm for block motion estimation. IEEE Trans. Circuits Syst. Video Technol. 1998, 8, 369–377.
[28]  Chai, D.; Bouzerdoum, A. A Bayesian Approach to Skin Color Classification in YCbCr Color Space. Proceedings of TENCON: Intelligent Systems and Technologies for the New Millennium, Renaissance-New World Hotel, Kuala Lumpur, Malaysia, 24–27 September 2000; pp. 421–424.
[29]  Liu, C.-S.; Lin, J.-C.; Yang, N.-C.; Kuo, C.-M. Motion Vector Re-estimation for Trans-coding Using Kalman Filter. Proceedings of the 3rd International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Kaohsiung, Taiwan, 26–28 November 2007; pp. 592–595.
[30]  Kuo, C.-M.; Chung, S.-C.; Shih, P.-Y. Kalman filtering based rate-constrained motion estimation for very low bit rate video coding. IEEE Trans. Circuits Syst. Video Technol. 2006, 16, 3–18.
[31]  Young, S.; Evermann, G.; Gales, M.; Hain, T.; Kershaw, D.; Liu, X.; Moore, G.; Odell, J.; Ollason, D.; Povey, D.; et al. The HTK Book (for HTK version 3.4); Engineering Department; Cambridge University: Cambridge, UK, 2009.


comments powered by Disqus