全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Affine-Invariant Feature Extraction for Activity Recognition

DOI: 10.1155/2013/215195

Full-Text   Cite this paper   Add to My Lib

Abstract:

We propose an innovative approach for human activity recognition based on affine-invariant shape representation and SVM-based feature classification. In this approach, a compact computationally efficient affine-invariant representation of action shapes is developed by using affine moment invariants. Dynamic affine invariants are derived from the 3D spatiotemporal action volume and the average image created from the 3D volume and classified by an SVM classifier. On two standard benchmark action datasets (KTH and Weizmann datasets), the approach yields promising results that compare favorably with those previously reported in the literature, while maintaining real-time performance. 1. Introduction Visual recognition and interpretation of human-induced actions and events are among the most active research areas in computer vision, pattern recognition, and image understanding communities [1]. Although a great deal of progress has been made in automatic recognition of human actions during the last two decades, the approaches proposed in the literature remain limited in their ability. This leads to a need for much research work to be conducted to address the ongoing challenges and develop more efficient approaches. It is clear that developing good algorithms for solving the problem of human action recognition would yield huge potential for a large number of potential applications, for example, the search and the structuring of large video archives, human-computer interaction, video surveillance, gesture recognition, and robot learning and control. In fact, the nonrigid nature of human body and clothes in video sequences, resulting from drastic illumination changes, changing in pose, and erratic motion patterns, presents the grand challenge to human detection and action recognition. In addition, while the real-time performance is a major concern in computer vision, especially for embedded computer vision systems, the majority of state-of-the-art human action recognition systems often employ sophisticated feature extraction and learning techniques, creating a barrier to the real-time performance of these systems. This suggests a trade-off between accuracy and real-time performance. The remainder of this paper commences by briefly reviewing the most relevant literature in this area of human action recognition in Section 2. Then, in Section 3, we describe the details of the proposed method for action recognition. The experimental results corroborating the proposed method effectiveness are presented and analyzed in Section 4. Finally, in Section 5, we conclude and

References

[1]  S. Sadek, A. Al-Hamadi, B. Michaelis, and U. Sayed, “Recognizing human actions: a fuzzy approach via chord-length shape features,” ISRN Machine Vision, vol. 1, pp. 1–9, 2012.
[2]  A. A. Efros, A. C. Berg, G. Mori, and J. Malik, “Recognizing action at a distance,” in Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV '03), vol. 2, pp. 726–733, October 2003.
[3]  S. Sadek, A. Al-Hamadi, B. Michaelis, and U. Sayed, “Towards robust human action retrieval in video,” in Proceedings of the British Machine Vision Conference (BMVC '10), Aberystwyth, UK, September 2010.
[4]  S. Sadek, A. Al-Hamadi, B. Michaelis, and U. Sayed, “Human activity recognition: a scheme using multiple cues,” in Proceedings of the International Symposium on Visual Computing (ISVC '10), vol. 1, pp. 574–583, Las Vegas, Nev, USA, November 2010.
[5]  S. Sadek, A. AI-Hamadi, M. Elmezain, B. Michaelis, and U. Sayed, “Human activity recognition via temporal moment invariants,” in Proceedings of the 10th IEEE International Symposium on Signal Processing and Information Technology (ISSPIT '10), pp. 79–84, Luxor, Egypt, December 2010.
[6]  S. Sadek, A. Al-Hamadi, B. Michaelis, and U. Sayed, “An action recognition scheme using fuzzy log-polar histogram and temporal self-similarity,” EURASIP Journal on Advances in Signal Processing, vol. 2011, Article ID 540375, 2011.
[7]  R. Cutler and L. S. Davis, “Robust real-time periodic motion detection, analysis, and applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 781–796, 2000.
[8]  E. Shechtman and M. Irani, “Space-time behavior based correlation,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05), vol. 1, pp. 405–412, June 2005.
[9]  M. D. Rodriguez, J. Ahmed, and M. Shah, “Action MACH: a spatio-temporal maximum average correlation height filter for action recognition,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), June 2008.
[10]  N. Ikizler and D. Forsyth, “Searching video for complex activities with finite state models,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), June 2007.
[11]  D. M. Blei and J. D. Lafferty, “Correlated topic models,” in Advances in Neural Information Processing Systems (NIPS), vol. 18, pp. 147–154, 2006.
[12]  D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,” Journal of Machine Learning Research, vol. 3, no. 4-5, pp. 993–1022, 2003.
[13]  T. Hofmann, “Probabilistic latent semantic indexing,” in Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '99), pp. 50–57, 1999.
[14]  S. J. McKenna, Y. Raja, and S. Gong, “Tracking colour objects using adaptive mixture models,” Image and Vision Computing, vol. 17, no. 3-4, pp. 225–231, 1999.
[15]  J. Liu and M. Shah, “Learning human actions via information maximization,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), June 2008.
[16]  Y. Wang and G. Mori, “Max-Margin hidden conditional random fields for human action recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR '09), pp. 872–879, June 2009.
[17]  H. Jhuang, T. Serre, L. Wolf, and T. Poggio, “A biologically inspired system for action recognition,” in Proceedings of the 11th IEEE International Conference on Computer Vision (ICCV '07), pp. 257–267, October 2007.
[18]  K. Rapantzikos, Y. Avrithis, and S. Kollias, “Dense saliency-based spatiotemporal feature points for action recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR '09), pp. 1454–1461, June 2009.
[19]  P. Dollár, V. Rabaud, G. Cottrell, and S. Belongie, “Behavior recognition via sparse spatio-temporal features,” in Proceedings of the 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS '05), pp. 65–72, October 2005.
[20]  Y. Ke, R. Sukthankar, and M. Hebert, “Efficient visual event detection using volumetric features,” in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), pp. 166–173, October 2005.
[21]  A. Fathi and G. Mori, “Action recognition by learning mid-level motion features,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), June 2008.
[22]  M. Bregonzio, S. Gong, and T. Xiang, “Recognising action as clouds of space-time interest points,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR '09), pp. 1948–1955, June 2009.
[23]  Z. Zhang, Y. Hu, S. Chan, and L.-T. Chia, “Motion context: a new representation for human action recognition,” in Proceeding of the European Conference on Computer Vision (ECCV '08), vol. 4, pp. 817–829, 2008.
[24]  J. C. Niebles, H. Wang, and L. Fei-Fei, “Unsupervised learning of human action categories using spatial-temporal words,” International Journal of Computer Vision, vol. 79, no. 3, pp. 299–318, 2008.
[25]  A. Kl?ser, M. Marszaek, and C. Schmid, “A spatiotemporal descriptor based on 3D-gradients,” in Proceedings of the British Machine Vision Conference (BMVC '08), 2008.
[26]  S. Sadek, A. Al-Hamadi, B. Michaelis, and U. Sayed, “Human action recognition via affine moment invariants,” in Proceedings of the 21st International Conference on Pattern Recognition (ICPR '12), pp. 218–221, Tsukuba Science City, Japan, November 2012.
[27]  J. Flusser and T. Suk, “Pattern recognition by affine moment invariants,” Pattern Recognition, vol. 26, no. 1, pp. 167–174, 1993.
[28]  D. Xu and H. Li, “3-D affine moment invariants generated by geometric primitives,” in Proceedings of the 18th International Conference on Pattern Recognition (ICPR '06), pp. 544–547, August 2006.
[29]  S. Sadek, A. Al-Hamadi, B. Michaelis, and U. Sayed, “An SVM approach for activity recognition based on chord-length-function shape features,” in Proceedings of the IEEE International Conference on Image Processing (ICIP '12), pp. 767–770, Orlando, Fla, USA, October 2012.
[30]  V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, NY, USA, 1995.
[31]  C. Schüldt, I. Laptev, and B. Caputo, “Recognizing human actions: a local SVM approach,” in Proceedings of the 17th International Conference on Pattern Recognition (ICPR '04), pp. 32–36, 2004.
[32]  M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, “Actions as space-time shapes,” in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), vol. 2, pp. 1395–1402, October 2005.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133