全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

3D Human Motion Tracking and Reconstruction Using DCT Matrix Descriptor

DOI: 10.5402/2012/235396

Full-Text   Cite this paper   Add to My Lib

Abstract:

One of the most important issues in human motion analysis is the tracking and 3D reconstruction of human motion, which utilizes the anatomic points' positions. These points can uniquely define the position and orientation of all anatomical segments. In this work, a new method is proposed for tracking and 3D reconstruction of human motion from the image sequence of a monocular static camera. In this method, 2D tracking is used for 3D reconstruction, which a database of selected frames is used for the correction of tracking process. The method utilizes a new image descriptor based on discrete cosine transform (DCT), which is employed in different stages of the algorithm. The advantage of using this descriptor is the capabilities of selecting proper frequency regions in various tasks, which results in an efficient tracking and pose matching algorithms. The tracking and matching algorithms are based on reference descriptor matrixes (RDMs), which are updated after each stage based on the frequency regions in DCT blocks. Finally, 3D reconstruction is performed using Taylor’s method. Experimental results show the promise of the algorithm. 1. Introduction One of the challenging issues in machine vision and computer graphic applications is the modeling and animation of human characters. Especially body modeling using video sequences is a difficult task that has been investigated a lot in the last decade. Nowadays, 3D human models are employed in various applications like movies, video games, ergonomic, e-commerce, virtual environments, and medicine. 3D scanners [1, 2] and video cameras are two sample tools that have been presented for 3D human model reconstruction. 3D scanners have limited flexibility and freedom constraints. In addition, the higher cost of these devices put them out of reach for general use. Video cameras are nonintrusive and flexible devices for extraction of human motion. However, due to the high number of degrees of freedom for the human body, human motion tracking is a difficult task. In addition, self-occlusion of human segments and their unknown kinematics make the human tracking algorithm more challenging. Existing vision-based approaches for human motion analysis may be divided in two groups, including model-based and model-free methods [3]. In model-based methods [4–8], a priori known human model is employed to represent human joints and segments as well as their kinematics. Model-free approaches do not employ a predefined human model for motion analysis; instead, the motion information is derived directly from video sequences.

References

[1]  P. Treleaven and J. Wells, “3D body scanning and healthcare applications,” Computer, vol. 40, no. 7, pp. 28–34, 2007.
[2]  P. Kelly, C. O. Conaire, J. Hodgins, and N. E. O’Connor, “Human motion reconstruction using wearable accelerometers,” in Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation (SCA '10), Madrid, Spain, 2010.
[3]  R. Poppe, “Vision-based human motion analysis: an overview,” Computer Vision and Image Understanding, vol. 108, no. 1-2, pp. 4–18, 2007.
[4]  Y. K. Wang and K. Y. Cheng, “A two-stage Bayesian network method for 3D human pose estimation from monocular image sequences,” EURASIP Journal on Advances in Signal Processing, vol. 2010, Article ID 761460, 2010.
[5]  C. Barrón and I. A. Kakadiaris, “Estimating anthropometry and pose from a single uncalibrated image,” Computer Vision and Image Understanding, vol. 81, no. 3, pp. 269–284, 2001.
[6]  C. Sminchisescu and B. Triggs, “Kinematic jump processes for monocular 3D human tracking,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '03), pp. 69–76, Rhone-Alpes, France, 2003.
[7]  C. Chen, Y. Zhuang, and J. Xiao, “Towards robust 3D reconstruction of human motion from monocular video,” in Lecture Notes in Computer Science: Advances in Artificial Reality and Tele-Existence, Z. Pan, A. Cheok, M. Haller, R. Lau, H. Saito, and R. Liang, Eds., pp. 594–603, Springer, Berlin, Germany, 2006.
[8]  G. Loy, M. Eriksson, J. Sullivan, and S. Carlsson, “Monocular 3D reconstruction of human motion in long action sequences,” in Lecture Notes in Computer Science: Computer Vision-ECCV, T. Pajdla and J. Matas, Eds., vol. 3024, pp. 442–445, Springer, Berlin, Germany, 2004.
[9]  G. Mori and J. Malik, “Estimating human body configurations using shape context matching,” in Lecture Notes in Computer Science: Computer Vision-ECCV, A. Heyden, G. Sparr, M. Nielsen, and P. Johansen, Eds., pp. 150–180, Springer, Berlin, Germany, 2002.
[10]  A. Kanaujia, C. Sminchisescu, and D. Metaxas, “Semi-supervised hierarchical models for 3D human pose reconstruction,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1–8, Minneapolis, Minn, USA, June 2007.
[11]  R. Rosales and S. Sclaroff, “Learning and synthesizing human body motion and posture,” in Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 506–511, Grenoble, France, 2000.
[12]  C. J. Taylor, “Reconstruction of articulated objects from point correspondences in a single uncalibrated image,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '00), pp. 677–684, Hilton Head Island, SC, USA, June 2000.
[13]  G. Mori and J. Malik, “Recovering 3D human body configurations using shape contexts.,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 7, pp. 1052–1062, 2006.
[14]  M. J. Park, M. G. Choi, Y. Shinagawa, and S. Y. Shin, “Video-guided motion synthesis using example motions,” ACM Transactions on Graphics, vol. 25, no. 4, pp. 1327–1359, 2006.
[15]  I. Mikic, M. Trivedi, E. Hunter, and P. Cosman, “Articulated body posture estimation from multi-camera voxel data,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '01), pp. 455–460, Kauai, Hawaii, USA, 2001.
[16]  S. Iwasawa, J. Ohya, K. Takahashi, T. Sakaguchi, S. Morishima, and K. Ebihara, “Human body postures from trinocular camera images,” in Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 326–331, Grenoble, France, 2000.
[17]  A. Hilton, D. Beresford, T. Gentils, R. Smith, W. Sun, and J. Illingworth, “Whole-body modelling of people from multiview images to populate virtual worlds,” Visual Computer, vol. 16, no. 7, pp. 411–436, 2000.
[18]  R. Pl?nkers and P. Fua, “Tracking and modeling people in video sequences,” Computer Vision and Image Understanding, vol. 81, no. 3, pp. 285–302, 2001.
[19]  R. Plankers, P. Fua, and N. D'Apuzzo, “Automated body modeling from video sequences,” in Proceedings of the IEEE International Workshop on Modelling People, pp. 45–52, Kerkyra, Greece, 1999.
[20]  G. K. M. Cheung, T. Kanade, J. Y. Bouguet, and M. Holler, “A real time system for robust 3D voxel reconstruction of human motions,” in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition, pp. 714–720, Hilton Head Island, SC , USA, 2002.
[21]  M. J. Park, M. G. Choi, and S. Y. Shin, “Human motion reconstruction from inter-frame feature correspondences of a single video stream using a motion library,” in Proceedings of the ASM SIGGRAPH Symposium on Computer Animation (SCA '02), pp. 113–120, July 2002.
[22]  D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.
[23]  G. Qian and F. Guo, “Monocular 3D tracking of articulated human motion in silhouette and pose manifolds,” EURASIP Journal on Image and Video Processing, vol. 2008, Article ID 326896, 2008.
[24]  K. Rohr, “Towards model-based recognition of human movements in image sequences,” CVGIP: Image Understanding, vol. 59, no. 1, pp. 94–115, 1994.
[25]  N. Roodsarabi and A. Behrad, “3D human motion reconstruction using video processing image and signal processing,” in Proceedings of the 3rd international conference on Image and Signal Processing (ICISP '08), pp. 386–395, Cherbourg-Octeville, France, 2008.
[26]  J. Y. Bouguet, “Pyramidal implementation of the lucas kanade feature tracker description of the algorithm,” Intel Corporation, Microprocessor Research Labs, OpenCV Documents, 1999.
[27]  F. Remondino and A. Roditakis, “3D reconstruction of human skeleton from single images or monocular video sequences,” in Lecture Notes in Computer Science: Pattern Recognition, B. Michaelis and G. Krell, Eds., vol. 2781, pp. 100–107, Springer, Berlin, Germany, 2003.
[28]  CMU Graphics Lab Motion Capture Database, http://mocap.cs.cmu.edu/.
[29]  G. Mori, S. Belongie, and J. Malik, “Shape contexts enable efficient retrieval of similar shapes,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '01), pp. 723–730, Kauai, Hawaii, USA, December 2001.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133