Abstract
In this paper, we present a novel approach to recover a 3D human pose in real-time from a single depth image using principal direction analysis (PDA). Human body parts are first recognized from a human depth silhouette via trained random forests (RFs). PDA is applied to each recognized body part, which is presented as a set of points in 3D, to estimate its principal direction. Finally, a 3D human pose is recovered by mapping the principal direction to each body part of a 3D synthetic human model. We perform both quantitative and qualitative evaluations of our proposed 3D human pose recovering methodology. We show that our proposed approach has a low average reconstruction error of 7.07 degrees for four key joint angles and performs more reliably on a sequence of unconstrained poses than conventional methods. In addition, our methodology runs at a speed of 20 FPS on a standard PC, indicating that our system is suitable for real-time applications. Our 3D pose recovery methodology is applicable to applications ranging from human computer interactions to human activity recognition.
Similar content being viewed by others
References
Autodesk 3Ds MAX, 2012
Baak A, Mller M, Bharaj G, Seidel HP, Theobalt C (2011) Data-driven approach for real-time full body pose reconstruction from a depth camera. In: Proceedings of the 2011 international conference on computer vision. pp 1092–1099
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recognit Lett Lett 34(15):1995–2006
CMU motion capture database. http://mocap.cs.cmu.edu
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Ganapathi V, Plagemann C, Koller D, Thrun S (2010) Real time motion capture using a single time-of-flight camera. In: IEEE conference on computer vision and pattern recognition (CVPR). pp 755–762
Ganapathi V, Plagemann C, Koller D, Thrun S (2012) Real-time human pose tracking from range data. In: Proceedings of the 12th European conference on computer vision. pp 738–751
Hastie T, Tibshirani R, Friedman J (2008) The elements of statistical learning. Springer, New York
Holt B, Ong EJ, Bowden R (2013) Accurate static pose estimation combining direct regression and geodesic extrema. In: IEEE international conference and workshops on automatic face and gesture recognition
Jalal A, Sharif N, Kim JT, Kim TS (2013) Human activity recognition via recognized body parts of human depth silhouettes for residents monitoring services at smart home. J Indoor Built Environ 22:271–279
Jiu M, Wolf C, Taylor G, Baskurt A (2013) Human body part estimation from depth images via spatially-constrained deep learning. Pattern Recognit Lett
Lepetit V, Lagger P, Fua P (2005) Randomized trees for real-time keypoint recognition. In: IEEE computer society conference on computer vision and pattern recognition, pp 75–781
Moeslund TB, Hilton A, Krger V (2006) A survey of advances in vision-based human motion capture and analysis. Comp Vision Image Underst 104(2):90–126
Plagemann C, Ganapathi V, Koller D, Thrun S (2010) Real-time identification and localization of body parts from depth images. In: IEEE international conference on robotics and automation (ICRA). pp 3108–3113
Poppe R (2007) Vision-based human motion analysis: an overview. Comp Vision Image Underst 108(1–2):4–18
PrimeSense Ltd. http://www.primesense.com
Rosenhahn B, Kersting UG, Smith AW, Gurney JK, Brox T, Klette R (2005) A system for marker-less human motion estimation. Lect Notes Comput Sci 3663:230–237
Rosenhahn B, Schmaltz C, Brox T, Weickert J, Cremers D, Seidel HP (2008) Markerless motion capture of man-machine interaction. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. pp 23–28
Schwarz LA, Mkhitaryan A, Mateus D, Navab N (2011) Estimating human 3d pose from time-of-flight images based on geodesic distances and optical flow. In: IEEE conference on automatic face and gesture recognition. pp 700–706
Schwarz LA, Mkhitaryan A, Mateus D, Navab N (2012) Human skeleton tracking from depth data using geodesic distances and optical flow. J Image Vision Comput 30(3):217–226
Shen J, Yang W, Liao Q (2013) Part template: 3d representation for multiview human pose estimation. Pattern Recog 46(7):1920–1932
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: Proceedings of the 2011 IEEE conference on computer vision and pattern recognition. pp 1297–1304
Shuang Z, Yu-ping Q, Hao D, Gang J (2012) Analyzing of mean-shift algorithm in extended target tracking technology. Lect Notes Electr Eng 144:161–166
Sundaresan A, Chellappa R (2008) Model-driven segmentation of articulating humans in laplacian eigenspace. IEEE Trans Pattern Anal Mach Intell 30(10):1771–1785
Thang ND, Kim TS, Lee YK, Lee SY (2011) Estimation of 3-d human body posture via co-registration of 3-d human model and sequential stereo information. Appl Intell 35(2):163–177
Vilaplana V, Marques F (2008) Region-based mean sift tracking: application to face tracking. In: IEEE international conference on image processing. pp 2712–2715
Yuan ZH, Lu T (2013) Incremental 3d reconstruction using bayesian learning. Appl Intell 39(4):761–771
Acknowledgments
This research was supported by the MSIP (Ministry of Science, ICT & Future Planning), Korea, under the ITRC (Information Technology Research Center) support program supervised by the NIPA (National IT Industry Promotion Agency (NIPA-2013-(H0301-13-2001)). This work was also supported by the Industrial Strategic Technology Development Program (10035348, Development of a Cognitive Planning and Learning Model for Mobile Platforms) funded by the Ministry of Knowledge Economy(MKE, Korea).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Dinh, DL., Lim, MJ., Thang, N.D. et al. Real-time 3D human pose recovery from a single depth image using principal direction analysis. Appl Intell 41, 473–486 (2014). https://doi.org/10.1007/s10489-014-0535-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-014-0535-z