Skip to main content

Automatic Human Activity Segmentation and Labeling in RGBD Videos

  • Conference paper
  • First Online:
Intelligent Decision Technologies 2016 (IDT 2016)

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 56))

Included in the following conference series:

Abstract

Human activity recognition has become one of the most active research topics in image processing and pattern recognition. Manual analysis of video is labour intensive, fatiguing, and error prone. Solving the problem of recognizing human activities from video can lead to improvements in several application fields like surveillance systems, human computer interfaces, sports video analysis, digital shopping assistants, video retrieval, gaming and health-care. This paper aims to recognize an action performed in a sequence of continuous actions recorded with a Kinect sensor based on the information about the position of the main skeleton joints. The typical approach is to use manually labeled data to perform supervised training. In this paper we propose a method to perform automatic temporal segmentation in order to separate the sequence in a set of actions. By measuring the amount of movement that occurs in each joint of the skeleton we are able to find temporal segments that represent the singular actions. We also proposed an automatic labeling method of human actions using a clustering algorithm on a subset of the available features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://mocap.cs.cmu.edu/.

References

  1. Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 1–43 (2011)

    Article  Google Scholar 

  2. Bobick, A.F., Wilson, A.D.: A state-based approach to the representation and recognition of gesture. IEEE Trans. Pattern Anal. Mach. Intell. 19(12), 1325–1337 (1997)

    Article  Google Scholar 

  3. Damen, D., Hogg, D.: Recognizing linked events: searching the space of feasible explanations. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 927–934 (2009). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5206636

  4. Gavrila, D.: The visual analysis of human movement: a survey. Comput. Vis. Image Underst. 73(1), 82–98 (1999). http://www.sciencedirect.com/science/article/pii/S1077314298907160

    Google Scholar 

  5. Gowayyed, M.A., Torki, M., Hussein, M.E., El-Saban, M.: Histogram of oriented displacements (HOD): describing trajectories of human joints for action recognition. In: International Joint Conference on Artificial Intelligence, vol. 25, pp. 1351–1357 (2013)

    Google Scholar 

  6. Gupta, A., Srinivasan, P., Shi, J., Davis, L.S.: Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 2012–2019 (2009)

    Google Scholar 

  7. Hussein, M.E., Torki, M., Gowayyed, M.a., El-Saban, M.: Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations. In: International Joint Conference on Artificial Intelligence pp. 2466–2472 (2013)

    Google Scholar 

  8. Intille, S.S., Bobick, A.F.: A framework for recognizing multi-agent action from visual evidence. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence, vol. 489, pp. 518–525 (1999). http://dl.acm.org/citation.cfm?id=315149.315381

  9. Keller, C.G., Dang, T., Fritz, H., Joos, A., Rabe, C., Gavrila, D.M.: Active pedestrian safety by automatic braking and evasive steering. IEEE Trans. Intell. Transp. Syst. 12(4), 1292–1304 (2011). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5936735

    Google Scholar 

  10. Koppula, H., Gupta, R., Saxena, A.: Learning human activities and object affordances from RGB-D videos. Int. J. Robot. Res. 32(8), 951–970 (2013). arxiv:1210.1207v2

  11. Nirjon, S., Greenwood, C., Torres, C., Zhou, S., Stankovic, J.a., Yoon, H.J., Ra, H.K., Basaran, C., Park, T., Son, S.H.: Kintense: A robust, accurate, real-time and evolving system for detecting aggressive actions from streaming 3D skeleton data. In: 2014 IEEE International Conference on Pervasive Computing and Communications, PerCom 2014 pp. 2–10 (2014). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6813937

  12. Niu, W., Long, J., Han, D., Wang, Y.F.: Human activity detection and recognition for video surveillance. In: 2004 IEEE International Conference on Multimedia and Exp (ICME), vols. 1-3. pp. 719–722 (2004)

    Google Scholar 

  13. O’Rourke, J., Badler, N.: Model-based image analysis of human motion using constraint propagation. IEEE Trans. Pattern Anal. Mach. Intell. 6(6), 522–536 (1980). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6447699

    Google Scholar 

  14. Pinhanez, C.S., Bobick, A.F.: Human action detection using pnf propagation of temporal constraints. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 898–904. IEEE (1998)

    Google Scholar 

  15. Popa, M., Kemal Koc, A., Rothkrantz, L.J.M., Shan, C., Wiggers, P.: Kinect sensing of shopping related actions. Commun. Comput. Inf. Sci. 277 CCIS, 91–100 (2012)

    Google Scholar 

  16. Rashid, R.F.: Towards a system for the interpretation of moving light displays. IEEE Trans. Pattern Anal. Mach. Intell. 2(6), 574–581 (1980). http://scholar.google.com/scholarhl=en&btnG=Search&q=intitle:Towards+a+System+for+the+Interpretation+of+Moving+Light+Displays#0

  17. Ryoo, M.S., Aggarwal, J.K.: Semantic representation and recognition of continued and recursive human activities. Int. J. Comput. Vis. 82(1), 1–24 (2009). http://link.springer.com/10.1007/s11263-008-0181-1

    Google Scholar 

  18. Starner, T., Weaver, J., Pentland, A.: Real-time american sign language recognition using desk and wearable computer based video. Trans. Pattern Anal. Mach. Intell. 20(466), 1371–1375 (1998). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=735811

    Google Scholar 

  19. Wolf, C., Mille, J., Lombardi, E., Celiktutan, O., Jiu, M., Dogan, E., Eren, G., Baccouche, M., Dellandrea, E., Bichot, C.E., Garcia, C., Sankur, B.: Evaluation of video activity localizations integrating quality and quantity measurements. Comput. Vis. Image Underst. 127, 14–30 (2014). http://liris.cnrs.fr/Documents/Liris-5498.pdf

    Google Scholar 

  20. Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden Markov model. Comput. Vis. Pattern Recognit. 379–385 (1992). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=223161, http://ieeexplore.ieee.org/ielx2/418/5817/00223161.pdf?tp=&arnumber=223161&isnumber=5817, http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=223161&contentType=Conference+Publication

  21. Yu, E., Aggarwal, J.K.: Detection of fence climbing from monocular video. In: 18th international conference on pattern recognition, vol. 1, pp. 375–378 (2006). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1698911

  22. Zhou, F., Torre, F.D.L., Hodgins, J.: Hierarchical aligned cluster analysis (HACA) for temporal segmentation of human motion. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 1–40 (2010).http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Hierarchical+Aligned+Cluster+Analysis+(+HACA+)+for+Temporal+Segmentation+of+Human+Motion#1

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Jardim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Jardim, D., Nunes, L., Dias, M.S. (2016). Automatic Human Activity Segmentation and Labeling in RGBD Videos. In: Czarnowski, I., Caballero, A., Howlett, R., Jain, L. (eds) Intelligent Decision Technologies 2016. IDT 2016. Smart Innovation, Systems and Technologies, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-319-39630-9_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-39630-9_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-39629-3

  • Online ISBN: 978-3-319-39630-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics