Automatic Human Activity Segmentation and Labeling in RGBD Videos

Jardim, David; Nunes, Luís; Dias, Miguel Sales

doi:10.1007/978-3-319-39630-9_32

David Jardim^7,8,9,10,
Luís Nunes^8,9 &
Miguel Sales Dias^7,9,10

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 56))

Included in the following conference series:

International Conference on Intelligent Decision Technologies

681 Accesses
2 Citations

Abstract

Human activity recognition has become one of the most active research topics in image processing and pattern recognition. Manual analysis of video is labour intensive, fatiguing, and error prone. Solving the problem of recognizing human activities from video can lead to improvements in several application fields like surveillance systems, human computer interfaces, sports video analysis, digital shopping assistants, video retrieval, gaming and health-care. This paper aims to recognize an action performed in a sequence of continuous actions recorded with a Kinect sensor based on the information about the position of the main skeleton joints. The typical approach is to use manually labeled data to perform supervised training. In this paper we propose a method to perform automatic temporal segmentation in order to separate the sequence in a set of actions. By measuring the amount of movement that occurs in each joint of the skeleton we are able to find temporal segments that represent the singular actions. We also proposed an automatic labeling method of human actions using a clustering algorithm on a subset of the available features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://mocap.cs.cmu.edu/.

References

Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 1–43 (2011)
Article Google Scholar
Bobick, A.F., Wilson, A.D.: A state-based approach to the representation and recognition of gesture. IEEE Trans. Pattern Anal. Mach. Intell. 19(12), 1325–1337 (1997)
Article Google Scholar
Damen, D., Hogg, D.: Recognizing linked events: searching the space of feasible explanations. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 927–934 (2009). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5206636
Gavrila, D.: The visual analysis of human movement: a survey. Comput. Vis. Image Underst. 73(1), 82–98 (1999). http://www.sciencedirect.com/science/article/pii/S1077314298907160
Google Scholar
Gowayyed, M.A., Torki, M., Hussein, M.E., El-Saban, M.: Histogram of oriented displacements (HOD): describing trajectories of human joints for action recognition. In: International Joint Conference on Artificial Intelligence, vol. 25, pp. 1351–1357 (2013)
Google Scholar
Gupta, A., Srinivasan, P., Shi, J., Davis, L.S.: Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 2012–2019 (2009)
Google Scholar
Hussein, M.E., Torki, M., Gowayyed, M.a., El-Saban, M.: Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations. In: International Joint Conference on Artificial Intelligence pp. 2466–2472 (2013)
Google Scholar
Intille, S.S., Bobick, A.F.: A framework for recognizing multi-agent action from visual evidence. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence, vol. 489, pp. 518–525 (1999). http://dl.acm.org/citation.cfm?id=315149.315381
Keller, C.G., Dang, T., Fritz, H., Joos, A., Rabe, C., Gavrila, D.M.: Active pedestrian safety by automatic braking and evasive steering. IEEE Trans. Intell. Transp. Syst. 12(4), 1292–1304 (2011). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5936735
Google Scholar
Koppula, H., Gupta, R., Saxena, A.: Learning human activities and object affordances from RGB-D videos. Int. J. Robot. Res. 32(8), 951–970 (2013). arxiv:1210.1207v2
Nirjon, S., Greenwood, C., Torres, C., Zhou, S., Stankovic, J.a., Yoon, H.J., Ra, H.K., Basaran, C., Park, T., Son, S.H.: Kintense: A robust, accurate, real-time and evolving system for detecting aggressive actions from streaming 3D skeleton data. In: 2014 IEEE International Conference on Pervasive Computing and Communications, PerCom 2014 pp. 2–10 (2014). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6813937
Niu, W., Long, J., Han, D., Wang, Y.F.: Human activity detection and recognition for video surveillance. In: 2004 IEEE International Conference on Multimedia and Exp (ICME), vols. 1-3. pp. 719–722 (2004)
Google Scholar
O’Rourke, J., Badler, N.: Model-based image analysis of human motion using constraint propagation. IEEE Trans. Pattern Anal. Mach. Intell. 6(6), 522–536 (1980). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6447699
Google Scholar
Pinhanez, C.S., Bobick, A.F.: Human action detection using pnf propagation of temporal constraints. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 898–904. IEEE (1998)
Google Scholar
Popa, M., Kemal Koc, A., Rothkrantz, L.J.M., Shan, C., Wiggers, P.: Kinect sensing of shopping related actions. Commun. Comput. Inf. Sci. 277 CCIS, 91–100 (2012)
Google Scholar
Rashid, R.F.: Towards a system for the interpretation of moving light displays. IEEE Trans. Pattern Anal. Mach. Intell. 2(6), 574–581 (1980). http://scholar.google.com/scholarhl=en&btnG=Search&q=intitle:Towards+a+System+for+the+Interpretation+of+Moving+Light+Displays#0
Ryoo, M.S., Aggarwal, J.K.: Semantic representation and recognition of continued and recursive human activities. Int. J. Comput. Vis. 82(1), 1–24 (2009). http://link.springer.com/10.1007/s11263-008-0181-1
Google Scholar
Starner, T., Weaver, J., Pentland, A.: Real-time american sign language recognition using desk and wearable computer based video. Trans. Pattern Anal. Mach. Intell. 20(466), 1371–1375 (1998). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=735811
Google Scholar
Wolf, C., Mille, J., Lombardi, E., Celiktutan, O., Jiu, M., Dogan, E., Eren, G., Baccouche, M., Dellandrea, E., Bichot, C.E., Garcia, C., Sankur, B.: Evaluation of video activity localizations integrating quality and quantity measurements. Comput. Vis. Image Underst. 127, 14–30 (2014). http://liris.cnrs.fr/Documents/Liris-5498.pdf
Google Scholar
Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden Markov model. Comput. Vis. Pattern Recognit. 379–385 (1992). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=223161, http://ieeexplore.ieee.org/ielx2/418/5817/00223161.pdf?tp=&arnumber=223161&isnumber=5817, http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=223161&contentType=Conference+Publication
Yu, E., Aggarwal, J.K.: Detection of fence climbing from monocular video. In: 18th international conference on pattern recognition, vol. 1, pp. 375–378 (2006). http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1698911
Zhou, F., Torre, F.D.L., Hodgins, J.: Hierarchical aligned cluster analysis (HACA) for temporal segmentation of human motion. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 1–40 (2010).http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Hierarchical+Aligned+Cluster+Analysis+(+HACA+)+for+Temporal+Segmentation+of+Human+Motion#1

Download references

Author information

Authors and Affiliations

MLDC, Lisbon, Portugal
David Jardim & Miguel Sales Dias
Instituto de Telecomunicações, Lisbon, Portugal
David Jardim & Luís Nunes
University Institute of Lisbon (ISCTE-IUL), Lisbon, Portugal
David Jardim, Luís Nunes & Miguel Sales Dias
ISTAR-IUL, Lisbon, Portugal
David Jardim & Miguel Sales Dias

Authors

David Jardim
View author publications
You can also search for this author in PubMed Google Scholar
Luís Nunes
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Sales Dias
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Jardim .

Editor information

Editors and Affiliations

Maritime University, Gdynia, Poland
Ireneusz Czarnowski
Artificial Intelligence Department, Universidad Politécnica de Madrid, Boadilla del Monte, Madrid, Spain
Alfonso Mateos Caballero
KES International, Shoreham-by-sea, United Kingdom
Robert J. Howlett
University of Canberra, Canberra, South Australia, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jardim, D., Nunes, L., Dias, M.S. (2016). Automatic Human Activity Segmentation and Labeling in RGBD Videos. In: Czarnowski, I., Caballero, A., Howlett, R., Jain, L. (eds) Intelligent Decision Technologies 2016. IDT 2016. Smart Innovation, Systems and Technologies, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-319-39630-9_32

Download citation

DOI: https://doi.org/10.1007/978-3-319-39630-9_32
Published: 01 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39629-3
Online ISBN: 978-3-319-39630-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics