Skip to main content
Log in

Three-dimensional spatio-temporal trajectory descriptor for human action recognition

  • Short Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

This paper presents a method for action recognition based on trajectory representation. The motion information for action recognition is obtained by tracking the key points through the video sequence using a standard KLT tracker. We propose a new 3D spatio-temporal descriptor based on histogram of directional derivative (3D-HODD) to describe the volume extracted around the trajectories. Our descriptor describes the local object appearance within the volume effectively and distinctively. The final descriptor constructed by combining the shape of trajectories (motion information) with 3D-HODD (appearance information). A multiclass support vector machine has been used to classify the human activities. The proposed framework for recognition of human action has been extensively validated on the benchmark datasets, with a focus that this methodology is robust and attains more precise human action recognition rate as compared to current methodologies available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. Turga P, Ivanov Y (2011) Diamond sentry: integrating sensors and camera for real time monitoring of indoor spaces. IEEE Sensor J 11(3):593

    Article  Google Scholar 

  2. Ali S, Shah M (2010) Human Action recognition in videos using kinematic features and multiple instance learning. IEEE Trans Pattern Anal Mach Intel 32(2):288

    Article  Google Scholar 

  3. Zhou Z, Chen X, Chung CY, He Z, Han XT, Keller JM (2008) Activity analysis, summarization, and visualization for indoor human activity monitoring. IEEE Trans Circuit Syst Video Tech 18(11):1489–1498

    Article  Google Scholar 

  4. Barger T, Brown D, Alwan M (2005) Health status monitoring through analysis of behavioral patterns. IEEE Trans Syst Man Cybern 35(1):22

    Article  Google Scholar 

  5. Lin W, Sun M, Poovendran R, Zang Z (2008) Activity Recognition using a combination of category component and local models for video surveillance. IEEE Trans Circuit Syst Video Technol 8(8):1128

    Article  Google Scholar 

  6. Dollar P, Rabaud V, Cottrel G (2005) Behavior recognition via sparse spatio-temporal features. IEEE international workshop on VS-PETS

  7. Shao L, Gao R, Lui Y, Zhang H (2011) Transform based spatio-temporal descriptor for human action recognition. Neurocomputing 74:962–973

    Article  Google Scholar 

  8. Ikizler N, Duygulu P (2009) Histogram of oriented rectangles: a new pose descriptor for human action recognition. Image Vis Comput 27:1515–1526

    Article  Google Scholar 

  9. Banerjee P, Nevatia R (2011) Learning neighborhood co-occurrence statistics of sparse features for human activity recognition. In: IEEE international conference on advanced video and signal based surveillance

  10. Zhang Y, Liu X, Chang MC, Ge W, Chen T (2012) Spatiotemporal phrases for activity recognition. In: Proceedings of European conference on computer vision (ECCV), pp 707–721

  11. Li N, Cheng X, Zhang S, Wu Z (2014) Realistic human action recognition by Fast HOG3D and self-organization feature map. Mach Vis Appl 25:1793–1812

    Article  Google Scholar 

  12. Laptev I (2005) On space–time interest points. Int J Comput Vis 64(2):107–23

    Article  Google Scholar 

  13. Wang H, Kläser A, Schmid C, Liu C-L (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79

    Article  MathSciNet  Google Scholar 

  14. Wang X, Qi C (2016) Action recognition using edge trajectories and motion acceleration descriptor. Mach Vis Appl 27(6):861–85

    Article  Google Scholar 

  15. Scovanner P, Ali S, Shah M (2007) A 3-dimensional sift descriptor and its application to action recognition. In: 15th international conference on multimedia, pp 357–360

  16. Kl A, Schmid C, Grenoble I (2008) A spatio-temporal descriptor based on 3D-gradients. In: British Machine Vision conference

  17. Willems G, Tuytelaars T, Van Gool L (2008) An efficient dense and scale-invariant spatio-temporal interest point detector. Computer vision—ECCV

  18. Bhorge S, Manthalkar R (2017) Histogram of directional derivative based spatio-temporal descriptor for human action recognition. In : IEEE international conference on data management, analytics and innovation

  19. Messing R, Pal C, Kautz H (2009) Activity recognition using the velocity histories of tracked key points. In: Proceedings of the IEEE international conference on computer vision, pp 104–111

  20. Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. Imaging 130:674–9

    Google Scholar 

  21. Matikainen P, Hebert M, Sukthankar R (2009) Trajectons: action recognition through the motion analysis of tracked features. In: IEEE 12th international conference on computer vision workshops. ICCV Workshops, pp 514–521

  22. Sun J, Mu Y, Yan S, Cheong LF (2010) Activity recognition using dense long-duration trajectories. In: The IEEE international conference on multimedia & expo (ICME)

  23. Raptis M, Soatto S (2010) Tracklet descriptors for action modeling and video analysis. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) 6311 LNCS: (PART 1), pp 577–590

  24. Bregonzio M, Li J, Gong S, Xiang T (2010) Discriminative topics modelling for action feature selection and recognition. In: Proceedings of British machine vision conference, pp 8.1–8.11

  25. Vig E, Dorr M, Cox D (2012) Space-variant descriptor sampling for action recognition based on saliency and eye movements. In: Proceedings of European conference on computer vision (ECCV), pp 84–97

  26. Somasundaram G, Cherian A, Morellas V, Papanikolopoulos N (2014) Action recognition using global spatio-temporal features derived from sparse representations. Comput Vis Image Underst 123:1–13

    Article  Google Scholar 

  27. Peng X, Qiao Y, Peng Q (2014) Motion boundary based sampling and 3D co-occurrence descriptors for action recognition. Image Vis Comput 32(9):616–628

    Article  Google Scholar 

  28. Murthy OR, Goecke R (2013) Ordered trajectories for large scale human action recognition. In: Proceedings of IEEE international conference on computer vision workshops (ICCVW), pp 412– 419

  29. Wang X, Qi C, Lin F (2017) Combined trajectories for action recognition based on saliency detection and motion boundary. Sig Process Image Commun 57:91–102

    Article  Google Scholar 

  30. Alghyaline S, Hsieh JW, Chiang HF, Lin RY (2016) Action classification using data mining and Paris of SURF-based trajectories. In: IEEE international conference on systems, man, and cybernetics (SMC), Budapest, pp 2163–2168

  31. Song Y, Liu S, Tang J (2015) Describing trajectory of surface patch for Human Action Recognition on RGB and depth videos. IEEE Signal Process Lett 22(4):426–429

    Article  Google Scholar 

  32. Falco P, Saveriano M, Hasany EG, Kirk NH, Lee D (2017) A human action descriptor based on motion coordination. Robot Autom Lett 2(2):811–818

    Article  Google Scholar 

  33. Amor B, Su J, Srivastava A (2016) Action recognition using rate-invariant analysis of skeletal shape trajectories. Trans Pattern Anal Mach Intel 38(1):1–13

    Article  Google Scholar 

  34. Seo J-J, Kim H-Il, De Neve W, Ro Y (2017) Effective and efficient human action recognition using dynamic frame skipping and trajectory rejection. Image Vis Comput 58:76–85

    Article  Google Scholar 

  35. Wang Y, Shi Y, Wei G (2017) A novel local feature descriptor based on energy information for human activity recognition. Neurocomputing 228:19–28

    Article  Google Scholar 

  36. Harris C, Stephens M (1988) A combined corner and edge detector. In: Proceedings of Alvey vision conference, pp 189–192

  37. Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Software available at: http://www.csie.ntu.edu.tw/~cjlin

  38. Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: International conference on pattern recognition

  39. Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space–time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253

    Article  Google Scholar 

  40. Abdul-Azim HA, Hemayed E (2015) Human action recognition using trajectory-based representation. Egypt Inform J 16:187–198

    Article  Google Scholar 

  41. Avgerinakis K, Briassouli A, Loannis K (2015) Activities of daily living recognition using optimal trajectories from motion boundaries. J Ambient Intell Smart Environ 7(6):817–834

    Article  Google Scholar 

  42. Shi Y, Tian Y, Wang Y, Huang T (2017) Sequential deep trajectory descriptor for action recognition with three-stream CNN. IEEE Trans Multim 19(7):1510–1520

    Article  Google Scholar 

  43. Vishwakarma DK, Singh K (2016) Human activity recognition based on spatial distribution of gradients at sub-levels of average energy silhouette images. IEEE Trans Cogn Dev Syst. https://doi.org/10.1109/TCDS.2016.2577044

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sidharth B. Bhorge.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhorge, S.B., Manthalkar, R.R. Three-dimensional spatio-temporal trajectory descriptor for human action recognition. Int J Multimed Info Retr 7, 197–205 (2018). https://doi.org/10.1007/s13735-018-0152-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-018-0152-4

Keywords

Navigation