Abstract
In this study, we proposed a motion recognition algorithm based on optimized temporal-spatial features and integrating BOW model. Firstly, detect the key points of the input video and then set up multiple 3D patches in the domain space, enhancing the spatiotemporal characteristic of the key points. With that, randomly sample the key point sets and carry out the description by histogram of oriented gradient (HOG) and optical flow histogram (OFH), thereby obtaining the composite descriptor for features. The clustering algorithm will be then applied to set up a visual dictionary and represent the input video samples. In the final step, a classifier is trained and generated by utilizing multi-core SVM; thus, the motion recognition has been completed. The experimental results suggested that the recognition rate and robustness of the proposed algorithm can outperform most existing motion recognition algorithms.
Similar content being viewed by others
References
Can EF, Manmatha R (2013) Formulating action recognition as a ranking problem. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 251–256
Holte MB, Moeslund TB, Nikolaidis N, Pitas I (2011) 3D human action recognition for multi-view camera systems. In: IEEE International conference on 3D imaging, modeling, processing, visualization and transmission, 2011, pp 342–349
Jiang YG, Dai Q, Xue X, Liu W, Ngo CW (2012) Trajectory-based modeling of human actions with motion reference points. In: European conference on computer vision. Springer, Berlin, pp 425–438
Kearns M, Ron D (1999) Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Comput 11(6):1427–1453
Kliper-Gross O, Gurovich Y, Hassner T, Wolf L (2012) Motion interchange patterns for action recognition in unconstrained videos. In: European conference on computer vision. Springer, Berlin, pp 256–269
Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: IEEE International conference on computer vision, 2011, pp 2556–2563
Laptev I (2005) On space-time interest points. Int J Comput Vis 64(2–3):107–123
Laptev I, Marszałek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: CVPR 2008-IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 1–8
Lindeberg T (1998) Feature detection with automatic scale selection. Int J Comput Vis 30(2):79–116
Niebles JC, Wang H, Fei-Fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79(3):299–318
Onishi K, Takiguchi T, Ariki Y (2008) 3D human posture estimation using the HOG features from monocular image. In: IEEE 19th international conference on pattern recognition, 2008, pp 1–4
Qing L, Shaozi L (2010) Motion representation of spatiotemporal features of local features in motion recognition [J]. Comput Eng Appl 46(34):7–14
Rui-Feng LI, Liang-Liang W, Ke W (2014) A survey of human body action recognition[J]. Pattern Recogn Artif Intell 01:000035–48
Sadanand S, Corso JJ (2012) Action bank: a high-level representation of activity in video. In: IEEE conference on computer vision and pattern recognition, 2012, pp 1234–1241
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: IEEE Conferences proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, Vol. 3, pp 32–36
Solmaz B, Assari SM, Shah M (2013) Classifying web videos using a global video descriptor. Mach Vis Appl 24(7):1473–1485
Turaga P, Chellappa R, Subrahmanian VS, Udrea O (2008) Machine recognition of human activities: a survey. IEEE Trans Circuits Syst Video Technol 18(11):1473
Wang B, Wang Y, Xiao W, Wang W, Zhang M (2012) Human action recognition based on discriminative sparse coding video representation. Robot 34(6):745
Wang H, Kläser A, Schmid C, Liu CL (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vision 103(1):60–79
Weinland D, Ronfard R, Boyer E (2011) A survey of vision-based methods for action representation, segmentation and recognition. Comput Vis Image Underst 115(2):224–241
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, T., Ru, SR., Zeng, ZH. et al. Research on motion recognition algorithm based on bag-of-words model. Microsyst Technol 27, 1647–1654 (2021). https://doi.org/10.1007/s00542-019-04462-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00542-019-04462-8