Abstract
Current research works for human action recognition in videos mainly focused on the case in different types of videos, that is coarse recognition. However, for recognizing specific actions of one object of interest, these methods may fail to recognize, especially if the video contains multiple moving objects with different actions. In this paper, we proposed a novel method for specific player action recognition in combat sports video. Object tracking with body segmentation are used to generate sub-frame sequences. Action recognition is achieved by training a new three-stream Convolutional Neural Networks (CNNs) model, where the network inputs are horizontal components of optical flow, single sub-frame and vertical components of optical flow, respectively. And the network fusion is applied at both convolutional and softmax layers. Extensive experiments on real broadcast combat sports videos are provided to show the advantages and effectiveness of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Sun, L., Jia, K., Yeung, D.Y., Shi, B.E.: Human action recognition using factorized spatio-temporal convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 4597–4605. IEEE (2015)
Du, T., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 4489–4497. IEEE (2015)
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1933–1941. IEEE (2016)
Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: IEEE International Conference on Computer Vision (ICCV), pp. 2556–2563. IEEE (2011)
Zhen, X., Shao, L., Tao, D., Li, X.: Embedding motion and structure features for action recognition. IEEE Trans. Circuits Syst. Video Technol. 23(7), 1182–1190 (2013)
Everts, I., Van Gemert, J.C., Gevers, T.: Evaluation of color stips for human action recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2850–2857. IEEE (2013)
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Li, F.F.: Large-scale video classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725–1732. IEEE (2014)
Rodriguez, M.D., Ahmed, J., Shah, M.: Action MACH a spatio-temporal maximum average correlation height filter for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2008)
Mendi, E., Clemente, H.B., Bayrak, C.: Sports video summarization based on motion analysis. Comput. Electr. Eng. 39(3), 790–796 (2013)
Dao, M.S., Babaguchi, N.: A new spatio-temporal method for event detection and personalized retrieval of sports video. Multimed. Tools Appl. 50(1), 227–248 (2010)
Almajai, I., et al.: Anomaly detection and knowledge transfer in automatic sports video annotation. In: Weinshall, D., Anemüller, J., van Gool, L. (eds.) Detection and Identification of Rare Audiovisual Cues. SCI, vol. 384, pp. 109–117. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24034-8_9
Liu, J., Carr, P., Collins, R.T., Liu, Y.: Tracking sports players with context-conditioned motion models. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1830–1837. IEEE (2013)
Dehghan, A., Tian, Y., Torr, P.H., Shah, M.: Target identity-aware network flow for online multiple target tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1146–1154. IEEE (2015)
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6(2), 1453–1484 (2005)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 886–893. IEEE (2005)
Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res. 7(3), 551–585 (2006)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255. IEEE (2009)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Wang, L., Xiong, Y., Wang, Z., Qiao, Y.: Towards good practices for very deep two-stream convnets. arXiv preprint arXiv:1507.02159 (2015)
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant nos. 41606198 and 61301241) and in part by the China Postdoctoral Science Foundation under Grant No. 2015M582140.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Kong, Y., Wei, Z., Wei, Z., Wang, S., Gao, F. (2018). Exploiting Sub-region Deep Features for Specific Action Recognition in Combat Sports Video. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10736. Springer, Cham. https://doi.org/10.1007/978-3-319-77383-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-77383-4_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77382-7
Online ISBN: 978-3-319-77383-4
eBook Packages: Computer ScienceComputer Science (R0)