Abstract
A human action recognition method is reported in which pose representation is based on the contour points of the human silhouette and actions are learned by a strict 3d pyramidal neural network (3DPyraNet) model which is based on convolutional neural networks and the image pyramids concept. 3DPyraNet extracts features from both spatial and temporal dimensions by keeping biological structure, thereby it is capable to capture the motion information encoded in multiple adjacent frames. One outlined advantage of 3DPyraNet is that it maintains spatial topology of the input image and presents a simple connection scheme with lower computational and memory costs compared to other neural networks. Encouraging results are reported for recognizing human actions in real-world environments.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Schindler, K., Van Gool, L.: Action Snippets: How many frames does human action recognition require? In: 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)
Yang, X., Tian, Y.L.: Action Recognition using super sparse coding vector with spatio-temporal awareness. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part II. LNCS, vol. 8690, pp. 727–741. Springer, Heidelberg (2014)
Liu, W., Wang, Z., Tao, D., Yu, J.: Hessian regularized sparse coding for human action recognition. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part II. LNCS, vol. 8936, pp. 502–511. Springer, Heidelberg (2015)
Melfi, R., Kondra, S., Petrosino, A.: Human activity modeling by spatio temporal textural appearance. Pattern Recognition Letters 34(15), 1990–1994 (2013)
Efros, A.-A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proceedings Ninth IEEE International Conference on Computer Vision, pp. 726–733. IEEE Computer Society (2003)
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings - International Conference on Pattern Recognition, vol. 3, pp. 32–36 (2004)
Ballan, L., Bertini, M., Del Bimbo, A., Seidenari, L., Serra, G.: Effective codebooks for human action representation and classification in unconstrained videos. IEEE Transactions on Multimedia 14(4 PART 2), 1234–1245 (2012)
Wu, D., Shao, L.: Silhouette analysis-based action recognition via exploiting human poses. IEEE Transactions on Circuits and Systems for Video Technology 23(2), 236–243 (2013)
Chaaraoui, A.A., Climent-Prez, P., Flrez-Revuelta, F.: Silhouette-based human action recognition using sequences of key poses. Pattern Recognition Letters 34(15), 1799–1807 (2013). Smart Approaches for Human Action Recognition
Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Taylor, G.W., Fergus, R., LeCun, Y., Bregler, C.: Convolutional learning of spatio-temporal features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 140–153. Springer, Heidelberg (2010)
Freitas, N.D.: Deep learning of invariant spatio temporal features from video. In: Workshop on Deep Learning and Unsupervised Feature Learning in NIPS, pp. 1–9 (2010)
Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3361–3368 (2011)
Ji, S., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence 35(1), 221–31 (2013)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: Salah, A.A., Lepri, B. (eds.) HBU 2011. LNCS, vol. 7065, pp. 29–39. Springer, Heidelberg (2011)
Ji, S., Xu, W., Yang, M., Yu, K.: 3d convolutional neural networks for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1), 221–231 (2013)
Cantoni, V., Petrosino, A.: Neural recognition in a pyramidal structure. IEEE Transactions on Neural Networks 13(2), 472–480 (2002)
Phung, S.L., Bouzerdoum, A.: A pyramidal neural network for visual pattern recognition. IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council 18(2), 329–343 (2007)
Fukushima, K.: Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks 1(2), 119–130 (1988)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proc. ICML, vol. 30 (2013)
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Proceedings - 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, VS-PETS 2005, pp. 65–72 (2005)
Maddalena, L., Petrosino, A.: The 3dsobs+ algorithm for moving object detection. Computer Vision and Image Understanding 122, 65–73 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ullah, I., Petrosino, A. (2015). A Strict Pyramidal Deep Neural Network for Action Recognition. In: Murino, V., Puppo, E. (eds) Image Analysis and Processing — ICIAP 2015. ICIAP 2015. Lecture Notes in Computer Science(), vol 9279. Springer, Cham. https://doi.org/10.1007/978-3-319-23231-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-23231-7_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23230-0
Online ISBN: 978-3-319-23231-7
eBook Packages: Computer ScienceComputer Science (R0)