Action Recognition with Exemplar Based 2.5D Graph Matching

Yao, Bangpeng; Fei-Fei, Li

doi:10.1007/978-3-642-33765-9_13

Bangpeng Yao²¹ &
Li Fei-Fei²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7575))

Included in the following conference series:

European Conference on Computer Vision

9574 Accesses
44 Citations

Abstract

This paper deals with recognizing human actions in still images. We make two key contributions. (1) We propose a novel, 2.5D representation of action images that considers both view-independent pose information and rich appearance information. A 2.5D graph of an action image consists of a set of nodes that are key-points of the human body, as well as a set of edges that are spatial relationships between the nodes. Each key-point is represented by view-independent 3D positions and local 2D appearance features. The similarity between two action images can then be measured by matching their corresponding 2.5D graphs. (2) We use an exemplar based action classification approach, where a set of representative images are selected for each action class. The selected images cover large within-action variations and carry discriminative information compared with the other classes. This exemplar based representation of action classes further makes our approach robust to pose variations and occlusions. We test our method on two publicly available datasets and show that it achieves very promising performance.

Download to read the full chapter text

Chapter PDF

Hollywood 3D: What are the Best 3D Features for Action Recognition?

Article Open access 21 June 2016

Temporal Self-Similarity for Appearance-Based Action Recognition in Multi-View Setups

Action recognition based on global optimal similarity measuring

Article 07 August 2015

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Ikizler, N., Cinbis, R.G., Pehlivan, S., Duygulu, P.: Recognizing actions from still images. In: ICPR (2008)
Google Scholar
Gupta, A., Kembhavi, A., Davis, L.S.: Observing human-object interactions: Using spatial and functional compatibility for recognition. IEEE T. Pattern Anal. Mach. Intell. 31, 1775–1789 (2009)
Article Google Scholar
Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human-object interaction activities. In: CVPR (2010)
Google Scholar
Yang, W., Wang, Y., Mori, G.: Recognizing human actions from still images with latent poses. In: CVPR (2010)
Google Scholar
Yao, B., Jiang, X., Khosla, A., Lin, A.L., Guibas, L.J., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: ICCV (2011)
Google Scholar
Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: CVPR (2011)
Google Scholar
Delaitre, V., Sivic, J., Laptev, I.: Learning person-object interactions for action recognition in still images. In: NIPS (2011)
Google Scholar
Prest, A., Schmid, C., Ferrari, V.: Weakly supervised learning of interactions between humans and objects. IEEE T. Pattern Anal. Mach. Intell. 34, 601–614 (2012)
Article Google Scholar
Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: CVPR (2011)
Google Scholar
Everingham, M., Van Gool, L.J., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2011 (VOC2011) Results (2011)
Google Scholar
Natarajan, P., Nevatia, R.: View and scale invariant action recognition using multiview shape-flow methods. In: CVPR (2008)
Google Scholar
Yan, P., Khan, S.M., Shah, M.: Learning 4D action feature models for arbitaray view action recognition. In: CVPR (2008)
Google Scholar
Gong, D., Medioni, G.: Dynamic manifold warping for view invariant action recognition. In: ICCV (2011)
Google Scholar
Weinland, D., Özuysal, M., Fua, P.: Making Action Recognition Robust to Occlusions and Viewpoint Changes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 635–648. Springer, Heidelberg (2010)
Chapter Google Scholar
Junejo, I.N., Dexter, E., Laptev, I., Perez, P.: View-independent action recognition from temporal self-similarities. IEEE T. Pattern Anal. Mach. Intell. 33, 172–185 (2011)
Article Google Scholar
Sapp, B., Toshev, A., Taskar, B.: Cascaded Models for Articulated Pose Estimation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 406–420. Springer, Heidelberg (2010)
Chapter Google Scholar
Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single uncalibrated image, vol. 80, pp. 349–363 (2000)
Google Scholar
Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3D human pose annotations. In: ICCV (2009)
Google Scholar
Yao, A., Gall, J., Fanelli, G., van Gool, L.: Does human action recognition benefit from pose estimation? In: BMVC (2011)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Article Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Google Scholar
Szeliski, R., Anandan, P., Baker, S.: From 2D images to 2.5D sprites: A layered approach to modeling 3D scenes. In: MMCS (1999)
Google Scholar
Duan, Y., Qin, H.: 2.5D active contour for surface reconstruction. In: VMV (2003)
Google Scholar
Zafeiriou, S., Petrou, M.: 2.5D elastic graph matching. Comput. Vis. Image Und. 115, 1062–1072 (2011)
Article Google Scholar
Sung, K.K., Poggio, T.: Example-based learning for view-based human face detection. IEEE T. Pattern Anal. Mach. Intell. 20, 39–51 (1998)
Article Google Scholar
Frome, A., Singer, Y., Sha, F., Malik, J.: Learning globally-consistent local distance functions for shape-based image retrieval and classification. In: ICCV (2007)
Google Scholar
Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-SVMs for object detection and beyond. In: ICCV (2011)
Google Scholar
Willems, G., Becker, J.H., Tuytelaars, T., van Gool, L.: Exemplar-based action recognition in video. In: BMVC (2009)
Google Scholar
Hedetniemi, S.T., Laskar, R.C.: Bibliography on domination in graphs and some basic definitions of domination parameters. Discrete Math. 86, 257–277 (1990)
Article MathSciNet MATH Google Scholar
Yao, B., Ai, H., Lao, S.: Building a Compact Relevant Sample Coverage for Relevance Feedback in Content-Based Image Retrieval. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 697–710. Springer, Heidelberg (2008)
Chapter Google Scholar
Read, J.C.A., Phillipson, G.P., Serrano-Pedraza, I., Milner, A.D., Parker, A.J.: Stereoscopic vision in the absence of the lateral occipital cortex. PLoS One 5 (2010)
Google Scholar
Lee, H.J., Chen, Z.: Determination of human body posture from a single view. Comp. Vision, Graphics, and Image Proc. 30, 148–168 (1985)
Article MathSciNet Google Scholar
Delaitre, V., Laptev, I., Sivic, J.: Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: BMVC (2010)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE T. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)
Article Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Learning locality-constrained linear coding for image classification. In: CVPR (2010)
Google Scholar
Umeyama, S.: Least-squares estimation of transformation parameters between two point patterns. IEEE T. Pattern Anal. Mach. Intell. 13, 376–380 (1991)
Article Google Scholar
Yao, B., Fei-Fei, L.: Grouplet: A structured image representation for recognizing human and object interactions. In: CVPR (2010)
Google Scholar
Burghouts, G.J., Geusebroek, J.M.: Performance evaluation of local colour invariants. Comput. Vis. Image Und. 113, 48–62 (2009)
Article Google Scholar
Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Stanford University, USA
Bangpeng Yao & Li Fei-Fei

Authors

Bangpeng Yao
View author publications
You can also search for this author in PubMed Google Scholar
Li Fei-Fei
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd., CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yao, B., Fei-Fei, L. (2012). Action Recognition with Exemplar Based 2.5D Graph Matching. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7575. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33765-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-33765-9_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33764-2
Online ISBN: 978-3-642-33765-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Action Recognition with Exemplar Based 2.5D Graph Matching

Abstract

Chapter PDF

Similar content being viewed by others

Hollywood 3D: What are the Best 3D Features for Action Recognition?

Temporal Self-Similarity for Appearance-Based Action Recognition in Multi-View Setups

Action recognition based on global optimal similarity measuring

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Action Recognition with Exemplar Based 2.5D Graph Matching

Abstract

Chapter PDF

Similar content being viewed by others

Hollywood 3D: What are the Best 3D Features for Action Recognition?

Temporal Self-Similarity for Appearance-Based Action Recognition in Multi-View Setups

Action recognition based on global optimal similarity measuring

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation