Recognizing Interactions in Video

Taj, Murtaza; Cavallaro, Andrea

doi:10.1007/978-3-642-11756-5_2

Murtaza Taj⁶ &
Andrea Cavallaro⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 282))

834 Accesses
3 Citations

Summary

Detection and tracking algorithms generates useful information in the form of trajectories from which the behaviors and the interactions of moving objects can be inferred through the analysis of spatio-temporal features. Interactions occur either between a dynamic and a static object, or between multiple dynamic objects. This chapter presents an interaction modeling framework formulated as a state sequence estimation problem using time-series analysis. Bayesian network-based methods and their variants are studied for the analysis of interactions in videos. Moreover, techniques such as Coupled Hidden Markov Model are also discussed for more complex interactions, such as those between multiple dynamic objects. Finally, the interaction modeling is demonstrated on real surveillance and sport sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Andrade, E.L., Blunsden, S., Fisher, R.B.: Modelling crowd scenes for event detection. In: Proc. of IEEE Conf. on Pattern Recognition, Hong Kong, CN (2006)
Google Scholar
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Elsevier Journal of Computer Vision and Image Understanding 104 (2006)
Google Scholar
Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)
Google Scholar
Stauffer, C., Grimson, W.: Learning patterns of activity using real-time tracking. IEEE Trans. on Pattern Analysis and Machine Intelligence 22, 747–757 (2000)
Article Google Scholar
Taj, M., Maggio, E., Cavallaro, A.: Multi-feature graph-based object tracking. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 190–199. Springer, Heidelberg (2007)
Chapter Google Scholar
Taj, M., Maggio, E., Cavallaro, A.: Objective evaluation of pedestrian and vehicle tracking on the CLEAR surveillance dataset. In: Stiefelhagen, R., Bowers, R., Fiscus, J.G. (eds.) RT 2007 and CLEAR 2007. LNCS, vol. 4625, pp. 160–173. Springer, Heidelberg (2008)
Chapter Google Scholar
Cavallaro, A., Ebrahimi, T.: Interaction between high-level and low-level image analysis for semantic video object extraction. EURASIP Journal on Applied Signal Processing 6, 786–797 (2004)
Google Scholar
Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: Proc. of IEEE Int. Conf. on Computer Vision, Washington, DC, USA, pp. 90–97 (2005)
Google Scholar
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: Proc. of Int. Conf. on Computer Vision Systems, Nice, FR (2003)
Google Scholar
Shafique, K., Shah, M.: A noniterative greedy algorithm for multiframe point correspondence. IEEE Trans. on Pattern Analysis and Machine Intelligence 27, 51–65 (2005)
Article Google Scholar
Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Computing Surveys (CSUR) 38, 1–45 (2006)
Article Google Scholar
Maggio, E., Smeraldi, F., Cavallaro, A.: Adaptive multifeature tracking in a particle filtering framework. IEEE Trans. on Circuits System and Video Technology 17, 1348–1359 (2007)
Article Google Scholar
Maggio, E., Piccardo, E., Regazzoni, C., Cavallaro, A.: Particle PHD filter for multi-target visual tracking. In: Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Honolulu, HI, USA (2007)
Google Scholar
Karlsson, S., Taj, M., Cavallaro, A.: Detection and tracking of humans and faces. EURASIP Journal on Image and Video Processing, 1–9 (2008)
Google Scholar
Zhou, H., Taj, M., Cavallaro, A.: Target detection and tracking with heterogeneous sensors. IEEE Journal of Selected Topics In Signal Processing 2 (2008)
Google Scholar
Taj, M., Cavallaro, A.: Multi-camera track-before-detect. In: Proc. of ACM/IEEE Int. Conf. on Distributed Smart Cameras, Como, IT (2009)
Google Scholar
Taj, M., Cavallaro, A.: Multi-camera scene analysis using an object-centric continuous distribution hidden Markov model. In: Proc. of IEEE Int. Conf. on Image Processing, San Antonio, TX, USA (2007)
Google Scholar
Taj, M., Cavallaro, A.: Object and scene-centric activity detection using state occupancy duration modeling. In: Proc. of IEEE Int. Conf. on Advanced Video and Signal Based Surveillance, Santa Fe, NM, USA (2008)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)
Google Scholar
Liu, J., Ali, S., Shah, M.: Recognizing human actions using multiple features. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)
Google Scholar
Velipasalar, S., Brown, L., Hampapur, A.: Specifying, interpreting and detecting high-level, spatio-temporal composite events in single and multi-camera systems. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, NY, USA (2006)
Google Scholar
Rezek, I., Gibbs, M., Roberts, S.J.: Maximum a posteriori estimation of coupled hidden Markov models. Journal of VLSI Signal Processing Systems 32, 55–66 (2002)
Article MATH Google Scholar
Mahmood, T.S., Vasilescu, A., Sethi, S.: Recognizing action events from multiple view points. In: Proc. of IEEE Workshop on Detection and Recognition of Events in Video, Madison, WI, USA (2001)
Google Scholar
Ghanem, N., DeMenthon, D., Doermann, D., Davis, L.: Representation and recognition of events in surveillance video using Petri nets. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Washington, DC, USA (2004)
Google Scholar
Wang, Y.: The variable-length hidden Markov model and its applications on sequential data mining. Technical report, Tsinghua University, Beijing, CN (2006), http://learn.tsinghua.edu.cn:8080/2001315444/VLHMM/icdm-techreport.pdf (last accessed: June 9, 2008)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Book MATH Google Scholar
Medioni, G.G., Cohen, I., Bremond, F., Hongeng, S., Nevatia, R.: Event detection and analysis from video streams. IEEE Trans. on Pattern Analysis and Machine Intelligence 23, 873–889 (2001)
Article Google Scholar
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Kaufmann, San Mateo (1990)
Google Scholar
Natarajan, P., Nevatia, R.: Coupled hidden semi Markov models for activity recognition. In: IEEE Int. Workshop on Motion and Video Computing, Austin, TX, USA (2007)
Google Scholar
i LIDS Team, Imagery library for intelligent detection systems (i-lids); a standard for testing video based detection systems. In: Proc. of IEEE Int. Carnahan Conf. on Security Technology, pp. 75–80 (2006)
Google Scholar
Cher, D.: ETISEO Metrics Definition. Silogic, Toulouse Cedex 1, FR (2006), https://www-sop.inria.fr/orion/ETISEO/iso_album/eti-metrics_definition-v2.pdf (last accessed: June 30, 2009)
Ferryman, J.: Performance evaluation of tracking and surveillance. In: Conj. with IEEE Int. Conf. on Computer Vision and Pattern Recognition (2006), http://www.cvg.rdg.ac.uk/PETS2006/data.html (last accessed: June 30, 2009)
Fisher, R.: Caviar: Context aware vision using image-based active recognition (2001-2005), http://homepages.inf.ed.ac.uk/rbf/CAVIAR/caviar.htm (last accessed: June 30, 2009)
Zotkin, D., Duraiswami, R., Davis, L.: Multimodal 3-D tracking and event detection via the particle filter. In: Proc. of IEEE Workshop on Detection and Recognition of Events in Video, Vancouver, CA (2001)
Google Scholar
Andrade, E.L., Blunsden, S., Fisher, R.B.: Detection of emergency events in crowded scenes. In: IEE Int. Symp. on Imaging for Crime Detection and Prevention, London, UK (2006)
Google Scholar
Wu, G., Wu, Y., Jiao, L., Wang, Y., Chang, E.Y.: Multi-camera spatio-temporal fusion and biased sequence-data learning for security surveillance. In: Proc. of ACM Int. Conf. on Multimedia, NY, USA (2003)
Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: Proc. of IEEE Int. Conf. on Computer Vision, Beijing, CN (2005)
Google Scholar
Morris, R.J., Hogg, D.C.: Statistical models of object interaction. Int. Journal on Computer Vision 37, 209–215 (2000)
Article MATH Google Scholar
Brand, M.: Coupled hidden Markov models for modeling interacting processes. MIT media lab perceptual computing / learning and common sense technical report 405, Massachusetts Institute of Technology (1997), http://citeseer.ist.psu.edu/7422.html (last accessed: December 30, 2008)
Oliver, N., Rosario, B., Pentland, A.: A bayesian computer vision system for modeling human interactions. IEEE Trans. on Pattern Analysis and Machine Intelligence 22, 831–843 (2000)
Article Google Scholar
Chartrand, G.: Introductory Graph Theory. In: Directed Graphs as Mathematical Models, ch. 1, pp. 16–19. Dover Publications, New York (1985)
Google Scholar
Murphy, K.: Dynamic Bayesian networks: Representation, inference and learning. PhD thesis, Department of Computer Science, UC Berkeley (2002)
Google Scholar
Zhang, L., Samaras, D., Klein, N.A., Volkow, N., Goldstein, R.: Modeling neuronal interactivity using dynamic bayesian networks. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18, pp. 1593–1600. MIT Press, Cambridge (2006)
Google Scholar
Brand, M., Kettnaker, V.: Discovery and segmentation of activities in video. IEEE Trans. on Pattern Analysis and Machine Intelligence 22, 844–851 (2000)
Article Google Scholar
Galata, A., Cohn, A., Magee, D., Hogg, D.: Modeling interaction using learnt qualitative spatio-temporal relations and variable length Markov models. In: Proc. of European Conf. on Artificial Intelligence, Lyon, FR (2002)
Google Scholar
Marhasev, E., Hadad, M., Kaminka, G.A.: Non-stationary hidden semi Markov models in activity recognition. In: Proc. of the AAAI Workshop on Modeling Others from Observations, Boston, MA, USA (2006)
Google Scholar
Russell, M., Moore, R.: Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition. In: Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Tampa, FL, USA (1985)
Google Scholar
Burshtein, D.: Robust parametric modeling of durations in hidden Markov models. IEEE Trans. on Speech and Audio Processing 4, 240–242 (1996)
Article Google Scholar
Auvinet, E., Grossmann, E., Rougier, C., Dahmane, M., Meunier, J.: Left-luggage detection using homographies and simple heuristics. In: Joint IEEE Int. Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, NY, USA (2006)
Google Scholar
RATP FR, Call for Real-Time Event Detection Solutions (CREDS) for Enhanced Security and Safety in Public Transportation (2005)
Google Scholar
McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions, vol. 2. John Wiley & Sons, New York (1996)
Google Scholar
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. of IEEE, 267–296 (1990)
Google Scholar

Download references

Author information

Authors and Affiliations

Queen Mary University of London,
Murtaza Taj & Andrea Cavallaro

Authors

Murtaza Taj
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Cavallaro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

TOBB ETÜ Computer Engineering Department, Sögütözü Cad.No:43, Sögütözü, 06560, Ankara, Turkey
Husrev Taha Sencar
Computing, Information Systems and Mathematics, Kingston University London, Penrhyn Road, KT1 2EE, Surrey, UK
Sergio Velastin
Department of Informatics, Aristotle University of Thessaloniki, Box 451, GR-54124, Thessaloniki, Greece
Nikolaos Nikolaidis
France Telecom R&D (Orange Labs) Beijing, Raycom Infotech Park C, 2 Science Institute South Road, Haidian District, 100080, Beijing, China
Shiguo Lian

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Taj, M., Cavallaro, A. (2010). Recognizing Interactions in Video. In: Sencar, H.T., Velastin, S., Nikolaidis, N., Lian, S. (eds) Intelligent Multimedia Analysis for Security Applications. Studies in Computational Intelligence, vol 282. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11756-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-11756-5_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11754-1
Online ISBN: 978-3-642-11756-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics