Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 282))

Summary

Detection and tracking algorithms generates useful information in the form of trajectories from which the behaviors and the interactions of moving objects can be inferred through the analysis of spatio-temporal features. Interactions occur either between a dynamic and a static object, or between multiple dynamic objects. This chapter presents an interaction modeling framework formulated as a state sequence estimation problem using time-series analysis. Bayesian network-based methods and their variants are studied for the analysis of interactions in videos. Moreover, techniques such as Coupled Hidden Markov Model are also discussed for more complex interactions, such as those between multiple dynamic objects. Finally, the interaction modeling is demonstrated on real surveillance and sport sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andrade, E.L., Blunsden, S., Fisher, R.B.: Modelling crowd scenes for event detection. In: Proc. of IEEE Conf. on Pattern Recognition, Hong Kong, CN (2006)

    Google Scholar 

  2. Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Elsevier Journal of Computer Vision and Image Understanding 104 (2006)

    Google Scholar 

  3. Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)

    Google Scholar 

  4. Stauffer, C., Grimson, W.: Learning patterns of activity using real-time tracking. IEEE Trans. on Pattern Analysis and Machine Intelligence 22, 747–757 (2000)

    Article  Google Scholar 

  5. Taj, M., Maggio, E., Cavallaro, A.: Multi-feature graph-based object tracking. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 190–199. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  6. Taj, M., Maggio, E., Cavallaro, A.: Objective evaluation of pedestrian and vehicle tracking on the CLEAR surveillance dataset. In: Stiefelhagen, R., Bowers, R., Fiscus, J.G. (eds.) RT 2007 and CLEAR 2007. LNCS, vol. 4625, pp. 160–173. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  7. Cavallaro, A., Ebrahimi, T.: Interaction between high-level and low-level image analysis for semantic video object extraction. EURASIP Journal on Applied Signal Processing 6, 786–797 (2004)

    Google Scholar 

  8. Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: Proc. of IEEE Int. Conf. on Computer Vision, Washington, DC, USA, pp. 90–97 (2005)

    Google Scholar 

  9. Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: Proc. of Int. Conf. on Computer Vision Systems, Nice, FR (2003)

    Google Scholar 

  10. Shafique, K., Shah, M.: A noniterative greedy algorithm for multiframe point correspondence. IEEE Trans. on Pattern Analysis and Machine Intelligence 27, 51–65 (2005)

    Article  Google Scholar 

  11. Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Computing Surveys (CSUR) 38, 1–45 (2006)

    Article  Google Scholar 

  12. Maggio, E., Smeraldi, F., Cavallaro, A.: Adaptive multifeature tracking in a particle filtering framework. IEEE Trans. on Circuits System and Video Technology 17, 1348–1359 (2007)

    Article  Google Scholar 

  13. Maggio, E., Piccardo, E., Regazzoni, C., Cavallaro, A.: Particle PHD filter for multi-target visual tracking. In: Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Honolulu, HI, USA (2007)

    Google Scholar 

  14. Karlsson, S., Taj, M., Cavallaro, A.: Detection and tracking of humans and faces. EURASIP Journal on Image and Video Processing, 1–9 (2008)

    Google Scholar 

  15. Zhou, H., Taj, M., Cavallaro, A.: Target detection and tracking with heterogeneous sensors. IEEE Journal of Selected Topics In Signal Processing 2 (2008)

    Google Scholar 

  16. Taj, M., Cavallaro, A.: Multi-camera track-before-detect. In: Proc. of ACM/IEEE Int. Conf. on Distributed Smart Cameras, Como, IT (2009)

    Google Scholar 

  17. Taj, M., Cavallaro, A.: Multi-camera scene analysis using an object-centric continuous distribution hidden Markov model. In: Proc. of IEEE Int. Conf. on Image Processing, San Antonio, TX, USA (2007)

    Google Scholar 

  18. Taj, M., Cavallaro, A.: Object and scene-centric activity detection using state occupancy duration modeling. In: Proc. of IEEE Int. Conf. on Advanced Video and Signal Based Surveillance, Santa Fe, NM, USA (2008)

    Google Scholar 

  19. Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)

    Google Scholar 

  20. Liu, J., Ali, S., Shah, M.: Recognizing human actions using multiple features. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)

    Google Scholar 

  21. Velipasalar, S., Brown, L., Hampapur, A.: Specifying, interpreting and detecting high-level, spatio-temporal composite events in single and multi-camera systems. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, NY, USA (2006)

    Google Scholar 

  22. Rezek, I., Gibbs, M., Roberts, S.J.: Maximum a posteriori estimation of coupled hidden Markov models. Journal of VLSI Signal Processing Systems 32, 55–66 (2002)

    Article  MATH  Google Scholar 

  23. Mahmood, T.S., Vasilescu, A., Sethi, S.: Recognizing action events from multiple view points. In: Proc. of IEEE Workshop on Detection and Recognition of Events in Video, Madison, WI, USA (2001)

    Google Scholar 

  24. Ghanem, N., DeMenthon, D., Doermann, D., Davis, L.: Representation and recognition of events in surveillance video using Petri nets. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Washington, DC, USA (2004)

    Google Scholar 

  25. Wang, Y.: The variable-length hidden Markov model and its applications on sequential data mining. Technical report, Tsinghua University, Beijing, CN (2006), http://learn.tsinghua.edu.cn:8080/2001315444/VLHMM/icdm-techreport.pdf (last accessed: June 9, 2008)

  26. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)

    Book  MATH  Google Scholar 

  27. Medioni, G.G., Cohen, I., Bremond, F., Hongeng, S., Nevatia, R.: Event detection and analysis from video streams. IEEE Trans. on Pattern Analysis and Machine Intelligence 23, 873–889 (2001)

    Article  Google Scholar 

  28. Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Kaufmann, San Mateo (1990)

    Google Scholar 

  29. Natarajan, P., Nevatia, R.: Coupled hidden semi Markov models for activity recognition. In: IEEE Int. Workshop on Motion and Video Computing, Austin, TX, USA (2007)

    Google Scholar 

  30. i LIDS Team, Imagery library for intelligent detection systems (i-lids); a standard for testing video based detection systems. In: Proc. of IEEE Int. Carnahan Conf. on Security Technology, pp. 75–80 (2006)

    Google Scholar 

  31. Cher, D.: ETISEO Metrics Definition. Silogic, Toulouse Cedex 1, FR (2006), https://www-sop.inria.fr/orion/ETISEO/iso_album/eti-metrics_definition-v2.pdf (last accessed: June 30, 2009)

  32. Ferryman, J.: Performance evaluation of tracking and surveillance. In: Conj. with IEEE Int. Conf. on Computer Vision and Pattern Recognition (2006), http://www.cvg.rdg.ac.uk/PETS2006/data.html (last accessed: June 30, 2009)

  33. Fisher, R.: Caviar: Context aware vision using image-based active recognition (2001-2005), http://homepages.inf.ed.ac.uk/rbf/CAVIAR/caviar.htm (last accessed: June 30, 2009)

  34. Zotkin, D., Duraiswami, R., Davis, L.: Multimodal 3-D tracking and event detection via the particle filter. In: Proc. of IEEE Workshop on Detection and Recognition of Events in Video, Vancouver, CA (2001)

    Google Scholar 

  35. Andrade, E.L., Blunsden, S., Fisher, R.B.: Detection of emergency events in crowded scenes. In: IEE Int. Symp. on Imaging for Crime Detection and Prevention, London, UK (2006)

    Google Scholar 

  36. Wu, G., Wu, Y., Jiao, L., Wang, Y., Chang, E.Y.: Multi-camera spatio-temporal fusion and biased sequence-data learning for security surveillance. In: Proc. of ACM Int. Conf. on Multimedia, NY, USA (2003)

    Google Scholar 

  37. Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: Proc. of IEEE Int. Conf. on Computer Vision, Beijing, CN (2005)

    Google Scholar 

  38. Morris, R.J., Hogg, D.C.: Statistical models of object interaction. Int. Journal on Computer Vision 37, 209–215 (2000)

    Article  MATH  Google Scholar 

  39. Brand, M.: Coupled hidden Markov models for modeling interacting processes. MIT media lab perceptual computing / learning and common sense technical report 405, Massachusetts Institute of Technology (1997), http://citeseer.ist.psu.edu/7422.html (last accessed: December 30, 2008)

  40. Oliver, N., Rosario, B., Pentland, A.: A bayesian computer vision system for modeling human interactions. IEEE Trans. on Pattern Analysis and Machine Intelligence 22, 831–843 (2000)

    Article  Google Scholar 

  41. Chartrand, G.: Introductory Graph Theory. In: Directed Graphs as Mathematical Models, ch. 1, pp. 16–19. Dover Publications, New York (1985)

    Google Scholar 

  42. Murphy, K.: Dynamic Bayesian networks: Representation, inference and learning. PhD thesis, Department of Computer Science, UC Berkeley (2002)

    Google Scholar 

  43. Zhang, L., Samaras, D., Klein, N.A., Volkow, N., Goldstein, R.: Modeling neuronal interactivity using dynamic bayesian networks. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18, pp. 1593–1600. MIT Press, Cambridge (2006)

    Google Scholar 

  44. Brand, M., Kettnaker, V.: Discovery and segmentation of activities in video. IEEE Trans. on Pattern Analysis and Machine Intelligence 22, 844–851 (2000)

    Article  Google Scholar 

  45. Galata, A., Cohn, A., Magee, D., Hogg, D.: Modeling interaction using learnt qualitative spatio-temporal relations and variable length Markov models. In: Proc. of European Conf. on Artificial Intelligence, Lyon, FR (2002)

    Google Scholar 

  46. Marhasev, E., Hadad, M., Kaminka, G.A.: Non-stationary hidden semi Markov models in activity recognition. In: Proc. of the AAAI Workshop on Modeling Others from Observations, Boston, MA, USA (2006)

    Google Scholar 

  47. Russell, M., Moore, R.: Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition. In: Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Tampa, FL, USA (1985)

    Google Scholar 

  48. Burshtein, D.: Robust parametric modeling of durations in hidden Markov models. IEEE Trans. on Speech and Audio Processing 4, 240–242 (1996)

    Article  Google Scholar 

  49. Auvinet, E., Grossmann, E., Rougier, C., Dahmane, M., Meunier, J.: Left-luggage detection using homographies and simple heuristics. In: Joint IEEE Int. Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, NY, USA (2006)

    Google Scholar 

  50. RATP FR, Call for Real-Time Event Detection Solutions (CREDS) for Enhanced Security and Safety in Public Transportation (2005)

    Google Scholar 

  51. McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions, vol. 2. John Wiley & Sons, New York (1996)

    Google Scholar 

  52. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. of IEEE, 267–296 (1990)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Taj, M., Cavallaro, A. (2010). Recognizing Interactions in Video. In: Sencar, H.T., Velastin, S., Nikolaidis, N., Lian, S. (eds) Intelligent Multimedia Analysis for Security Applications. Studies in Computational Intelligence, vol 282. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11756-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11756-5_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11754-1

  • Online ISBN: 978-3-642-11756-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics