Skip to main content

Multimodal Content-based Video Retrieval

  • Chapter
Multimedia Retrieval

Part of the book series: Data-Centric Systems and Applications ((DCSA))

  • 1032 Accesses

Abstract

This chapter is a case study showing how important events (highlights) can be automatically detected in video recordings of Formula 1 car racing. Numerous approaches presented in literature have shown that it is becoming possible to extract interesting events from video. However, the majority of the approaches uses individual visual or audio cues. According to the current understanding of human perception it is expected that using evidence obtained from different modalities should result in a more robust and accurate perception of video. On the other hand, fusion of multimodal evidence is quite challenging, since it has to deal with indications which may contradict each other. In this chapter we deal with three topics, one being fusion of evidence from different modalities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. B. Arons. Pitch-Based Emphasis Detection for Segment Speech Recordings. In Proceeding of the International Conference on Spoken Language Processing, pages 1931–1934, Yokohama, Japan, 1994.

    Google Scholar 

  2. N. Babaguchi, Y. Kawai, Y. Yasugi, and T. Kitahashi. Linking Live and Replay Scenes in Broadcast Sport Video. In Proceeding of ACM Multimedia 2000 Workshops, pages 205–208, Marina del Rey, CA, 2000.

    Google Scholar 

  3. X. Boyen, N. Fiderman, and D. Koller. Discovering the Hidden Structure of Complex Dynamic Systems. In Proceeding of IEEE Intl. Conference on Uncertainty in Artificial Intelligence, pages 91–100, 1999.

    Google Scholar 

  4. Y.L. Chang, W. Zeng, I. Kamel, and R. Alonso. Integrated Image and Speech Analysis for Content-Based Video Indexing. In Proceeding of third IEEE Conference on Multimedia Computing and Systems, pages 306–313, Hiroshima, Japan, 1996.

    Google Scholar 

  5. Y. Gong, L.T. Sin, C.H. Chuan, H-J. Zhang, and M. Sakauchi. Automatic Parsing of TV Soccer Programs. In Procceeding of IEEE International Conference on Multimedia Computing and Systems, pages 167–174, Washington D.C., 1995.

    Google Scholar 

  6. N. Haering, R.J. Qian, and M.I. Sezan. A Semantic Event-Detection Approach and its Application to Detecting Hunts in Wildlife Video. IEEE Transactions on Circuits and Systems for Video Technology, 10(6):857–868, 2000.

    Article  Google Scholar 

  7. S. Intille and A. Bobick. Visual Tracking Using Closed-Worlds. Technical Report 294, MIT Media Laboratory, 1994.

    Google Scholar 

  8. V. Kobla, D. De Menthon, and D. Doermann. Identifying Sports Video using Replay, Text, and Camera Motion Features. In Proceeding of SPIE Conference on Storage and Retrieval for Media Datbases, pages 332–343, Hiroshima, Japan, 2000.

    Google Scholar 

  9. R. Lienhart. Video Mining, pages 155–184. Kluwer Academic Publisher, 2003.

    Google Scholar 

  10. V. Mihajlović and M. Petković. Automatic Annotation of Formula 1 Races for Content Based Video Retrieval. Technical Report TR-CTIT-01-41, Centre for Telematics and Information Technology, 2001.

    Google Scholar 

  11. M. Naphade and T.S. Huang. A Probabilistic Framework for Semantic Indexing and Retrieval in Video. In Proceeding of IEEE Intl. Conference on Multimedia and Expo (ICME), volume 1, pages 475–478, New York, 2000.

    Google Scholar 

  12. L. Rabiner and B.H. Juang. Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs, New Yersey, 1993.

    Google Scholar 

  13. Y. Rui, A. Gupta, and A. Acero. Automatically Extracting Highlights for TV Baseball Programs. In Proceeding of ACM Multimedia, pages 105–115, Los Angeles, CA, 2000.

    Google Scholar 

  14. T. Syeda-Mahmood and S. Srinivasan. Detecting Topical Events in Digital Video. In Procceeding of ACM Multimedia, pages 85–94, Los Angeles, CA, 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Mihajlović, V., Petković, M., Jonker, W., Blanken, H. (2007). Multimodal Content-based Video Retrieval. In: Blanken, H.M., Blok, H.E., Feng, L., de Vries, A.P. (eds) Multimedia Retrieval. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72895-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72895-5_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72894-8

  • Online ISBN: 978-3-540-72895-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics