Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2705))

  • 283 Accesses

Abstract

This section describes the indexing, search, and retrieval of various combinations of audio, video, text, and image media and the automated content processing that enables it. The intent is to provide a framework for data analysis in multimedia digital libraries. The organization of this article is as follows: The introduction briefly distinguishes digital from traditional libraries and touches on the specific issues important to searching the content of multimedia libraries. The second section introduces the Informedia Digital Video Library as an example of a multimedia library, including a quick tour of the functionality. The next section discusses the processing of audio and image information, as it relates to a multimedia library. Section four illustrates the interplay between audio and video information using a video information retrieval experiment as an example. Section five discusses the exporting and sharing of metadata in a digital library using MPEG–7. Finally, section 6 presents one vision of a future digital library, where all personal memory can be recorded and accessed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ardizzone, E., La Cascia, M., Avanzato, A., Bruna, A.: Video indexing using MPEG motion compensation vectors. In: IEEE International Conference on Multimedia Computing and Systems, vol. 2, pp. 725–729 (1999)

    Google Scholar 

  2. Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble: A high performance learning name-finder. In: Proc. 5th Conference on Applied Natural Language Processing, pp. 194–201 (1996)

    Google Scholar 

  3. Bouthemy, P., Gelgon, M., Ganansia, F.: A unified approach to shot change detection and camera motion characterization. IEEE Trans. Circuits and Systems for Video Technology 9, 1030–1044 (1999)

    Article  Google Scholar 

  4. Bush, V.: As we may think. The Atlantic Monthly 176(7), 101–108 (1945)

    Google Scholar 

  5. Chang, S.-F., Sikora, T., Puri, A.: Overview of the MPEG-7 standard. IEEE Transactions on Circuits and Systems for Video Technology (2001)

    Google Scholar 

  6. Christel, M., Martin, D.: Information visualization within a digital video library. Journal of Intelligent Information Systems 11(3), 235–257 (1998)

    Article  Google Scholar 

  7. Christel, M.G., Hauptmann, A.G., Warmack, A.S., Crosby, S.A.: Adjustable filmstrips and skims as abstractions for a digital video library. In: Proc. IEEE Advances in Digital Libraries Conference, pp. 98–104 (1999)

    Google Scholar 

  8. Christel, M.G., Maher, B., Begun, A.: XSLT for tailored access to a digital video library. In: Proc. Joint Conference on Digital Libraries, pp. 290–299 (2001)

    Google Scholar 

  9. Christel, M.G., Olligschlaeger, A.M., Huang, C.: Interactive maps for a digital video library. IEEE MultiMedia 7(1), 60–67 (2000)

    Article  Google Scholar 

  10. Bimbo, A.D.: Visual Information Retrieval. Morgan Kaufmann Publishers, San Francisco (1999)

    Google Scholar 

  11. Encyclopedia Britannica (2002), http://www.britannica.com

  12. Fox, E.A., Marchionini, G.: Toward a worldwide digital library. Communications of the ACM 41(4), 22–28 (1998)

    Article  Google Scholar 

  13. Garofolo, J.S., Auzanne, C.P., Voorhees, E.M.: The TREC spoken document retrieval track: A success story. In: Proc RIAO–2000: Content-Based Multimedia Information Access Conference, pp. 12–14 (2000)

    Google Scholar 

  14. Goldstein, J., Kantrowitz, M., Mittal, V., Carbonell, J.: Summarizing text documents: Sentence selection and evaluation metrics. In: Proc. ACM SIGIR (1999)

    Google Scholar 

  15. Hauptmann, A.G., Jin, R., Ng, T.D.: Multi-modal information retrieval from broadcast video using OCR and speech recognition. In: Proc. Joint Conference on Digital Libraries (2002)

    Google Scholar 

  16. Hauptmann, A.G., Jones, R.E., Seymore, K., Siegler, M.A., Slattery, S.T., Witbrock, M.J.: Experiments in information retrieval from spoken documents. In: Proc. DARPA Workshop on Broadcast News Understanding Systems (1998)

    Google Scholar 

  17. Hauptmann, A.G., Lee, D.: Topic labeling of broadcast news stories in the Informedia digital video library. In: Proc. ACM Conference on Digital Libraries (1998)

    Google Scholar 

  18. Hauptmann, A.G., Smith, M.: Text, speech and vision for video segmentation: The Informedia project. In: Proc. AAAI Fall Symposium, Computational Models for Integrating Language and Vision, pp. 10–12 (1995)

    Google Scholar 

  19. Hauptmann, A.G., Witbrock, M.: Informedia: News-on-demand - multimedia information acquisition and retrieval. In: Maybury, M. (ed.) Intelligent Multimedia Information Retrieval. AAAI Press/MIT Press (1998)

    Google Scholar 

  20. Hauptmann, A.G., Witbrock, M.J., Christel, M.G.: Artificial intelligence techniques in the interface to a digital video library. In: Proc. Conference on Human Factors in Computing Systems, pp. 2–3 (1997)

    Google Scholar 

  21. Houghton, R.: Named faces: putting names to faces. IEEE Intelligent Systems 14(5), 45–50 (1999)

    Article  Google Scholar 

  22. Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., Yanker, P.: Query by image and video content: the QBIC system. IEEE Computer 28(9), 23–32 (1995)

    MATH  Google Scholar 

  23. Jin, R., Hauptmann, A.G.: Headline generation using a training corpus. In: Gelbukh, A. (ed.) CICLING 2001. LNCS, vol. 2004, pp. 208–215. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  24. Jinzenji, K., Ishibashi, S., Kotera, H.: Algorithm for automatically producing layered sprites by detecting camera movement. In: Proc. International Conference on Image Processing, vol. 1, pp. 767–770 (1997)

    Google Scholar 

  25. Kantor, P., Voorhees, E.M.: Report on the confusion track. In: Proc. Fifth Text Retrieval Conference, (TREC-5), (1997)

    Google Scholar 

  26. Kimball, O., Schmidt, M., Gish, H., Waterman, J.: Speaker verification with limited enrollment data. In: Proc. ICSLP, vol. 2, pp. 967–970 (1996)

    Google Scholar 

  27. Kubala, F., Colbath, S., Liu, D., Makhoul, J.: Rough’n’Ready: A meeting recorder and browser. ACM Computing Surveys 31(2es), 7 (1999)

    Article  Google Scholar 

  28. Kubala, F., Colbath, S., Liu, D., Srivastava, A., Makhoul, J.: Integrated technologies for indexing spoken language. Communication of the ACM 43(2), 48–56 (2000)

    Article  Google Scholar 

  29. Kubala, F., Schwartz, R., Stone, R., Weischedel, R.: Named entity extraction from speech. In: Proc. DARPA Broadcast News Workshop (1998)

    Google Scholar 

  30. Kupiec, J., Pedersen, J., Chen, F.: A trainable document summarizer. In: Proc. ACM SIGIR, pp. 68–73 (1995)

    Google Scholar 

  31. Lee, H., Smeaton, A.: Searching the Físchlár-NEWS archive on a mobile device. In: Proc. ACM SIGIR, pp. 11–15 (2002)

    Google Scholar 

  32. Leiner, B.M.: The scope of the digital library. Draft Prepared for the DLib Working Group on Digital Library Metrics (1998)

    Google Scholar 

  33. Lienhart, R.: Comparison of automatic shot boundary detection algorithms. In: Storage and Retrieval for Still Image and Video Databases VII, Proc. SPIE 3656-29 (1999)

    Google Scholar 

  34. Mani, I., House, D., Maybury, M., Green, M.: Towards content-based browsing of broadcast news video. Intelligent Multimedia Information Retrieval (1998)

    Google Scholar 

  35. MPEG Moving Pictures Expert Group. Standards ISO/IEC 13818-2:2000, and ISO/IEC 11172-2 (1993), http://mpeg.telecomitalialab.com/standards.htm

  36. ISO/IEC JTC1/SC29/WG11 N4509. Overview of the MPEG-7 standard, version 6.0 (2000)

    Google Scholar 

  37. Ney, H.: The use of a one stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, AASP 32(2), 262–271 (1984)

    Google Scholar 

  38. Olligschlaeger, A.M., Hauptmann, A.G.: Multimodal information systems and GIS: The Informedia digital video library. ESRI User Conference (1999)

    Google Scholar 

  39. MPEG-7 Schema Page (2001), http://pmedia.i2.ibm.com:8000/mpeg7/schema/

  40. Park, J.I., Inoue, S., Iwadate, Y.: Estimating camera parameters from motion vectors of digital video. In: IEEE Workshop Multimedia Signal Processing, pp. 105–110 (1998)

    Google Scholar 

  41. Pentland, A., Starner, T., Etcoff, N., Masoiu, N., Oliyide, O., Turk, M.: Experiments with Eigenfaces. In: Proc. IJCAI Looking at People Workshop (1993)

    Google Scholar 

  42. Rivlin, Z., Bolles, R., Appelt, D., Cheyer, A., Hakkani-Tur, D.Z., Israel, D., Julia, L., Martin, D., Myers, G., Nitz, K., Sabata, B., Sankar, A., Shriberg, E., Sonmez, K., Stolcke, A., Tur, G.: MAESTRO: Conductor of multimedia analysis technologies. Communications of the ACM 43(2), 57–74 (2000)

    Article  Google Scholar 

  43. Rowley, H., Baluja, S., Kanade, T.: Human face detection in visual scenes. Technical Report CMU-CS-95-158, Carnegie Mellon University, Pittsburgh, PA (1995)

    Google Scholar 

  44. Salton, G., Singhal, A., Mitra, M., Buckley, C.: Automatic text structuring and summary. Info. Proc. And Management 33, 193–207 (1997)

    Article  Google Scholar 

  45. Sato, T., Kanade, T., Hughes, E., Smith, M.: Video OCR for digital news archives. In: IEEE International Workshop on Content-Based Access of Image and Video Databases, pp. 52–60 (January 1998)

    Google Scholar 

  46. Sato, T., Kanade, T., Hughes, E.A., Smith, M.A., Satoh, S.: Video OCR: Indexing digital news libraries by recognition of superimposed caption. ACM Multimedia Systems 7(5), 385–395 (1999)

    Article  Google Scholar 

  47. Satoh, S., Kanade, T.: NAME-IT: Association of face and name in video. IEEE CVPR 1997, Puerto Rico (1997)

    Google Scholar 

  48. Schmidt, M., Golden, J., Gish, H.: GMM sample statistic log-likelihoods for textindependent speaker recognition. In: Proc. Eurospeech 1997, vol. 2, pp. 855–858 (1997)

    Google Scholar 

  49. Schneiderman, H., Kanade, T.: Probabilistic modeling of local appearance and spatial relationships of object recognition. In: Proc IEEE CVPR (1998)

    Google Scholar 

  50. Schwartz, R., Imai, T., Kubala, F., Nguyen, L., Makhoul, J.: A maximum likelihood model for topic classification in broadcast news. In: Proc. Eurospeech 1997 (1997)

    Google Scholar 

  51. Shamos, M.: Vision for the universal library (2002), http://www.ul.cs.cmu.edu/

  52. Shneiderman, B.: Designing the User Interface. Addison-Wesley, Reading (1998)

    Google Scholar 

  53. Slaughter, L., Oard, D.W., Warnick, V.L., Harding, J.L., Wilkerson, G.J.: A graphical interface for speech-based retrieval. In: Proc. Digital Libraries 1998, pp. 305–306 (1998)

    Google Scholar 

  54. Smeaton, A., Murphy, N., O’Connor, N., Marlow, S., Lee, H., Mc Donald, K., Browne, P., Ye, J.: The Físchlár digital video system: A digital library of broadcast TV programmes. In: Proc. Joint Conference on Digital Libraries (2001)

    Google Scholar 

  55. Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Analysis and Machine Intelligence 22(12), 1349–1380 (2000)

    Article  Google Scholar 

  56. SonicFoundry (2002), http://sonicfoundry.com/

  57. Virage (2002), http://www.virage.com/

  58. Visionics (2002), http://www.visionics.com

  59. Voorhees, E.M., Harman, D.K.: The Ninth Text Retrieval Conference, TREC-9 (2001)

    Google Scholar 

  60. Voorhees, E.M., Tice, D.M.: The TREC-8 question answering track report. In: The Eighth Text Retrieval Conference, TREC-8 (2000)

    Google Scholar 

  61. VTREC. The Video TREC track home page (2001), http://www-nlpir.nist.gov/projects/trecvid/

  62. Wactlar, H., Christel, M., Gong, Y., Hauptmann, A.: Lessons learned from the creation and deployment of a terabyte digital video library. IEEE Computer 32(2), 66–73 (1999)

    Google Scholar 

  63. Wang, R., Huang, T.: Fast camera motion analysis in the MPEG domain. International Conference on Image Processing 3, 691–694 (1999)

    Google Scholar 

  64. QBIC web site (2002), http://wwwqbic.almaden.ibm.com

  65. Witbrock, M., Mittal, V.: Ultra-summarization: A statistical approach to generating highly condensed non-extractive summaries. In: Proc. ACM SIGIR (1999)

    Google Scholar 

  66. Woodland, P.C., Gales, M.J.F., Pye, D., Young, S.J.: Development of the 1996 broadcast news transcription system. In: Proceedings of the 1997 ARPA Workshop on Speech Recognition (February 1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hauptmann, A., Jin, R., Wactlar, H. (2003). Data Analysis for a Multimedia Library. In: Renals, S., Grefenstette, G. (eds) Text- and Speech-Triggered Information Access. Lecture Notes in Computer Science(), vol 2705. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45115-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45115-0_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40635-8

  • Online ISBN: 978-3-540-45115-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics