Automated Alignment and Annotation of Audio-Visual Presentations

Jones, Gareth J. F.; Edens, Richard J.

doi:10.1007/3-540-45747-X_21

Gareth J. F. Jones⁶ &
Richard J. Edens⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2458))

Included in the following conference series:

International Conference on Theory and Practice of Digital Libraries

1725 Accesses
7 Citations

Abstract

Recordings of audio-visual presentations are a potentially valuable component of digital libraries. These recordings can be archived to enable remote access to audio presentations including lectures and seminars. Recordings of presentations often contain multiple information streams involving visual and audio data. If the full benefit of these recordings is to be realised these multiple media streams must be properly integrated to enable rapid navigation. This paper describes the application of information retrieval techniques within a system to automatically synchronise an audio soundtrack with electronic slides from a presentation. A novel component of the system is the detection of sections of the presentation unsupported by prepared slides, such as discussion and question answering, and automatic development of keypoint slides for these elements of the presentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. W. Brown, S. Srinivasen, A. Coden, D. Ponceleon, J. W. Cooper, and A. Amir. Towards Speech as a Knowldge Resource. IBM Systems Journal, 40(4):985–1001, 2001.
Article Google Scholar
S. Mukhopadyay and B. Smith. Passive Capture and Structuring of Lectures. In Proceedings of the 7th ACM International Conference on Multimedia (Part 1), pages 477–487, Orlando, Florida, 1999. ACM.
Google Scholar
J. Hunter and S. Little Building and Indexinga Distributed Multimedia Presentation Archive Using SMIL. In Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2001), pages 415–428, Darmstadt, 2001.
Google Scholar
A. G. Hauptmann and M. J. Witbrock. Informedia: News-on-Demand Multimedia Information Aquistion and Retrieval. In M. T. Maybury, editor, Intelligent Multimedia Information Retrieval, pages 215–239. AAAI/MIT Press, 1997.
Google Scholar
M. G. Brown, J. T. Foote, G. J. F. Jones, K. Sparck Jones, and S. J. Young. Open-vocabulary speech indexing for voice and video mail retrieval. In Proceedings of ACM Multimedia 96, pages 307–316, Boston, 1996. ACM.
Google Scholar
J. S. Garafolo, C. G. P. Auzanne, and E. M. Voorhees. The TREC Spoken Document Retrieval Track: A Success Story. In Proceedings of the RIAO 2000 Conference: Content-Based Multimedia Information Access, pages 1–20, Paris, 2000.
Google Scholar
C. J. van Rijsbergen. Information Retrieval. Butterworths, 2nd edition, 1979.
Google Scholar
M. F. Porter. An algorithm for suffix stripping. Program, 14:130–137, 1980.
Google Scholar
S. E. Robertson and S. Walker. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 232–241, Dublin, 1994. ACM.
Google Scholar
S. E. Robertson, S. Walker, M. M. Beaulieu, M. Gatford, and A. Payne. Okapi at TREC-4. In D. K. Harman, editor, Overview of the Fourth Text REtrieval Conference (TREC-4), pages 73–96. NIST, 1996.
Google Scholar
M. Hearst. Multi-Paragraph Segmentation of Expository Text. In Proceedings of ACL’94, Las Cruces, New Mexico, U.S.A., 1994.
Google Scholar
D. Ponceleon and S. Srinivasen. Structure and Content-Based Segmentation of Speech Transcripts. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 404–405, New Orleans, 2001. ACM.
Google Scholar
R. Jin and A. G. Hauptmann. Automatic title generation for spoken broadcast news. In Proceedings of Human Language Technology Conference (HLT 2001), San Diego, 2001.
Google Scholar
L. J. Stifelman. Augmenting Real-World Objects: A Paper-Based Audio Netbook. In Proceedings of CHI’96, Vancouver, Canada, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Exeter, EX4 4PT, Exeter, UK
Gareth J. F. Jones & Richard J. Edens

Authors

Gareth J. F. Jones
View author publications
You can also search for this author in PubMed Google Scholar
Richard J. Edens
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Engineering, University of Padua, Via Gradenigo 6/a, 35131, Padova, Italy
Maristella Agosti
Istituto di Scienza e Tecnologie dell’ Informazione (ISTI-CNR), Area della Ricerca CNR di Pisa, Via G. Moruzzi 1, 56124, Pisa, Italy
Costantino Thanos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jones, G.J.F., Edens, R.J. (2002). Automated Alignment and Annotation of Audio-Visual Presentations. In: Agosti, M., Thanos, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2002. Lecture Notes in Computer Science, vol 2458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45747-X_21

Download citation

DOI: https://doi.org/10.1007/3-540-45747-X_21
Published: 13 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44178-6
Online ISBN: 978-3-540-45747-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics