Abstract
This paper describes a designed and implemented system for efficient storage, indexing and search in collections of spoken documents that takes advantage of automatic speech recognition. As the quality of current speech recognizers is not sufficient for a great deal of applications, it is necessary to index the ambiguous output of the recognition, i. e. the acyclic graphs of word hypotheses — recognition lattices. Then, it is not possible to directly apply the standard methods known from text-based systems. The paper discusses an optimized indexing system for efficient search in the complex and large data structure that has been developed by our group. The search engine works as a server. The meeting browser JFerret, developed withing the European AMI project, is used as a client to browse search results.
This work was partly supported by European project AMI (Augmented Multi-party Interaction, FP6-506811) and Grant Agency of Czech Republic under project No. 102/05/0278. Pavel Smrž was supported by MŠMT Research Plan MSM 6383917201. The hardware used in this work was partially provided by CESNET under project No. 119/2004.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Science Department. Stanford University
Hain, T., et al.: Transcription of Conference Room Meetings: an Investigation. In: Proc. Eurospeech 2005, Lisabon, Portugal (September 2005)
Szöke, I., et al.: Comparison of Keyword Spotting Approaches for Informal Continuous Speech. In: Proc. Eurospeech 2005, Lisabon, Portugal (September 2005)
Young, S., et al.: The HTK Book (for HTK Version 3. Engineering Department. Cambridge University Press, Cambridge (2005), http://htk.eng.cam.ac.uk/
van der Wal, B., et al.: D6.3 Preliminary demonstrator of Browser Components and Wireless Presentation System. In: AMI deliverable (August 2005)
Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., Wooters, C.: The ICSI Meeting Corpus. In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2003, Hong Kong (April 2003)
Rohlicek, J.R., Russell, W., Roukos, S., Gish, H.: Continuous hidden Markov modeling for speaker-independent word spotting. In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1989, Glasgow, UK, vol. 1 (May 1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fapšo, M. et al. (2006). Information Retrieval from Spoken Documents. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2006. Lecture Notes in Computer Science, vol 3878. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11671299_43
Download citation
DOI: https://doi.org/10.1007/11671299_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32205-4
Online ISBN: 978-3-540-32206-1
eBook Packages: Computer ScienceComputer Science (R0)