Developing HMM-Based Recognizers with ESMERALDA

Finkco], Gernot A.

doi:10.1007/3-540-48239-3_42

Gernot A. Finkco]³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1692))

Included in the following conference series:

International Workshop on Text, Speech and Dialogue

486 Accesses
18 Citations

Abstract

ESMERALDA is an integrated environment for the development of speech recognition systems. It provides a powerful selection of methods for building statistical models together with an efficient incremental recognizer. In this paper the approaches adopted for estimating mixture densities, Hidden Markov Models, and n-gram language models are described as well as the algorithms applied during recognition. Evaluation results on a speaker independent spontaneous speech recognition task demonstrate the capabilities of ESMERALDA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. Billa, T. Colhurst, A. El-Jaroudi, R. Iyer, K. Ma, S. Matsoukas, C. Quillen, F. Richardson, M. Siu, G. Zvaliagkos, and H. Gish. Recent experiments in large vocabulary conversational speech recognition. In Proc. Int. Conf. on Acoustics, Speech and Signal Processing, Phoenix, Arizona, 1999.
Google Scholar
H. Brandt-Pook, G. A. Fink, S. Wachsmuth, and G. Sagerer. Integrated recognition and interpretaion of speech for a construction task domain. In Proc. 8th Int. Conf. on Human-Computer Interaction, München, 1999. to appear.
Google Scholar
M. Federico, M. Cettelo, F. Brugnara, and G. Antoniol. Language modelling for efficient beam-search. Computer Speech & Language, 9:353–379, 1995.
Article Google Scholar
G. A. Fink, N. Jungclaus, H. Ritter, and G. Sagerer. A communication framework for heterogeneous distributed pattern analysis. In Proc. Int. Conf. on Algorithms And Architectures for Parallel Processing, pages 881–890, Brisbane, 1995.
Google Scholar
G. A. Fink, C. Schillo, F. Kummert, and G. Sagerer. Incremental speech recognition for multimodal interfaces. In Proc. 24th Annual Conference of the IEEE Industrial Electronics Society, pages 2012–2017, Aachen, September 1998.
Google Scholar
T. Hain, P. C. Woodland, T. R. Niesler, and E. W. D. Whittaker. The 1998 HTK system for transcription of conversational telephone speech. In Proc. Int. Conf. on Acoustics, Speech and Signal Processing, Phoenix, Arizona, 1999.
Google Scholar
X. Huang, Y. Ariki, and M. Jack. Hidden Markov Models for Speech Recognition. Edinburgh University Press, Edinburgh, 1990.
Google Scholar
F. Jelinek. Statistical Methods for Speech Recognition. MIT Press, Cambridge, MA, 1997.
Google Scholar
K.-F. Lee. Automatic Speech Recognition: The Development of the SPHINX System. Kluwer Academic Publishers, Boston, 1989.
Google Scholar
Y. Linde, A. Buzo, and R. Gray. An algorithm for vector quantizer design. IEEE Trans. on Communications, 28(1):84–95, 1980.
Article Google Scholar
H. Ney, R. Haeb-Umbach, B. Tran, and M. Oerder. Improvements in beam search for 10000-word continuous speech recognition. In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, volume 1, pages 9–12, San Francisco, 1992.
Google Scholar
S. Ortmanns, H. Ney, F. Seide, and I. Lindam. A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition. In Proc. Int. Conf. on Spoken Language Processing, pages 2091–2094, Philadelphia, 1996.
Google Scholar
E. G. Schukat-Talamazzini. Automatische Spracherkennung. Vieweg, Wiesbaden, 1995.
MATH Google Scholar
V. Steinbiss, H. Ney, X. Aubert, S. Besling, C. Dugast, U. Essen, R. Haeb-Umbach, R. Kneser, H.-G. Meier, M. Oerder, and B.-H. Tran. The Philips research system for continuous-speech recognition. Philips Journal of Research, 49(4):317–352, 1996.
Article Google Scholar
S. Wachsmuth, G. A. Fink, and G. Sagerer. Integration of parsing and incremental speech recognition. In Proc. of the European Signal Processing Conference, volume 1, pages 371–375, Rhodes, 1998.
Google Scholar
M. Westphal. The use of cepstral means in conversational speech recognition. In Proc. European Conf. on Speech Communication and Technology, volume 3, pages 1143–1146, Rhodes, 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Technology, University of Bielefeld, P.O. Box 100131, 33501, Bielefeld, Germany
Gernot A. Finkco]

Authors

Gernot A. Finkco]
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineerig, Faculty of Applied Sciences, University of West Bohemia in Plzeň, Universitní 22, 306 14, Pizeň, Czech Republic
Václav Matousek , Pavel Mautner & Jana Ocelíková , &
Department of Programming Systems and Communication, Faculty of Informatics, Masaryk University Brno, Botanická 68a, 602 00, Brno, Czech Republic
Petr Sojka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Finkco], G.A. (1999). Developing HMM-Based Recognizers with ESMERALDA. In: Matousek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds) Text, Speech and Dialogue. TSD 1999. Lecture Notes in Computer Science(), vol 1692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48239-3_42

Download citation

DOI: https://doi.org/10.1007/3-540-48239-3_42
Published: 01 October 1999
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66494-9
Online ISBN: 978-3-540-48239-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics