Abstract
ESMERALDA is an integrated environment for the development of speech recognition systems. It provides a powerful selection of methods for building statistical models together with an efficient incremental recognizer. In this paper the approaches adopted for estimating mixture densities, Hidden Markov Models, and n-gram language models are described as well as the algorithms applied during recognition. Evaluation results on a speaker independent spontaneous speech recognition task demonstrate the capabilities of ESMERALDA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J. Billa, T. Colhurst, A. El-Jaroudi, R. Iyer, K. Ma, S. Matsoukas, C. Quillen, F. Richardson, M. Siu, G. Zvaliagkos, and H. Gish. Recent experiments in large vocabulary conversational speech recognition. In Proc. Int. Conf. on Acoustics, Speech and Signal Processing, Phoenix, Arizona, 1999.
H. Brandt-Pook, G. A. Fink, S. Wachsmuth, and G. Sagerer. Integrated recognition and interpretaion of speech for a construction task domain. In Proc. 8th Int. Conf. on Human-Computer Interaction, München, 1999. to appear.
M. Federico, M. Cettelo, F. Brugnara, and G. Antoniol. Language modelling for efficient beam-search. Computer Speech & Language, 9:353–379, 1995.
G. A. Fink, N. Jungclaus, H. Ritter, and G. Sagerer. A communication framework for heterogeneous distributed pattern analysis. In Proc. Int. Conf. on Algorithms And Architectures for Parallel Processing, pages 881–890, Brisbane, 1995.
G. A. Fink, C. Schillo, F. Kummert, and G. Sagerer. Incremental speech recognition for multimodal interfaces. In Proc. 24th Annual Conference of the IEEE Industrial Electronics Society, pages 2012–2017, Aachen, September 1998.
T. Hain, P. C. Woodland, T. R. Niesler, and E. W. D. Whittaker. The 1998 HTK system for transcription of conversational telephone speech. In Proc. Int. Conf. on Acoustics, Speech and Signal Processing, Phoenix, Arizona, 1999.
X. Huang, Y. Ariki, and M. Jack. Hidden Markov Models for Speech Recognition. Edinburgh University Press, Edinburgh, 1990.
F. Jelinek. Statistical Methods for Speech Recognition. MIT Press, Cambridge, MA, 1997.
K.-F. Lee. Automatic Speech Recognition: The Development of the SPHINX System. Kluwer Academic Publishers, Boston, 1989.
Y. Linde, A. Buzo, and R. Gray. An algorithm for vector quantizer design. IEEE Trans. on Communications, 28(1):84–95, 1980.
H. Ney, R. Haeb-Umbach, B. Tran, and M. Oerder. Improvements in beam search for 10000-word continuous speech recognition. In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, volume 1, pages 9–12, San Francisco, 1992.
S. Ortmanns, H. Ney, F. Seide, and I. Lindam. A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition. In Proc. Int. Conf. on Spoken Language Processing, pages 2091–2094, Philadelphia, 1996.
E. G. Schukat-Talamazzini. Automatische Spracherkennung. Vieweg, Wiesbaden, 1995.
V. Steinbiss, H. Ney, X. Aubert, S. Besling, C. Dugast, U. Essen, R. Haeb-Umbach, R. Kneser, H.-G. Meier, M. Oerder, and B.-H. Tran. The Philips research system for continuous-speech recognition. Philips Journal of Research, 49(4):317–352, 1996.
S. Wachsmuth, G. A. Fink, and G. Sagerer. Integration of parsing and incremental speech recognition. In Proc. of the European Signal Processing Conference, volume 1, pages 371–375, Rhodes, 1998.
M. Westphal. The use of cepstral means in conversational speech recognition. In Proc. European Conf. on Speech Communication and Technology, volume 3, pages 1143–1146, Rhodes, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Finkco], G.A. (1999). Developing HMM-Based Recognizers with ESMERALDA. In: Matousek, V., Mautner, P., OcelÃková, J., Sojka, P. (eds) Text, Speech and Dialogue. TSD 1999. Lecture Notes in Computer Science(), vol 1692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48239-3_42
Download citation
DOI: https://doi.org/10.1007/3-540-48239-3_42
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66494-9
Online ISBN: 978-3-540-48239-0
eBook Packages: Springer Book Archive