Abstract
Continuous-density hidden Markov models (HMM) are a popular approach to the problem of modeling sequential data, e.g. in automatic speech recognition (ASR), off-line handwritten text recognition, and bioinformatics. HMMs rely on strong assumptions on their statistical properties, e.g. the arbitrary parametric assumption on the form of the emission probability density functions (pdfs). This chapter proposes a nonparametric HMM based on connectionist estimates of the emission pdfs, featuring a global gradient-ascent training algorithm over the maximum-likelihood criterion. Robustness to noise may be further increased relying on a soft parameter grouping technique, namely the introduction of adaptive amplitudes of activation functions. Applications to ASR tasks are presented and analyzed, evaluating the behavior of the proposed paradigm and allowing for a comparison with standard HMMs with Gaussian mixtures, as well as with other state-of-the-art neural net/HMM hybrids.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bengio, Y.: Neural Networks for Speech and Sequence Recognition. International Thomson Computer Press, London (1996)
Bengio, Y., De Mori, R., Flammia, G., Kompe, R.: Global optimization of a neural network-hidden Markov model hybrid. IEEE Transactions on Neural Networks 3(2), 252–259 (1992)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5(2), 157–166 (1994); Special Issue on Recurrent Neural Networks (March 1994)
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Bourlard, H., Morgan, N.: Connectionist Speech Recognition. A Hybrid Approach, vol. 247. Kluwer Academic Publishers, Boston (1994)
Bridle, J.S.: Alphanets: a recurrent ‘neural’ network architecture with a hidden Markov model interpretation. Speech Communication 9(1), 83–92 (1990)
Davis, S.B., Mermelstein, P.: Comparison of parametric representations of monosyllabic word recognition in continuously spoken sentences. IEEE Trans. on Acoustics, Speech and Signal Processing 28(4), 357–366 (1980)
Rabiner, R.L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)
Trentin, E.: Networks with trainable amplitude of activation functions. Neural Networks 14(4–5), 471–493 (2001)
Trentin, E.: Robust Combination of Neural Networks and Hidden Markov Models for Speech Recognition. PhD thesis, DSI, Univ. di Firenze (2001)
Trentin, E., Bengio, Y., Furlanello, C., De Mori, R.: Neural networks for speech recognition. In: De Mori, R. (ed.) Spoken Dialogues with Computers, pp. 311–361. Academic Press, London (1998)
Trentin, E., Gori, M.: Continuous speech recognition with a robust connectionist/ markovian hybrid model. In: Dorffner, G., Bischof, H., Hornik, K. (eds.) ICANN 2001. LNCS, vol. 2130, p. 577. Springer, Heidelberg (2001)
Trentin, E., Gori, M.: A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing 37(1-4), 91–126 (2001)
Trentin, E., Gori, M.: Toward noise-tolerant acoustic models. In: Proceedings of Eurospeech 2001, Aalborg, Scandinavia (September 2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Trentin, E. (2003). Nonparametric Hidden Markov Models: Principles and Applications to Speech Recognition. In: Apolloni, B., Marinaro, M., Tagliaferri, R. (eds) Neural Nets. WIRN 2003. Lecture Notes in Computer Science, vol 2859. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45216-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-45216-4_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20227-1
Online ISBN: 978-3-540-45216-4
eBook Packages: Springer Book Archive