Abstract
We present our recent advances in silent speech interfaces using electromyographic signals that capture the movements of the human articulatory muscles at the skin surface for recognizing continuously spoken speech. Previous systems were limited to speaker- and session-dependent recognition tasks on small amounts of training and test data. In this article we present speaker-independent and speaker-adaptive training methods which allow us to use a large corpus of data from many speakers to train acoustic models more reliably. We use the speaker-dependent system as baseline, carefully tuning the data preprocessing and acoustic modeling. Then on our corpus we compare the performance of speaker-dependent and speaker-independent acoustic models and carry out model adaptation experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jou, S.-C., Schultz, T., Waibel, A.: Whispery Speech Recognition Using Adapted Articulatory Features. In: Proc. ICASSP (2005)
Nakajima, Y., Kashioka, H., Shikano, K., Campbell, N.: Non-Audible Murmur Recognition. In: Proc. Eurospeech (2003)
Hueber, T., Chollet, G., Denby, B., Dreyfus, G., Stone, M.: Continuous-Speech Phone Recognition from Ultrasound and Optical Images of the Tongue and Lips. In: Proc. Interspeech, pp. 658–661 (2007)
Jorgensen, C., Binsted, K.: Web Browser Control Using EMG Based Sub Vocal Speech Recognition. In: Proceedings of the 38th Hawaii International Conference on System Sciences (2005)
Chan, A., Englehart, K., Hudgins, B., Lovely, D.: Hidden Markov Model Classification of Myolectric Signals in Speech. IEEE Engineering in Medicine and Biology Magazine 21(9), 143–146 (2002)
Jou, S.-C., Schultz, T., Walliczek, M., Kraft, F., Waibel, A.: Towards Continuous Speech Recognition using Surface Electromyography. In: Proc. Interspeech, Pittsburgh, PA (September 2006)
Wand, M., Stan Jou, S.-C., Schultz, T.: Wavelet-based Front-End for Electromyographic Speech Recognition. In: Proc. Interspeech (2007)
Jou, S.-C., Maier-Hein, L., Schultz, T., Waibel, A.: Articulatory Feature Classification Using Surface Electromyography. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006), Toulouse, France, May 15-19 (2006)
Maier-Hein, L., Metze, F., Schultz, T., Waibel, A.: Session Independent Non-Audible Speech Recognition Using Surface Electromyography. In: Proc. ASRU (2005)
Dietrich, M.: The Effects of Stress Reactivity on Extralaryngeal Muscle Tension in Vocally Normal Participants as a Function of Personality. PhD thesis, University of Pittsburgh (2008)
Yu, H., Waibel, A.: Streamlining the Front End of a Speech Recognizer. In: Proc. ICSLP (2000)
Walliczek, M., Kraft, F., Jou, S.-C., Schultz, T., Waibel, A.: Sub-Word Unit Based Non-Audible Speech Recognition Using Surface Electromyography. In: Proc. Interspeech, Pittsburgh, PA (September 2006)
Kirchhoff, K.: Robust Speech Recognition Using Articulatory Information. PhD thesis, University of Bielefeld (1999)
Metze, F.: Articulatory Features for Conversational Speech Recognition. PhD thesis, University of Karlsruhe (2005)
Metze, F., Waibel, A.: A Flexible Stream Architecture for ASR Using Articulatory Features. In: Proc. ICSLP (September 2002)
Stan Jou, S.-C., Schultz, T.: Automatic Speech Recognition based on Electromyographic Biosignals, page accepted for publication. In: Communications in Computer and Information Science (CCIS), BIOSTEC - BIOSIGNALS 2008 best papers, pp. 305–320. Springer, Heidelberg (2009)
Jou, S.-C.S., Schultz, T., Waibel, A.: Continuous Electromyographic Speech Recognition with a Multi-Stream Decoding Architecture. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, US, April 15-20 (2007)
Frankel, J., Wester, M., King, S.: Articulatory Feature Recognition Using Dynamic Bayesian Networks. In: Proc. ICSLP (2004)
Schultz, T., Wand, M.: Modeling Coarticulation in Large Vocabulary EMG-based Speech Recognition. Speech Communication Journal (to appear, 2009)
Bahl, L.R., de Souza, P.V., Gopalakrishnan, P.S., Nahmoo, D., Picheny, M.A.: Decision Trees for Phonological Rules in Continuous Speech. In: Proc. ICASSP (1991)
Leggetter, C.J., Woodland, P.C.: Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models. Computer Speech and Language 9, 171–185 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wand, M., Schultz, T. (2010). Speaker-Adaptive Speech Recognition Based on Surface Electromyography. In: Fred, A., Filipe, J., Gamboa, H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2009. Communications in Computer and Information Science, vol 52. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11721-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-11721-3_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11720-6
Online ISBN: 978-3-642-11721-3
eBook Packages: Computer ScienceComputer Science (R0)