Speaker-Adaptive Speech Recognition Based on Surface Electromyography

Wand, Michael; Schultz, Tanja

doi:10.1007/978-3-642-11721-3_21

Michael Wand⁴ &
Tanja Schultz⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 52))

Included in the following conference series:

International Joint Conference on Biomedical Engineering Systems and Technologies

1206 Accesses
5 Citations

Abstract

We present our recent advances in silent speech interfaces using electromyographic signals that capture the movements of the human articulatory muscles at the skin surface for recognizing continuously spoken speech. Previous systems were limited to speaker- and session-dependent recognition tasks on small amounts of training and test data. In this article we present speaker-independent and speaker-adaptive training methods which allow us to use a large corpus of data from many speakers to train acoustic models more reliably. We use the speaker-dependent system as baseline, carefully tuning the data preprocessing and acoustic modeling. Then on our corpus we compare the performance of speaker-dependent and speaker-independent acoustic models and carry out model adaptation experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jou, S.-C., Schultz, T., Waibel, A.: Whispery Speech Recognition Using Adapted Articulatory Features. In: Proc. ICASSP (2005)
Google Scholar
Nakajima, Y., Kashioka, H., Shikano, K., Campbell, N.: Non-Audible Murmur Recognition. In: Proc. Eurospeech (2003)
Google Scholar
Hueber, T., Chollet, G., Denby, B., Dreyfus, G., Stone, M.: Continuous-Speech Phone Recognition from Ultrasound and Optical Images of the Tongue and Lips. In: Proc. Interspeech, pp. 658–661 (2007)
Google Scholar
Jorgensen, C., Binsted, K.: Web Browser Control Using EMG Based Sub Vocal Speech Recognition. In: Proceedings of the 38th Hawaii International Conference on System Sciences (2005)
Google Scholar
Chan, A., Englehart, K., Hudgins, B., Lovely, D.: Hidden Markov Model Classification of Myolectric Signals in Speech. IEEE Engineering in Medicine and Biology Magazine 21(9), 143–146 (2002)
Article Google Scholar
Jou, S.-C., Schultz, T., Walliczek, M., Kraft, F., Waibel, A.: Towards Continuous Speech Recognition using Surface Electromyography. In: Proc. Interspeech, Pittsburgh, PA (September 2006)
Google Scholar
Wand, M., Stan Jou, S.-C., Schultz, T.: Wavelet-based Front-End for Electromyographic Speech Recognition. In: Proc. Interspeech (2007)
Google Scholar
Jou, S.-C., Maier-Hein, L., Schultz, T., Waibel, A.: Articulatory Feature Classification Using Surface Electromyography. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2006), Toulouse, France, May 15-19 (2006)
Google Scholar
Maier-Hein, L., Metze, F., Schultz, T., Waibel, A.: Session Independent Non-Audible Speech Recognition Using Surface Electromyography. In: Proc. ASRU (2005)
Google Scholar
Dietrich, M.: The Effects of Stress Reactivity on Extralaryngeal Muscle Tension in Vocally Normal Participants as a Function of Personality. PhD thesis, University of Pittsburgh (2008)
Google Scholar
Yu, H., Waibel, A.: Streamlining the Front End of a Speech Recognizer. In: Proc. ICSLP (2000)
Google Scholar
Walliczek, M., Kraft, F., Jou, S.-C., Schultz, T., Waibel, A.: Sub-Word Unit Based Non-Audible Speech Recognition Using Surface Electromyography. In: Proc. Interspeech, Pittsburgh, PA (September 2006)
Google Scholar
Kirchhoff, K.: Robust Speech Recognition Using Articulatory Information. PhD thesis, University of Bielefeld (1999)
Google Scholar
Metze, F.: Articulatory Features for Conversational Speech Recognition. PhD thesis, University of Karlsruhe (2005)
Google Scholar
Metze, F., Waibel, A.: A Flexible Stream Architecture for ASR Using Articulatory Features. In: Proc. ICSLP (September 2002)
Google Scholar
Stan Jou, S.-C., Schultz, T.: Automatic Speech Recognition based on Electromyographic Biosignals, page accepted for publication. In: Communications in Computer and Information Science (CCIS), BIOSTEC - BIOSIGNALS 2008 best papers, pp. 305–320. Springer, Heidelberg (2009)
Google Scholar
Jou, S.-C.S., Schultz, T., Waibel, A.: Continuous Electromyographic Speech Recognition with a Multi-Stream Decoding Architecture. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), Honolulu, Hawaii, US, April 15-20 (2007)
Google Scholar
Frankel, J., Wester, M., King, S.: Articulatory Feature Recognition Using Dynamic Bayesian Networks. In: Proc. ICSLP (2004)
Google Scholar
Schultz, T., Wand, M.: Modeling Coarticulation in Large Vocabulary EMG-based Speech Recognition. Speech Communication Journal (to appear, 2009)
Google Scholar
Bahl, L.R., de Souza, P.V., Gopalakrishnan, P.S., Nahmoo, D., Picheny, M.A.: Decision Trees for Phonological Rules in Continuous Speech. In: Proc. ICASSP (1991)
Google Scholar
Leggetter, C.J., Woodland, P.C.: Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models. Computer Speech and Language 9, 171–185 (1995)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Universität Karlsruhe (TH), Germany
Michael Wand & Tanja Schultz

Authors

Michael Wand
View author publications
You can also search for this author in PubMed Google Scholar
Tanja Schultz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Instituto de Telecomunicaçöes, IST - Instituto Superior Técnico, Av. Rovisco Pais, 1, 1049-001, Portugal, Lisbon
Ana Fred
Departament of Systems and Informatics, Polytechnic Institute of Setúbal – INSTICC, Rua do Vale de Chaves - Estefanilha, 2910-761, Setúbal, Portugal
Joaquim Filipe
Institute of Telecommunications, Av. Rovisco Pais, 1, 1049-001, Lisboa, Portugal
Hugo Gamboa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wand, M., Schultz, T. (2010). Speaker-Adaptive Speech Recognition Based on Surface Electromyography. In: Fred, A., Filipe, J., Gamboa, H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2009. Communications in Computer and Information Science, vol 52. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11721-3_21

Download citation

DOI: https://doi.org/10.1007/978-3-642-11721-3_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11720-6
Online ISBN: 978-3-642-11721-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics