Skip to main content

Influence of Reverberation on Automatic Evaluation of Intelligibility with Prosodic Features

  • Conference paper
  • First Online:
Text, Speech, and Dialogue (TSD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9924))

Included in the following conference series:

  • 1658 Accesses

Abstract

Objective analysis of intelligibility by a speech recognizer and prosodic features was performed for close-talking recordings before. This study examined whether this is also possible for reverberated speech. In order to ensure that only the room acoustics are different, artificial reverberation was used. 82 patients after partial laryngectomy read a standardized text, 5 experienced raters assessed intelligibility perceptually on a 5-point scale. The best feature subset, determined by Support Vector Regression, consists of the word correctness of a speech recognizer, the average duration of silent pauses, the standard deviation of the \(F_0\) on the entire sample, the standard deviation of jitter, and the ratio of the durations of the voiced sections and the entire recording. A human-machine correlation of r = 0.80 was achieved for the close-talking recordings and r = 0.72 for the worst case of the examined signal qualities. By adding three more features, also r = 0.80 was reached for the reverberated scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baghai-Ravary, L., Beet, S.: Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders. Springer, New York (2013)

    Book  Google Scholar 

  2. Haderlein, T., Nöth, E., Batliner, A., Eysholdt, U., Rosanowski, F.: Automatic intelligibility assessment of pathologic speech over the telephone. Logopedics Phoniatrics Vocology 36, 175–181 (2011)

    Article  Google Scholar 

  3. Couvreur, L., Couvreur, C., Ris, C.: A corpus-based approach for robust ASR in reverberant environments. In: Proceedings of ICSLP, Beijing, vol. 1, pp. 397–400 (2000)

    Google Scholar 

  4. International Phonetic Association (IPA): Handbook of the International Phonetic Association. Cambridge University Press, Cambridge (1999)

    Google Scholar 

  5. Haderlein, T., Moers, C., Möbius, B., Rosanowski, F., Nöth, E.: Intelligibility rating with automatic speech recognition, prosodic, and cepstral evaluation. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 195–202. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Haderlein, T.: Automatic Evaluation of Tracheoesophageal Substitute Voices. Studien zur Mustererkennung, vol. 25. Logos Verlag, Berlin (2007)

    Google Scholar 

  7. Maier, A.: Speech of Children with Cleft Lip and Palate: Automatic Assessment. Studien zur Mustererkennung, vol. 29. Logos Verlag, Berlin (2009)

    Google Scholar 

  8. Nöth, E., Batliner, A., Kießling, A., Kompe, R., Niemann, H.: Verbmobil: the use of prosody in the linguistic components of a speech understanding system. IEEE Trans. Speech Audio Process. 8, 519–532 (2000)

    Article  Google Scholar 

  9. Rosenberg, A.: Automatic detection and classification of prosodic events. Ph.D. thesis, Columbia University, New York (2009)

    Google Scholar 

  10. Origlia, A., Alfano, I.: Prosomarker: a prosodic analysis tool based on optimal pitch stylization and automatic syllabification. In Calzolari, N., et al. (eds.) In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 997–1002 (2012)

    Google Scholar 

  11. Haderlein, T., Schwemmle, C., Döllinger, M., Matoušek, V., Ptok, M., Nöth, E.: Automatic evaluation of voice quality using text-based laryngograph measurements and prosodic analysis. Comput. Math. Methods Med. 2015, 11 p. Published 2 June 2015 (2015)

    Google Scholar 

  12. Batliner, A., Buckow, J., Niemann, H., Nöth, E., Warnke, V.: The prosody module. In: Wahlster, W. (ed.) Verbmobil: Foundations of Speech-to-Speech Translation, pp. 106–121. Springer, Berlin (2000)

    Chapter  Google Scholar 

  13. Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004)

    Article  MathSciNet  Google Scholar 

  14. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  15. Haderlein, T., Döllinger, M., Matoušek, V., Nöth, E.: Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples. Logopedics Phoniatrics Vocology 41, 106–116 (2016)

    Google Scholar 

  16. Bocklet, T., Haderlein, T., Hönig, F., Rosanowski, F., Nöth, E.: Evaluation and assessment of speech intelligibility on pathologic voices based upon acoustic speaker models. In: 3rd Advanced Voice Function Assessment International Workshop (AVFA2009), pp. 89–92. Universidad Politécnica de Madrid, Madrid (2009)

    Google Scholar 

Download references

Acknowledgments

We would like to thank Dr. Wolfgang Herbordt for his kind support with the software and data for artificial reverberation. Dr. Döllinger’s contribution was supported by Deutsche Krebshilfe grant no. 111332.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tino Haderlein .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Haderlein, T., Döllinger, M., Schützenberger, A., Nöth, E. (2016). Influence of Reverberation on Automatic Evaluation of Intelligibility with Prosodic Features. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science(), vol 9924. Springer, Cham. https://doi.org/10.1007/978-3-319-45510-5_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45510-5_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45509-9

  • Online ISBN: 978-3-319-45510-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics