Skip to main content

Part of the book series: Springer Handbooks ((SHB))

Abstract

In this chapter, we provide an overview of methods for speech quality assessment. First, we define the term speech quality and outline in Sect. 5.1 the main causes of degradation of speech quality. Then, we discuss subjective test methods for quality assessment, with a focus on standardized methods. Section 5.3 is dedicated to objective algorithms for quality assessment. We conclude the chapter with a reference table containing common quality assessment scenarios and the corresponding most suitable methods for quality assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 579.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 729.00
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

ACR:

absolute category rating

ANOVA:

analysis of variance

BSD:

bark spectral distortion

CCR:

comparison category rating

CELP:

code-excited linear prediction

CF:

coherence function

CMOS:

comparison mean opinion score

DAM:

diagnostic acceptability measure

DCR:

degradation category rating

DMOS:

degradation mean opinion score

DRT:

diagnostic rhyme test

GMM:

Gaussian mixture model

HMM:

hidden Markov models

HSD:

honestly significant difference

II:

information index

IS:

Itakura-Saito

ITU:

International Telecommunication Union

LAR:

log-area-ratio

LL:

log-likelihood

MNB:

measuring normalizing blocks

MNRU:

modulated noise reference unit

MOS:

mean opinion score

MPI:

minimal pairs intelligibility

MRT:

modified rhyme test

MSD:

minimum significant difference

MUSHRA:

multi stimulus test with hidden reference and anchor

NN:

neural network

PEAQ:

perceptual quality assessment for digital audio

PESQ:

perceptual evaluation of speech quality

PLP:

perceptual linear prediction

PSQM:

perceptual speech quality measure

QoS:

quality-of-service

RMSE:

root-mean-square error

SD:

spectral distortion

SNR:

signal-to-noise ratio

SegSNR:

segmental SNR

References

  1. ITU-T Rec. G.113: Transmission impairments (Geneva 2001)

    Google Scholar 

  2. ITU-T Rec. P.11: Effects of transmission impairments (Geneva 1993)

    Google Scholar 

  3. ITU-T Rec. P.800: Methods for subjective determination of transmission quality (Geveva 1996)

    Google Scholar 

  4. D. Clark: High-resolution subjective testing using a double-blind comparator, J. Audio Eng. Soc. 30, 330-338 (1982)

    Google Scholar 

  5. ITU-R Rec. BS.1116-1: Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems (Geneva 1997)

    Google Scholar 

  6. ITU-R Rec. BS.1534-1: Method for the subjective assessment of intermediate quality level of coding systems (Geneva 2005)

    Google Scholar 

  7. ITU-R Rec. BT.500-11: Method for the subjective assessment of the quality of television pictures (Geneva 2002)

    Google Scholar 

  8. ITU-R Rec. BS.1284-1: General methods for the subjective assessment of sound quality (Geneva 2003)

    Google Scholar 

  9. S. Möller: Assessment and Prediction of Speech Quality in Telecommunications (Kluwer Academic, Boston 2000)

    Book  Google Scholar 

  10. M. Gueguin, R. Bouquin-Jeannes, G. Faucon, V. Barriac: Towards an objective model of the conversational speech quality, Proc. IEEE ICASSP 1, 1229-1232 (2006)

    MATH  Google Scholar 

  11. W. Voiers: Diagnostic acceptability measure for speech communication systems, Proc. IEEE ICASSP 2, 204-207 (1977)

    Google Scholar 

  12. ITU-T Rec. P.835: Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm (Geneva 2003)

    Google Scholar 

  13. Y. Hu, P. Loizou: Subjective comparison of speech enhancement algorithms, Proc. IEEE ICASSP 1, 153-156 (2006)

    Google Scholar 

  14. W. Voiers: Evaluating processed speech using the diagnostic rhyme test, Speech Technol. 1, 338-352 (1983)

    Google Scholar 

  15. A. House, C. Williams, M. Hecker, K. Kryter: Articulation testing methods: Consonant differenctiation with a closed response set, J. Acoust. Soc. Am. 37, 158-166 (1965)

    Article  Google Scholar 

  16. M. Goldstein: Classification of methods used for assessment of text-to-speech systems according to the demands placed on the listener, Speech Commun. 16, 225-244 (1995)

    Article  Google Scholar 

  17. M. Spiegel, M. Altom, M. Macchi, K. Wallace: Comprehensive assessment of the telephone intelligibility of synthesized and natural speech, Speech Commun. 9, 279-291 (1990)

    Article  Google Scholar 

  18. J. van Santen: Perceptual experiments for diagnostic testing of text-to-speech systems, Comput. Speech Lang. 7, 49-100 (1993)

    Article  Google Scholar 

  19. ITU-T Rec. P.830: Subjective performance assessment of telephone-band and wideband digital codecs (Geneva 1996)

    Google Scholar 

  20. A. Huggins, R. Nickerson: Speech quality evaluation using phonemic-specific sentences, J. Acoust. Soc. Am. 77, 1896-1906 (1985)

    Article  Google Scholar 

  21. H. Lane, B. Tranel: The Lombard sign and the role of hearing in speech, J. Acoust. Soc. Am. 47, 618-624 (1970)

    Article  Google Scholar 

  22. ITU-T. Rec. P.810: Modulated noise reference unit (Geneva 1996)

    Google Scholar 

  23. J. Tukey: The Problem of Multiple Comparisons (Princeton University, Ditton 1953)

    Google Scholar 

  24. B. Jones, P. McManus: Graphic scaling of qualitative terms, SMPTE J. 95, 1166-1171 (1986)

    Article  Google Scholar 

  25. M. Dahlquist, A. Leijon: Paired-comparison rating of sound quality using MAP parameter estimation for data analysis. In: 1st ISCA Tutorial and Research Workshop on Auditory Quality of Systems (2003)

    Google Scholar 

  26. S. Voran: A basic experiment on time-varying speech quality. In: Proc. 4th Int. Conf.: Measurement of Speech and Audio Quality in Networks (2005)

    Google Scholar 

  27. M. Hansen, B. Kollmeier: Continuous assessment of the time-varying speech quality, J. Acoust. Soc. Am. 106, 2888-2899 (1999)

    Article  Google Scholar 

  28. L. Gros, N. Chateu: Instantaneous and overall judgements for time-varying speech quality: Assessments and relationships, Acta Acust. United Ac. 87, 367-377 (2001)

    Google Scholar 

  29. L. Gros, N. Chateu, S. Busson: Effects of context on the subjective assessment of time-varying speech quality: listening/conversation, laboratory/real environment, Acta Acust. United Ac. 90, 1037-1051 (2004)

    Google Scholar 

  30. L. Gros: The impact of listening and conversational situations on speech perceived quality for time-varying impairments. P: Int. Conf. Measurement of Speech and Audio Quality in Networks, 17-19 (2002)

    Google Scholar 

  31. J. Rosenbluth: Testing the quality of connections having time varying impairments, Comitee T1 Standards Contribution ANSI T1A1.7/98-031, (1998)

    Google Scholar 

  32. P. Gray, R. Massara, M. Hollier: An experimental investigation of the accumulation of perceived error in time-varying speech distortions. In Preprint: Audio Engineering Society 103rd Convention (1997)

    Google Scholar 

  33. ITU-T Rec. P.880: Continuous evaluation of time varying speech quality (Geneva 2004)

    Google Scholar 

  34. S. Quackenbush, T. Barnwell, M. Clements: Objective Measures of Speech Quality (Prentice Hall, Englewood Cliffs 1988)

    Google Scholar 

  35. S. Jayant, P. Noll: Digital Coding of Waveforms (Prentice Hall, Englewood Cliffs 1984)

    Google Scholar 

  36. M. Schroeder, B. Atal, J. Hall: Objective Measure of Certain Speech Signal Degradations Based on Masking Properties of Human Auditory Perception (Academic, New York 1979)

    Google Scholar 

  37. N. Kitawaki, H. Nagabuchi, K. Itoh: Objective quality evaluation for low-bit-rate speech coding systems, IEEE J. Sel. Area. Commun. 6, 242-248 (1988)

    Article  Google Scholar 

  38. W.B. Kleijn, K.K. Paliwal (Eds.): Speech Coding and Synthesis (Elsevier Science, Amsterdam 1995)

    Google Scholar 

  39. R. Viswanathan, J. Makhoul, W. Russel: Towards perceptually consistent measures of spectral distance, Proc. IEEE ICASSP 1, 485-488 (1976)

    Google Scholar 

  40. S. Dimolitsas: Objective speech distortion measures and their relevance to speech quality measurements, IEE Proc. 136, 317-324 (1989)

    Article  Google Scholar 

  41. ITU-T Rec. P.56: Objective measurement of active speech level (Geneva 1993)

    Google Scholar 

  42. E. Zwicker, H. Fastl: Psycho-Acoustics: Facts and Models (Springer, New York 1999)

    Google Scholar 

  43. B.C.J. Moore: An Introduction to the Psychology of Hearing (Academic, London 1989)

    Google Scholar 

  44. T. Dau, D. Püschel, A. Kohlrausch: A quantitative model of the effective signal processing in the auditory system. I. Model structure, J. Acoust. Soc. Am. 99, 3615-3622 (1996)

    Article  Google Scholar 

  45. A. Rix, A. Bourret, M. Hollier: Models of human perception, BT Technol. J. 17, 24-34 (1999)

    Article  Google Scholar 

  46. S. Voran: A simplified version of the ITU algorithm for objective measurement of speech codec quality, Proc. IEEE ICASSP 1, 537-540 (1998)

    Google Scholar 

  47. K. Brandenburg: Evaluation of quality for audio encoding at low bit rates. In preprint: Audio Engineering Society 82nd Convention (1987)

    Google Scholar 

  48. J. Karjalainen: A new audithory model for the evaluation of sound quality of audio systems, Proc. IEEE ICASSP 10, 608-611 (1985)

    Google Scholar 

  49. S. Wang, A. Sekey, A. Gersho: An objective measure for predicting subjective quality of speech coders, IEEE J. Sel. Area. Commun. 10(5), 819-829 (1992)

    Article  Google Scholar 

  50. J. Lalou: The information index: An objective measure of speech transmission performance, Ann. Telecommun. 45, 47-65 (1990)

    Google Scholar 

  51. R. Bouquin, G. Faucon: Using the coherence function for noise reduction, Proc. IEE 139(3), 276-282 (1992)

    Google Scholar 

  52. J. Beerends, J. Stemerdink: A perceptual speech-quality measure based on a psychoacoustic sound representation, J. Audio Eng. Soc. 42(3), 115-123 (1994)

    Google Scholar 

  53. C. Colomes, C. Schmidmer, T. Thiede, W. Treurniet: Perceptual quality assessment for digital audio: (PEAQ) - the new ITU standard for objective measurement of the perceived audio quality, In Proc.: AES 17th Int. Conf. 337-351 (1999)

    Google Scholar 

  54. ITU-R. BS.1387-1: Method for Objective Measurements of Perceived Audio Quality (PEAQ) (2001)

    Google Scholar 

  55. ITU-T Rec. P. 862: Perceptual evaluation of speech quality (PESQ) (2001).

    Google Scholar 

  56. ITU-R Rec. P.862.2: Wideband extension to Recommendation P.862 for the assessment of wideband telephone networks and speech codecs (2005)

    Google Scholar 

  57. H. Knagenhjelm, W.B. Kleijn: Spectral dynamics is more important than spectral distortion, P. IEEE ICASSP 1, 732-735 (1995)

    Google Scholar 

  58. F. Norden, T. Eriksson: Time evolution in LPC spectrum coding, IEEE T. Speech Audi. P. 12, 290-301 (2004)

    Article  Google Scholar 

  59. T. Quatieri, R. Dunn: Speech enhancement based on auditory spectral change, P. IEEE ICASSP 1, 257-260 (2002)

    Google Scholar 

  60. M. Hollier, M. Hawksford, D. Guard: Error activity and error entropy as a measure of psychoacoustic significance in the perceptual domain, IEE P-Vis. Image Sign. 141(3), 203-208 (1994)

    Article  Google Scholar 

  61. S. Voran: Objective estimation of perceived speech quality - Part I: Development of the measuring normalizing block technique, IEEE T. Speech Audi. P. 7(4), 371-382 (1999)

    Article  Google Scholar 

  62. S. Voran: Objective estimation of perceived speech quality - Part II: Evaluation of the measuring normalizing block technique, IEEE T. Speech Audi. Proc. 7(4), 383-390 (1999)

    Article  Google Scholar 

  63. H. Coetzee, T.B. III: An LSP based speech quality measure, Proc. IEEE ICASSP 1, 596-599 (1989)

    Google Scholar 

  64. D. Klatt: Prediction of perceived phonetic distance from critical-band spectra: a first step, Proc. IEEE ICASSP 7, 1278-1281 (1982)

    Google Scholar 

  65. U. Halka, U. Heute: A new approach to objective quality-measures based on attribute-matching, Speech Commun. 11, 15-30 (1992)

    Article  Google Scholar 

  66. W. Zha, W.-Y. Chan: Objective speech quality measurement using statistical data minimg, J. Appl. Signal Process. 9, 1410-1424 (2005)

    Article  Google Scholar 

  67. P. Gray, M. Hollier, R. Massara: Non-intrusive speech-quality assessment using vocal-tract models, IEE P-Vis. Image Sign. 147(6), 493-501 (2000)

    Article  Google Scholar 

  68. J. Liang, R. Kubichek: Output-based objective speech quality, IEEE 44th Vehicular Technology Conf. 3(8-10), 1719-1723 (1994)

    Google Scholar 

  69. H. Hermansky: Perceptual linear prediction (PLP) analysis of speech, J. Acoust. Soc. Am. 87, 1738-1752 (1990)

    Article  Google Scholar 

  70. A. Conway: A passive method for monitoring voice-over-IP call quality with ITU-T objective speech quality measurement methods, Proc. IEEE Int. Conf. Commun. 4, 2583-2586 (2002)

    Article  Google Scholar 

  71. A. Conway: Output-based method of applying PESQ to measure the perceptual quality of framed speech signals, Proc. IEEE Wireless Commun. Netw. 4, 2521-2526 (2004)

    Google Scholar 

  72. D. Kim: ANIQUE: An auditory model for single-ended speech quality estimation, IEEE T. Speech Audi. P. 13, 821-831 (2005)

    Article  Google Scholar 

  73. D. Kim, A. Tarraf: Enhanced perceptual model for non-intrusive speech quality assessment, P. IEEE ICASSP 1, 829-832 (2006)

    Google Scholar 

  74. O. Au, K. Lam: A novel output-based objective speech quality measure for wireless communication, Signal Process. P, 4th Int. Conf. 1, 666-669 (1998)

    Google Scholar 

  75. ITU-T Rec. P.563: Single ended method for objective speech quality assessment in narrow-band telephony applications (2004)

    Google Scholar 

  76. T. Falk, Q. Xu, W.-Y. Chan: Non-intrusive GMM-based speech quality measurement, P. IEEE ICASSP 1, 125-128 (2005)

    Google Scholar 

  77. G. Chen, V. Parsa: Bayesian model based non-intrusive speech quality evaluation, P. IEEE ICASSP 1, 385-388 (2005)

    Google Scholar 

  78. D. Picovici, A. Mahdi: Output-based objective speech quality measure using self-organizing map, P. IEEE ICASSP 1, 476-479 (2003)

    Google Scholar 

  79. V. Grancharov, D. Y. Zhao, J. Lindblom, W. B. Kleijn: Low complexity, non-intrusive speech quality assessment, IEEE Trans. Speech Audio. Process. 14, 1948-1956 (2006)

    Article  Google Scholar 

  80. ITU-T Rec. G.107: The e-model, a computational model for use in transmission planning (Geneva 2005)

    Google Scholar 

  81. ITU-T Rec. P. Supplement 3: Models for predicting transmission quality from objective measurements (Withdrawn 1997) International Telecommunication Union (1993)

    Google Scholar 

  82. R. Appel, J. Beerends: On the quality of hearing oneʼs own voice, J. Audio Eng. Soc. 50(4), 237-248 (2002)

    Google Scholar 

  83. ITU-T Rec. P. Supplement 23: ITU-T coded-speech database, International Telecommunication Union (1998)

    Google Scholar 

  84. A. Rix, J. Beerends, M. Hollier, A. Hekstra: Perceptual evaluation of speech quality (PESQ) - A new method for speech quality assessment of telephone networks and codecs, P. IEEE ICASSP 2, 749-752 (2001)

    Google Scholar 

  85. T. Goldstein, A.W. Rix: Subjective comparison of speech enhancement algorithms, P. IEEE ICASSP 3, 1064-1067 (2004)

    Google Scholar 

  86. A.W. Rix: Perceptual speech quality assessment - a review, P. IEEE ICASSP 3, 1056-1059 (2004)

    Google Scholar 

  87. M. Werner, T. Junge, P. Vary: Quality control for AMR speech channels in GSM networks, P. IEEE ICASSP 3, 1076-1079 (2004)

    Google Scholar 

  88. ITU-T Rec. P.920: Interactive test methods for audiovisual communications, (2000)

    Google Scholar 

  89. http://www.itu.int/ITU-T/studygroups/com12/index.asp

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Volodya Grancharov Dr. or W. Bastiaan Kleijn Prof. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Grancharov, V., Kleijn, W. (2008). Speech Quality Assessment. In: Benesty, J., Sondhi, M.M., Huang, Y.A. (eds) Springer Handbook of Speech Processing. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49127-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-49127-9_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49125-5

  • Online ISBN: 978-3-540-49127-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics