Abstract
In this chapter, we provide an overview of methods for speech quality assessment. First, we define the term speech quality and outline in Sect. 5.1 the main causes of degradation of speech quality. Then, we discuss subjective test methods for quality assessment, with a focus on standardized methods. Section 5.3 is dedicated to objective algorithms for quality assessment. We conclude the chapter with a reference table containing common quality assessment scenarios and the corresponding most suitable methods for quality assessment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- ACR:
-
absolute category rating
- ANOVA:
-
analysis of variance
- BSD:
-
bark spectral distortion
- CCR:
-
comparison category rating
- CELP:
-
code-excited linear prediction
- CF:
-
coherence function
- CMOS:
-
comparison mean opinion score
- DAM:
-
diagnostic acceptability measure
- DCR:
-
degradation category rating
- DMOS:
-
degradation mean opinion score
- DRT:
-
diagnostic rhyme test
- GMM:
-
Gaussian mixture model
- HMM:
-
hidden Markov models
- HSD:
-
honestly significant difference
- II:
-
information index
- IS:
-
Itakura-Saito
- ITU:
-
International Telecommunication Union
- LAR:
-
log-area-ratio
- LL:
-
log-likelihood
- MNB:
-
measuring normalizing blocks
- MNRU:
-
modulated noise reference unit
- MOS:
-
mean opinion score
- MPI:
-
minimal pairs intelligibility
- MRT:
-
modified rhyme test
- MSD:
-
minimum significant difference
- MUSHRA:
-
multi stimulus test with hidden reference and anchor
- NN:
-
neural network
- PEAQ:
-
perceptual quality assessment for digital audio
- PESQ:
-
perceptual evaluation of speech quality
- PLP:
-
perceptual linear prediction
- PSQM:
-
perceptual speech quality measure
- QoS:
-
quality-of-service
- RMSE:
-
root-mean-square error
- SD:
-
spectral distortion
- SNR:
-
signal-to-noise ratio
- SegSNR:
-
segmental SNR
References
ITU-T Rec. G.113: Transmission impairments (Geneva 2001)
ITU-T Rec. P.11: Effects of transmission impairments (Geneva 1993)
ITU-T Rec. P.800: Methods for subjective determination of transmission quality (Geveva 1996)
D. Clark: High-resolution subjective testing using a double-blind comparator, J. Audio Eng. Soc. 30, 330-338 (1982)
ITU-R Rec. BS.1116-1: Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems (Geneva 1997)
ITU-R Rec. BS.1534-1: Method for the subjective assessment of intermediate quality level of coding systems (Geneva 2005)
ITU-R Rec. BT.500-11: Method for the subjective assessment of the quality of television pictures (Geneva 2002)
ITU-R Rec. BS.1284-1: General methods for the subjective assessment of sound quality (Geneva 2003)
S. Möller: Assessment and Prediction of Speech Quality in Telecommunications (Kluwer Academic, Boston 2000)
M. Gueguin, R. Bouquin-Jeannes, G. Faucon, V. Barriac: Towards an objective model of the conversational speech quality, Proc. IEEE ICASSP 1, 1229-1232 (2006)
W. Voiers: Diagnostic acceptability measure for speech communication systems, Proc. IEEE ICASSP 2, 204-207 (1977)
ITU-T Rec. P.835: Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm (Geneva 2003)
Y. Hu, P. Loizou: Subjective comparison of speech enhancement algorithms, Proc. IEEE ICASSP 1, 153-156 (2006)
W. Voiers: Evaluating processed speech using the diagnostic rhyme test, Speech Technol. 1, 338-352 (1983)
A. House, C. Williams, M. Hecker, K. Kryter: Articulation testing methods: Consonant differenctiation with a closed response set, J. Acoust. Soc. Am. 37, 158-166 (1965)
M. Goldstein: Classification of methods used for assessment of text-to-speech systems according to the demands placed on the listener, Speech Commun. 16, 225-244 (1995)
M. Spiegel, M. Altom, M. Macchi, K. Wallace: Comprehensive assessment of the telephone intelligibility of synthesized and natural speech, Speech Commun. 9, 279-291 (1990)
J. van Santen: Perceptual experiments for diagnostic testing of text-to-speech systems, Comput. Speech Lang. 7, 49-100 (1993)
ITU-T Rec. P.830: Subjective performance assessment of telephone-band and wideband digital codecs (Geneva 1996)
A. Huggins, R. Nickerson: Speech quality evaluation using phonemic-specific sentences, J. Acoust. Soc. Am. 77, 1896-1906 (1985)
H. Lane, B. Tranel: The Lombard sign and the role of hearing in speech, J. Acoust. Soc. Am. 47, 618-624 (1970)
ITU-T. Rec. P.810: Modulated noise reference unit (Geneva 1996)
J. Tukey: The Problem of Multiple Comparisons (Princeton University, Ditton 1953)
B. Jones, P. McManus: Graphic scaling of qualitative terms, SMPTE J. 95, 1166-1171 (1986)
M. Dahlquist, A. Leijon: Paired-comparison rating of sound quality using MAP parameter estimation for data analysis. In: 1st ISCA Tutorial and Research Workshop on Auditory Quality of Systems (2003)
S. Voran: A basic experiment on time-varying speech quality. In: Proc. 4th Int. Conf.: Measurement of Speech and Audio Quality in Networks (2005)
M. Hansen, B. Kollmeier: Continuous assessment of the time-varying speech quality, J. Acoust. Soc. Am. 106, 2888-2899 (1999)
L. Gros, N. Chateu: Instantaneous and overall judgements for time-varying speech quality: Assessments and relationships, Acta Acust. United Ac. 87, 367-377 (2001)
L. Gros, N. Chateu, S. Busson: Effects of context on the subjective assessment of time-varying speech quality: listening/conversation, laboratory/real environment, Acta Acust. United Ac. 90, 1037-1051 (2004)
L. Gros: The impact of listening and conversational situations on speech perceived quality for time-varying impairments. P: Int. Conf. Measurement of Speech and Audio Quality in Networks, 17-19 (2002)
J. Rosenbluth: Testing the quality of connections having time varying impairments, Comitee T1 Standards Contribution ANSI T1A1.7/98-031, (1998)
P. Gray, R. Massara, M. Hollier: An experimental investigation of the accumulation of perceived error in time-varying speech distortions. In Preprint: Audio Engineering Society 103rd Convention (1997)
ITU-T Rec. P.880: Continuous evaluation of time varying speech quality (Geneva 2004)
S. Quackenbush, T. Barnwell, M. Clements: Objective Measures of Speech Quality (Prentice Hall, Englewood Cliffs 1988)
S. Jayant, P. Noll: Digital Coding of Waveforms (Prentice Hall, Englewood Cliffs 1984)
M. Schroeder, B. Atal, J. Hall: Objective Measure of Certain Speech Signal Degradations Based on Masking Properties of Human Auditory Perception (Academic, New York 1979)
N. Kitawaki, H. Nagabuchi, K. Itoh: Objective quality evaluation for low-bit-rate speech coding systems, IEEE J. Sel. Area. Commun. 6, 242-248 (1988)
W.B. Kleijn, K.K. Paliwal (Eds.): Speech Coding and Synthesis (Elsevier Science, Amsterdam 1995)
R. Viswanathan, J. Makhoul, W. Russel: Towards perceptually consistent measures of spectral distance, Proc. IEEE ICASSP 1, 485-488 (1976)
S. Dimolitsas: Objective speech distortion measures and their relevance to speech quality measurements, IEE Proc. 136, 317-324 (1989)
ITU-T Rec. P.56: Objective measurement of active speech level (Geneva 1993)
E. Zwicker, H. Fastl: Psycho-Acoustics: Facts and Models (Springer, New York 1999)
B.C.J. Moore: An Introduction to the Psychology of Hearing (Academic, London 1989)
T. Dau, D. Püschel, A. Kohlrausch: A quantitative model of the effective signal processing in the auditory system. I. Model structure, J. Acoust. Soc. Am. 99, 3615-3622 (1996)
A. Rix, A. Bourret, M. Hollier: Models of human perception, BT Technol. J. 17, 24-34 (1999)
S. Voran: A simplified version of the ITU algorithm for objective measurement of speech codec quality, Proc. IEEE ICASSP 1, 537-540 (1998)
K. Brandenburg: Evaluation of quality for audio encoding at low bit rates. In preprint: Audio Engineering Society 82nd Convention (1987)
J. Karjalainen: A new audithory model for the evaluation of sound quality of audio systems, Proc. IEEE ICASSP 10, 608-611 (1985)
S. Wang, A. Sekey, A. Gersho: An objective measure for predicting subjective quality of speech coders, IEEE J. Sel. Area. Commun. 10(5), 819-829 (1992)
J. Lalou: The information index: An objective measure of speech transmission performance, Ann. Telecommun. 45, 47-65 (1990)
R. Bouquin, G. Faucon: Using the coherence function for noise reduction, Proc. IEE 139(3), 276-282 (1992)
J. Beerends, J. Stemerdink: A perceptual speech-quality measure based on a psychoacoustic sound representation, J. Audio Eng. Soc. 42(3), 115-123 (1994)
C. Colomes, C. Schmidmer, T. Thiede, W. Treurniet: Perceptual quality assessment for digital audio: (PEAQ) - the new ITU standard for objective measurement of the perceived audio quality, In Proc.: AES 17th Int. Conf. 337-351 (1999)
ITU-R. BS.1387-1: Method for Objective Measurements of Perceived Audio Quality (PEAQ) (2001)
ITU-T Rec. P. 862: Perceptual evaluation of speech quality (PESQ) (2001).
ITU-R Rec. P.862.2: Wideband extension to Recommendation P.862 for the assessment of wideband telephone networks and speech codecs (2005)
H. Knagenhjelm, W.B. Kleijn: Spectral dynamics is more important than spectral distortion, P. IEEE ICASSP 1, 732-735 (1995)
F. Norden, T. Eriksson: Time evolution in LPC spectrum coding, IEEE T. Speech Audi. P. 12, 290-301 (2004)
T. Quatieri, R. Dunn: Speech enhancement based on auditory spectral change, P. IEEE ICASSP 1, 257-260 (2002)
M. Hollier, M. Hawksford, D. Guard: Error activity and error entropy as a measure of psychoacoustic significance in the perceptual domain, IEE P-Vis. Image Sign. 141(3), 203-208 (1994)
S. Voran: Objective estimation of perceived speech quality - Part I: Development of the measuring normalizing block technique, IEEE T. Speech Audi. P. 7(4), 371-382 (1999)
S. Voran: Objective estimation of perceived speech quality - Part II: Evaluation of the measuring normalizing block technique, IEEE T. Speech Audi. Proc. 7(4), 383-390 (1999)
H. Coetzee, T.B. III: An LSP based speech quality measure, Proc. IEEE ICASSP 1, 596-599 (1989)
D. Klatt: Prediction of perceived phonetic distance from critical-band spectra: a first step, Proc. IEEE ICASSP 7, 1278-1281 (1982)
U. Halka, U. Heute: A new approach to objective quality-measures based on attribute-matching, Speech Commun. 11, 15-30 (1992)
W. Zha, W.-Y. Chan: Objective speech quality measurement using statistical data minimg, J. Appl. Signal Process. 9, 1410-1424 (2005)
P. Gray, M. Hollier, R. Massara: Non-intrusive speech-quality assessment using vocal-tract models, IEE P-Vis. Image Sign. 147(6), 493-501 (2000)
J. Liang, R. Kubichek: Output-based objective speech quality, IEEE 44th Vehicular Technology Conf. 3(8-10), 1719-1723 (1994)
H. Hermansky: Perceptual linear prediction (PLP) analysis of speech, J. Acoust. Soc. Am. 87, 1738-1752 (1990)
A. Conway: A passive method for monitoring voice-over-IP call quality with ITU-T objective speech quality measurement methods, Proc. IEEE Int. Conf. Commun. 4, 2583-2586 (2002)
A. Conway: Output-based method of applying PESQ to measure the perceptual quality of framed speech signals, Proc. IEEE Wireless Commun. Netw. 4, 2521-2526 (2004)
D. Kim: ANIQUE: An auditory model for single-ended speech quality estimation, IEEE T. Speech Audi. P. 13, 821-831 (2005)
D. Kim, A. Tarraf: Enhanced perceptual model for non-intrusive speech quality assessment, P. IEEE ICASSP 1, 829-832 (2006)
O. Au, K. Lam: A novel output-based objective speech quality measure for wireless communication, Signal Process. P, 4th Int. Conf. 1, 666-669 (1998)
ITU-T Rec. P.563: Single ended method for objective speech quality assessment in narrow-band telephony applications (2004)
T. Falk, Q. Xu, W.-Y. Chan: Non-intrusive GMM-based speech quality measurement, P. IEEE ICASSP 1, 125-128 (2005)
G. Chen, V. Parsa: Bayesian model based non-intrusive speech quality evaluation, P. IEEE ICASSP 1, 385-388 (2005)
D. Picovici, A. Mahdi: Output-based objective speech quality measure using self-organizing map, P. IEEE ICASSP 1, 476-479 (2003)
V. Grancharov, D. Y. Zhao, J. Lindblom, W. B. Kleijn: Low complexity, non-intrusive speech quality assessment, IEEE Trans. Speech Audio. Process. 14, 1948-1956 (2006)
ITU-T Rec. G.107: The e-model, a computational model for use in transmission planning (Geneva 2005)
ITU-T Rec. P. Supplement 3: Models for predicting transmission quality from objective measurements (Withdrawn 1997) International Telecommunication Union (1993)
R. Appel, J. Beerends: On the quality of hearing oneʼs own voice, J. Audio Eng. Soc. 50(4), 237-248 (2002)
ITU-T Rec. P. Supplement 23: ITU-T coded-speech database, International Telecommunication Union (1998)
A. Rix, J. Beerends, M. Hollier, A. Hekstra: Perceptual evaluation of speech quality (PESQ) - A new method for speech quality assessment of telephone networks and codecs, P. IEEE ICASSP 2, 749-752 (2001)
T. Goldstein, A.W. Rix: Subjective comparison of speech enhancement algorithms, P. IEEE ICASSP 3, 1064-1067 (2004)
A.W. Rix: Perceptual speech quality assessment - a review, P. IEEE ICASSP 3, 1056-1059 (2004)
M. Werner, T. Junge, P. Vary: Quality control for AMR speech channels in GSM networks, P. IEEE ICASSP 3, 1076-1079 (2004)
ITU-T Rec. P.920: Interactive test methods for audiovisual communications, (2000)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Grancharov, V., Kleijn, W. (2008). Speech Quality Assessment. In: Benesty, J., Sondhi, M.M., Huang, Y.A. (eds) Springer Handbook of Speech Processing. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49127-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-49127-9_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49125-5
Online ISBN: 978-3-540-49127-9
eBook Packages: EngineeringEngineering (R0)