Objective Speech Quality Evaluation Using an Adaptive Neuro-Fuzzy Network

Chen, Guo; Parsa, Vijay

doi:10.1007/978-3-540-75398-8_5

Guo Chen⁴ &
Vijay Parsa⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 83))

2094 Accesses

A speech quality measure is a valuable assessment tool for the development of speech coding and enhancing techniques. Commonly, two approaches, subjective and objective, are used for measuring the speech quality. Subjective measures are based on the perceptual ratings by a group of listeners while objective metrics assess speech quality using the extracted physical parameters. Objective metrics that correlate well with subjective ratings are attractive as they are less expensive to administer and give more consistent results. In this work, we investigated a novel non-intrusive speech quality metric based on adaptive neuro-fuzzy network techniques. In the proposed method, a first-order Sugeno type fuzzy inference system (FIS) is applied for objectively estimating the speech quality. The features required for the proposed method are extracted from the perceptual spectral density distribution of the input speech by using the co-occurrence matrix analysis technique. The performance of the proposed method was demonstrated through comparisons with the state-of-the-art non-intrusive quality evaluation standard, the ITU-T P.563.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

ITU (1996) Methods for subjective determination of transmission quality. ITU-T P.800.
Google Scholar
Quackenbush SR, Barnwell-III TP, and Clements MA (1988) Objective Measures of Speech Qaulity, Prentice-Hall, Englewood Cliffs, NJ.
Google Scholar
Dimolitsas S (1989) Objective speech distortion measures and their relevance to speech quality assessments. IEE Proceedings - Communications, Speech and Vision, vol. 136, no. 5, pp. 317-324.
Google Scholar
Rix A (2004) Perceptual speech quality assessment - a review. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, vol. 3, pp. 1056-1059.
Google Scholar
Rix A, Beerends JG, Kim DS, Kroon P, and Ghitza O (2006) Objective assessment of speech and audio quality—technology and applications. IEEE Transactions on Audio, Speech and Language Processing, vol.14, no.6, pp. 1890-1901.
Article Google Scholar
Wang S, Sekey A, and Gersho A (1992) An objective measure for predict-ing subjective quality of speech coders. IEEE Journal on selected areas in communications, vol. 10, no. 5, pp. 819-829.
Article Google Scholar
Beerends JG and Stemerdink JA (1994) A perceptual speech-quality mea-sure based on a psychoacoustic sound representation. Journal of the Audio Engineering Society, vol. 42, no. 3, pp. 115-123.
Google Scholar
Yang W, Benbouchta M, and Yantorno R (1998) Performance of the modified bark spectral distortion as an objective speech quality measure. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Washington, USA, vol. 1, pp. 541-544.
Google Scholar
Voran S (1999) Objective estimation of perceived speech quality - part i. devel-opment of the measuring normalizing block technique. IEEE Transactions on speech and audio processing, vol. 7, no. 4, pp. 371-382.
Article Google Scholar
Voran S (1999) Objective estimation of perceived speech quality - part ii. eval-uation of the measuring normalizing block technique. IEEE Transactions on speech and audio processing, vol. 7, no. 4, pp. 383-390.
Article Google Scholar
ITU (2001) Perceptual evaluation of speech quality. ITU-T P.862.
Google Scholar
Zha W and Chan WY (2004) A data mining approach to objective speech quality measurement. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, vol. 3, pp. 461-464.
Google Scholar
Kates JM and Arehart KH (2005) A model of speech intelligibility and quality in hearing aids. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New York, USA, pp. 53-56.
Google Scholar
Karmakar A, Kumar A, and Patney RK (2006) A multiresolution model of audi-tory excitation pattern and its application to objective evaluation of perceived speech quality. IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 6, pp. 1912-1923.
Article Google Scholar
Chen G, Koh S, and Soon I (2003) Enhanced itakura measure incorporat-ing masking properties of human auditory system. Signal Processing, vol. 83, pp. 1445-1456.
Article MATH Google Scholar
Chen G, Parsa V, and Scollie S (2006) An erb loudness pattern based objective speech quality measure. In Proceedings of Iternational Conference on Spoken Language Processing, Pittsburg, USA, pp. 2174-2177.
Google Scholar
Liang J and Kubichek R (1994) Output-based objective speech quality. In Pro-ceedings of IEEE 44th Vehicular Technology Conference, Stockholm, Sweden, vol. 3, pp. 1719-1723.
Google Scholar
Jin C and Kubichek R (1996) Vector quantization techniques for output-based objective speech quality. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Atlanta, GA, vol. 1, pp. 491-494.
Google Scholar
Picovici D and Mahdi AE (2003) Output-based objective speech quality measure using self-organizing map. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Hongkong, China, vol. 1, pp. 476-479.
Google Scholar
Picovici D and Mahdi AE (2004) New output-based perceptual measure for predicting subjective quality of speech. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, vol. 5, pp. 633-636.
Google Scholar
Falk T, Xu Q, and Chan WY (2005) Non-intrusive gmm-based speech quality measurement. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Philadelphia, USA.
Google Scholar
Falk T and Chan WY (2006) Nonintrusive speech quality estimation using gaussian mixture models. IEEE Signal Processing Letters, vol.13, no.2, pp. 108-111.
Article Google Scholar
Falk T and Chan WY (2006) Single-ended speech quality measurement using machine learning methods. IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 6, pp. 1935-1947.
Article Google Scholar
Falk T and Chan WY (2006) Enhanced non-intrusive speech quality mea-surement using degradation models. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, France, vol. 1, pp. 837-840.
Google Scholar
Nielsen LB (1993) Objective scaling of sound quality for normal-hearing and hearing-impaired listerners. Tech. Rep. No. 54, The acoustics laboratory, Technical University of Denmark, Denmark.
Google Scholar
Gray P, Hollier MP, and Massara RE (2000) Non-intrusive speech quality assess-ment using vocal-tract models. IEE Proceedings - Vision, Image and Signal Processing, vol. 147, no. 6, pp. 493-501.
Article Google Scholar
Kim DS and Tarraf A (2004), Perceptual model for non-intrusive speech quality assessment. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, vol. 3, pp. 1060-1063.
Google Scholar
Kim DS (2004) A cue for objective speech quality estimation in temporal envelope representations. IEEE Signal Processing Letters, vol. 1, no. 10, pp. 849-852.
Article Google Scholar
Kim DS (2005) Anique: An auditory model for single-ended speech quality estimation. IEEE Transactions on Speech and Audio Processing, vol. 13, no. 4, pp. 1-11.
Article MATH Google Scholar
Kim DS and Tarraf A (2006) Enhanced perceptual model for non-intrusive speech quality assessment. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, France, vol. 1, pp. 829-832.
Google Scholar
Chen G and Parsa V (2004) Output-based speech quality evaluation by mea-suring perceptual spectral density distribution. IEE Electronics Letter, vol. 40, no. 12, pp. 783-784.
Article Google Scholar
Chen G and Parsa V (2004) Neuro-fuzzy estimator of speech quality. In Proceedings of International Conference on signal processing and communications (SPCOM), Bangalore, India, pp. 587-591.
Google Scholar
Chen G and Parsa V (2005) Non-intrusive speech quality evaluation using an adaptive neurofuzzy inference system. IEEE Signal Processing Letters, vol. 12, no. 5, pp. 403-406.
Article Google Scholar
Chen G and Parsa V (2005) Bayesian model based non-intrusive speech quality evaluation. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Philadelphia, USA, vol. 1, pp. 385-388.
Google Scholar
ITU (2004) Single ended method for objective speech quality assessment in narrow-band telephony applicaitons. ITU-T P.563.
Google Scholar
Ding L, Radwan A, El-Hennawey MS, and Goubran RA (2006) Measurement of the effects of temporal clipping on speech quality. IEEE Transactions on Instrumentation and Measurement, vol. 55, no. 4, pp. 1197-1203.
Article Google Scholar
Grancharov V, Zhao DY, Lindblom J, and Kleijn WB (2006) Low-complexity, nonintrusive speech quality assessment. IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 6, pp. 1948-1956.
Article Google Scholar
Jang JS (1993) Anfis: adaptive-network-based fuzzy inference systems. IEEE Transactions on System, Man, and Cybernetics, vol. 23, no. 3, pp. 665-685.
Article MathSciNet Google Scholar
Jang JS and Sun CT (1995) Neuro-fuzzy modeling and control. The Proceedings of the IEEE, vol. 83, no. 3, pp. 378-406.
Article Google Scholar
Jang JS, Sun CT, and Mizutani E (1997) Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence, Prentice-Hall, Englewood Cliffs, NJ.
Google Scholar
Sugeno M and Kang GT (1988) Structure identificaiton of fuzzy model. Fuzzy Sets and Systems, vol. 28, pp. 15-33.
Article MATH MathSciNet Google Scholar
Takagi T and Sugeno M (1985) Fuzzy identification of systems and its appli-cation to modelling and control. IEEE Transactions on Systems, Man, and Cybernetics, vol. 15, pp. 116-132.
MATH Google Scholar
Haralick RM, Shanmugan K, and Dinstein IH (1973) Textural features for image classification. IEEE Transactions on System, Man, and Cybernetics, vol. SMC-3, pp. 610-621.
Google Scholar
Haralick RM (1979) Statistical and structural approaches to texture. Proceedings of IEEE, vol. 67, pp. 786-804.
Article Google Scholar
Terzopoulos D(1985) Co-occurrence analysis of speech waveforms. IEEE Transactions on acoustics, speech and signal processing, vol. ASSP-33, no. 1, pp. 5-30.
Google Scholar
ITU (1998) ITU-T coded-speech database. ITU-T P-series Supplement 23.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical & Computer Engineering, University of Western Ontario, London, Ontario, Canada
Guo Chen
Department of Electrical & Computer Engineering and National Centre for Audiology, University of Western Ontario, London, Ontario, Canada
Vijay Parsa

Authors

Guo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Vijay Parsa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Information Sciences, Florida A&M University, Tallahassee, FL 32307, USA
Bhanu Prasad
Department of Electronics and Communication Engineering, Indian Institute of Technology Guwahati, Guwahati, India
S. R. Mahadeva Prasanna

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chen, G., Parsa, V. (2008). Objective Speech Quality Evaluation Using an Adaptive Neuro-Fuzzy Network. In: Prasad, B., Prasanna, S.R.M. (eds) Speech, Audio, Image and Biomedical Signal Processing using Neural Networks. Studies in Computational Intelligence, vol 83. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75398-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-75398-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75397-1
Online ISBN: 978-3-540-75398-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics