Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 83))

  • 2094 Accesses

A speech quality measure is a valuable assessment tool for the development of speech coding and enhancing techniques. Commonly, two approaches, subjective and objective, are used for measuring the speech quality. Subjective measures are based on the perceptual ratings by a group of listeners while objective metrics assess speech quality using the extracted physical parameters. Objective metrics that correlate well with subjective ratings are attractive as they are less expensive to administer and give more consistent results. In this work, we investigated a novel non-intrusive speech quality metric based on adaptive neuro-fuzzy network techniques. In the proposed method, a first-order Sugeno type fuzzy inference system (FIS) is applied for objectively estimating the speech quality. The features required for the proposed method are extracted from the perceptual spectral density distribution of the input speech by using the co-occurrence matrix analysis technique. The performance of the proposed method was demonstrated through comparisons with the state-of-the-art non-intrusive quality evaluation standard, the ITU-T P.563.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ITU (1996) Methods for subjective determination of transmission quality. ITU-T P.800.

    Google Scholar 

  2. Quackenbush SR, Barnwell-III TP, and Clements MA (1988) Objective Measures of Speech Qaulity, Prentice-Hall, Englewood Cliffs, NJ.

    Google Scholar 

  3. Dimolitsas S (1989) Objective speech distortion measures and their relevance to speech quality assessments. IEE Proceedings - Communications, Speech and Vision, vol. 136, no. 5, pp. 317-324.

    Google Scholar 

  4. Rix A (2004) Perceptual speech quality assessment - a review. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, vol. 3, pp. 1056-1059.

    Google Scholar 

  5. Rix A, Beerends JG, Kim DS, Kroon P, and Ghitza O (2006) Objective assessment of speech and audio quality—technology and applications. IEEE Transactions on Audio, Speech and Language Processing, vol.14, no.6, pp. 1890-1901.

    Article  Google Scholar 

  6. Wang S, Sekey A, and Gersho A (1992) An objective measure for predict-ing subjective quality of speech coders. IEEE Journal on selected areas in communications, vol. 10, no. 5, pp. 819-829.

    Article  Google Scholar 

  7. Beerends JG and Stemerdink JA (1994) A perceptual speech-quality mea-sure based on a psychoacoustic sound representation. Journal of the Audio Engineering Society, vol. 42, no. 3, pp. 115-123.

    Google Scholar 

  8. Yang W, Benbouchta M, and Yantorno R (1998) Performance of the modified bark spectral distortion as an objective speech quality measure. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Washington, USA, vol. 1, pp. 541-544.

    Google Scholar 

  9. Voran S (1999) Objective estimation of perceived speech quality - part i. devel-opment of the measuring normalizing block technique. IEEE Transactions on speech and audio processing, vol. 7, no. 4, pp. 371-382.

    Article  Google Scholar 

  10. Voran S (1999) Objective estimation of perceived speech quality - part ii. eval-uation of the measuring normalizing block technique. IEEE Transactions on speech and audio processing, vol. 7, no. 4, pp. 383-390.

    Article  Google Scholar 

  11. ITU (2001) Perceptual evaluation of speech quality. ITU-T P.862.

    Google Scholar 

  12. Zha W and Chan WY (2004) A data mining approach to objective speech quality measurement. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, vol. 3, pp. 461-464.

    Google Scholar 

  13. Kates JM and Arehart KH (2005) A model of speech intelligibility and quality in hearing aids. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New York, USA, pp. 53-56.

    Google Scholar 

  14. Karmakar A, Kumar A, and Patney RK (2006) A multiresolution model of audi-tory excitation pattern and its application to objective evaluation of perceived speech quality. IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 6, pp. 1912-1923.

    Article  Google Scholar 

  15. Chen G, Koh S, and Soon I (2003) Enhanced itakura measure incorporat-ing masking properties of human auditory system. Signal Processing, vol. 83, pp. 1445-1456.

    Article  MATH  Google Scholar 

  16. Chen G, Parsa V, and Scollie S (2006) An erb loudness pattern based objective speech quality measure. In Proceedings of Iternational Conference on Spoken Language Processing, Pittsburg, USA, pp. 2174-2177.

    Google Scholar 

  17. Liang J and Kubichek R (1994) Output-based objective speech quality. In Pro-ceedings of IEEE 44th Vehicular Technology Conference, Stockholm, Sweden, vol. 3, pp. 1719-1723.

    Google Scholar 

  18. Jin C and Kubichek R (1996) Vector quantization techniques for output-based objective speech quality. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Atlanta, GA, vol. 1, pp. 491-494.

    Google Scholar 

  19. Picovici D and Mahdi AE (2003) Output-based objective speech quality measure using self-organizing map. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Hongkong, China, vol. 1, pp. 476-479.

    Google Scholar 

  20. Picovici D and Mahdi AE (2004) New output-based perceptual measure for predicting subjective quality of speech. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, vol. 5, pp. 633-636.

    Google Scholar 

  21. Falk T, Xu Q, and Chan WY (2005) Non-intrusive gmm-based speech quality measurement. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Philadelphia, USA.

    Google Scholar 

  22. Falk T and Chan WY (2006) Nonintrusive speech quality estimation using gaussian mixture models. IEEE Signal Processing Letters, vol.13, no.2, pp. 108-111.

    Article  Google Scholar 

  23. Falk T and Chan WY (2006) Single-ended speech quality measurement using machine learning methods. IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 6, pp. 1935-1947.

    Article  Google Scholar 

  24. Falk T and Chan WY (2006) Enhanced non-intrusive speech quality mea-surement using degradation models. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, France, vol. 1, pp. 837-840.

    Google Scholar 

  25. Nielsen LB (1993) Objective scaling of sound quality for normal-hearing and hearing-impaired listerners. Tech. Rep. No. 54, The acoustics laboratory, Technical University of Denmark, Denmark.

    Google Scholar 

  26. Gray P, Hollier MP, and Massara RE (2000) Non-intrusive speech quality assess-ment using vocal-tract models. IEE Proceedings - Vision, Image and Signal Processing, vol. 147, no. 6, pp. 493-501.

    Article  Google Scholar 

  27. Kim DS and Tarraf A (2004), Perceptual model for non-intrusive speech quality assessment. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, vol. 3, pp. 1060-1063.

    Google Scholar 

  28. Kim DS (2004) A cue for objective speech quality estimation in temporal envelope representations. IEEE Signal Processing Letters, vol. 1, no. 10, pp. 849-852.

    Article  Google Scholar 

  29. Kim DS (2005) Anique: An auditory model for single-ended speech quality estimation. IEEE Transactions on Speech and Audio Processing, vol. 13, no. 4, pp. 1-11.

    Article  MATH  Google Scholar 

  30. Kim DS and Tarraf A (2006) Enhanced perceptual model for non-intrusive speech quality assessment. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, France, vol. 1, pp. 829-832.

    Google Scholar 

  31. Chen G and Parsa V (2004) Output-based speech quality evaluation by mea-suring perceptual spectral density distribution. IEE Electronics Letter, vol. 40, no. 12, pp. 783-784.

    Article  Google Scholar 

  32. Chen G and Parsa V (2004) Neuro-fuzzy estimator of speech quality. In Proceedings of International Conference on signal processing and communications (SPCOM), Bangalore, India, pp. 587-591.

    Google Scholar 

  33. Chen G and Parsa V (2005) Non-intrusive speech quality evaluation using an adaptive neurofuzzy inference system. IEEE Signal Processing Letters, vol. 12, no. 5, pp. 403-406.

    Article  Google Scholar 

  34. Chen G and Parsa V (2005) Bayesian model based non-intrusive speech quality evaluation. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Philadelphia, USA, vol. 1, pp. 385-388.

    Google Scholar 

  35. ITU (2004) Single ended method for objective speech quality assessment in narrow-band telephony applicaitons. ITU-T P.563.

    Google Scholar 

  36. Ding L, Radwan A, El-Hennawey MS, and Goubran RA (2006) Measurement of the effects of temporal clipping on speech quality. IEEE Transactions on Instrumentation and Measurement, vol. 55, no. 4, pp. 1197-1203.

    Article  Google Scholar 

  37. Grancharov V, Zhao DY, Lindblom J, and Kleijn WB (2006) Low-complexity, nonintrusive speech quality assessment. IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 6, pp. 1948-1956.

    Article  Google Scholar 

  38. Jang JS (1993) Anfis: adaptive-network-based fuzzy inference systems. IEEE Transactions on System, Man, and Cybernetics, vol. 23, no. 3, pp. 665-685.

    Article  MathSciNet  Google Scholar 

  39. Jang JS and Sun CT (1995) Neuro-fuzzy modeling and control. The Proceedings of the IEEE, vol. 83, no. 3, pp. 378-406.

    Article  Google Scholar 

  40. Jang JS, Sun CT, and Mizutani E (1997) Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence, Prentice-Hall, Englewood Cliffs, NJ.

    Google Scholar 

  41. Sugeno M and Kang GT (1988) Structure identificaiton of fuzzy model. Fuzzy Sets and Systems, vol. 28, pp. 15-33.

    Article  MATH  MathSciNet  Google Scholar 

  42. Takagi T and Sugeno M (1985) Fuzzy identification of systems and its appli-cation to modelling and control. IEEE Transactions on Systems, Man, and Cybernetics, vol. 15, pp. 116-132.

    MATH  Google Scholar 

  43. Haralick RM, Shanmugan K, and Dinstein IH (1973) Textural features for image classification. IEEE Transactions on System, Man, and Cybernetics, vol. SMC-3, pp. 610-621.

    Google Scholar 

  44. Haralick RM (1979) Statistical and structural approaches to texture. Proceedings of IEEE, vol. 67, pp. 786-804.

    Article  Google Scholar 

  45. Terzopoulos D(1985) Co-occurrence analysis of speech waveforms. IEEE Transactions on acoustics, speech and signal processing, vol. ASSP-33, no. 1, pp. 5-30.

    Google Scholar 

  46. ITU (1998) ITU-T coded-speech database. ITU-T P-series Supplement 23.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Chen, G., Parsa, V. (2008). Objective Speech Quality Evaluation Using an Adaptive Neuro-Fuzzy Network. In: Prasad, B., Prasanna, S.R.M. (eds) Speech, Audio, Image and Biomedical Signal Processing using Neural Networks. Studies in Computational Intelligence, vol 83. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75398-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75398-8_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75397-1

  • Online ISBN: 978-3-540-75398-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics