Skip to main content

Automatic Recognition System for Dysarthric Speech Based on MFCC’s, PNCC’s, JITTER and SHIMMER Coefficients

  • Conference paper
  • First Online:
Advances in Computer Vision (CVC 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 944))

Included in the following conference series:

Abstract

The aim of this work is to improve the automatic recognition of the dysarthria speech. In this context, we have compared two techniques of speech parameterization; these two techniques are based on the recently proposed coefficients Power Normalized Cepstral Coefficients and Mel-Frequency Cepstral Coefficients. In this paper we have concatenate several variants of JITTER and SHIMMER with the techniques of speech parameterization to improve an automatic recognition of the dysarthric word system. The aim is to help the fragile persons having speech problems (dysarthric voice) and the doctor to make a first diagnosis about the patient’s disease. For this, an Automatic Acknowledgment of Continuous Pathological Speech System has been developed based on the Hidden Models of Markov and the Hidden Markov Model Toolkit. For our tests, we used the Nemours Database which contains 11 speakers representing dysarthric voices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kim, C., Stern, R.M.: Power Normalized Cepstral Coefficients (PNCC) for robust speech recognition. IEEE Trans. Audio Speech Lang. Process. 24, 1315 (2016)

    Article  Google Scholar 

  2. Mohammed, A., Mansour, A., Ghulam, M., Mohammed, Z., Mesallam, T.A., Malki, K.H., Mohamed, F., Mekhtiche, M.A., Mohamed, B.: Automatic speech recognition of pathological voice. Indian J. Sci. Technol. 8, 32 (2015)

    Article  Google Scholar 

  3. Tsanas, A.: Accurate telemonitoring of Parkinson’s disease symptom severity using nonlinear speech signal processing and statistical machine learning. University of Oxford, June 2012

    Google Scholar 

  4. Zaidi, B.F., Selouani, S.A., Boudraa, M., Hamdani, G.: Human/machine interface dialog integrating new information and communication technology for pathological voice. In: IEEE Xplore, Future Technologies Conference (FTC), San Francisco, CA, USA, January 2017

    Google Scholar 

  5. Alam, M.J., Kenny, P., Dumouchel, P., O’Shaughnessy, D.: Robust feature extractors for continuous speech recognition. In: IEEE Xplore, European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, November 2014

    Google Scholar 

  6. Dua, M., Aggarwal, R.K., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. Int. J. Comput. Sci. Issues 9(4), 359 (2012)

    Google Scholar 

  7. Young, S., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book, version 3.1, pp. 1–277 (2006)

    Google Scholar 

  8. Menéndez-Pidal, X., Polikoff, J.B., Peters, S.M., Leonzio, J.E., Bunnell, H.T.: The nemours database of dysarthric speech. J. IEEE (in press)

    Google Scholar 

  9. Darley, F.L., Aronson, A.E., Brown, J.R.: Differential diagnostic patterns of dysarthria. J. Speech Lang. Hear. Res. 12, 246–269 (1969)

    Article  Google Scholar 

  10. Titze, I.R.: Principles of Voice Production. National Center for Voice and Speech, Iowa City, USA, 2nd printing (2000)

    Google Scholar 

  11. Schoentgen, J., de Guchteneere, R.: Time series analysis of jitter. J. Phon. 23, 189–201 (1995)

    Article  Google Scholar 

  12. Baken, R.J., Orlikoff, R.F.: Clinical Measurement of Speech and Voice, 2nd edn. Singular Thomson Learning, San Diego (2000)

    Google Scholar 

  13. Tsanas, A., Little, M.A., McSharry, P.E., Ramig, L.O.: Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson‘s disease symptom severity. J. R. Soc. Interface 8, 842–855 (2011)

    Article  Google Scholar 

  14. Kaiser, J.: On a simple algorithm to calculate the ‘energy’ of a signal. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1990), pp. 381–384, Albuquerque, NM, USA, April 1990

    Google Scholar 

  15. Kounoudes, A., Naylor, P.A., Brookes, M.: The DYPSA algorithm for estimation of glottal closure instants in voices speech. In: IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP), pp. 349–352, Orlando, FL (2002)

    Google Scholar 

  16. Naylor, P.A., Kounoudes, A., Gudnason, J., Brookes, M.: Estimation of glottal closure instants in voices speech using the DYPSA algorithm. IEEE Trans. Audio Speech Lang. Process. 15, 34–43 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brahim-Fares Zaidi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zaidi, BF., Boudraa, M., Selouani, SA., Addou, D., Yakoub, M.S. (2020). Automatic Recognition System for Dysarthric Speech Based on MFCC’s, PNCC’s, JITTER and SHIMMER Coefficients. In: Arai, K., Kapoor, S. (eds) Advances in Computer Vision. CVC 2019. Advances in Intelligent Systems and Computing, vol 944. Springer, Cham. https://doi.org/10.1007/978-3-030-17798-0_40

Download citation

Publish with us

Policies and ethics