Skip to main content

New Sub-band Processing Framework Using Non-linear Predictive Models for Speech Feature Extraction

  • Conference paper
Nonlinear Analyses and Algorithms for Speech Processing (NOLISP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3817))

Abstract

Speech feature extraction methods are commonly based on time and frequency processing approaches. In this paper, we propose a new framework based on sub-band processing and non-linear prediction. The key idea is to pre-process the speech signal by a filter bank. From the resulting signals, non-linear predictors are computed. The feature extraction method involves the association of different Neural Predictive Coding (NPC) models. We apply this new framework to phoneme classification and experiments carried out with the NTIMIT database show an improvement of the classification rates in comparison with the full-band approach. The new method is also shown to give better performance than the traditional Linear Predictive Coding (LPC), Mel Frequency Cepstral Coding (MFCC) and Perceptual Linear Prediction (PLP) methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allen, J.B.: How Do Humans Process and Recognize Speech? IEEE Trans. on Speech and Audio Processing 2(4), 567–577 (1994)

    Article  Google Scholar 

  2. Besacier, L., Bonastre, J.F.: Subband approach for automatic speaker recognition: Optimal division of the frequency. In: Audio and Video-based Biometric Person Authentification. LNCS, pp. 195–202. Springer, Heidelberg (1997)

    Google Scholar 

  3. Chetouani, M.: Codage neuro-prédictif pour l’extraction de caractéristiques de signaux de signaux de parole. Université Paris VI (2004)

    Google Scholar 

  4. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience Publication, Hoboken (2001)

    MATH  Google Scholar 

  5. Gas, B., Zarader, J.L., Chavy, C., Chetouani, M.: Discriminant neural predictive coding applied to phoneme recognition. Neurocomputing 56, 141–166 (2004)

    Article  Google Scholar 

  6. Ghitza, O.: Auditory Models and Human Performance in Tasks Related to Speech Coding and Speech Recognition. IEEE Trans. on Speech and Audio Processing 2(1), 115–132 (1994)

    Article  Google Scholar 

  7. Gold, B., Nelson, N.: Speech and Audio Signal Processing: Processing and Perception of Speech and Music. John Wiley and Sons, INC, Chichester (2000)

    Google Scholar 

  8. Greenberg, S.: Representation of speech in the auditory periphery. Journal of Phonetics, Special Issue 16(1) (January 1994)

    Google Scholar 

  9. Hermansky, H.: Perceptual linear predictive (PLP) analysis of speech. The Journal of the Acoustical Society of America, 1738–1752 (1990)

    Google Scholar 

  10. Hermansky, H.: Auditory Modeling in Automatic Recognition of Speech. In: Proc. Keele Workshop (1996)

    Google Scholar 

  11. Hermansky, H., Tibrewala, S., Pavel, M.: Towards ASR on Partially Corrupted Speech. In: Proc. ICSLP (1996)

    Google Scholar 

  12. Hussain, A., Campbell, D.R.: Binaural Sub-Band Adaptive Speech Enhancement Using Artificial Neural Networks. Speech Communication, 177–186 (1998)

    Google Scholar 

  13. Jankowski, C., Kalyanswamy, A., Basson, S., Spitz, J.: NTIMIT: A Phonetically Balanced, Continous Speech, Telephone Bandwidth Speech Database. In: Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), vol. 1, pp. 109–112 (1990)

    Google Scholar 

  14. Kleijn, W.B.: Signal Processing Representations of Speech. IEICE Trans. Inf. and Syst. E86-D 3, 359–376 (2003)

    Google Scholar 

  15. Paliwal, K.K.: Spectral Subband Centroid Features for Speech Recognition. In: Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), vol. 2, pp. 617–620 (1988)

    Google Scholar 

  16. Tibrewala, S., Hermansky, H.: Sub-band Based Recognition of Noisy Speech. In: Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), vol. 2, pp. 1255–1258 (1997)

    Google Scholar 

  17. Yu, R., Ko, C.C.: A Warped Linear-Prediction-Based Subband Audio Coding Algorithm. IEEE Trans. on Speech and Audio Processing 10(2), 1–8 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chetouani, M., Hussain, A., Gas, B., Zarader, JL. (2006). New Sub-band Processing Framework Using Non-linear Predictive Models for Speech Feature Extraction. In: Faundez-Zanuy, M., Janer, L., Esposito, A., Satue-Villar, A., Roure, J., Espinosa-Duro, V. (eds) Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science(), vol 3817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11613107_25

Download citation

  • DOI: https://doi.org/10.1007/11613107_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31257-4

  • Online ISBN: 978-3-540-32586-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics