Skip to main content

WFT – Context-Sensitive Speech Signal Representation

  • Conference paper
Intelligent Information Processing and Web Mining

Part of the book series: Advances in Soft Computing ((AINSC,volume 35))

  • 599 Accesses

Abstract

Progress of automatic speech recognition systems’ (ASR) development is, inter alia, made by using signal representation sensitive for more and more sophisticated features. This paper is an overview of our investigation of the new context-sensitive speech signal’s representation, based on wavelet-Fourier transform (WFT), and proposal of it’s quality measures. The paper is divided into 5 sections, introducing as follows: phonetic-acoustic contextuality in speech, basics of WFT, WFT speech signal feature space, feature space quality measures and finally conclusion of our achievements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1. Benitez C., Burget L. et al. (2001) Robust ASR front-end using spectral-based and discriminant features: experiments on the Aurora tasks. Eurospeech, Aalborg

    Google Scholar 

  2. 2. Bojar B. (1974) Elementy jêzykoznawstwa dla informatyków. PAN ODiIN, Warszawa.

    Google Scholar 

  3. 3. Bölla K., Foldi E. (1987) A Phonetic Conspectus of Polish, The Articulatory and Acoustic Features of Polish Speech Sounds. Linguistic Institute of the Hungarian Academy of Sciences, Budapest

    Google Scholar 

  4. 4. Chang S., Greenberg S.,Wester M. (2001) An Elitist Approach to Articulatory- Acoustic Feature Classification. Eurospeech, Aalborg

    Google Scholar 

  5. 5. Dukiewicz L., Piela R. (1962) Wyrazistoćæ I rozróźnialnoćæ gòsek w jêzyku polskim w zaleęnoćci od górnej granicy czêstotliwoćcifi Przeglą Telekomunikacyjny

    Google Scholar 

  6. 6. Dukiewicz L. (1995) Gramatyka Wspólczesnego Jêzyka Polskiego—Fonetyka. Instytut Jêzyka Polskiego PAN, Kraków

    Google Scholar 

  7. 7. Galka J., Kêpiński M. (2004) Wavelet-Fourier Spectrum Parameterisation for Speech Signal Recognition. Proceedings of the Tenth National Conference on Application of Mathematics in Biology and Medicine. wiêty Krzyź

    Google Scholar 

  8. 8. Gold B., Morgan N. (2000) Speech and Audio Signal Processing. John Wiley&Sons Inc.

    Google Scholar 

  9. 9. Jassem W. (1966) The Distinctwe Features and Entropy of the Polish Phoneme System. Biuletyn PTJ XXIV

    Google Scholar 

  10. 10. Jassem W. (1973) Podstawy fonetyki akustycznej. PWN, Warszawa

    Google Scholar 

  11. 11. Kêpiński M. (2001) Ulepszona metodaćledzenia punktów charakterystycznych. II Krajowa Konferencja Metody i Systemy Komputerowe w badaniach naukowych i projektowaniu inźynierskim, Kraków

    Google Scholar 

  12. 12. Kòsowski P. (2000) Usprawnienie procesu rozpoznawania mowy w oparciu o fonetykêi fonologiêjêzyka polskiego. Politechnika lś ka, Gliwice

    Google Scholar 

  13. 13. Martens P. J. (Chairman) (2000) Continuous Speech Recognition over the Telephone, Electronics&Information Systems (ELIS). Final Report of COST Action 249, Ghent University

    Google Scholar 

  14. 14. Miêkisz M., Denenfeld J. (1975) Phonology and Distribution of Phonemes in Present-day English and Polish. Ossolineum, Wrolcaw

    Google Scholar 

  15. 15. Rabiner L., Juang B. H. (1993) Fundamentals of Speech Recognition. Prentice- Hall, Englewood Cliffs, NJ

    Google Scholar 

  16. 16. Rolcawski B. (1976) Zarys fonologii, fonetyki, fonotaktyki i fonostatystyki wspólczesnego jêzyka polskiego. Gdańsk

    Google Scholar 

  17. 17. SAMPA—A computer readable phonetic alphabet. http://www.phon.ucl.ac.uk/home/sampa/home.htm

    Google Scholar 

  18. 18. Sharma S., Ellis D. et al. (2000) Feature extraction using non-linear transformation for robust speech recognition on the Aurora database. ICASSP, Istanbul

    Google Scholar 

  19. 19. Shuangyu C. (2002) A Syllable, Articulatory-Feature, and Stress-Accent Model of Speech Recognition. Ph.D. Thesis, University of California, Berkeley

    Google Scholar 

  20. 20. Somervuo P. (2003) Experiments With Linear And Nonlinear Feature Transformations In HMM Based Phone Recognition. ICASSP, Hong Kong

    Google Scholar 

  21. 21. Somervuo P., Chen B., Zhu Q. (2003) Feature Transformations and Combinations for Improving ASR Performance. Eurospeech, Geneva

    Google Scholar 

  22. 22. Tadeusiewicz R., Flasiński M. (2000) Rozpoznawanie obrazów. AGH, Kraków

    Google Scholar 

  23. 23. Tadeusiewicz R. (1988) Sygnal mowy. Wydawnictwa Komunikacji i Łącznoćci, Warszawa

    Google Scholar 

  24. 24. Tan B., Lang R. et al. (1994) Applying wavelet analysis to speech segmentation and classification. Proceedings of Spie the International Society for Optical Engineering, Orlando, 750–761

    Google Scholar 

  25. 25. Tyagi V., McCowan ifi et al. (2003) Mel-cepstrum Modulation Spectrum (MCMS) Features for Robust ASR. Dalle Molle Institute for Perceptual Arti ficial Intelligence (IDIAP)

    Google Scholar 

  26. 26. Xiong Z., Huang T. S. (2002) Boosting Speech/Non-Speech Classification Using Averaged Mel-frequency Cepstrum Coef-ficients Features. Proceedings of The Third IEEE Pacific-Rim Conference on Multimedia

    Google Scholar 

  27. 27. Ziólko M., Kêpiński M., Galka J. (2003) Wavelet-Fourier Analysis of Speech Signal. Procedings of the Workshop on Multimedia Communications and Services, Kielce

    Google Scholar 

  28. 28. Ziólko M., Stêpień J. (1999) Does the Wavelet Transfer Function Exist? Proceedings of the ECMCS99, CD ROM, Kraków

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer

About this paper

Cite this paper

Gałka, J., Kępiński, M. (2006). WFT – Context-Sensitive Speech Signal Representation. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 35. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33521-8_10

Download citation

  • DOI: https://doi.org/10.1007/3-540-33521-8_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33520-7

  • Online ISBN: 978-3-540-33521-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics