Analysis of Acoustic Events in Speech Signals Using Bessel Series Expansion

Prakash, Chetana; Gowda, Dhananjaya N.; Gangashetty, Suryakanth V.

doi:10.1007/s00034-013-9596-1

Analysis of Acoustic Events in Speech Signals Using Bessel Series Expansion

Published: 26 April 2013

Volume 32, pages 2915–2938, (2013)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Chetana Prakash¹,
Dhananjaya N. Gowda² &
Suryakanth V. Gangashetty¹

425 Accesses
6 Citations
Explore all metrics

Abstract

In this paper, we propose an approach for the analysis and detection of acoustic events in speech signals using the Bessel series expansion. The acoustic events analyzed are the voice onset time (VOT) and the glottal closure instants (GCIs). The hypothesis is that the Bessel functions with their damped sinusoid-like basis functions are better suited for representing the speech signals than the sinusoidal basis functions used in the conventional Fourier representation. The speech signal is band-pass filtered by choosing the appropriate range of Bessel coefficients to obtain a narrow-band signal, which is decomposed further into amplitude modulated (AM) and frequency modulated (FM) components. The discrete energy separation algorithm (DESA) is used to compute the amplitude envelope (AE) of the narrow-band AM-FM signal. Events such as the consonant and vowel beginnings in an unvoiced stop consonant vowel (SCV) and the GCIs are derived by processing the AE of the signal. The proposed approach for the detection of the VOT using the Bessel expansion is shown to perform better than the conventional Fourier representation. The performance of the proposed GCI detection method using the Bessel series expansion is compared against some of the existing methods for various noise environments and signal-to-noise ratios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Nonparametric Approach for Multicomponent AM–FM Signal Analysis

Article 04 July 2020

An Efficient Method for Fundamental Frequency Determination of Noisy Speech

Parametric representation of speech employing multi-component AFM signal model

Article 31 January 2015

References

M. Brookes, P.A. Naylor, J. Gundnason, A quantitative assessment of group delay method for identifying glottal closure in voiced speech. IEEE Trans. Audio Speech Lang. Process. 14(2), 456–466 (2006)
Article Google Scholar
C.S. Chen, K. Gopalan, P. Mitra, Speech signal analysis and synthesis via Fourier–Bessel representation, in Proc. Inter. Conf. Acoust. Speech and Signal Processing (ICASSP) (1985), pp. 497–500
Google Scholar
S. Das, J.H.L. Hansen, Detection of voice onset time (VOT) for unvoiced stops (/k/, /t/, /p/) using the Teager energy operator (TEO) for automatic detection of accented English, in Proc. 6th Nordic Signal Processing Symposium (2004), pp. 344–347
Google Scholar
K. Gopalan, T.R. Anderson, E.J. Cupples, A comparison of speaker identification results using features based on cepstrum and Fourier–Bessel expansion. IEEE Trans. Speech Audio Process. 7(3), 289–294 (1999)
Article Google Scholar
K. Gopalan, Speech coding using Fourier–Bessel expansion of speech signals, in Proc. 27th Annu. Conf. IEE Industrial Electronics Society, vol. 3 (2001), pp. 2199–2203
Google Scholar
F.S. Gurgen, C.S. Chen, Speech enhancement by Fourier–Bessel coefficients of speech and noise. Commun. Speech Vis., IEE Proc. I 137(5), 290–294 (1990)
Article Google Scholar
J.F. Kaiser, On a simple algorithm to calculate the energy of a signal, in Proc. Inter. Conf. Acoust. Speech and Signal Processing (ICASSP) (1990), pp. 381–384
Chapter Google Scholar
L. Kaushik, D. O’Saughnessy, A novel method for epoch extraction from speech signals, in Proc. Interspeech (2009), pp. 2883–2886
Google Scholar
P.A. Keating, J.R. Westbury, K.N. Stevens, Mechanisms of stop-consonant release for different places of articulation. J. Acoust. Soc. Am. 67, 93 (1980)
Article Google Scholar
J. Kominek, A. Black, The CMU Arctic speech databases, in Proc. 5th ISCA Speech Synthesis Workshop (2004), pp. 223–234
Google Scholar
A.K. Krishnamurthy, Glottal source estimation using a sum-of-exponential model. IEEE Trans. Acoust. Speech Signal Process. 40(3), 682–686 (1992)
Article MathSciNet Google Scholar
P. Ladefoged, A Course in Phonetics, 3rd edn. (Harcourt Brace College, Fort Worth, 1993)
Google Scholar
P. Maragos, J.F. Kaiser, T.F. Quatieri, Energy separation in signal modulation with application to speech analysis. Digit. Signal Process. 41(10), 3024–3051 (1993)
MATH Google Scholar
K.S.R. Murthy, B. Yegnanarayana, Epoch extraction from speech signals. IEEE Trans. Audio Speech Lang. Process. 16(8), 1602–1613 (2008)
Article Google Scholar
J.I. Navarro-Mesa, E. Lleida-Solano, A. Moreno-Bilbao, A new method for epoch detection based on the Cohen’s class of time-frequency representations. IEEE Signal Process. Lett. 8(8), 225–227 (2001)
Article Google Scholar
A. Nayeemulla Khan, S.V. Gangashetty, S. Rajendran, Speech database for Indian languages—a preliminary study, in Proc. Int. Conf. Natural Language Processing, Mumbai, India (2002), pp. 295–301
Google Scholar
P.A. Naylor, A. Kounoudes, J. Gundnason, M. Brookes, Estimation of glottal closure instants in voiced speech using the DYPSA algorithm. IEEE Trans. Audio Speech Lang. Process. 15(1), 34–43 (2007)
Article Google Scholar
R.B. Pachori, P. Sircar, Analysis of multicomponent AM-FM signals using FB-DESA method. Digit. Signal Process. 20, 42–62 (2010)
Article Google Scholar
C. Prakash, N. Dhananjaya, S.V. Gangashetty, Bessel features for detection of voice onset time using AM-FM signal, in Proc. 2011 18th Int. Conf. on Systems, Signal and Image Process. (IWSSIP-2011) (2011), pp. 139–142
Google Scholar
C. Prakash, S.V. Gangashetty, Fourier–Bessel cepstral coefficients for robust speech recognition, in Proc. Inter. Conf. Signal Processing and Communication (SPCOM) (2012), pp. 1–5
Google Scholar
C. Prakash, N. Dhananjaya, S.V. Gangashetty, Detection of glottal closure instants from Bessel features using AM-FM signal, in Proc. 18th Int. Conf. on Systems, Signal and Image Process. (IWSSIP-2011) (2011), pp. 143–146
Google Scholar
C. Prakash, N. Dhananjaya, S.V. Gangashetty, Exploring Bessel features for detection of glottal closure instants, in Proc. Interspeech (2011), pp. 1985–1988
Google Scholar
K.S. Rao, S.R.M. Prasanna, B. Yegnanarayana, Determination of instants of significant excitation in speech using Hilbert envelope and group delay function. IEEE Signal Process. Lett. 14(10), 762–765 (2007)
Article Google Scholar
J. Schroeder, Signal processing via Fourier–Bessel series expansion. Digit. Signal Process. 3, 112–124 (1993)
Article MathSciNet Google Scholar
D.O. Shaughnessy, in Speech Communications Human and Machine, 2nd edn. (Wiley/IEEE, New York, 1999)
Chapter Google Scholar
K. Sjolander, J. Beskow, Wavesurfer—an open source speech tool, in Proc. Int. Conf. Spoken Language Processing, Beijing, China (2000), pp. 464–467
Google Scholar
R. Smiths, B. Yegnanarayana, Determination of instants of significant excitation in speech using group delay functions. IEEE Trans. Speech Audio Process. 3(5), 325–333 (1995)
Article Google Scholar
K.N. Stevens, Acoustic Phonetics (MIT, Cambridge, 1999)
Google Scholar
A. Varga, H.J.M. Steeneken, Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun. 12(3), 247–251 (1993)
Article Google Scholar
B. Yegnanarayana, S.V. Gangashetty, Machine learning for speech recognition—an illustration of phonetic engine using hidden Markov models, in Proc. Inter. Conf. Frontiers of Interface Between Statistics and Science (2010), pp. 319–328
Google Scholar

Download references

Acknowledgements

The authors would like to thank the Department of Information Technology (DIT), Government of India, and the Defense Research and Development Organization (DRDO), Government of India, for supporting this activity through sponsored research projects. The second author would also like to thank The Academy of Finland (Finnish Centre of Excellence in Computational Inference Research COIN, 251170), and the European community’s seventh framework programme (FP7/2007–2013) under grant agreement no. 287678 (Simple4All) for supporting his stay in Finland as a postdoctoral researcher.

Author information

Authors and Affiliations

International Institute of Information Technology Hyderabad, Hyderabad, India
Chetana Prakash & Suryakanth V. Gangashetty
Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland
Dhananjaya N. Gowda

Authors

Chetana Prakash
View author publications
You can also search for this author in PubMed Google Scholar
Dhananjaya N. Gowda
View author publications
You can also search for this author in PubMed Google Scholar
Suryakanth V. Gangashetty
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chetana Prakash.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prakash, C., Gowda, D.N. & Gangashetty, S.V. Analysis of Acoustic Events in Speech Signals Using Bessel Series Expansion. Circuits Syst Signal Process 32, 2915–2938 (2013). https://doi.org/10.1007/s00034-013-9596-1

Download citation

Received: 17 July 2012
Revised: 09 April 2013
Published: 26 April 2013
Issue Date: December 2013
DOI: https://doi.org/10.1007/s00034-013-9596-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of Acoustic Events in Speech Signals Using Bessel Series Expansion

Abstract

Access this article

Similar content being viewed by others

A Nonparametric Approach for Multicomponent AM–FM Signal Analysis

An Efficient Method for Fundamental Frequency Determination of Noisy Speech

Parametric representation of speech employing multi-component AFM signal model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analysis of Acoustic Events in Speech Signals Using Bessel Series Expansion

Abstract

Access this article

Similar content being viewed by others

A Nonparametric Approach for Multicomponent AM–FM Signal Analysis

An Efficient Method for Fundamental Frequency Determination of Noisy Speech

Parametric representation of speech employing multi-component AFM signal model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation