Skip to main content

The IIR Submission to CSLP 2006 Speaker Recognition Evaluation

  • Conference paper
Chinese Spoken Language Processing (ISCSLP 2006)

Abstract

This paper describes the design and implementation of a practical automatic speaker recognition system for the CSLP speaker recognition evaluation (SRE). The speaker recognition system is built upon four subsystems using speaker information from acoustic spectral features. In addition to the conventional spectral features, a novel temporal discrete cosine transform (TDCT) feature is introduced in order to capture long-term speech dynamic. The speaker information is modeled using two complementary speaker modeling techniques, namely, Gaussian mixture model (GMM) and support vector machine (SVM). The resulting subsystems are then integrated at the score level through a multilayer perceptron (MLP) neural network. Evaluation results confirm that the feature selection, classifier design, and fusion strategy are successful, giving rise to an effective speaker recognition system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Furui, S.: Speaker verification. In: Madisetti, V.K., Williams, D.B. (eds.) Digital Signal Processing Handbook. CRC Press LLC, Boca Raton (1999)

    Google Scholar 

  2. Quatieri, T.F.: Discrete-time speech signal processing: principles and practice. Prentice-Hall, Upper- Sadder River (2002)

    Google Scholar 

  3. Evaluation Plan for ISCSLP 2006 Special Session on Speaker Recognition, Chinese Corpus Consortium (April 2006)

    Google Scholar 

  4. van Leeuwen, D.A., Martin, A.F., Przybocki, M.A., Bouten, J.S.: NIST and NFITNO evaluations of automatic speaker recognition. Computer Speech and Language 20, 128–158 (2006)

    Article  Google Scholar 

  5. Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech, Signal Processing ASSP-28(4) (August 1980)

    Google Scholar 

  6. Kinnunen, T.H., Koh, C.W.E., Wang, L., Li, H., Chng, E.S.: Shifted delta cepstrum amd temporal discrete cosine transform features in speaker verification. Accepted for presentation in International Symposium on Chinese Spoken Language Processing (2006)

    Google Scholar 

  7. Bimbot, F., Bonastre, J.F., Fredouille, C., Gravier, G., Margin-Chagnolleau, I., Meignier, S., Merlin, T., Ortega-Garcia, J., Petrovska, D., Reynolds, D.A.: A tutorial on textindepent speaker verification. Eurasip Journal on Applied Signal Processing 4, 430–451 (2004)

    Google Scholar 

  8. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)

    Article  Google Scholar 

  9. Campbell, W.M.: Generalized linear discriminant sequence kernels for speaker recognition. In: Proc. ICASSP, pp. 161–164 (2002)

    Google Scholar 

  10. Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres- Carrasquillo, P.A.: Support vector machines for speaker and language recognition. Computer Speech and Language 20(2-3), 210–229 (2006)

    Article  Google Scholar 

  11. Collobert, R., Bengio, S.: SVMTorch: support vector machines for large-scale regression problems. Journal of Machine Learning Research 1, 143–160 (2001)

    Article  MathSciNet  Google Scholar 

  12. Auckenthaler, M.C., Lloyd-Thomas, H.: Score normalization for text-independent speaker verification systems. Digital Signal Processing 10(1-3), 42–54 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, KA. et al. (2006). The IIR Submission to CSLP 2006 Speaker Recognition Evaluation. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_52

Download citation

  • DOI: https://doi.org/10.1007/11939993_52

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49665-6

  • Online ISBN: 978-3-540-49666-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics