Client Dependent GMM-SVM Models for Speaker Verification

Le, Quan; Bengio, Samy

doi:10.1007/3-540-44989-2_53

Quan Le⁷ &
Samy Bengio⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2714))

Included in the following conference series:

1607 Accesses
8 Citations

Abstract

Generative Gaussian Mixture Models (GMMs) are known to be the dominant approach for modeling speech sequences in text independent speaker verification applications because of their scalability, good performance and their ability in handling variable size sequences. On the other hand, because of their discriminative properties, models like Support Vector Machines (SVMs) usually yield better performance in static classification problems and can construct flexible decision boundaries. In this paper, we try to combine these two complementary models by using Support Vector Machines to postprocess scores obtained by the GMMs. A cross-validation method is also used in the baseline system to increase the number of client scores in the training phase, which enhances the results of the SVM models. Experiments carried out on the XM2VTS and PolyVar databases confirm the interest of this hybrid approach.

The authors would like to thank the Swiss National Science Foundation for supporting this work through the National Center of Competence in Research (NCCR) on “Interactive Multimodal Information Management (IM2)”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Bengio and J. Mariéthoz. Learning the decision function for speaker verification. In IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing ICASSP, 2001.
Google Scholar
C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data mining and Knowledge Discovery, 2(2):1–47, 1998.
Article Google Scholar
G. Chollet, J.-L. Cochard, A. Constantinescu, C. Jaboulet, and P. Langlais. Swiss french polyphone and polyvar: telephone speech databases to model inter-and intra-speaker variability. IDIAP-RR 1, IDIAP, 1996.
Google Scholar
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum-likelihood from incomplete data via the EM algorithm. Jrnl. of Royal Statistical Society B, 39:1–38, 1977.
MATH MathSciNet Google Scholar
S. Furui. Recent advances in speaker recognition. Lecture Notes in Computer Science, 1206:237–252, 1997.
Google Scholar
J. Mariéthoz and S. Bengio. A comparative study of adaptation methods for speaker verification. In Intl. Conf. on Spoken Language Processing ICSLP, 2002.
Google Scholar
K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre. XM2VTSDB: The extended M2VTS database. In Second International Conference on Audio and Video-based Biometric Person Authentication AVBPA, March 1999.
Google Scholar
D. A. Reynolds, T. F. Quatieri, and R. B. Dunn. Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 10:19–41, 2000.
Article Google Scholar
V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, New York, NY, USA, 1995.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

IDIAP, P.O. Box 592, CH-1920, Martigny, Switzerland
Quan Le & Samy Bengio

Authors

Quan Le
View author publications
You can also search for this author in PubMed Google Scholar
Samy Bengio
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Bogazici University, Bebek, 34342, Istanbul, Turkey
Okyay Kaynak & Ethem Alpaydin &
Laboratory of Computer and Information Science, Helsinki University of Technology, P.O.B. 5400, 02015, Finland
Erkki Oja
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
Lei Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Le, Q., Bengio, S. (2003). Client Dependent GMM-SVM Models for Speaker Verification. In: Kaynak, O., Alpaydin, E., Oja, E., Xu, L. (eds) Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003. ICANN ICONIP 2003 2003. Lecture Notes in Computer Science, vol 2714. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44989-2_53

Download citation

DOI: https://doi.org/10.1007/3-540-44989-2_53
Published: 18 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40408-8
Online ISBN: 978-3-540-44989-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics