Abstract
In this paper, we present a technique for the extraction of the main five visemes for German spoken language analysis from images. The intensity, the edges, and the line segments are used to locate the lips automatically, and to discriminate between the desired viseme classes. Good recognition rate has been achieved on different speakers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
C. Bregler, H. Hild, S. Manke, and W. A. “Improving connected letter recognition by lipreading”. In International Joint Conference of Speech and Signal Processing, volume 1, pages 557–560, Mineapolis, MN, 1993.
Tsuhan Chen. “Audiovisual Speech Processing Lip Reading and Lip Synchronization”. IEEE Signal Processing Magazine, Vol. 18, No. 1, pages 9–21, January 2001.
C. Bregler and Y. Konig “‘Eigenlips’ for robust speech recognition”, Proc. Int. Conf. Acoust. Speech Signal Process. Vol. 75, 1994, pp 669–672.
Aleix M. and Avinash K. “PCA versus LDA”, IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 23, No 2, Feb. 2001, pp 228–233.
Marcus E. Hennecke, David G. Stork, and K. Venkatesh Prasad. “Visionary speech: Looking ahead to practical speechreading systems”. Speechreading by Humans and Machines, volume 150 of NATO ASI Series, Series F: Computer and Systems Sciences, Berlin, 1995. Springer Verlag.
Abry, C. and Boe L.-J. “Laws for Lips”, J. Speech Communication, Vol. 5, 1986, p 97–104.
B. Dodd and R. Cambell, Eds, “Hearing by Eye: The Psychology of Lipreading”. London: Lawrence Erlbaum, 1987.
Lucey, S., Srindharan, S. and Chandran, V., “Initialized Eigenlip Estimator for Fast Lip Tracking Using Linear Regression”, ICPR’2000, Barcelona, Spain.
M. U. Ramos Sanchez, J. Matas, and J. Kittler. Statistical chromaticity models for lip tracking with B-splines. In Proceedings ofthe First International Conference on Audio-and Video-based Biometric Person Authentication, Lecture Notes in Computer Science, pages 69–76. Springer Verlag, 1997.
L. Revéret, C. BenoÎt. A New 3D Lip Model for Analysis and Synthesis of Lip Motion in Speech Production. Proc. 2nd ESCA Workshop on Audio-Visual Speech Processing, Terrigal, Australia, Dec. 4-6, 1998.
J. Luettin, N. Thacker, and S. Beet. Statistical lip modelling for visual speech recognition. In VIII European Signal Processing Conference, Trieste Italy, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shdaifat, I., Grigat, RR., Lütgert, S. (2001). Recognition of the German Visemes Using Multiple Feature Matching. In: Radig, B., Florczyk, S. (eds) Pattern Recognition. DAGM 2001. Lecture Notes in Computer Science, vol 2191. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45404-7_58
Download citation
DOI: https://doi.org/10.1007/3-540-45404-7_58
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42596-0
Online ISBN: 978-3-540-45404-5
eBook Packages: Springer Book Archive