Automatic Recognition of Spontaneous Emotions in Speech Using Acoustic and Lexical Features

Truong, Khiet P.; Raaijmakers, Stephan

doi:10.1007/978-3-540-85853-9_15

Khiet P. Truong¹ &
Stephan Raaijmakers²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5237))

Included in the following conference series:

International Workshop on Machine Learning for Multimodal Interaction

908 Accesses
8 Citations

Abstract

We developed acoustic and lexical classifiers, based on a boosting algorithm, to assess the separability on arousal and valence dimensions in spontaneous emotional speech. The spontaneous emotional speech data was acquired by inviting subjects to play a first-person shooter video game. Our acoustic classifiers performed significantly better than the lexical classifiers on the arousal dimension. On the valence dimension, our lexical classifiers usually outperformed the acoustic classifiers. Finally, fusion between acoustic and lexical features on feature level did not always significantly improve classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Schapire, R.E., Singer, Y.: A Boosting-based system for text categorization. Machine Learning 39, 135–168 (2000)
Article MATH Google Scholar
Schuller, B., Muller, R., Lang, M., Rigoll, G.: Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In: Proceedings of Interspeech, pp. 805–808 (2005)
Google Scholar
Litman, D.J., Forbed-Riley, K.: Predicting student emotions in computer-human tutoring dialogues. In: Proceedings of ACL, pp. 351–358 (2004)
Google Scholar
Lee, C.H., Narayanan, S.S., Pieraccini, R.: Combining acoustic and language information for emotion recognition. In: Proceedings of ICSLP, pp. 873–876 (2002)
Google Scholar
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication 48, 1162–1181 (2006)
Article Google Scholar
Krippendorff, K.: Computing Krippendorff’s Alpha-Reliability. (Accessed, 29/03/08), http://www.asc.upenn.edu/usr/krippendorff/webreliability.doc
Boersma, P., Weenink, D.: Praat: doing phonetics by computer (Version 5.0.19) [Computer program] Retrieved April 4 (2008), from http://www.praat.org/
Lazarro, N.: Why whe play games: 4 keys to more emotion without story. In: Game Developers Conference (2004)
Google Scholar
Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., Schröder, M.: Feeltrace: An instrument for recording perceived emotion in real time. In: Proceedings of the ISCA Workshop on Speech and Emotion, pp. 19–24 (2000)
Google Scholar
Pellom, B.: SONIC: The university of Colorado Continuous Speech Recognizer. Technical Report TRCSLR-2001-01, University of Colorado, Boulder (2001)
Google Scholar
Pittam, J., Gallois, C., Callan, V.: The long-term spectrum and perceived emotion. Speech Communication 9, 177–187 (1990)
Article Google Scholar

Download references

Author information

Authors and Affiliations

TNO Defence, Security and Safety, , Soesterberg, The Netherlands
Khiet P. Truong
TNO Information and Communication Technology, , Delft, The Netherlands
Stephan Raaijmakers

Authors

Khiet P. Truong
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Raaijmakers
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Andrei Popescu-Belis Rainer Stiefelhagen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Truong, K.P., Raaijmakers, S. (2008). Automatic Recognition of Spontaneous Emotions in Speech Using Acoustic and Lexical Features. In: Popescu-Belis, A., Stiefelhagen, R. (eds) Machine Learning for Multimodal Interaction. MLMI 2008. Lecture Notes in Computer Science, vol 5237. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85853-9_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-85853-9_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85852-2
Online ISBN: 978-3-540-85853-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics