Abstract
We developed acoustic and lexical classifiers, based on a boosting algorithm, to assess the separability on arousal and valence dimensions in spontaneous emotional speech. The spontaneous emotional speech data was acquired by inviting subjects to play a first-person shooter video game. Our acoustic classifiers performed significantly better than the lexical classifiers on the arousal dimension. On the valence dimension, our lexical classifiers usually outperformed the acoustic classifiers. Finally, fusion between acoustic and lexical features on feature level did not always significantly improve classification performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Schapire, R.E., Singer, Y.: A Boosting-based system for text categorization. Machine Learning 39, 135–168 (2000)
Schuller, B., Muller, R., Lang, M., Rigoll, G.: Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In: Proceedings of Interspeech, pp. 805–808 (2005)
Litman, D.J., Forbed-Riley, K.: Predicting student emotions in computer-human tutoring dialogues. In: Proceedings of ACL, pp. 351–358 (2004)
Lee, C.H., Narayanan, S.S., Pieraccini, R.: Combining acoustic and language information for emotion recognition. In: Proceedings of ICSLP, pp. 873–876 (2002)
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication 48, 1162–1181 (2006)
Krippendorff, K.: Computing Krippendorff’s Alpha-Reliability. (Accessed, 29/03/08), http://www.asc.upenn.edu/usr/krippendorff/webreliability.doc
Boersma, P., Weenink, D.: Praat: doing phonetics by computer (Version 5.0.19) [Computer program] Retrieved April 4 (2008), from http://www.praat.org/
Lazarro, N.: Why whe play games: 4 keys to more emotion without story. In: Game Developers Conference (2004)
Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., Schröder, M.: Feeltrace: An instrument for recording perceived emotion in real time. In: Proceedings of the ISCA Workshop on Speech and Emotion, pp. 19–24 (2000)
Pellom, B.: SONIC: The university of Colorado Continuous Speech Recognizer. Technical Report TRCSLR-2001-01, University of Colorado, Boulder (2001)
Pittam, J., Gallois, C., Callan, V.: The long-term spectrum and perceived emotion. Speech Communication 9, 177–187 (1990)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Truong, K.P., Raaijmakers, S. (2008). Automatic Recognition of Spontaneous Emotions in Speech Using Acoustic and Lexical Features. In: Popescu-Belis, A., Stiefelhagen, R. (eds) Machine Learning for Multimodal Interaction. MLMI 2008. Lecture Notes in Computer Science, vol 5237. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85853-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-85853-9_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85852-2
Online ISBN: 978-3-540-85853-9
eBook Packages: Computer ScienceComputer Science (R0)