Skip to main content

Stress Detection from Speech Using Spectral Slope Measurements

  • Conference paper
  • First Online:
Pervasive Computing Paradigms for Mental Health (FABULOUS 2016, MindCare 2016, IIOT 2015)

Abstract

Automatic detection of emotional stress is an active research domain, which has recently drawn increasing attention, mainly in the fields of computer science, linguistics, and medicine. In this study, stress is automatically detected by employing speech-derived features. Related studies utilize features such as overall intensity, MFCCs, Teager Energy Operator, and pitch. The present study proposes a novel set of features based on the spectral tilt of the glottal source and of the speech signal itself. The proposed features rely on the Probability Density Function of the estimated spectral slopes, and consist of the three most probable slopes from the glottal source, as well as the corresponding three slopes of the speech signal, obtained on a word level. The performance of the proposed method is evaluated on the simulated dataset of the SUSAS corpus, achieving recognition accuracy of \(92.06\%\), when the Random Forests classifier is used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 60.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Glottal source signal is the signal generated at the glottis which could be either periodic pulses or noise.

References

  1. Sharma, N., Gedeon, T.: Objective measures, sensors and computational techniques for stress recognition and classification: A survey. Comput. Methods Programs Biomed. 108(3), 1287–1301 (2012)

    Article  Google Scholar 

  2. Murray, I.R., Baber, C., South, A.: Towards a definition and working model of stress and its effects on speech. Speech Commun. 20(1), 3–12 (1996)

    Article  Google Scholar 

  3. Selye, H.: The Stress of Life. McGraw-Hill, New York (1956)

    Google Scholar 

  4. Lefter, I., Rothkrantz, L.J., Van Leeuwen, D.A., Wiggers, P.: Automatic stress detection in emergency (telephone) calls. Int. J. Intell. Defence Support Syst. 4(2), 148–168 (2011)

    Article  Google Scholar 

  5. Zhou, G.J., Hansen, J.H.L., Kaiser, J.F.: Nonlinear feature based classification of speech under stress. IEEE Trans. Speech Audio Process. 9(3), 201–216 (2001)

    Article  Google Scholar 

  6. Garnier, M., Henrich, N.: Speaking in noise: How does the Lombard effect improve acoustic contrasts between speech and ambient noise? Comput. Speech Lang. 28(2), 580–597 (2014)

    Article  Google Scholar 

  7. Giannakakis, G., Pediaditis, M., Manousos, D., Kazantzaki, E., Chiarugi, F., Simos, P.G., Marias, K., Tsiknakis, M.: Stress and anxiety detection using facial cues from videos. Biomed. Signal Process. Control 31, 89–101 (2017)

    Article  Google Scholar 

  8. Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31, 39–58 (2009)

    Article  Google Scholar 

  9. Godin, K.W., Hasan, T., Hansen, J.H.: Glottal waveform analysis of physical task stress speech. In: INTERSPEECH, pp. 1648–1651 (2012)

    Google Scholar 

  10. Sluijter, A.M., Van Heuven, V.J.: Spectral balance as an acoustic correlate of linguistic stress. J. Acoust. Soc. Am. 100(4), 2471–2485 (1996)

    Article  Google Scholar 

  11. Hansen, J.H., Bou-Ghazale, S.E., Sarikaya, R., Pellom, B.: Getting started with SUSAS: a speech under simulated and actual stress database. In: Eurospeech, vol. 97(4), pp. 1743–46 (1997)

    Google Scholar 

  12. Hansen, J.H., Kim, W., Rahurkar, M., Ruzanski, E., Meyerhoff, J.: Robust emotional stressed speech detection using weighted frequency subbands. EURASIP J. Adv. Signal Process. 2011(1), 1–10 (2011)

    Article  Google Scholar 

  13. Shukla, S., Dandapat, S., Prasanna, S.R.M.: Spectral slope based analysis and classification of stressed speech. Int. J. Speech Technol. 14(3), 245–258 (2011)

    Article  Google Scholar 

  14. Yao, X., Jitsuhiro, T., Miyajima, C., Kitaoka, N., Takeda, K.: Physical characteristics of vocal folds during speech under stress. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4609–4612 (2012)

    Google Scholar 

  15. Shah, F., Sukumar, R., Anto, B.: Automatic Stress Detection from Speech by Using Discrete Wavelet Transforms (2009)

    Google Scholar 

  16. Sondhi, S., Khan, M., Vijay, R., Salhan, A.K.: Vocal indicators of emotional stress. Int. J. Comput. Appl. 122(15), 38–43 (2015)

    Google Scholar 

  17. Fernandez, R., Rosalind, W.P.: Modeling drivers speech under stress. Speech Commun. 40(1), 145–159 (2003)

    Article  MATH  Google Scholar 

  18. Womak, B.D., Hansen, J.H.: Improved speech recognition via speaker stress directed classification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, vol. 1, pp. 53–56 (1996)

    Google Scholar 

  19. Eriksson, A., Traunmüller, H.: Perception of vocal effort and distance from the speaker on the basis of vowel utterances. Percept. Psychophysics 64(1), 131–139 (2002)

    Article  Google Scholar 

  20. Tartter, V.C., Gomes, H., Litwin, E.: Some acoustic effects of listening to noise on speech production. J. Acoust. Soc. Am. 94(4), 2437–2440 (1993)

    Article  Google Scholar 

  21. Sigmund, M.: Introducing the database ExamStress for speech under stress. In: Proceedings of the 7th Nordic Signal Processing Symposium-NORSIG, pp. 290–293. IEEE (2006)

    Google Scholar 

  22. Camacho, A.: SWIPE: A sawtooth waveform inspired pitch estimator for speech and music. Doctoral dissertation, University of Florida (2007)

    Google Scholar 

  23. Protopapas, A., Lieberman, P.: Fundamental frequency of phonation and perceived emotional stress. J. Acoust. Soc. Am. 101(4), 2267–2277 (1997)

    Article  Google Scholar 

  24. Röbel, A., Rodet, X.: Efficient spectral envelope estimation and its application to pitch shifting and envelope preservation. In: International Conference on Digital Audio Effects, pp. 30–35 (2005)

    Google Scholar 

  25. Hansen, J.H.L., Patil, S.: Speech under stress: analysis, modeling and recognition. In: Müller, C. (ed.) Speaker Classification I. LNCS (LNAI), vol. 4343, pp. 108–137. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74200-5_6

    Chapter  Google Scholar 

  26. Alku, P.: Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering. Speech Commun. 11(2–3), 109–118 (1992)

    Article  Google Scholar 

  27. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2008)

    MATH  Google Scholar 

  28. Tsamardinos, I., Rakhshani, A., Lagani, V.: Performance-estimation properties of cross-validation-based protocols with simultaneous hyper-parameter optimization. In: Likas, A., Blekas, K., Kalles, D. (eds.) SETN 2014. LNCS (LNAI), vol. 8445, pp. 1–14. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07064-3_1

    Chapter  Google Scholar 

  29. Pampouchidou, A., Simantiraki, O., Fazlollahi, A., Pediaditis, M., Manousos, D., Roniotis, A., Giannakakis, G., Meriaudeau, F., Simos, P., Marias, K., Yang, F., Tsiknakis, M.: Depression assessment by fusing high and low level features from audio, video, and text. In: The 6th Audio/Visual Emotion Challenge and Workshop. ACM-Multimedia (2016)

    Google Scholar 

Download references

Acknowledgments

The authors acknowledge support from the iManageCancer EU project under contract H2020-PHC-26-2014 No.643529.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olympia Simantiraki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Simantiraki, O., Giannakakis, G., Pampouchidou, A., Tsiknakis, M. (2018). Stress Detection from Speech Using Spectral Slope Measurements. In: Oliver, N., Serino, S., Matic, A., Cipresso, P., Filipovic, N., Gavrilovska, L. (eds) Pervasive Computing Paradigms for Mental Health. FABULOUS MindCare IIOT 2016 2016 2015. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 207. Springer, Cham. https://doi.org/10.1007/978-3-319-74935-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-74935-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-74934-1

  • Online ISBN: 978-3-319-74935-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics