Abstract
In our recent work, a method on how to enumerate differences between various expressive categories (communicative functions) has been proposed. To improve the overall impact of this approach to both the quality of synthetic expressive speech and expressivity perception by listeners, a few modifications are suggested in this paper. The main ones consist in a different way of expressive data processing and penalty matrix calculation. A complex evaluation using listening tests and some auxiliary measures was performed.
This work was supported by the European Regional Development Fund (ERDF), project “New Technologies for Information Society” (NTIS), European Centre of Excellence, ED1.1.00/02.0090. The access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum, provided under the programme “Projects of Large Infrastructure for Research, Development, and Innovations” (LM2010005) is highly appreciated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Russell, J.A.: A circumplex model of affect. Journal of Personality and Social Psychology 39, 1161–1178 (1980)
Cornelius, R.R.: The science of emotion: Research and tradition in the psychology of emotions. Prentice-Hall, Englewood Cliffs (1996)
Syrdal, A.K., Conkie, A., Kim, Y.J., Beutnagel, M.: Speech acts and dialog TTS. In: Proceedings of the 7th ISCA Speech Synthesis Workshop – SSW7, Kyoto, Japan, pp. 179–183 (2010)
Zovato, E., Pacchiotti, A., Quazza, S., Sandri, S.: Towards emotional speech synthesis: A rule based approach. In: Proceedings of the 5th ISCA Speech Synthesis Workshop – SSW5, Pittsburgh, PA, USA, pp. 219–220 (2004)
Hamza, W., Bakis, R., Eide, E.M., Picheny, M.A., Pitrelli, J.F.: The IBM expressive speech synthesis system. In: Proceedings of the 8th International Conference on Spoken Language Processing – ISCLP, Jeju, Korea, pp. 2577–2580 (2004)
Krstulovic, S., Hunecke, A., Schroder, M.: An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements. In: Proceedings of Interspeech, Antwerp, Belgium, pp. 1897–1900 (2007)
Ircing, P., Romportl, J., Loose, Z.: Audiovisual interface for Czech spoken dialogue system. In: IEEE 10th International Conference on Signal Processing Proceedings, Beijing, China, pp. 526–529. Institute of Electrical and Electronics Engineers, Inc. (2010)
Grůber, M., Matoušek, J.: Listening-test-based annotation of communicative functions for expressive speech synthesis. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS, vol. 6231, pp. 283–290. Springer, Heidelberg (2010)
Grůber, M., Tihelka, D.: Expressive speech synthesis for Czech limited domain dialogue system – basic experiments. In: IEEE 10th International Conference on Signal Processing Proceedings, Beijing, China, vol. 1, pp. 561–564. Institute of Electrical and Electronics Engineers, Inc. (2010)
Grůber, M., Legát, M., Ircing, P., Romportl, J., Psutka, J.: Czech Senior COMPANION: Wizard of Oz data collection and expressive speech corpus recording and annotation. In: Vetulani, Z. (ed.) LTC 2009. LNCS, vol. 6562, pp. 280–290. Springer, Heidelberg (2011)
Tihelka, D., Kala, J., Matoušek, J.: Enhancements of Viterbi search for fast unit selection synthesis. In: Proceedings of Interspeech, Makuhari, Japan, pp. 174–177 (2010)
Grůber, M.: Enumerating differences between various communicative functions for purposes of Czech expressive speech synthesis in limited domain. In: Proceedings of Interspeech, Portland, Oregon, USA, pp. 650–653 (2012)
Syrdal, A.K., Kim, Y.J.: Dialog speech acts and prosody: Considerations for TTS. In: Proceedings of Speech Prosody, Campinas, Brazil, pp. 661–665 (May 2008)
Grůber, M., Hanzlíček, Z.: Czech expressive speech synthesis in limited domain: Comparison of unit selection and HMM-based approaches. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 656–664. Springer, Heidelberg (2012)
Grůber, M.: Acoustic analysis of Czech expressive recordings from a single speaker in terms of various communicative functions. In: Proceedings of the 11th IEEE International Symposium on Signal Processing and Information Technology, pp. 267–272. IEEE (2011)
Trujillo-Ortiz, A., Hernandez-Walls, R., Castro-Perez, A., Barba-Rojo, K.: MOUTLIER1: Detection of outlier in multivariate samples test. A MATLAB file (2006) (online; cited October 29, 2012)
Wilks, S.S.: Multivariate statistical outlier. The Indian Journal of Statistics 25(4), 407–426 (1963)
Přibil, J., Přibilová, A.: Statistical analysis of spectral properties and prosodic parameters of emotional speech. Measurement Science Review 9, 95–104 (2009)
Přibil, J., Přibilová, A.: Statistical analysis of complementary spectral features of emotional speech in Czech and Slovak. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 299–306. Springer, Heidelberg (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Grůber, M., Matoušek, J. (2013). Improvements in Czech Expressive Speech Synthesis in Limited Domain. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-01931-4_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01930-7
Online ISBN: 978-3-319-01931-4
eBook Packages: Computer ScienceComputer Science (R0)