Abstract
This article discusses speech production in dialogue from the perspective of natural language generation, focusing on the selection of appropriate intonation. We argue that in order to assign appropriate intonation contours in speech producing systems, it is vital to acknowledge the diversity of functions that intonation fulfills and to account for communicative and immediate contexts as major factors constraining intonation selection. Bringing forward arguments from a functional-linguistically motivated natural language generation architecture, we present a model of context-to-speech as an alternative to the traditional text-to-speech and concept-to-speech approaches.
Authors appear in alphabetical order.-This work was partially funded by the European Union Programme Copernicus, Project No. 10393 (SPEAK!) under contract with the Darmstadt University of Technology. All authors have been actively involved in the project at various stages, either under employment at the Darmstadt University of Technology or GMD-IPSI.
Preview
Unable to display preview. Download preview PDF.
References
Abb, B.; Günther, C.; Herweg, M.; Maienborn, C.; and Schopp, A. 1996. Incremental syntactic and phonological encoding — an outline of the SYNPHONICS formulator. In Adorni, G., and Zock, M., eds., Trends in Natural Language Generation — An Artificial Intelligence Perspective. Berlin and New York: Springer-Verlag. 277–299.
Bateman, J. A., and Teich, E. 1995. Selective information presentation in an integrated publication system: An application of genre-driven text generation. Information Processing & Management 31(5):379–395.
Belkin, N. J.; Cool, C.; Stein, A.; and Thiel, U. 1995. Cases, scripts, and information seeking strategies: On the design of interactive information retrieval systems. Expert Systems and Application 9(3):379–395.
Berry, M. 1981. Systemic linguistics and discourse analysis: A multilayered approach to exchange structure. In Coulthard, M., and Montgomery, M., eds., Studies in Discourse Analysis. London: Routledge and Kegan Paul.
Bierwisch, M. 1973. Regeln für die Intonation deutscher Sätze. In Studia Grammatica VII: Untersuchungen über Akzent und Intonation im Deutschen. Berlin: Akademie Verlag. 99–201.
Bilange, E. 1991. A task independent oral dialogue model. In Proceedings of the European Chapter of the ACL, 83–87.
Black, A., and Campbell, N. 1995. Predicting the intonation of discourse segments from examples in dialogue speech. In Dalsgaard, P.; Larsen, L. B.; Boves, L.; and Thomsen, I., eds., Proceedings of the ESCA Workshop on Spoken Dialogue Systems—Theories and Applications (ETRW '95), Vigsø, Denmark. Aalborg, Denmark: ESCA/Aalborg University. 197–200.
Bunt, H. C. 1989. Information dialogues as communicative action in relation to partner modeling and information processing. In Taylor, M. M.; Neel, F.; and Bouwhuis, D. G., eds., The Structure of Multimodal Dialogue. Amsterdam: North-Holland. 47–73.
Bunt, H. C. 1996. Interaction management functions and context representation requirements. In LuperFoy, S.; Nijholt, A.; and van Zanten, G., eds., Dialogue Management in Natural Language Systems. Proceedings of the Eleventh Twente Workshop on Language Technology, 187–198. Enschede, NL: Universiteit Twente.
Callan, J. P.; Croft, W. B.; and Harding, S. M. 1992. The INQUERY retrieval system. In Proceedings of the 3rd International Conference on Database and Expert Systems Application. Berlin and New York: Springer-Verlag. 78–83.
Dahlbäck, N. 1997. Towards a dialogue taxonomy. In this volume.
Dalsgaard, P.; Larsen, L. B.; Boves, L.; and Thomsen, I., eds. 1995. Proceedings of the ESCA Workshop on Spoken Dialogue Systems—Theories and Applications (ETRW '95), Vigso, Denmark. Aalborg, Denmark: ESCA/Aalborg University.
Dorffner, G.; Buchberger, E.; and Kommenda, M. 1990. Integrating stress and intonation into a concept-to-speech system. In Proceedings of the 14th International Conference on Computational Linguistics (COLING '90), 89–94.
Fawcett, R. P.; van der Mije, A.; and van Wissen, C. 1988. Towards a systemic flowchart model for discourse. In New Developments in Systemic Linguistics. London: Pinter. 116–143.
Fawcett, R. P. 1990. The computer generation of speech with discoursally and semantically motivated intonation. In Proceedings of the 5th International Workshop on Natural Language Generation (INLG '90).
Grote, B. 1995. Specifications of grammar/semantic extensions for inclusion of intonation within the KOMET grammar of German. COPERNICUS '93 Project No. 10393, SPEAK!, Deliverable R2.1.1.
Hagen, E., and Stein, A. 1996. Automatic generation of a complex dialogue history. In McCalla, G., ed., Advances in Artificial Intelligence. Proceedings of the Eleventh Biennial of the Canadian Society for Computational Studies of Intelligence (AI '96). Berlin and New York: Springer-Verlag. 84–96.
Halliday, M. 1967. Intonation and Grammar in British English. The Hague: Mouton.
Halliday, M. 1985. An Introduction to Functional Grammar. London: Edward Arnold.
Hasan, R. 1978. Text in the systemic-functional model. In Dressler, W., ed., Current Trends in Text Linguistics. Berlin: de Gruyter. 228–246.
Hemert, J.; Adriaens-Porzig, U.; and Adriaens, L. 1987. Speech synthesis in the SPICOS project. In Tillmann, H., and Willee, G., eds., Analyse und Synthese gesprochener Sprache. Jahrestagung der GLDV. Hildesheim: Georg Olms. 34–39.
Hirschberg, J.; Nakatani, C.; and Grosz, B. 1995. Conveying discourse structure through intonation variation. In Dalsgaard, P.; Larsen, L.; Boves, L.; and Thomsen, I., eds., Proceedings of the ESCA Workshop on Spoken Dialogue Systems—Theories and Applications (ETRW '95), Vigso, Denmark. Aalborg, Denmark: ESCA/Aalborg University. 189–192.
Hirschberg, J. 1992. Using discourse context to guide pitch accent decisions in synthetic speech. In Bailly, G., and Benoit, C., eds., Talking machines: Theory, Models and Design. Amsterdam: North Holland. 367–376.
Huber, K.; Hunker, H.; Pfister, B.; Russi, T.; and Traber, C. 1987. Sprachsynthese ab Text. In Tillmann, H. G., and Willee, G., eds., Analyse und Synthese gesprochener Sprache. Jahrestagung der GLDV. Hildesheim: Georg Olms. 26–33.
LuperFoy, S.; Nijholt, A.; and van Zanten, G. V., eds. 1996. Dialogue Management in Natural Language Systems. Proceedings of the Eleventh Twente Workshop on Language Technology. Enschede, NL: Universiteit Twente.
Martin, J. R. 1992. English Text: System and Structure. Amsterdam: Benjamins. chapter 7, 493–573.
Matthiessen, C. M. I. M. 1988. Semantics for a systemic grammar: The chooser and inquiry framework. In Benson, J.; Cummings, M.; and Greaves, W., eds., Linguistics in a Systemic Perspective. Amsterdam: Benjamins.
Matthiessen, C. M. I. M. 1995. Lexicogrammatical Cartography: English Systems. Tokyo: International Language Science Publishers.
Nakatani, C. 1995. Discourse structural constraints on accent in narrative. In van Santen, J.; Sproat, R.; Olive, J.; and Hirschberg, J., eds., Progress in Speech Synthesis. Berlin and New York: Springer-Verlag.
O'Donnell, M. 1990. A dynamic model of exchange. Word 41(3):293–327.
Olaszy, G.; Nemeth, G.; Tihanyi, A.; and Szentivanyi, G. 1995. Implementation of the interface language in the SPEAK! dialogue system. COPERNICUS '93 Project No. 10393, SPEAK!, Deliverable P2.3.1.
Olaszy, G.; Gordos, G.; and Nemeth, G. 1992. The MULTIVOX multilingual text-to-speech converter. In Bailly, G., and Benoit, C., eds., Talking Machines: Theory, Models and Design. Amsterdam: North Holland. 385–411.
PENMAN Project. 1989. PENMAN documentation: the Primer, the User Guide, the Reference Manual, and the Nigel manual. Technical report, University of Southern California/Information Sciences Institute, Marina del Rey, CA.
Pheby, J. 1969. Intonation und Grammatik im Deutschen. Berlin: Akademie-Verlag, (2nd. edition, 1980) edition.
Prevost, S., and Steedman, M. 1994. Specifying intonation from context for speech synthesis. Speech Communication 15(1–2):139–153. Also available as http://xxx.lanl.gov/cmp-lg/9407015.
Searle, J. R. 1979. Expression and Meaning. Studies in the Theory of Speech Acts. Cambridge, MA: Cambridge University Press. chapter A Taxonomy of Illocutionary Acts, 1–29.
Sitter, S., and Stein, A. 1996. Modeling information-seeking dialogues: The conversational roles (COR) model. RIS: Review of Information Science 1(1, Pilot Issue). Online Journal. Available from http://www.inf-wiss.unikonstanz.de/RIS/.
Stein, A.; Gulla, J. A.; Müller, A.; and Thiel, U. 1997. Conversational interaction for semantic access to multimedia information. In Maybury, M. T., ed., Intelligent Multimedia Information Retrieval. Menlo Park, CA: AAAI/The MIT Press. chapter 20. (in press).
Teich, E.; Hagen, E.; Grote, B.; and Bateman, J. A. 1997. From communicative context to speech: Integrating dialogue processing, speech production and natural language generation. Speech Communication. (in press).
Teich, E. 1992. KOMET: Grammar documentation. Technical Report, GMD-IPSI (Institut für integrierte Publikations-und Informationssysteme), Darmstadt.
Traum, D. R., and Hinkelman, E. 1992. Conversation acts in task-oriented spoken dialogue. Computational Intelligence 8(3):575–599.
Ventola, E. 1987. The Structure of Social Interaction: A Systemic Approach to the Semiotics of Service Encounters. London: Pinter.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grote, B., Hagen, E., Stein, A., Teich, E. (1997). Speech production in human-machine dialogue: A natural language generation perspective. In: Maier, E., Mast, M., LuperFoy, S. (eds) Dialogue Processing in Spoken Language Systems. DPSLS 1996. Lecture Notes in Computer Science, vol 1236. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63175-5_38
Download citation
DOI: https://doi.org/10.1007/3-540-63175-5_38
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63175-0
Online ISBN: 978-3-540-69206-5
eBook Packages: Springer Book Archive