Background

Heinroth, Tobias; Minker, Wolfgang

doi:10.1007/978-1-4614-5383-3_2

Tobias Heinroth³ &
Wolfgang Minker³

499 Accesses

Abstract

In this chapter we discuss the background and the related work of ASDM within the context of IEs. In Sect. 2.1 we explain the general functioning of an SDS. The underlying idea of an IE is explained in Sect. 2.2. The IE approaches realised within the ATRACO Project serve as examples. In Sect. 2.3 we describe prior work in the field of (spoken and multimodal) interaction within IEs. Section 2.4 focuses on a specific part of an SDS: the Spoken Dialogue Manager (SDM). Several approaches toward developing this component have been implemented in the past. We give an overview on all directions in general and illustrate each with an example. Section 2.4 is divided into three parts: the first part is dedicated to state-machine-based approaches, the second part to stochastic methodologies, and the third part to plan- and Information State-based systems. In Sect. 2.5 we present several approaches to enhancing the performance of the SDM. Furthermore, we discuss how these approaches influenced our work. By introducing our own approach we conclude this chapter in Sect. 2.6.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
SOFIA is funded by the European Artemis programme, 2009–2011, http://www.sofia-project.eu

References

Abowd, G., Atkeson, C., & Essa, I. (1998). Ubiquitous smart spaces. Technical report, DARPA.
Google Scholar
Axelsson, J., Cross, C., Lie, H. W., McCobb, G., Raman, T. V., & Wilson, L. (2001). Xhtml+voice profile 1.0. Technical report, W3C.
Google Scholar
Bachmann, P. (1894). Die analytische Zahlentheorie, vol. 2. Leipzig: Teubner.
MATH Google Scholar
Baum, L. E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. The Annals of Mathematical Statistics, 41(1), 164–171.
Article MathSciNet MATH Google Scholar
Bechhofer, S., Volz, R., & Lord, P. (2003). Cooking the semantic web with the owl api. In The Semantic Web – ISWC 2003, (pp. 659–675). Springer.
Google Scholar
Bellik, Y., Pruvost, G., Martin, J.-C., Tan, N., Minker, W., & Heinroth, T. (2010). D16 – user interaction adaptation component. Confidential deliverable, The ATRACO Project (FP7/2007–2013 grant agreement no:216837).
Google Scholar
Berton, A., Bühler, D., & Minker, W. (2006). SmartKom-Mobile Car: User Interaction with Mobile Services in a Car Environment (SmartKom: Foundations of Multi-Modal Dialogue Systems ed.)., (pp. 523–541). Cognitive Technologies. Heidelberg: Springer.
Google Scholar
Beslay, L., & Hakala, H. (2007). Digital territory: Bubbles. In P. T. Kidd (Ed.), European visions for the knowledge age: a quest for new horizons in the information society. Cheshire Henbury.
Google Scholar
Bezold, M. (2011). Adapting Multimodal Dialogue Systems to User Behaviour. PhD thesis, Ulm University.
Google Scholar
Bidot, J., Goumopoulos, C., & Calemis, I. (2011). Using ai planning and late binding for managing service workflows in intelligent environments. In Proc. of the International Conference on Pervasive Computing and Communications (PerCom), (pp. 156–163). IEEE.
Google Scholar
Black, A. W., Burger, S., Conkie, A., Hastie, H. W., Keizer, S., Lemon, O., Merigaud, N., Parent, G., Schubiner, G., Thomson, B., Williams, J. D., Yu, K., Young, S., & Eskenazi, M. (2011). Spoken dialog challenge 2010: Comparison of live and control test results. In SIGDIAL Conference, (pp. 2–7).
Google Scholar
Bohlin, P., Bos, J., Larsson, S., Lewin, I., Matheson, C., & Milward, D. (1999). Survey of existing interactive systems – trindi deliverable d1.3. Technical report, Gothenburg University.
Google Scholar
Bohus, D., Raux, A., Harris, T. K., Eskenazi, M., & Rudnicky, E. I. (2007). Olympus: an open-source framework for conversational spoken language interface research. In HLT-NAACL 2007 workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technology.
Google Scholar
Bohus, D., & Rudnicky, A. (2002). Integrating multiple knowledge sources for utterance-level confidence annotation in the cmu communicator spoken dialog system. Technical report, Roots in the Town. In 2nd International Workshop on Community Networking. 1995. Princeton, NJ: IEEE Communications Society
Google Scholar
Bohus, D., & Rudnicky, A. (2005). Sorry i didn’t catch that: An investigation of non-understanding errors and recovery strategies. In Proceedings of SIGdial-2005, Lisbon, Portugal.
Google Scholar
Bohus, D., & Rudnicky, A. I. (2009). The ravenclaw dialog management framework: Architecture and systems. Computer Speech & Language, 23, 332–361.
Article Google Scholar
Brown, M., Burnett, D., Candell, E., Carter, J., Dahl, D., Ghosh, D., Hunt, A., Krause, S., Lerner, S., Lucas, B., Marschner, J., McGlashan, S., Normandin, Y., Porter, B., Raggett, D., Ramsthaler, D., Tichelen, L. V., Wang, K., & Werner, L. (2004). Speech recognition grammar specification version 1.0. Technical report, W3C.
Google Scholar
Bühler, D. (2009). Towards Domain-driven Dialogue - Application Control and Problem Solving. PhD thesis, Ulm University.
Google Scholar
Burkhardt, F., Huber, R., & Batliner, A. (2007). Application of speaker classification in human machine dialog systems. In Speaker Classification I: Fundamentals, Features, and Methods, (pp. 174–179). Berlin, Heidelberg: Springer.
Chapter Google Scholar
Burkhardt, F., Metze, F., & Stegmann, J. (2008). Speaker classification for next-generation voice-dialog systems, (pp. 497–528). Wiley.
Google Scholar
Cáceres, M. (2011). Widget packaging and configuration (working draft). Technical report, W3C.
Google Scholar
Chin, J., Diehl, V., & Norman, K. (1988). Development of an instrument measuring user satisfaction of the human–computer interface. In Proceedings of ACM CHI 88 Conference on Human Factors in Computing, (pp. 213–218).
Google Scholar
Chomsky, N. (1956). Three models for the description of language. IRE Transactions on Information Theory, 2, 113–124.
Article MATH Google Scholar
Chung, G., Seneff, S., Wang, C., & Hetherington, L. (2004). A dynamic vocabulary spoken dialogue interface. In Proc. ICSLP, (pp. 1457–1460).
Google Scholar
Clark, H. H., & Schaefer, E. F. (1989). Contributing to discourse. Cognitive Science, 13(2), 259–294.
Article Google Scholar
Colmerauer, A., & Roussel, P. (1996). The birth of prolog. In T. J. Bergin, Jr., & R. G. Gibson, Jr. (Eds.), History of programming languages—II (pp. 331–367). New York, NY, USA: ACM.
Chapter Google Scholar
Cook, D., Youngblood, M., & Das, S. (2006). A multi-agent approach to controlling a smart environment. In J. Augusto and C. Nugent (Eds.), Designing Smart Homes, vol. 4008 of Lecture Notes in Computer Science (pp. 165–182). Heidelberg: Springer.
Chapter Google Scholar
Cornelius, R. (1996). The science of emotion : research and tradition in the psychology of emotions. Upper Saddle River, NJ, USA: Prentice Hall.
Google Scholar
Coutaz, J., Crowley, J., Dobson, S., & Garlan, D. (2005). Context is key. Communications of the ACM, 48(3), 49–53.
Article Google Scholar
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor, J. G. (2001). Emotion recognition in human–computer interaction. Signal Processing Magazine, 18(1), 32–80.
Article Google Scholar
Daniels, J. (2000). Integrating a spoken language system with agents for operational information access. In AAAI, (pp. 1002–1007).
Google Scholar
Dervin, B., Foreman-Wernet, L., & Lauterbach, E. (2003). Sense-making methodology reader: Selected writings of Brenda Dervin. Hampton Press Inc.
Google Scholar
Dretske, F. (1991). Explaining behavior: Reasons in a world of causes. Cambridge, MA, USA: MIT.
Google Scholar
Duong, T., Bui, H., Phung, D., & Venkatesh, S. (2005). Activity recognition and abnormality detection with the switching hidden semi-markov model. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, (pp. 838–845). IEEE.
Google Scholar
Fahrmeir, L., Hamerle, A., & Tutz, G. (1984). Multivariate statistische Verfahren. New York: Walter de Gruyter.
MATH Google Scholar
Ferguson, G., Allen, J., Blaylock, N., Byron, D., Chambers, N., Dzikovska, M., Galescu, L., Shen, X., Swier, R., & Swift, M. (2002). The Medication Advisor Project: Preliminary report. Technical Report TR776, University of Rochester Computer Science Department.
Google Scholar
Fowler, M. (2006). Passive view.
Google Scholar
Franke, J., Daniels, J., & McFarlane, D. (2002). Recovering context after interruption. In CogSci’02, (pp. 310–315).
Google Scholar
Garrett, J. J. (2005). Ajax: A new approach to web applications. http://adaptivepath.com/ideas/essays/archives/000385.php.
Gervasio, M., & Murdock, J. (2009). What were you thinking?: filling in missing dataflow through inference in learning from demonstration. In Proceedings of the 14th international conference on Intelligent user interfaces, (pp. 157–166). ACM.
Google Scholar
Gil, Y., & Ratnakar, V. (2008). Towards intelligent assistance for to-do lists. In Proceedings of the 13th international conference on Intelligent user interfaces, (pp. 329–332). ACM.
Google Scholar
Ginzburg, J., & Cooper, R. (2004). Clarification, ellipsis, and the nature of contextual updates in dialogue. Linguistics and Philosophy, 27(3), 297–365.
Article Google Scholar
Gnjatović, M., & Rösner, D. (2008). Adaptive dialogue management in the nimitek prototype system. In Proceedings of the 4th IEEE PIT workshop, (pp. 14–25). Berlin, Heidelberg: Springer.
Google Scholar
Goumopoulos, C., & Kameas, A. (2009). Ambient ecologies in smart homes. The Computer Journal, 52(8), 922–937.
Article Google Scholar
Habibi, M., Rahbar, S., & Sameti, H. (2010). Divided pomdp method for complex menu problems in spoken dialogue systems. In Spoken Language Technology Workshop (SLT), 2010 IEEE, (pp. 484–489). IEEE.
Google Scholar
Hamp, B., & Feldweg, H. (1997). Germanet – a lexical-semantic net for german. In Proceedings of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, (pp. 9–15). Citeseer.
Google Scholar
Heinroth, T., & Denich, D. (2011). Spoken Interaction within the Computed World: Evaluation of a Multitasking Adaptive Spoken Dialogue System. In 35th Annual IEEE International Computer Software and Applications Conference (COMPSAC 2011). IEEE.
Google Scholar
Heinroth, T., Denich, D., & Schmitt, A. (2010). Owlspeak - adaptive spoken dialogue within intelligent environments. In 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), (pp. 666 – 671). Mannheim, Germany.
Google Scholar
Heinroth, T., Grotz, M., Nothdurft, F., & Minker, W. (2012). Adaptive speech recognition for intuitive model-based spoken dialogues. In Proceedings of the Eighth Conference on International Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA).
Google Scholar
Heinroth, T., Koleva, S., & Minker, W. (2011). Topic switching strategies for spoken dialogue systems. In Proc. of the 12th Annual Conference of the International Speech Communication Association.
Google Scholar
Heinroth, T., & Minker, W. (Eds.). (2011). Next Generation Intelligent Environments: Ambient Adaptive Systems. Boston, USA: Springer.
Google Scholar
Herm, O., Schmitt, A., & Liscombe, J. (2008). When calls go wrong: How to detect problematic calls based on log-files and emotions? In Proc. of the International Conference on Speech and Language Processing (ICSLP).
Google Scholar
Hildebrand, A., & Sá, V. (2000). Embassi: electronic multimedia and service assistance. In oceedings of the Internet Measurement Conference (IMC), (pp. 50–59).
Google Scholar
Hone, K. S., & Graham, R. (2000). Towards a tool for the subjective assessment of speech system interfaces (sassi). Natural Language Engineering, 6, 287–305.
Article Google Scholar
Horridge, M., Bechhofer, S., & Noppens, O. (2007). Igniting the owl 1.1 touch paper: The owl api. In Proc. OWL-ED, vol. 258.
Google Scholar
Huerta, J. M. (2000). Robust Speech Recognition in GSM Mobile Environments. PhD thesis, Carnegie Mellon University.
Google Scholar
Hunt, A. (2000). Jspeech grammar format. W3C Note http://www.w3.org/TR/jsgf/.
Intille, S. S., Larson, K., Beaudin, J. S., Tapia, M., Kaushik, P., Nawyn, J., and Mcleish, T. J. (2005). The placelab: a live-in laboratory for pervasive computing research (video. In Proceedings of Pervasive 2005 Video Program.
Google Scholar
ISO (2008). Iso/iec 29341–2:2008 information technology – upnp device architecture – part 2: Basic device control protocol - basic device. Technical report, INTERNATIONAL ORGANIZATION FOR STANDARDIZATION.
Google Scholar
ITU (2005). Parameters describing the interaction with spoken dialogue systems. ITU-T Recommendation Supplement 24 to P-Series, International Telecommunication Union, Geneva, Switzerland. Based on ITU-T Contr. COM 12–17 (2009).
Google Scholar
Jiang, H. (2005). Confidence measures for speech recognition: A survey. Speech Communication, 45(4), 455–470.
Article Google Scholar
Johnston, M., Baggia, P., Burnett, D., Carter, J., Dahl, D., & McCobb, G. (2009). Emma: Extensible multimodal annotation markup language; World Wide Web Consortium Recommendation REC-emma-2009021. Technical report, W3C.
Google Scholar
Jokinen, K., Kerminen, A., Kaipainen, M., Jauhiainen, T., Wilcock, G., Turunen, M., Hakulinen, J., Kuusisto, J., & Lagus, K. (2002). Adaptive dialogue systems-interaction with interact. In Proceedings of the 3rd SIGdial workshop on Discourse and dialogue-Volume 2, (pp. 64–73). ACL.
Google Scholar
Jurafsky, D., & Martin, J. H. (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (Prentice Hall Series in Artificial Intelligence) (1st ed.). Prentice Hall.
Google Scholar
Kaelbling, L., Littman, M., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1–2), 99–134.
Article MathSciNet MATH Google Scholar
Kientz, J. A., Patel, S. N., Jones, B., Price, E., Mynatt, E. D., & Abowd, G. D. (2008). The georgia tech aware home. In CHI ’08 extended abstracts on Human factors in computing systems, CHI EA ’08, (pp. 3675–3680). New York, NY, USA: ACM.
Google Scholar
Kleene, S. (1988). Introduction to metamathematics. Wolters-Noordhoff.
Google Scholar
Knuth, D. E. (1964). Backus normal form vs. Backus Naur form. Communications of the ACM, 7(12), 735–736.
Google Scholar
Konings, B., & Schaub, F. (2011). Territorial privacy in ubiquitous computing. In Wireless On-Demand Network Systems and Services (WONS), 2011 Eighth International Conference on, (pp. 104–108). IEEE.
Google Scholar
Könings, B., Wiedersheim, B., & Weber, M. (2011). Privacy & trust in ambient intelligence environments. In W. Minker and T. Heinroth (Eds.), Next Generation Intelligent Environments (pp. 227–252). New York: Springer.
Chapter Google Scholar
Krasner, G., & Pope, S. (1998). A cookbook for using the model-view-controller user interface paradigm in smalltalk-80. Journal of Object-Oriented Programming, 1(3), 26–49.
Google Scholar
Kruskal, W., & Wallis, W. (1952). Use of ranks in one-criterion variance analysis. Journal of the American statistical Association, 47(260), 583–621.
Article MATH Google Scholar
Larsson, S. (2002). Issue-based Dialogue Management. PhD thesis, Göteborg University, Sweden.
Google Scholar
Larsson, S., & Traum, D. (2000). Information state and dialogue management in the trindi dialogue move engine. Natural Language Engineering Special Issue, 6, 323–340.
Article Google Scholar
Levenshtein, V. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8), 707–710.
MathSciNet Google Scholar
Lewis, J. R. (1995). Ibm computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use. International Journal of Human–Computer Interaction, 7(1), 57–78.
Google Scholar
Limbourg, Q., Vanderdonckt, J., Michotte, B., Bouillon, L., & López-Jaquero, V. (2005). Usixml: A language supporting multi-path development of user interfaces. In 9th IFIP Working Conference on Engineering for Human–Computer Interaction, (pp. 134–135). Springer.
Google Scholar
Litman, D., & Pan, S. (2002). Designing and evaluating an adaptive spoken dialogue system. User Modeling and User-Adapted Interaction, 12(2), 111–137.
Article MATH Google Scholar
Lockwood, S., & Cook, D. (2008). Computer, light on! In The 4th IET International Conference on Intelligent Environments, Seattle, USA.
Google Scholar
López-Cózar, R., & Callejas, Z. (2006). Two-level speech recognition to enhance the performance of spoken dialogue systems. Knowledge-Based Systems, 19(3), 153–163.
Article Google Scholar
López-Cózar, R., & Callejas, Z. (2008). Asr post-correction for spoken dialogue systems based on semantic, syntactic, lexical and contextual information. Speech Communication, 50(8–9), 745–766.
Article Google Scholar
López-Cózar, R., & Callejas, Z. (2010). Multimodal dialogue for ambient intelligence and smart environments, chapter 21, (pp. 559–579). Springer.
Google Scholar
Mankiewicz, R. (2000). The story of mathematics. Princeton: Princeton University Press.
MATH Google Scholar
Mann, H., & Whitney, D. (1947). On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, 18(1), 50–60.
Article MathSciNet MATH Google Scholar
McFarlane, D. (2002). Comparison of four primary methods for coordinating the interruption of people in human–computer interaction. Human–Computer Interaction, 17, 63–139.
Google Scholar
McGuinness, D. L., & van Harmelen, F. (2004). Owl web ontology language. Technical report, W3C.
Google Scholar
McTear, M. (2004). Spoken Dialogue Technology: Toward the Conversational User Interface. London: Springer.
Book Google Scholar
McTear, M., O’Neill, I., Hanna, P., Liu, X., McTear, M., O’Neill, I., Hanna, P., & Liu, X. (2005). Handling errors and determining confirmation strategies–an object-based approach. Speech Communication, 45(3), 249–269. Special Issue on Error Handling in Spoken Dialogue Systems.
Google Scholar
Metze, F., Englert, R., Bub, U., Burkhardt, F., & Stegmann, J. (2008). Getting closer: tailored human–computer speech dialog. Universal Access in the Information Society, 8, 97–108.
Article Google Scholar
Miller, G. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological review, 63(2), 81–97.
Article Google Scholar
Miller, G. (1995). Wordnet: a lexical database for english. Communications of the ACM, 38(11), 39–41.
Article Google Scholar
Minker, W., López-Cózar, R., & McTear, M. (2009). The role of spoken language dialogue interaction in intelligent environments. Journal of Ambient Intelligence and Smart Environments, 1(1), 31–36.
Google Scholar
Montoro, G., Alamán, X., & Haya, P. A. (2004). Spoken interaction in intelligent environments: A working system. In Advances in Pervasive Computing.
Google Scholar
Mozer, M. C. (2005). Lessons from an Adaptive Home, (pp. 271–294). Wiley.
Google Scholar
Nakano, M., Miyazaki, N., Hirasawa, J.-i., Dohsaka, K., & Kawabata, T. (1999). Understanding unsegmented user utterances in real-time spoken dialogue systems. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, ACL ’99, (pp. 200–207). Stroudsburg, PA, USA: ACL.
Google Scholar
Nevin, B., & Johnson, S. (2002). The legacy of Zellig Harris: language and information into the 21st century. John Benjamins Publishing Company.
Google Scholar
Niezen, G., van der Vlist, B., Hu, J., & Feijs, L. (2010). From events to goals: Supporting semantic interaction in smart environments. In 2010 IEEE Symposium on Computers and Communications (ISCC), (pp. 1029–1034). IEEE.
Google Scholar
Nuance (2008). Nuance speech recognition system version 8.5 grammar developer’s guide. Technical report, Nuance Communications. visited 05.09.2010.
Google Scholar
Oh, A. H., & Rudnicky, A. I. (2000). Stochastic language generation for spoken dialogue systems. In Proceedings of the 2000 ANLP/NAACL Workshop on Conversational systems - Volume 3, ANLP/NAACL-ConvSyst ’00, (pp. 27–32). Stroudsburg, PA, USA: ACL.
Chapter Google Scholar
Oshry, M., Auburn, R., Baggia, P., Bodell, M., Burke, D., Burnett, D. C., Candell, E., Carter, J., McGlashan, S., Lee, A., Porter, B., & Rehor, K. (2007). Voice extensible markup language (voicexml) 2.1. Technical report, W3C.
Google Scholar
Paternò, F., Mancini, C., & Meniconi, S. (1997). Concurtasktrees: A diagrammatic notation for specifying task models. In Proceedings of the IFIP TC13 Interantional Conference on Human–Computer Interaction, (pp. 362–369).
Google Scholar
Pittermann, J. (2008). Speech-Emotion Recognition in Adaptive Dialogue Systems. PhD thesis, Ulm University.
Google Scholar
Pittermann, J., Pittermann, A., & Minker, W. (2009). Handling Emotions in Human–Computer Dialogues. Dordrecht, The Netherlands: Springer.
Google Scholar
Plutchik, R. (1980). Emotion: A Psychoevolutionary Synthesis. New York, USA: Harper & Row.
Google Scholar
Potel, M. (1996). MVP: Model-View-Presenter The Taligent Programming Model for C + + and Java. Technical report, Taligent Inc.
Google Scholar
Pruvost, G., Heinroth, T., Bellik, Y., & Minker, W. (2011). Next Generation Intelligent Environments: Ambient Adaptive Systems, chapter 5, (pp. 151–192). Springer.
Google Scholar
Puerta, A., & Eisenstein, J. (2002). Ximl: a common representation for interaction data. In Proceedings of the 7th International Conference on Intelligent User Interfaces, (pp. 214–215). ACM.
Google Scholar
Qu, Y. (2001). A Constraint-Based Model of Mixed-Initiative Dialogue in Information-Seeking Interactions. PhD thesis, School of Computer Science, Carnegie Mellon University.
Google Scholar
Qu, Y. (2002). A constraint-based approach for cooperative information-seeking dialog. In Proc. INLG.
Google Scholar
Quesada, J. F., Garcia, F., Sena, E., Bernal, J. A., & Amores, G. (2001). Dialogue management in a home machine environment: Linguistic components over an agent architecture. Procesamiento del Lenguaje Natural, 27, 89–96.
Google Scholar
Raux, A., & Eskenazi, M. (2007). A multi-layer architecture for semi-synchronous event-driven dialogue management. In ASRU. IEEE Workshop on Automatic Speech Recognition Understanding, (pp. 514–519).
Google Scholar
Reenskaug, T. (1979). Models - views - controllers. Technical report, Xerox PARC.
Google Scholar
rí Adámek, J. (2008). Theoretische Informatik (lecture notes). Technische Universität Braunschweig.
Google Scholar
Rohlicek, J., Russell, W., Roukos, S., & Gish, H. (1989). Continuous hidden Markov modeling for speaker-independent word spotting. In ICASSP’89, (pp. 627–630). IEEE.
Google Scholar
Román, M., Hess, C., Cerqueira, R., Campbell, R. H., & Nahrstedt, K. (2002). Gaia: A middleware infrastructure to enable active spaces. IEEE Pervasive Computing, 1, 74–83.
Article Google Scholar
Ruser, H., Borodulkin, L., & Leisner, D. (2003). Multi-modal ‘smart home’ user interface. In Signals Systems Decision and Information Technology (SSD).
Google Scholar
Schattenberg, B., Balzer, S., & Biundo, S. (2006). Knowledge-based Middleware as an Architecture for Planning and Scheduling Systems. In Proc. of the 16th International Conference on Automated Planning and Scheduling (ICAPS-06), Ambleside, The English Lake District, UK.
Google Scholar
Schmitt, A., Heinroth, T., & Bertrand, G. (2009). Towards emotion, age- and gender-aware voicexml applications. In 5th International Conference on Intelligent Environments (IE’09), vol. 2 of Ambient Intelligence and Smart Environments, (pp. 34–41). IOS Press.
Google Scholar
Schmitt, A., & Liscombe, J. (2008). Detecting Problematic Calls With Automated Agents. In 4th IEEE Tutorial and Research Workshop Perception and Interactive Technologies for Speech-Based Systems, Irsee, Germany.
Google Scholar
Schmitt, A., Schatz, B., & Minker, W. (2011). Modeling and predicting quality in spoken human–computer interaction. In Proceedings of the SIGDIAL 2011 Conference, (pp. 173–184). Portland, Oregon, USA: ACL.
Google Scholar
Schnelle-Walka, D., & Feldes, S. (2009). Towards mixed-initiative concepts in smart environments. In Proceedings of Workshop Interacting with Smart Objects.
Google Scholar
Seneff, S., Hurley, E., Lau, R., Pao, C., Schmid, P., & Zue, V. (1998). Galaxy-ii: A reference architecture for conversational system development. In Proceedings of the international conference on spoken language processing, (pp. 931–934).
Google Scholar
Shanmugham, S., Monaco, P., & Eberman, B. (2006). A media resource control protocol (mrcp). RFC 4463 http://tools.ietf.org/html/rfc4463.
Shannon, C. (1948). A mathematical theory of communication. Bell Systems Technical Journal, 27, 623–656.
MathSciNet Google Scholar
Skantze, G. (2003). Exploring human error handling strategies: Implications for spoken dialogue systems. In Proceedings of the ISCA Workshop on Error Handling in Spoken Dialogue Systems, (pp. 71–76). Citeseer.
Google Scholar
Sonntag, D., Engel, R., Herzog, G., Pfalzgraf, A., Pfleger, N., Romanelli, M., & Reithinger, N. (2007). SmartWeb Handheld – Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services, vol. 4451 of Lecture Notes in Computer Science, (pp. 272–295). Berlin/Heidelberg: Springer.
Google Scholar
Stoline, M. (1981). The status of multiple comparisons: simultaneous estimation of all pairwise comparisons in one-way anova designs. American Statistician, 35(3), 134–141.
MATH Google Scholar
Swerts, M., Litman, D., & Hirschberg, J. (2000). Corrections in spoken dialogue systems. In Proceedings of the International Conference on Spoken Language Processing, vol. 2, (pp. 615–618). Citeseer.
Google Scholar
Traum, D., & Larsson, S. (2003). The information state approach to dialogue management, chapter 15, (pp. 325–353). Kluwer.
Google Scholar
Turing, A. (1937). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 2(1), 230.
Article MathSciNet Google Scholar
Ubisense (2011). Ubisense series 7000 ip sensors. http://www.ubisense.net/en/media/pdfs/factsheets_pdf/88679_series_7000_ip_sensors_combined.pdf.
van Helvert, J., Hagras, H., & Kameas, A. (2009). D27 - prototype testing and validation (year 2). Restricted deliverable, The ATRACO Project (FP7/2007–2013 grant agreement n 216837).
Google Scholar
van Helvert, J., Hagras, H., Wagner, C., Dooley, J., Bacon, R., & Bilgin, A. (2011). D27 - prototype testing and validation (year 3). Restricted deliverable, The ATRACO Project (FP7/2007–2013 grant agreement n 216837).
Google Scholar
Van Welie, M., Van der Veer, G., & Eliëns, A. (1998). An ontology for task world models. In Proceedings of DSV-IS98, (pp. 3–5). Abingdon, UK: Springer.
Google Scholar
Vipperla, R., Wolters, M., Georgila, K., & Renals, S. (2009). Speech input from older users in smart environments: Challenges and perspectives. In Proceedings HCI International: Universal Access in Human–Computer Interaction. Intelligent and Ubiquitous Interaction Environments, number 5615 in Lecture Notes in Computer Science (pp. 117–126). Springer.
Google Scholar
Voxeo (2011). Voxeo prophecy. http://www.voxeo.com/products/.
Wagner, C., & Hagras, H. (2010). D14 – artefact operation adaptation component. Confidential deliverable, The ATRACO Project (FP7/2007–2013 grant agreement n 216837).
Google Scholar
Walker, M., Rudnicky, A., Prasad, R., Aberdeen, J., Bratt, E., Garofolo, J., Hastie, H., Le, A., Pellom, B., Potamianos, A., et al. (2002). Darpa communicator: Cross-system results for the 2001 evaluation. In Proc. of ICSLP. Citeseer.
Google Scholar
Walker, M. A., Litman, D. J., Kamm, C. A., & Abella, A. (1997). Paradise: a framework for evaluating spoken dialogue agents. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics.
Google Scholar
Wang, K. (2002). Salt: an xml application for web-based multimodal dialog management. In Proceedings of the 2nd workshop on NLP and XML - Volume 17, (pp. 1–8).
Google Scholar
Ward, W., & Issar, S. (1994). Recent improvements in the cmu spoken language understanding system. In Proceedings of the workshop on Human Language Technology, HLT ’94, (pp. 213–216). Stroudsburg, PA, USA: ACL.
Chapter Google Scholar
Warren, W. (2006). The dynamics of perception and action. Psychological review, 113(2), 358.
Article MathSciNet Google Scholar
Wechsung, I., & Naumann, A. B. (2008). Evaluation methods for multimodal systems: A comparison of standardized usability questionnaires. Lecture Notes in Computer Science, 5078, 276–284.
Article Google Scholar
Williams, J., & Young, S. (2007). Scaling pomdps for spoken dialog management. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 2116–2129.
Article Google Scholar
Yang, F., Heeman, P., & Kun, A. (2008). Switching to real-time tasks in multi-tasking dialogue. In COLING’08, (pp. 1025–1032). ACL.
Google Scholar
Yang, F., Heeman, P. A., & Kun, A. L. (2011). An investigation of interruptions and resumptions in multi-tasking dialogues. Computational Linguistics, 37(1), 75–104.
Article Google Scholar
Young, S. (2007). Using POMDPs for dialog management. In Spoken Language Technology Workshop, 2006. IEEE, (pp. 8–13). IEEE.
Google Scholar
Young, S., Gasic, M., Keizer, S., Mairesse, F., Schatzmann, J., Thomson, B., & Yu, K. (2010). The hidden information state model: A practical framework for pomdp-based spoken dialogue management. Computer Speech & Language, 24(2), 150–174.
Article Google Scholar
Young, S., Williams, J., Schatzmann, J., Stuttle, M., & Weilhammer, K. (2006). D4.3: Bayes net prototype - the hidden information state dialogue manager. Technical report, TALK - Talk and Look: Tools for Ambient Linguistic Knowledge, IST-507802, 6th FP.
Google Scholar
Zgorzelski, A., Schmitt, A., Heinroth, T., & Minker, W. (2010). Repair strategies on trial: which error recovery do users like best? In Proc. of the International Conference on Speech and Language Processing (ICSLP).
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Communications Engineering, University of Ulm, Albert-Einstein-Allee 43, Ulm, Germany
Tobias Heinroth & Wolfgang Minker

Authors

Tobias Heinroth
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Minker
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Heinroth, T., Minker, W. (2013). Background. In: Introducing Spoken Dialogue Systems into Intelligent Environments. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5383-3_2

Download citation

DOI: https://doi.org/10.1007/978-1-4614-5383-3_2
Published: 13 October 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-5382-6
Online ISBN: 978-1-4614-5383-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics