Skip to main content
  • 499 Accesses

Abstract

In this chapter we discuss the background and the related work of ASDM within the context of IEs. In Sect. 2.1 we explain the general functioning of an SDS. The underlying idea of an IE is explained in Sect. 2.2. The IE approaches realised within the ATRACO Project serve as examples. In Sect. 2.3 we describe prior work in the field of (spoken and multimodal) interaction within IEs. Section 2.4 focuses on a specific part of an SDS: the Spoken Dialogue Manager (SDM). Several approaches toward developing this component have been implemented in the past. We give an overview on all directions in general and illustrate each with an example. Section 2.4 is divided into three parts: the first part is dedicated to state-machine-based approaches, the second part to stochastic methodologies, and the third part to plan- and Information State-based systems. In Sect. 2.5 we present several approaches to enhancing the performance of the SDM. Furthermore, we discuss how these approaches influenced our work. By introducing our own approach we conclude this chapter in Sect. 2.6.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    SOFIA is funded by the European Artemis programme, 2009–2011, http://www.sofia-project.eu

References

  • Abowd, G., Atkeson, C., & Essa, I. (1998). Ubiquitous smart spaces. Technical report, DARPA.

    Google Scholar 

  • Axelsson, J., Cross, C., Lie, H. W., McCobb, G., Raman, T. V., & Wilson, L. (2001). Xhtml+voice profile 1.0. Technical report, W3C.

    Google Scholar 

  • Bachmann, P. (1894). Die analytische Zahlentheorie, vol. 2. Leipzig: Teubner.

    MATH  Google Scholar 

  • Baum, L. E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. The Annals of Mathematical Statistics, 41(1), 164–171.

    Article  MathSciNet  MATH  Google Scholar 

  • Bechhofer, S., Volz, R., & Lord, P. (2003). Cooking the semantic web with the owl api. In The Semantic Web – ISWC 2003, (pp. 659–675). Springer.

    Google Scholar 

  • Bellik, Y., Pruvost, G., Martin, J.-C., Tan, N., Minker, W., & Heinroth, T. (2010). D16 – user interaction adaptation component. Confidential deliverable, The ATRACO Project (FP7/2007–2013 grant agreement no:216837).

    Google Scholar 

  • Berton, A., Bühler, D., & Minker, W. (2006). SmartKom-Mobile Car: User Interaction with Mobile Services in a Car Environment (SmartKom: Foundations of Multi-Modal Dialogue Systems ed.)., (pp. 523–541). Cognitive Technologies. Heidelberg: Springer.

    Google Scholar 

  • Beslay, L., & Hakala, H. (2007). Digital territory: Bubbles. In P. T. Kidd (Ed.), European visions for the knowledge age: a quest for new horizons in the information society. Cheshire Henbury.

    Google Scholar 

  • Bezold, M. (2011). Adapting Multimodal Dialogue Systems to User Behaviour. PhD thesis, Ulm University.

    Google Scholar 

  • Bidot, J., Goumopoulos, C., & Calemis, I. (2011). Using ai planning and late binding for managing service workflows in intelligent environments. In Proc. of the International Conference on Pervasive Computing and Communications (PerCom), (pp. 156–163). IEEE.

    Google Scholar 

  • Black, A. W., Burger, S., Conkie, A., Hastie, H. W., Keizer, S., Lemon, O., Merigaud, N., Parent, G., Schubiner, G., Thomson, B., Williams, J. D., Yu, K., Young, S., & Eskenazi, M. (2011). Spoken dialog challenge 2010: Comparison of live and control test results. In SIGDIAL Conference, (pp. 2–7).

    Google Scholar 

  • Bohlin, P., Bos, J., Larsson, S., Lewin, I., Matheson, C., & Milward, D. (1999). Survey of existing interactive systems – trindi deliverable d1.3. Technical report, Gothenburg University.

    Google Scholar 

  • Bohus, D., Raux, A., Harris, T. K., Eskenazi, M., & Rudnicky, E. I. (2007). Olympus: an open-source framework for conversational spoken language interface research. In HLT-NAACL 2007 workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technology.

    Google Scholar 

  • Bohus, D., & Rudnicky, A. (2002). Integrating multiple knowledge sources for utterance-level confidence annotation in the cmu communicator spoken dialog system. Technical report, Roots in the Town. In 2nd International Workshop on Community Networking. 1995. Princeton, NJ: IEEE Communications Society

    Google Scholar 

  • Bohus, D., & Rudnicky, A. (2005). Sorry i didn’t catch that: An investigation of non-understanding errors and recovery strategies. In Proceedings of SIGdial-2005, Lisbon, Portugal.

    Google Scholar 

  • Bohus, D., & Rudnicky, A. I. (2009). The ravenclaw dialog management framework: Architecture and systems. Computer Speech & Language, 23, 332–361.

    Article  Google Scholar 

  • Brown, M., Burnett, D., Candell, E., Carter, J., Dahl, D., Ghosh, D., Hunt, A., Krause, S., Lerner, S., Lucas, B., Marschner, J., McGlashan, S., Normandin, Y., Porter, B., Raggett, D., Ramsthaler, D., Tichelen, L. V., Wang, K., & Werner, L. (2004). Speech recognition grammar specification version 1.0. Technical report, W3C.

    Google Scholar 

  • Bühler, D. (2009). Towards Domain-driven Dialogue - Application Control and Problem Solving. PhD thesis, Ulm University.

    Google Scholar 

  • Burkhardt, F., Huber, R., & Batliner, A. (2007). Application of speaker classification in human machine dialog systems. In Speaker Classification I: Fundamentals, Features, and Methods, (pp. 174–179). Berlin, Heidelberg: Springer.

    Chapter  Google Scholar 

  • Burkhardt, F., Metze, F., & Stegmann, J. (2008). Speaker classification for next-generation voice-dialog systems, (pp. 497–528). Wiley.

    Google Scholar 

  • Cáceres, M. (2011). Widget packaging and configuration (working draft). Technical report, W3C.

    Google Scholar 

  • Chin, J., Diehl, V., & Norman, K. (1988). Development of an instrument measuring user satisfaction of the human–computer interface. In Proceedings of ACM CHI 88 Conference on Human Factors in Computing, (pp. 213–218).

    Google Scholar 

  • Chomsky, N. (1956). Three models for the description of language. IRE Transactions on Information Theory, 2, 113–124.

    Article  MATH  Google Scholar 

  • Chung, G., Seneff, S., Wang, C., & Hetherington, L. (2004). A dynamic vocabulary spoken dialogue interface. In Proc. ICSLP, (pp. 1457–1460).

    Google Scholar 

  • Clark, H. H., & Schaefer, E. F. (1989). Contributing to discourse. Cognitive Science, 13(2), 259–294.

    Article  Google Scholar 

  • Colmerauer, A., & Roussel, P. (1996). The birth of prolog. In T. J. Bergin, Jr., & R. G. Gibson, Jr. (Eds.), History of programming languages—II (pp. 331–367). New York, NY, USA: ACM.

    Chapter  Google Scholar 

  • Cook, D., Youngblood, M., & Das, S. (2006). A multi-agent approach to controlling a smart environment. In J. Augusto and C. Nugent (Eds.), Designing Smart Homes, vol. 4008 of Lecture Notes in Computer Science (pp. 165–182). Heidelberg: Springer.

    Chapter  Google Scholar 

  • Cornelius, R. (1996). The science of emotion : research and tradition in the psychology of emotions. Upper Saddle River, NJ, USA: Prentice Hall.

    Google Scholar 

  • Coutaz, J., Crowley, J., Dobson, S., & Garlan, D. (2005). Context is key. Communications of the ACM, 48(3), 49–53.

    Article  Google Scholar 

  • Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor, J. G. (2001). Emotion recognition in human–computer interaction. Signal Processing Magazine, 18(1), 32–80.

    Article  Google Scholar 

  • Daniels, J. (2000). Integrating a spoken language system with agents for operational information access. In AAAI, (pp. 1002–1007).

    Google Scholar 

  • Dervin, B., Foreman-Wernet, L., & Lauterbach, E. (2003). Sense-making methodology reader: Selected writings of Brenda Dervin. Hampton Press Inc.

    Google Scholar 

  • Dretske, F. (1991). Explaining behavior: Reasons in a world of causes. Cambridge, MA, USA: MIT.

    Google Scholar 

  • Duong, T., Bui, H., Phung, D., & Venkatesh, S. (2005). Activity recognition and abnormality detection with the switching hidden semi-markov model. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, (pp. 838–845). IEEE.

    Google Scholar 

  • Fahrmeir, L., Hamerle, A., & Tutz, G. (1984). Multivariate statistische Verfahren. New York: Walter de Gruyter.

    MATH  Google Scholar 

  • Ferguson, G., Allen, J., Blaylock, N., Byron, D., Chambers, N., Dzikovska, M., Galescu, L., Shen, X., Swier, R., & Swift, M. (2002). The Medication Advisor Project: Preliminary report. Technical Report TR776, University of Rochester Computer Science Department.

    Google Scholar 

  • Fowler, M. (2006). Passive view.

    Google Scholar 

  • Franke, J., Daniels, J., & McFarlane, D. (2002). Recovering context after interruption. In CogSci’02, (pp. 310–315).

    Google Scholar 

  • Garrett, J. J. (2005). Ajax: A new approach to web applications. http://adaptivepath.com/ideas/essays/archives/000385.php.

  • Gervasio, M., & Murdock, J. (2009). What were you thinking?: filling in missing dataflow through inference in learning from demonstration. In Proceedings of the 14th international conference on Intelligent user interfaces, (pp. 157–166). ACM.

    Google Scholar 

  • Gil, Y., & Ratnakar, V. (2008). Towards intelligent assistance for to-do lists. In Proceedings of the 13th international conference on Intelligent user interfaces, (pp. 329–332). ACM.

    Google Scholar 

  • Ginzburg, J., & Cooper, R. (2004). Clarification, ellipsis, and the nature of contextual updates in dialogue. Linguistics and Philosophy, 27(3), 297–365.

    Article  Google Scholar 

  • Gnjatović, M., & Rösner, D. (2008). Adaptive dialogue management in the nimitek prototype system. In Proceedings of the 4th IEEE PIT workshop, (pp. 14–25). Berlin, Heidelberg: Springer.

    Google Scholar 

  • Goumopoulos, C., & Kameas, A. (2009). Ambient ecologies in smart homes. The Computer Journal, 52(8), 922–937.

    Article  Google Scholar 

  • Habibi, M., Rahbar, S., & Sameti, H. (2010). Divided pomdp method for complex menu problems in spoken dialogue systems. In Spoken Language Technology Workshop (SLT), 2010 IEEE, (pp. 484–489). IEEE.

    Google Scholar 

  • Hamp, B., & Feldweg, H. (1997). Germanet – a lexical-semantic net for german. In Proceedings of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications, (pp. 9–15). Citeseer.

    Google Scholar 

  • Heinroth, T., & Denich, D. (2011). Spoken Interaction within the Computed World: Evaluation of a Multitasking Adaptive Spoken Dialogue System. In 35th Annual IEEE International Computer Software and Applications Conference (COMPSAC 2011). IEEE.

    Google Scholar 

  • Heinroth, T., Denich, D., & Schmitt, A. (2010). Owlspeak - adaptive spoken dialogue within intelligent environments. In 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops), (pp. 666 – 671). Mannheim, Germany.

    Google Scholar 

  • Heinroth, T., Grotz, M., Nothdurft, F., & Minker, W. (2012). Adaptive speech recognition for intuitive model-based spoken dialogues. In Proceedings of the Eighth Conference on International Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA).

    Google Scholar 

  • Heinroth, T., Koleva, S., & Minker, W. (2011). Topic switching strategies for spoken dialogue systems. In Proc. of the 12th Annual Conference of the International Speech Communication Association.

    Google Scholar 

  • Heinroth, T., & Minker, W. (Eds.). (2011). Next Generation Intelligent Environments: Ambient Adaptive Systems. Boston, USA: Springer.

    Google Scholar 

  • Herm, O., Schmitt, A., & Liscombe, J. (2008). When calls go wrong: How to detect problematic calls based on log-files and emotions? In Proc. of the International Conference on Speech and Language Processing (ICSLP).

    Google Scholar 

  • Hildebrand, A., & Sá, V. (2000). Embassi: electronic multimedia and service assistance. In oceedings of the Internet Measurement Conference (IMC), (pp. 50–59).

    Google Scholar 

  • Hone, K. S., & Graham, R. (2000). Towards a tool for the subjective assessment of speech system interfaces (sassi). Natural Language Engineering, 6, 287–305.

    Article  Google Scholar 

  • Horridge, M., Bechhofer, S., & Noppens, O. (2007). Igniting the owl 1.1 touch paper: The owl api. In Proc. OWL-ED, vol. 258.

    Google Scholar 

  • Huerta, J. M. (2000). Robust Speech Recognition in GSM Mobile Environments. PhD thesis, Carnegie Mellon University.

    Google Scholar 

  • Hunt, A. (2000). Jspeech grammar format. W3C Note http://www.w3.org/TR/jsgf/.

  • Intille, S. S., Larson, K., Beaudin, J. S., Tapia, M., Kaushik, P., Nawyn, J., and Mcleish, T. J. (2005). The placelab: a live-in laboratory for pervasive computing research (video. In Proceedings of Pervasive 2005 Video Program.

    Google Scholar 

  • ISO (2008). Iso/iec 29341–2:2008 information technology – upnp device architecture – part 2: Basic device control protocol - basic device. Technical report, INTERNATIONAL ORGANIZATION FOR STANDARDIZATION.

    Google Scholar 

  • ITU (2005). Parameters describing the interaction with spoken dialogue systems. ITU-T Recommendation Supplement 24 to P-Series, International Telecommunication Union, Geneva, Switzerland. Based on ITU-T Contr. COM 12–17 (2009).

    Google Scholar 

  • Jiang, H. (2005). Confidence measures for speech recognition: A survey. Speech Communication, 45(4), 455–470.

    Article  Google Scholar 

  • Johnston, M., Baggia, P., Burnett, D., Carter, J., Dahl, D., & McCobb, G. (2009). Emma: Extensible multimodal annotation markup language; World Wide Web Consortium Recommendation REC-emma-2009021. Technical report, W3C.

    Google Scholar 

  • Jokinen, K., Kerminen, A., Kaipainen, M., Jauhiainen, T., Wilcock, G., Turunen, M., Hakulinen, J., Kuusisto, J., & Lagus, K. (2002). Adaptive dialogue systems-interaction with interact. In Proceedings of the 3rd SIGdial workshop on Discourse and dialogue-Volume 2, (pp. 64–73). ACL.

    Google Scholar 

  • Jurafsky, D., & Martin, J. H. (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (Prentice Hall Series in Artificial Intelligence) (1st ed.). Prentice Hall.

    Google Scholar 

  • Kaelbling, L., Littman, M., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1–2), 99–134.

    Article  MathSciNet  MATH  Google Scholar 

  • Kientz, J. A., Patel, S. N., Jones, B., Price, E., Mynatt, E. D., & Abowd, G. D. (2008). The georgia tech aware home. In CHI ’08 extended abstracts on Human factors in computing systems, CHI EA ’08, (pp. 3675–3680). New York, NY, USA: ACM.

    Google Scholar 

  • Kleene, S. (1988). Introduction to metamathematics. Wolters-Noordhoff.

    Google Scholar 

  • Knuth, D. E. (1964). Backus normal form vs. Backus Naur form. Communications of the ACM, 7(12), 735–736.

    Google Scholar 

  • Konings, B., & Schaub, F. (2011). Territorial privacy in ubiquitous computing. In Wireless On-Demand Network Systems and Services (WONS), 2011 Eighth International Conference on, (pp. 104–108). IEEE.

    Google Scholar 

  • Könings, B., Wiedersheim, B., & Weber, M. (2011). Privacy & trust in ambient intelligence environments. In W. Minker and T. Heinroth (Eds.), Next Generation Intelligent Environments (pp. 227–252). New York: Springer.

    Chapter  Google Scholar 

  • Krasner, G., & Pope, S. (1998). A cookbook for using the model-view-controller user interface paradigm in smalltalk-80. Journal of Object-Oriented Programming, 1(3), 26–49.

    Google Scholar 

  • Kruskal, W., & Wallis, W. (1952). Use of ranks in one-criterion variance analysis. Journal of the American statistical Association, 47(260), 583–621.

    Article  MATH  Google Scholar 

  • Larsson, S. (2002). Issue-based Dialogue Management. PhD thesis, Göteborg University, Sweden.

    Google Scholar 

  • Larsson, S., & Traum, D. (2000). Information state and dialogue management in the trindi dialogue move engine. Natural Language Engineering Special Issue, 6, 323–340.

    Article  Google Scholar 

  • Levenshtein, V. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8), 707–710.

    MathSciNet  Google Scholar 

  • Lewis, J. R. (1995). Ibm computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use. International Journal of Human–Computer Interaction, 7(1), 57–78.

    Google Scholar 

  • Limbourg, Q., Vanderdonckt, J., Michotte, B., Bouillon, L., & López-Jaquero, V. (2005). Usixml: A language supporting multi-path development of user interfaces. In 9th IFIP Working Conference on Engineering for Human–Computer Interaction, (pp. 134–135). Springer.

    Google Scholar 

  • Litman, D., & Pan, S. (2002). Designing and evaluating an adaptive spoken dialogue system. User Modeling and User-Adapted Interaction, 12(2), 111–137.

    Article  MATH  Google Scholar 

  • Lockwood, S., & Cook, D. (2008). Computer, light on! In The 4th IET International Conference on Intelligent Environments, Seattle, USA.

    Google Scholar 

  • López-Cózar, R., & Callejas, Z. (2006). Two-level speech recognition to enhance the performance of spoken dialogue systems. Knowledge-Based Systems, 19(3), 153–163.

    Article  Google Scholar 

  • López-Cózar, R., & Callejas, Z. (2008). Asr post-correction for spoken dialogue systems based on semantic, syntactic, lexical and contextual information. Speech Communication, 50(8–9), 745–766.

    Article  Google Scholar 

  • López-Cózar, R., & Callejas, Z. (2010). Multimodal dialogue for ambient intelligence and smart environments, chapter 21, (pp. 559–579). Springer.

    Google Scholar 

  • Mankiewicz, R. (2000). The story of mathematics. Princeton: Princeton University Press.

    MATH  Google Scholar 

  • Mann, H., & Whitney, D. (1947). On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, 18(1), 50–60.

    Article  MathSciNet  MATH  Google Scholar 

  • McFarlane, D. (2002). Comparison of four primary methods for coordinating the interruption of people in human–computer interaction. Human–Computer Interaction, 17, 63–139.

    Google Scholar 

  • McGuinness, D. L., & van Harmelen, F. (2004). Owl web ontology language. Technical report, W3C.

    Google Scholar 

  • McTear, M. (2004). Spoken Dialogue Technology: Toward the Conversational User Interface. London: Springer.

    Book  Google Scholar 

  • McTear, M., O’Neill, I., Hanna, P., Liu, X., McTear, M., O’Neill, I., Hanna, P., & Liu, X. (2005). Handling errors and determining confirmation strategies–an object-based approach. Speech Communication, 45(3), 249–269. Special Issue on Error Handling in Spoken Dialogue Systems.

    Google Scholar 

  • Metze, F., Englert, R., Bub, U., Burkhardt, F., & Stegmann, J. (2008). Getting closer: tailored human–computer speech dialog. Universal Access in the Information Society, 8, 97–108.

    Article  Google Scholar 

  • Miller, G. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological review, 63(2), 81–97.

    Article  Google Scholar 

  • Miller, G. (1995). Wordnet: a lexical database for english. Communications of the ACM, 38(11), 39–41.

    Article  Google Scholar 

  • Minker, W., López-Cózar, R., & McTear, M. (2009). The role of spoken language dialogue interaction in intelligent environments. Journal of Ambient Intelligence and Smart Environments, 1(1), 31–36.

    Google Scholar 

  • Montoro, G., Alamán, X., & Haya, P. A. (2004). Spoken interaction in intelligent environments: A working system. In Advances in Pervasive Computing.

    Google Scholar 

  • Mozer, M. C. (2005). Lessons from an Adaptive Home, (pp. 271–294). Wiley.

    Google Scholar 

  • Nakano, M., Miyazaki, N., Hirasawa, J.-i., Dohsaka, K., & Kawabata, T. (1999). Understanding unsegmented user utterances in real-time spoken dialogue systems. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, ACL ’99, (pp. 200–207). Stroudsburg, PA, USA: ACL.

    Google Scholar 

  • Nevin, B., & Johnson, S. (2002). The legacy of Zellig Harris: language and information into the 21st century. John Benjamins Publishing Company.

    Google Scholar 

  • Niezen, G., van der Vlist, B., Hu, J., & Feijs, L. (2010). From events to goals: Supporting semantic interaction in smart environments. In 2010 IEEE Symposium on Computers and Communications (ISCC), (pp. 1029–1034). IEEE.

    Google Scholar 

  • Nuance (2008). Nuance speech recognition system version 8.5 grammar developer’s guide. Technical report, Nuance Communications. visited 05.09.2010.

    Google Scholar 

  • Oh, A. H., & Rudnicky, A. I. (2000). Stochastic language generation for spoken dialogue systems. In Proceedings of the 2000 ANLP/NAACL Workshop on Conversational systems - Volume 3, ANLP/NAACL-ConvSyst ’00, (pp. 27–32). Stroudsburg, PA, USA: ACL.

    Chapter  Google Scholar 

  • Oshry, M., Auburn, R., Baggia, P., Bodell, M., Burke, D., Burnett, D. C., Candell, E., Carter, J., McGlashan, S., Lee, A., Porter, B., & Rehor, K. (2007). Voice extensible markup language (voicexml) 2.1. Technical report, W3C.

    Google Scholar 

  • Paternò, F., Mancini, C., & Meniconi, S. (1997). Concurtasktrees: A diagrammatic notation for specifying task models. In Proceedings of the IFIP TC13 Interantional Conference on Human–Computer Interaction, (pp. 362–369).

    Google Scholar 

  • Pittermann, J. (2008). Speech-Emotion Recognition in Adaptive Dialogue Systems. PhD thesis, Ulm University.

    Google Scholar 

  • Pittermann, J., Pittermann, A., & Minker, W. (2009). Handling Emotions in Human–Computer Dialogues. Dordrecht, The Netherlands: Springer.

    Google Scholar 

  • Plutchik, R. (1980). Emotion: A Psychoevolutionary Synthesis. New York, USA: Harper & Row.

    Google Scholar 

  • Potel, M. (1996). MVP: Model-View-Presenter The Taligent Programming Model for C +  + and Java. Technical report, Taligent Inc.

    Google Scholar 

  • Pruvost, G., Heinroth, T., Bellik, Y., & Minker, W. (2011). Next Generation Intelligent Environments: Ambient Adaptive Systems, chapter 5, (pp. 151–192). Springer.

    Google Scholar 

  • Puerta, A., & Eisenstein, J. (2002). Ximl: a common representation for interaction data. In Proceedings of the 7th International Conference on Intelligent User Interfaces, (pp. 214–215). ACM.

    Google Scholar 

  • Qu, Y. (2001). A Constraint-Based Model of Mixed-Initiative Dialogue in Information-Seeking Interactions. PhD thesis, School of Computer Science, Carnegie Mellon University.

    Google Scholar 

  • Qu, Y. (2002). A constraint-based approach for cooperative information-seeking dialog. In Proc. INLG.

    Google Scholar 

  • Quesada, J. F., Garcia, F., Sena, E., Bernal, J. A., & Amores, G. (2001). Dialogue management in a home machine environment: Linguistic components over an agent architecture. Procesamiento del Lenguaje Natural, 27, 89–96.

    Google Scholar 

  • Raux, A., & Eskenazi, M. (2007). A multi-layer architecture for semi-synchronous event-driven dialogue management. In ASRU. IEEE Workshop on Automatic Speech Recognition Understanding, (pp. 514–519).

    Google Scholar 

  • Reenskaug, T. (1979). Models - views - controllers. Technical report, Xerox PARC.

    Google Scholar 

  • rí Adámek, J. (2008). Theoretische Informatik (lecture notes). Technische Universität Braunschweig.

    Google Scholar 

  • Rohlicek, J., Russell, W., Roukos, S., & Gish, H. (1989). Continuous hidden Markov modeling for speaker-independent word spotting. In ICASSP’89, (pp. 627–630). IEEE.

    Google Scholar 

  • Román, M., Hess, C., Cerqueira, R., Campbell, R. H., & Nahrstedt, K. (2002). Gaia: A middleware infrastructure to enable active spaces. IEEE Pervasive Computing, 1, 74–83.

    Article  Google Scholar 

  • Ruser, H., Borodulkin, L., & Leisner, D. (2003). Multi-modal ‘smart home’ user interface. In Signals Systems Decision and Information Technology (SSD).

    Google Scholar 

  • Schattenberg, B., Balzer, S., & Biundo, S. (2006). Knowledge-based Middleware as an Architecture for Planning and Scheduling Systems. In Proc. of the 16th International Conference on Automated Planning and Scheduling (ICAPS-06), Ambleside, The English Lake District, UK.

    Google Scholar 

  • Schmitt, A., Heinroth, T., & Bertrand, G. (2009). Towards emotion, age- and gender-aware voicexml applications. In 5th International Conference on Intelligent Environments (IE’09), vol. 2 of Ambient Intelligence and Smart Environments, (pp. 34–41). IOS Press.

    Google Scholar 

  • Schmitt, A., & Liscombe, J. (2008). Detecting Problematic Calls With Automated Agents. In 4th IEEE Tutorial and Research Workshop Perception and Interactive Technologies for Speech-Based Systems, Irsee, Germany.

    Google Scholar 

  • Schmitt, A., Schatz, B., & Minker, W. (2011). Modeling and predicting quality in spoken human–computer interaction. In Proceedings of the SIGDIAL 2011 Conference, (pp. 173–184). Portland, Oregon, USA: ACL.

    Google Scholar 

  • Schnelle-Walka, D., & Feldes, S. (2009). Towards mixed-initiative concepts in smart environments. In Proceedings of Workshop Interacting with Smart Objects.

    Google Scholar 

  • Seneff, S., Hurley, E., Lau, R., Pao, C., Schmid, P., & Zue, V. (1998). Galaxy-ii: A reference architecture for conversational system development. In Proceedings of the international conference on spoken language processing, (pp. 931–934).

    Google Scholar 

  • Shanmugham, S., Monaco, P., & Eberman, B. (2006). A media resource control protocol (mrcp). RFC 4463 http://tools.ietf.org/html/rfc4463.

  • Shannon, C. (1948). A mathematical theory of communication. Bell Systems Technical Journal, 27, 623–656.

    MathSciNet  Google Scholar 

  • Skantze, G. (2003). Exploring human error handling strategies: Implications for spoken dialogue systems. In Proceedings of the ISCA Workshop on Error Handling in Spoken Dialogue Systems, (pp. 71–76). Citeseer.

    Google Scholar 

  • Sonntag, D., Engel, R., Herzog, G., Pfalzgraf, A., Pfleger, N., Romanelli, M., & Reithinger, N. (2007). SmartWeb Handheld – Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services, vol. 4451 of Lecture Notes in Computer Science, (pp. 272–295). Berlin/Heidelberg: Springer.

    Google Scholar 

  • Stoline, M. (1981). The status of multiple comparisons: simultaneous estimation of all pairwise comparisons in one-way anova designs. American Statistician, 35(3), 134–141.

    MATH  Google Scholar 

  • Swerts, M., Litman, D., & Hirschberg, J. (2000). Corrections in spoken dialogue systems. In Proceedings of the International Conference on Spoken Language Processing, vol. 2, (pp. 615–618). Citeseer.

    Google Scholar 

  • Traum, D., & Larsson, S. (2003). The information state approach to dialogue management, chapter 15, (pp. 325–353). Kluwer.

    Google Scholar 

  • Turing, A. (1937). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 2(1), 230.

    Article  MathSciNet  Google Scholar 

  • Ubisense (2011). Ubisense series 7000 ip sensors. http://www.ubisense.net/en/media/pdfs/factsheets_pdf/88679_series_7000_ip_sensors_combined.pdf.

  • van Helvert, J., Hagras, H., & Kameas, A. (2009). D27 - prototype testing and validation (year 2). Restricted deliverable, The ATRACO Project (FP7/2007–2013 grant agreement n 216837).

    Google Scholar 

  • van Helvert, J., Hagras, H., Wagner, C., Dooley, J., Bacon, R., & Bilgin, A. (2011). D27 - prototype testing and validation (year 3). Restricted deliverable, The ATRACO Project (FP7/2007–2013 grant agreement n 216837).

    Google Scholar 

  • Van Welie, M., Van der Veer, G., & Eliëns, A. (1998). An ontology for task world models. In Proceedings of DSV-IS98, (pp. 3–5). Abingdon, UK: Springer.

    Google Scholar 

  • Vipperla, R., Wolters, M., Georgila, K., & Renals, S. (2009). Speech input from older users in smart environments: Challenges and perspectives. In Proceedings HCI International: Universal Access in Human–Computer Interaction. Intelligent and Ubiquitous Interaction Environments, number 5615 in Lecture Notes in Computer Science (pp. 117–126). Springer.

    Google Scholar 

  • Voxeo (2011). Voxeo prophecy. http://www.voxeo.com/products/.

  • Wagner, C., & Hagras, H. (2010). D14 – artefact operation adaptation component. Confidential deliverable, The ATRACO Project (FP7/2007–2013 grant agreement n 216837).

    Google Scholar 

  • Walker, M., Rudnicky, A., Prasad, R., Aberdeen, J., Bratt, E., Garofolo, J., Hastie, H., Le, A., Pellom, B., Potamianos, A., et al. (2002). Darpa communicator: Cross-system results for the 2001 evaluation. In Proc. of ICSLP. Citeseer.

    Google Scholar 

  • Walker, M. A., Litman, D. J., Kamm, C. A., & Abella, A. (1997). Paradise: a framework for evaluating spoken dialogue agents. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics.

    Google Scholar 

  • Wang, K. (2002). Salt: an xml application for web-based multimodal dialog management. In Proceedings of the 2nd workshop on NLP and XML - Volume 17, (pp. 1–8).

    Google Scholar 

  • Ward, W., & Issar, S. (1994). Recent improvements in the cmu spoken language understanding system. In Proceedings of the workshop on Human Language Technology, HLT ’94, (pp. 213–216). Stroudsburg, PA, USA: ACL.

    Chapter  Google Scholar 

  • Warren, W. (2006). The dynamics of perception and action. Psychological review, 113(2), 358.

    Article  MathSciNet  Google Scholar 

  • Wechsung, I., & Naumann, A. B. (2008). Evaluation methods for multimodal systems: A comparison of standardized usability questionnaires. Lecture Notes in Computer Science, 5078, 276–284.

    Article  Google Scholar 

  • Williams, J., & Young, S. (2007). Scaling pomdps for spoken dialog management. IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 2116–2129.

    Article  Google Scholar 

  • Yang, F., Heeman, P., & Kun, A. (2008). Switching to real-time tasks in multi-tasking dialogue. In COLING’08, (pp. 1025–1032). ACL.

    Google Scholar 

  • Yang, F., Heeman, P. A., & Kun, A. L. (2011). An investigation of interruptions and resumptions in multi-tasking dialogues. Computational Linguistics, 37(1), 75–104.

    Article  Google Scholar 

  • Young, S. (2007). Using POMDPs for dialog management. In Spoken Language Technology Workshop, 2006. IEEE, (pp. 8–13). IEEE.

    Google Scholar 

  • Young, S., Gasic, M., Keizer, S., Mairesse, F., Schatzmann, J., Thomson, B., & Yu, K. (2010). The hidden information state model: A practical framework for pomdp-based spoken dialogue management. Computer Speech & Language, 24(2), 150–174.

    Article  Google Scholar 

  • Young, S., Williams, J., Schatzmann, J., Stuttle, M., & Weilhammer, K. (2006). D4.3: Bayes net prototype - the hidden information state dialogue manager. Technical report, TALK - Talk and Look: Tools for Ambient Linguistic Knowledge, IST-507802, 6th FP.

    Google Scholar 

  • Zgorzelski, A., Schmitt, A., Heinroth, T., & Minker, W. (2010). Repair strategies on trial: which error recovery do users like best? In Proc. of the International Conference on Speech and Language Processing (ICSLP).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Heinroth, T., Minker, W. (2013). Background. In: Introducing Spoken Dialogue Systems into Intelligent Environments. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5383-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-5383-3_2

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-5382-6

  • Online ISBN: 978-1-4614-5383-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics