Skip to main content

Vidiam: Corpus-based Development of a Dialogue Manager for Multimodal Question Answering

  • Chapter
  • First Online:
Interactive Multi-modal Question-Answering

Abstract

This chapter describes the Vidiam project, which covered the development of a dialogue management system for multimodal question answering (QA) dialogues, as carried out in the IMIX project. The approach followed was datadriven, i.e., corpus-based. Since research in QA dialogue of multimodal information retrieval is still new, no suitable corpora were available to base a system on. This chapter reports on the collection and analysis of three QA dialogue corpora, involving textual follow-up utterances, multimodal follow-up questions, and speech dialogues. Based on the data, a dialogue act typology was created, which helps translate user utterances to practical interactive QA strategies. The chapter goes on to explain how the dialogue manager and its components: dialogue act recognition; interactive QA strategy handling; reference resolution; and multimodal fusion, were built and evaluated using off-line analysis of the corpus data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bertomeu N, Uszkoreit H, Frank A, Krieger HU, J¨org B (2006) Contextual phenomena and thematic relations in database QA dialogues: results from a Wizard-of-Oz experiment. In: Workshop on Interactive Question Answering, HLT-NAACL 06, pp 1–8

    Article  Google Scholar 

  • Bouma G, Mur J, van Noord G, van der Plas L, Tiedemann J (2006) Question answering for dutch using dependency relations. In: Proceedings of the CLEF2005 workshop

    Google Scholar 

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20:37–46

    Article  Google Scholar 

  • De Boni M, Manandhar S (2004) Implementing clarification dialogues in open domain question answering. Journal of Natural Language Engineering

    Google Scholar 

  • Forner P, PeËœnas, Agirre E, Alegrian I For˘ascu C, Moreau N, Osenova P, Prokopidis P, Rocha P, Sacaleanu B, Sutcliffe R, Tjong Kim Sang E (2009) Overview of the clef 2008 multilingual question answering track. In: CLEF’08: Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access, Springer-Verlag, Berlin, Heidelberg, pp 262–295

    Google Scholar 

  • Fukumoto J (2006) Answering questions of information access dialogue (iad) task using ellipsis handling of follow-up questions. In: Workshop on Interactive Question Answering, HLT-NAACL 06

    Google Scholar 

  • Fukumoto J, Niwa T, Itoigawa M, MatsudaM(2004) RitsQA: List answer detection and context task with ellipses handling. In: Working notes of the Fourth NTCIR Workshop Meeting, pp 310–314

    Google Scholar 

  • Galibert O, Illouz G, Rosset S (2005) Ritel: an open-domain, human-computer dialog system. In: Interspeech 2005, pp 909–912

    Google Scholar 

  • Gildea D, Palmer M (2001) The necessity of parsing for predicate argument recognition. In: Proceedings of the 40th Annual Meeting on Association for C omputational Linguistics, Philadelphia, Annual Meeting of the ACL, URL http://www.egr.msu.edu/~jchai/QAPapers/gildea-acl02.pdf

  • Hickl A,Wang P, Lehmann J, Harabagiu SM (2006) FERRET: Interactive questionanswering for real-world environments. In: ACL 2006, pp 25–28

    Google Scholar 

  • Hofs D, Theune M, Op den Akker R (2010) Natural interaction with a virtual guide in a virtual environment: A multimodal dialogue system. Journal on Multimodal User Interfaces 3 (1-2):141–153

    Article  Google Scholar 

  • Inui K, Yamashita A, Matsumoto Y (2003) Dialogue management for languagebased information seeking. In: Proc. First International Workshop on Language Understanding and Agents for Real World Interaction, pp 32–38

    Google Scholar 

  • Kato T, Fukumoto J, Masui F (2004) Question answering challenge for information access dialogue – overview of NTCIR4 QAC2 subtask 3. In: Working notes of the Fourth NTCIR Workshop Meeting

    Google Scholar 

  • Lappin S, Leass HJ (1994) An algorithm for pronominal anaphora resolution. Computational Linguistics 20(4):535–561, URL citeseer.ist.psu.edu/ lappin94algorithm.html

    Google Scholar 

  • Lin CJ, Chen HH (2001) Description of NTU system at TREC-10 QA track. In: TREC 10

    Google Scholar 

  • Lin J, Quan D, Sinha V, Bakshi K, Huynh D, Katz B, Karger DR (2003) What makes a good answer? the role of context in question answering. In: Proceedings of the Ninth IFIP TC13 International Conference on Human-Computer Interaction (INTERACT-2003)

    Google Scholar 

  • Martin JC, Buisine S, Pitel G, Bernsen NO (2006) Fusion of children’s speech and 2D gestures when conversing with 3D characters. Special issue on multimodal interfaces of the Signal Processing journal 86(12):3596–3624

    MATH  Google Scholar 

  • Oh JH, Lee KS, Chang DS, Seo CW, Choi KS (2001) Trec-10 experiments at kaist: Batch filtering and question answering. In: TREC

    Google Scholar 

  • Reithinger N, Bergweiler S, Engel R, Herzog G, Pfleger N, Romanelli M, Sonntag D (2005) A look under the hood: design and development of the first smartweb system demonstrator. In: ICMI ’05: Proceedings of the 7th international conference on Multimodal interfaces, ACM Press, New York, NY, USA, pp 159– 166, DOI http://doi.acm.org/10.1145/1088463.1088492

  • van Schooten B, op den Akker R (2005) Follow-up utterances in QA dialogue. Traitement Automatique des Langues 46(3):181–206

    Google Scholar 

  • van Schooten B, op den Akker R (2007) Multimodal follow-up questions to multimodal answers in a QA system. In: Tenth international symposium on social communication, Universidad de Oriente Santiago de Cuba, pp 469–474

    Google Scholar 

  • van Schooten B, Rosset S, Galibert O, Max A, op den Akker R, Illouz G (2007) Handling speech input in the Ritel QA dialogue system. In: Interspeech 2007

    Google Scholar 

  • van Schooten B, op den Akker R, Rosset S, Galibert O, Max A, Illouz G (2009) Follow-up question handling in the IMIX and Ritel systems: a comparative study. JNLE 15(1):97–118

    Google Scholar 

  • Small S, Liu T, Shimizu N, Strzalkowski T (2003) HITIQA: an interactive question answering system: A preliminary report. In: Proceedings of the ACL 2003 Workshop on Multilingual Summarization and Question Answering

    Google Scholar 

  • Theune M, Krahmer E, van Schooten B, op den Akker R, van Hooijdonk C, Marsi E, Bosma W, Hofs D, Nijholt A (2007) Questions, pictures, answers: Introducing pictures in question-answering systems. In: Tenth international symposium on social communication, Universidad de Oriente Santiago de Cuba, pp 450–463

    Google Scholar 

  • Voorhees EM (2001) Overview of TREC 2001. In: TREC

    Google Scholar 

  • Voorhees EM (2005) Overview of the TREC 2005 question answering track. Tech. rep., NIST

    Google Scholar 

  • Wang D, Zhang J, Dai G (2006) A multimodal fusion framework for children’s storytelling systems. In: Edutainment, pp 585–588

    Google Scholar 

  • Willems DJM, Rossignol SYP, Vuurpijl LG (2005) Features for mode detection in natural online pen input. In: BIGS 2005: Proceedings of the 12th Biennial Conference of the International Graphonomics Society, pp 113–117

    Google Scholar 

  • Witten IH, Frank E (2005) Data Mining: Practical machine learning tools and techniques, 2nd Edition. Morgan Kaufmann

    Google Scholar 

  • Yang F, Feng J, Di Fabbrizio G (2006) A data driven approach to relevancy recognition for contextual question answering. In: Workshop on Interactive Question Answering, HLT-NAACL 06, pp 33–40

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Boris van Schooten .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

van Schooten, B., op den Akker, R. (2011). Vidiam: Corpus-based Development of a Dialogue Manager for Multimodal Question Answering. In: van den Bosch, A., Bouma, G. (eds) Interactive Multi-modal Question-Answering. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17525-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17525-1_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17524-4

  • Online ISBN: 978-3-642-17525-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics