Skip to main content

A Perceptual System for Language Game Experiments

  • Chapter
  • First Online:
Language Grounding in Robots

Abstract

This chapter describes key aspects of a visual perception system as a key component for language game experiments on physical robots. The vision system is responsible for segmenting the continuous flow of incoming visual stimuli into segments and computing a variety of features for each segment. This happens by a combination of bottom-up way processing that work on the incoming signal and top-down processing based on expectations about what was seen before or objects stored in memory. This chapter consists of two parts. The first one is concerned with extracting and maintaining world models about spatial scenes, without any prior knowledge of the possible objects involved. The second part deals with the recognition of gestures and actions which establish the joint attention and pragmatic feedback that is an important aspect of language games. experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aherne F, Thacker NA, Rockett PI (1998) The Bhattacharyya metric as an absolute similarity measure for frequency coded data. Kybernetika 34(4):363–368

    MathSciNet  Google Scholar 

  • Baddeley AD (1983) Working memory. Philosophical Transactions of the Royal Society of London Series B, Biological Sciences (1934-1990) 302(1110):311– 324

    Google Scholar 

  • Baillie JC, Ganascia JG (2000) Action categorization from video sequences. In:

    Google Scholar 

  • Horn W (ed) Proceedings ECAI, IOS Press, pp 643–647

    Google Scholar 

  • Ballard DH, Hayhoe MM, Pook PK, Rao RPN (1997) Deictic codes for the embodiment of cognition. Behavioural and Brain Sciences 20(4):723–742

    Google Scholar 

  • Belpaeme T, Steels L, Van Looveren J (1998) The construction and acquisition of

    Google Scholar 

  • visual categories. In: Proceedings EWLR-6, Springer, LNCS, vol 1545, pp 1–12

    Google Scholar 

  • Bhattacharyya A (1943) On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin Calcutta Mathematical Society 35:99–110

    MATH  Google Scholar 

  • Breazeal C (2002) Designing Sociable Robots. MIT Press

    Google Scholar 

  • Breazeal C (2003) Toward sociable robots. Robotics and Autonomous Systems 42(3-4):167–175

    Article  MATH  Google Scholar 

  • Brooks A, Arkin R (2007) Behavioral overlays for non-verbal communication expression on a humanoid robot. Autonomous Robots 22(1):55–74

    Article  Google Scholar 

  • Cassell J, Torres OE, Prevost S (1999) Turn taking vs. discourse structure: how best to model multimodal conversation. Machine Conversations pp 143–154

    Google Scholar 

  • Chella A, Frixione M, Gaglio S (2003) Anchoring symbols to conceptual spaces: the case of dynamic scenarios. Robotics and Autonomous Systems 43(2-3):175–188

    Article  Google Scholar 

  • Colombo C, Del Bimbo A, Valli A (2003) Visual capture and understanding of hand

    Google Scholar 

  • pointing actions in a 3-D environment. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 3(4):677–686

    Google Scholar 

  • Coradeschi S, Saffiotti A (2003) An introduction to the anchoring problem. Robotics and Autonomous Systems 43(2-3):85–96

    Article  Google Scholar 

  • Cruse H, Durr V, Schmitz J (2007) Insect walking is based on a decentralized architecture revealing a simple and robust controller. Phil Trans R Soc A 365:221–250

    Article  MathSciNet  Google Scholar 

  • Dautenhahn K, Odgen B, Quick T (2002) From embodied to socially embedded agents–implications for interaction-aware robots. Cognitive Systems Research 3(3):397–428

    Article  Google Scholar 

  • Dominey PF, Boucher JD (2005) Learning to talk about events from narrated video in a construction grammar framework. Artificial Intelligence 167(1-2):31–61

    Article  Google Scholar 

  • Fong T, Nourbakhsh I, Dautenhahn K (2002) A survey of socially interactive robots. Robotics and Autonomous Systems 42(3-4):143–166

    Google Scholar 

  • Fujita M, Kuroki Y, Ishida T, Doi TT (2003) Autonomous behavior control architecture of entertainment humanoid robot sdr-4x. In: Proceedings IROS ’03, pp 960–967, vol. 1

    Google Scholar 

  • Gardenfors P (2000) Conceptual Spaces: The Geometry of Thought. MIT Press

    Google Scholar 

  • Haasch A, Hofemann N, Fritsch J, Sagerer G (2005) A multi-modal object attention system for a mobile robot. In: Proceedings IROS ’05, pp 2712–2717

    Google Scholar 

  • Hafner V, Kaplan F (2005) Learning to interpret pointing gestures: experiments with four-legged autonomous robots. In: Biomimetic Neural Learning for Intelligent

    Google Scholar 

  • Robots, LNCS, vol 3575, Springer, pp 225–234

    Google Scholar 

  • Hager GD, Belhumeur PN (1998) Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(10):1025–1039

    Article  Google Scholar 

  • Hurford JR (2003) The neural basis of predicate-argument structure. Behavioral and Brain Sciences 26(3):261–316

    Google Scholar 

  • Imai M, Ono T, Ishiguro H (2004) Physical relation and expression: joint attention for human-robot interaction. IEEE Transactions on Industrial Electronics 50(4):636–643

    Article  Google Scholar 

  • Ishiguro H (2006) Android science: conscious and subconscious recognition. Connection Science 18(4):319–332

    Article  Google Scholar 

  • Jungel M, Hoffmann J, Lotzsch M (2004) A real-time auto-adjusting vision system for robotic soccer. In: Polani D, Browning B, Bonarini A (eds) RoboCup 2003:

    Google Scholar 

  • Robot Soccer World Cup VII, Springer, LNCS, vol 3020, pp 214–225

    Google Scholar 

  • Kalman RE (1960) A new approach to linear filtering and prediction problems. Transactions of the ASME-Journal of Basic Engineering 82(1):35–45

    Article  Google Scholar 

  • Kanda T, Kamasima M, Imai M, Ono T, Sakamoto D, Ishiguro H, Anzai Y (2007) A humanoid robot that pretends to listen to route guidance from a human. Autonomous Robots 22(1):87–100 Kaplan F, Hafner V (2006) The challenges of joint attention. Interaction Studies 7(2):129–134

    Google Scholar 

  • Kato H, Billinghurst M (1999) Marker tracking and HMD calibration for a videobased augmented reality conferencing system. In: Proceedings ISAR ’99, pp 85– 94

    Google Scholar 

  • Kopp S (2010) Social resonance and embodied coordination in face-to-face conversation with artificial interlocutors. Speech Communication 52(6):587–597

    Article  Google Scholar 

  • Kortenkamp D, Huber E, Bonasso RP (1996) Recognizing and interpreting gestures on a mobile robot. In: Proceedings AAAI-96, pp 915–921

    Google Scholar 

  • Kozima H, Yano H (2001) A robot that learns to communicate with human caregivers. In: Proceedings EPIROB ’01

    Google Scholar 

  • Kroger B, Kopp S, Lowit A (2009) A model for production, perception, and acquisition of actions in face-to-face communication. Cognitive Processing

    Google Scholar 

  • Marjanovic M, Scassellati B, Williamson M (1996) Self-taught visually-guided

    Google Scholar 

  • pointing for a humanoid robot. In: Proceedings SAB ’96, The MIT Press, pp 35–44

    Google Scholar 

  • Marr D (1982) Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W. H. Freeman, San Francisco, CA

    Google Scholar 

  • Martin C, Steege FF, Gross HM (2009) Estimation of pointing poses for visually instructing mobile robots under real world conditions. Robotics and Autonomous Systems 58(2):174–185

    Article  Google Scholar 

  • Mishkin M, Ungerleider LG, Macko KA (1983) Object vision and spatial vision: two cortical pathways. Trends in Neurosciences 6:414–417

    Article  Google Scholar 

  • Nagai Y, Hosada K, Morita A, Asada M (2003) A constructive model for the development of joint attention. Connection Science 15(4):211–229

    Article  Google Scholar 

  • Nickel K, Stiefelhagen R (2007) Visual recognition of pointing gestures for humanrobot

    Google Scholar 

  • interaction. Image and Vision Computing 25(12):1875–1884

    Google Scholar 

  • Perez P, Hue C, Vermaak J, Gangnet M (2002) Color-based probabilistic tracking. In: Proceedings ECCV ’02, Springer, LNCS, vol 2350, pp 661–675

    Google Scholar 

  • Pfeifer R, Lungarella M, Iida F (2007) Self-organization, embodiment, and biologically inspired robotics. Science 318:1088–1093

    Article  Google Scholar 

  • Pylyshyn ZW (1989) The role of location indexes in spatial perception: A sketch of the FINST spatial-index model. Cognition 32(1):65–97

    Article  Google Scholar 

  • Pylyshyn ZW (2001) Visual indexes, preconceptual objects, and situated vision. Cognition 80(1):127–158

    Article  Google Scholar 

  • Scassellati B (1999) Imitation and mechanisms of joint attention: A developmental

    Google Scholar 

  • structure for building social skills on a humanoid robot. In: Nehaniv CL (ed)

    Google Scholar 

  • Computation for Metaphors, Analogy, and Agents, LNCS, vol 1562, Springer, pp 176–195

    Google Scholar 

  • Siskind JM (1995) Grounding language in perception. Artificial Intelligence Review 8(5-6):371–391

    Article  Google Scholar 

  • Siskind JM (2001) Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. Journal of Artificial Intelligence Research 15:31–90

    MATH  Google Scholar 

  • Soille P (2003) Morphological Image Analysis: Principles and Applications. Springer

    Google Scholar 

  • Spelke ES (1990) Principles of object perception. Cognitive Science 14(1):29–56

    Article  Google Scholar 

  • Spranger M (2008) World models for grounded language games

    Google Scholar 

  • Spranger M, Pauw S, Loetzsch M, Steels L (2012) Open-ended procedural semantics. In: Steels L, Hild M (eds) Grounding Language in Robots, Springer Verlag, Berlin

    Google Scholar 

  • Steels L (1998) The origins of syntax in visually grounded robotic agents. Artificial Intelligence 103(1-2):133–156

    Article  MATH  Google Scholar 

  • Steels L, Baillie JC (2003) Shared grounding of event descriptions by autonomous robots. Robotics and Autonomous Systems 43(2-3):163–173

    Article  Google Scholar 

  • Steels L, Kaplan F (1998) Stochasticity as a source of innovation in language games. In: Proceedings ALIFE ’98, MIT Press, pp 368–376

    Google Scholar 

  • Steels L, Vogt P (1997) Grounding adaptive language games in robotic agents. In:

    Google Scholar 

  • Proceedings ECAL ’97, The MIT Press, pp 473–484

    Google Scholar 

  • Tomasello M (1995) Joint attention as social cognition. In: Moore C, Dunham PJ (eds) Joint Attention: Its Origins and Role in Development, Lawrence Erlbaum Associates, Hillsdale, NJ

    Google Scholar 

  • Tomasello M (1999) The Cultural Origins of Human Cognition. Harvard University Press, Harvard

    Google Scholar 

  • Tomasello M, Carpenter M, Call J, Behne T, Moll H (2005) Understanding and sharing intentions: The origins of cultural cognition. Behavioral and Brain Sciences 28:675–691

    Google Scholar 

  • Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cognitive Psychology 12(1):97–136

    Article  Google Scholar 

  • Vinciarelli A, Pantic M, Bourlard H (2009) Social signal processing: Survey of an emerging domain. Image and Vision Computing 27(12):1743–1759

    Article  Google Scholar 

  • Wagner D, Schmalstieg D (2007) ARToolKitPlus for pose tracking on mobile devices. In: Proceedings CVWW ’07

    Google Scholar 

  • Yilmaz A, Javed O, Shah M (2006) Object tracking: A survey. ACM Computing Surveys 38(13):1–45

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Spranger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Spranger, M., Loetzsch, M., Steels, L. (2012). A Perceptual System for Language Game Experiments. In: Steels, L., Hild, M. (eds) Language Grounding in Robots. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-3064-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-3064-3_5

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4614-3063-6

  • Online ISBN: 978-1-4614-3064-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics