Simultaneously learning actions and goals from demonstration

Akgun, Baris; Thomaz, Andrea

doi:10.1007/s10514-015-9448-x

Simultaneously learning actions and goals from demonstration

Published: 09 July 2015

Volume 40, pages 211–227, (2016)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Baris Akgun¹ &
Andrea Thomaz¹

1434 Accesses
23 Citations
10 Altmetric
Explore all metrics

Abstract

Our research aim is to develop interactions and algorithms for learning from naïve human teachers through demonstration. We introduce a novel approach to leverage the goal-oriented nature of human teachers by learning an action model and a goal model simultaneously from the same set of demonstrations. We use robot motion data to learn an action model for executing the skill. We use a generic set of perceptual features to learn a goal model and use it to monitor the executed action model. We evaluate our approach with data from 8 naïve teachers demonstrating two skills to the robot. We show that the goal models in the perceptual feature space are consistent across users and correctly recognize demonstrations in cross-validation tests. We additionally observe that a subset of users were not able to teach a successful action model whereas all of them were able to teach a mostly successful goal model. When the learned action models are executed on the robot, the success was on average 66.25 %. Whereas the goal models were on average 90 % correct at deciding on success/failure of the executed action, which we call monitoring.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model-free reinforcement learning from expert demonstrations: a survey

Article 18 October 2021

Learning from Demonstration: A Study of Visual and Auditory Communication and Influence Diagrams

Learning from Humans

Notes

We only assume that a segmentation for objects is available.
http://pointclouds.org/.
Integrating more robust object segmentation into the perception pipeline is left for future work. Since our experiments in this article are performed on a batch of demonstrations offline, robust online tracking is not our current focus. This will become important in our future work when we want learning to be an incremental online process, and we believe solutions exist for obtaining a robust segmentation for this purpose.
This object based representation is fairly common in robotics.
These details are given for completeness but another model selection procedure can be used as well.
Although tractable approximate methods exist for DBNs.
They are mirrored since the participant is standing across the table in one and standing next to the robot in the other.
Participant 1 has only provided between 2 or 3 keyframes per demonstration whereas other participants provided 4–6. As a result, participant 1s goal model was not able to recognize the demonstrations of other users.
The cross-validation tests how similar the demonstrations are but not how the action itself is modelled.
In an interactive scenario, the teacher might realize this and fix it with their follow-up demonstrations.
Fast enough to have a fluid interaction with the user.
Expert in the sense of demonstrations, not necessarily the underlying algorithms.
We can represent cyclic behaviors with the current action model but currently have no means to decide on when to stop the cycle, see Sect. 4.6.

References

Abbeel, P., & Ng, A. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the 21st International Conference on Machine Learning (ICML) (pp. 1–8).
Abbeel, P., Coates, A., & Ng, A. Y. (2010). Autonomous helicopter aerobatics through apprenticeship learning. The International Journal of Robotics Research, 29(13), 1608–1639.
Article Google Scholar
Akgun, B., & Thomaz, A. (2013). Learning constraints with keyframes. In Robotics: Science and Systems: Workshop on Robot Manipulation.
Akgun, B., Cakmak, M., Jiang, K., & Thomaz, L. A. (2012a). Keyframe-based learning from demonstration. International Journal of Social Robotics, 4(4), 343–355.doi: 10.1007s12369-012-0160-0
Akgun, B., Cakmak, M., Wook Yoo, J., & Thomaz, LA. (2012b). Trajectories and keyframes for kinesthetic teaching: A human-robot interaction perspective. In ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 391–398).
Akgun, B., Subramanian, K., & Thomaz, A. (2012c). Novel interaction strategies for learning from teleoperation. In AAAI Fall Symposia 2012, Robots Learning Interactively from Human Teachers.
Argall, B., Chernova, S., Veloso, M. M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483.
Article Google Scholar
Atkeson, CG., & Schaal, S. (1997). Robot learning from demonstration. In Proceedings of 14th International Conference on Machine Learning, Morgan Kaufmann (pp. 12–20).
Baum, L., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics, 41, 164–171.
Article MathSciNet MATH Google Scholar
Bitzer, S., Howard, M., & Vijayakumar, S. (2010). Using dimensionality reduction to exploit constraints in reinforcement learning. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 3219–3225).
Cakmak, M. (2012). Guided teaching interactions with robots: Embodied queries and teaching heuristics. PhD thesis, Georgia Institute of Technology.
Calinon, S., Guenter, F., & Billard, A. (2007). On learning, representing and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man and Cybernetics, Part B Special Issue on Robot Learning by Observation, Demonstration and Imitation, 37(2), 286–298.
Article Google Scholar
Chao, C., Cakmak, M., & Thomaz, A. (2011). Towards grounding concepts for transfer in goal learning from demonstration. In Proceedings of the Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob), IEEE, vol. 2 (pp. 1–6).
Chernova, S., & Thomaz, A. L. (2014). Robot learning from human teachers. San Rafael, CA: Morgan & Claypool Publishers.
Google Scholar
Csibra, G. (2003). Teleological and referential understanding of action in infancy. Philosophical Transactions of the Royal Society of London, 358, 447–458.
Article Google Scholar
Dantam, N., Essa, I., & Stilman, M. (2012). Linguistic transfer of human assembly tasks to robots. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Ekvall, S., & Kragic, D. (2008). Robot learning from demonstration: A task-level planning approach. International Journal of Advanced Robotic Systems, 5(3), 223–234.
Google Scholar
Hovland, G., Sikka P., & McCarragher, B. (1996). Skill acquisition from human demonstration using a hidden Markov model. In 1996 IEEE International Conference on Robotics and Automation, vol 3, (pp. 2706–2711). IEEE
Hsiao K., & Lozano-Perez, T. (2006). Imitation learning of whole-body grasps. In 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 5657–5662). IEEE
Jäkel, R., Schmidt-Rohr, S. R., Rühl, S. W., Kasper, A., Xue, Z., & Dillmann, R. (2012). Learning of planning models for dexterous manipulation based on human demonstrations. International Journal of Social Robotics, Special Issue on Robot Learning from Demonstration, 4, 437–448.
Google Scholar
Jenkins, O., Mataric, M., Weber, S., et al. (2000). Primitive-based movement classification for humanoid imitation. In Proceedings of 1st IEEE-RAS International Conference on Humanoid Robotics (Humanoids-2000).
Khansari-Zadeh, S. M., & Billard, A. (2011). Learning stable non-linear dynamical systems with Gaussian mixture models. IEEE Transaction on Robotics, 27, 943–957.
Article Google Scholar
Kormushev, P., Calinon, S., & Caldwell, D.G. (2010). Robot motor skill coordination with em-based reinforcement learning. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Kulić, D., Ott, C., Lee, D., Ishikawa, J., & Nakamura, Y. (2012). Incremental learning of full body motion primitives and their sequencing through human motion observation. The International Journal of Robotics Research, 31(3), 330–345.
Article Google Scholar
Levas, A., & Selfridge, M. (1984). A user-friendly high-level robot teaching system. In Proceedings of the IEEE International Conference on Robotics, Atlanta, Georgia (pp. 413–416).
Meltzoff, A. N., & Decety, J. (2003). What imitation tells us about social cognition: A rapprochement between developmental psychology and cognitive neuroscience. Philosophical Transactions of the Royal Society of London, 358, 491–500. doi:10.1098/rstb.2002.1261.
Article Google Scholar
Miyamoto, H., Schaal, S., Gandolfo, F., Gomi, H., Koike, Y., Osu, R., et al. (1996). A kendama learning robot based on bi-directional theory. Neural Networks, 9, 1281–1302.
Article Google Scholar
Mülling, K., Kober, J., Kroemer, O., & Peters, J. (2013). Learning to select and generalize striking movements in robot table tennis. The International Journal of Robotics Research, 32(3), 263–279.
Article Google Scholar
Nicolescu, M.N., & Matarić, M.J. (2003). Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In Proceedings of the 2nd International Conference on AAMAS. Melbourne, Australia.
Niekum, S., Chitta, S., Marthi, B., Osentoski, S., & Barto, A.G. (2013). Incremental semantically grounded learning from demonstration. In Robotics: Science and Systems, 9.
Niekum, S., Osentoski, S., Konidaris, G.D., Chitta, S., Marthi, B., & Barto, A. G. (2015). Learning grounded finite-state representations from unstructured demonstrations. The International Journal of Robotics Research, 34(2), 131–157.
Parent, R. (2002). Computer animation: Algorithms and techniques. Morgan Kaufmann series in computer graphics and geometric modeling. San Francisco: Morgan Kaufmann.
Google Scholar
Pastor, P., Hoffmann, H., Asfour, T., & Schaal, S. (2009). Learning and generalization of motor skills by learning from demonstration. In IEEE International Conference on Robotics and Automation.
Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., & Schaal, S. (2011). Skill learning and task outcome prediction for manipulation. In 2011 IEEE International Conference on Robotics and Automation (ICRA).
Ratliff, N., Ziebart, B., Peterson, K., Bagnell, J.A., Hebert, M., Dey, A.K., & Srinivasa, S. (2009). Inverse optimal heuristic control for imitation learning. In Proceedings of AISTATS (pp. 424–431).
Rusu, RB., Bradski, G., Thibaux, R., & Hsu, J. (2010). Fast 3D recognition and pose using the viewpoint feature histogram. In Proceedings of the 23rd IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Taipei, Taiwan.
Suay, H. B., Toris, R., & Chernova, S. (2012). A practical comparison of three robot learning from demonstration algorithms. Journal of Social Robotics, Special Issue on Robot Learning from Demonstration, 4, 319–330.
Google Scholar
Thomaz, A. L., & Breazeal, C. (2008a). Experiments in socially guided exploration: Lessons learned in building robots that learn with and without human teachers. Connection Science, Special Issue on Social Learning in Embodied Agents, 20, 91–110.
Google Scholar
Thomaz, A. L., & Breazeal, C. (2008b). Teachable robots: Understanding human teaching behavior to build more effective robot learners. Artificial Intelligence Journal, 172, 716–737.
Article Google Scholar
Trevor, A.J.B., Gedikli, S., Rusu, R.B., & Christensen, H.I. (2013). Efficient organized point cloud segmentation with connected components. In 3rd Workshop on Semantic Perception, Mapping and Exploration (SPME). Karlsruhe, Germany.

Download references

Acknowledgments

This work has been supported by US National Science Foundation CAREER award #0953181, and by US Office of Naval Research grant #N000141410120.

Author information

Authors and Affiliations

College of Computing, Georgia Institute of Technology, Atlanta, GA, USA
Baris Akgun & Andrea Thomaz

Authors

Baris Akgun
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Thomaz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Baris Akgun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akgun, B., Thomaz, A. Simultaneously learning actions and goals from demonstration. Auton Robot 40, 211–227 (2016). https://doi.org/10.1007/s10514-015-9448-x

Download citation

Received: 27 June 2014
Accepted: 19 June 2015
Published: 09 July 2015
Issue Date: February 2016
DOI: https://doi.org/10.1007/s10514-015-9448-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simultaneously learning actions and goals from demonstration

Abstract

Access this article

Similar content being viewed by others

Model-free reinforcement learning from expert demonstrations: a survey

Learning from Demonstration: A Study of Visual and Auditory Communication and Influence Diagrams

Learning from Humans

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Simultaneously learning actions and goals from demonstration

Abstract

Access this article

Similar content being viewed by others

Model-free reinforcement learning from expert demonstrations: a survey

Learning from Demonstration: A Study of Visual and Auditory Communication and Influence Diagrams

Learning from Humans

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation