Abstract
Our research aim is to develop interactions and algorithms for learning from naïve human teachers through demonstration. We introduce a novel approach to leverage the goal-oriented nature of human teachers by learning an action model and a goal model simultaneously from the same set of demonstrations. We use robot motion data to learn an action model for executing the skill. We use a generic set of perceptual features to learn a goal model and use it to monitor the executed action model. We evaluate our approach with data from 8 naïve teachers demonstrating two skills to the robot. We show that the goal models in the perceptual feature space are consistent across users and correctly recognize demonstrations in cross-validation tests. We additionally observe that a subset of users were not able to teach a successful action model whereas all of them were able to teach a mostly successful goal model. When the learned action models are executed on the robot, the success was on average 66.25 %. Whereas the goal models were on average 90 % correct at deciding on success/failure of the executed action, which we call monitoring.
Similar content being viewed by others
Notes
We only assume that a segmentation for objects is available.
Integrating more robust object segmentation into the perception pipeline is left for future work. Since our experiments in this article are performed on a batch of demonstrations offline, robust online tracking is not our current focus. This will become important in our future work when we want learning to be an incremental online process, and we believe solutions exist for obtaining a robust segmentation for this purpose.
This object based representation is fairly common in robotics.
These details are given for completeness but another model selection procedure can be used as well.
Although tractable approximate methods exist for DBNs.
They are mirrored since the participant is standing across the table in one and standing next to the robot in the other.
Participant 1 has only provided between 2 or 3 keyframes per demonstration whereas other participants provided 4–6. As a result, participant 1s goal model was not able to recognize the demonstrations of other users.
The cross-validation tests how similar the demonstrations are but not how the action itself is modelled.
In an interactive scenario, the teacher might realize this and fix it with their follow-up demonstrations.
Fast enough to have a fluid interaction with the user.
Expert in the sense of demonstrations, not necessarily the underlying algorithms.
We can represent cyclic behaviors with the current action model but currently have no means to decide on when to stop the cycle, see Sect. 4.6.
References
Abbeel, P., & Ng, A. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the 21st International Conference on Machine Learning (ICML) (pp. 1–8).
Abbeel, P., Coates, A., & Ng, A. Y. (2010). Autonomous helicopter aerobatics through apprenticeship learning. The International Journal of Robotics Research, 29(13), 1608–1639.
Akgun, B., & Thomaz, A. (2013). Learning constraints with keyframes. In Robotics: Science and Systems: Workshop on Robot Manipulation.
Akgun, B., Cakmak, M., Jiang, K., & Thomaz, L. A. (2012a). Keyframe-based learning from demonstration. International Journal of Social Robotics, 4(4), 343–355.doi: 10.1007s12369-012-0160-0
Akgun, B., Cakmak, M., Wook Yoo, J., & Thomaz, LA. (2012b). Trajectories and keyframes for kinesthetic teaching: A human-robot interaction perspective. In ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 391–398).
Akgun, B., Subramanian, K., & Thomaz, A. (2012c). Novel interaction strategies for learning from teleoperation. In AAAI Fall Symposia 2012, Robots Learning Interactively from Human Teachers.
Argall, B., Chernova, S., Veloso, M. M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483.
Atkeson, CG., & Schaal, S. (1997). Robot learning from demonstration. In Proceedings of 14th International Conference on Machine Learning, Morgan Kaufmann (pp. 12–20).
Baum, L., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics, 41, 164–171.
Bitzer, S., Howard, M., & Vijayakumar, S. (2010). Using dimensionality reduction to exploit constraints in reinforcement learning. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 3219–3225).
Cakmak, M. (2012). Guided teaching interactions with robots: Embodied queries and teaching heuristics. PhD thesis, Georgia Institute of Technology.
Calinon, S., Guenter, F., & Billard, A. (2007). On learning, representing and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man and Cybernetics, Part B Special Issue on Robot Learning by Observation, Demonstration and Imitation, 37(2), 286–298.
Chao, C., Cakmak, M., & Thomaz, A. (2011). Towards grounding concepts for transfer in goal learning from demonstration. In Proceedings of the Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob), IEEE, vol. 2 (pp. 1–6).
Chernova, S., & Thomaz, A. L. (2014). Robot learning from human teachers. San Rafael, CA: Morgan & Claypool Publishers.
Csibra, G. (2003). Teleological and referential understanding of action in infancy. Philosophical Transactions of the Royal Society of London, 358, 447–458.
Dantam, N., Essa, I., & Stilman, M. (2012). Linguistic transfer of human assembly tasks to robots. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Ekvall, S., & Kragic, D. (2008). Robot learning from demonstration: A task-level planning approach. International Journal of Advanced Robotic Systems, 5(3), 223–234.
Hovland, G., Sikka P., & McCarragher, B. (1996). Skill acquisition from human demonstration using a hidden Markov model. In 1996 IEEE International Conference on Robotics and Automation, vol 3, (pp. 2706–2711). IEEE
Hsiao K., & Lozano-Perez, T. (2006). Imitation learning of whole-body grasps. In 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 5657–5662). IEEE
Jäkel, R., Schmidt-Rohr, S. R., Rühl, S. W., Kasper, A., Xue, Z., & Dillmann, R. (2012). Learning of planning models for dexterous manipulation based on human demonstrations. International Journal of Social Robotics, Special Issue on Robot Learning from Demonstration, 4, 437–448.
Jenkins, O., Mataric, M., Weber, S., et al. (2000). Primitive-based movement classification for humanoid imitation. In Proceedings of 1st IEEE-RAS International Conference on Humanoid Robotics (Humanoids-2000).
Khansari-Zadeh, S. M., & Billard, A. (2011). Learning stable non-linear dynamical systems with Gaussian mixture models. IEEE Transaction on Robotics, 27, 943–957.
Kormushev, P., Calinon, S., & Caldwell, D.G. (2010). Robot motor skill coordination with em-based reinforcement learning. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Kulić, D., Ott, C., Lee, D., Ishikawa, J., & Nakamura, Y. (2012). Incremental learning of full body motion primitives and their sequencing through human motion observation. The International Journal of Robotics Research, 31(3), 330–345.
Levas, A., & Selfridge, M. (1984). A user-friendly high-level robot teaching system. In Proceedings of the IEEE International Conference on Robotics, Atlanta, Georgia (pp. 413–416).
Meltzoff, A. N., & Decety, J. (2003). What imitation tells us about social cognition: A rapprochement between developmental psychology and cognitive neuroscience. Philosophical Transactions of the Royal Society of London, 358, 491–500. doi:10.1098/rstb.2002.1261.
Miyamoto, H., Schaal, S., Gandolfo, F., Gomi, H., Koike, Y., Osu, R., et al. (1996). A kendama learning robot based on bi-directional theory. Neural Networks, 9, 1281–1302.
Mülling, K., Kober, J., Kroemer, O., & Peters, J. (2013). Learning to select and generalize striking movements in robot table tennis. The International Journal of Robotics Research, 32(3), 263–279.
Nicolescu, M.N., & Matarić, M.J. (2003). Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In Proceedings of the 2nd International Conference on AAMAS. Melbourne, Australia.
Niekum, S., Chitta, S., Marthi, B., Osentoski, S., & Barto, A.G. (2013). Incremental semantically grounded learning from demonstration. In Robotics: Science and Systems, 9.
Niekum, S., Osentoski, S., Konidaris, G.D., Chitta, S., Marthi, B., & Barto, A. G. (2015). Learning grounded finite-state representations from unstructured demonstrations. The International Journal of Robotics Research, 34(2), 131–157.
Parent, R. (2002). Computer animation: Algorithms and techniques. Morgan Kaufmann series in computer graphics and geometric modeling. San Francisco: Morgan Kaufmann.
Pastor, P., Hoffmann, H., Asfour, T., & Schaal, S. (2009). Learning and generalization of motor skills by learning from demonstration. In IEEE International Conference on Robotics and Automation.
Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., & Schaal, S. (2011). Skill learning and task outcome prediction for manipulation. In 2011 IEEE International Conference on Robotics and Automation (ICRA).
Ratliff, N., Ziebart, B., Peterson, K., Bagnell, J.A., Hebert, M., Dey, A.K., & Srinivasa, S. (2009). Inverse optimal heuristic control for imitation learning. In Proceedings of AISTATS (pp. 424–431).
Rusu, RB., Bradski, G., Thibaux, R., & Hsu, J. (2010). Fast 3D recognition and pose using the viewpoint feature histogram. In Proceedings of the 23rd IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Taipei, Taiwan.
Suay, H. B., Toris, R., & Chernova, S. (2012). A practical comparison of three robot learning from demonstration algorithms. Journal of Social Robotics, Special Issue on Robot Learning from Demonstration, 4, 319–330.
Thomaz, A. L., & Breazeal, C. (2008a). Experiments in socially guided exploration: Lessons learned in building robots that learn with and without human teachers. Connection Science, Special Issue on Social Learning in Embodied Agents, 20, 91–110.
Thomaz, A. L., & Breazeal, C. (2008b). Teachable robots: Understanding human teaching behavior to build more effective robot learners. Artificial Intelligence Journal, 172, 716–737.
Trevor, A.J.B., Gedikli, S., Rusu, R.B., & Christensen, H.I. (2013). Efficient organized point cloud segmentation with connected components. In 3rd Workshop on Semantic Perception, Mapping and Exploration (SPME). Karlsruhe, Germany.
Acknowledgments
This work has been supported by US National Science Foundation CAREER award #0953181, and by US Office of Naval Research grant #N000141410120.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Akgun, B., Thomaz, A. Simultaneously learning actions and goals from demonstration. Auton Robot 40, 211–227 (2016). https://doi.org/10.1007/s10514-015-9448-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-015-9448-x