Knowledge-Enabled Generation of Semantically Annotated Image Sequences of Manipulation Activities from VR Demonstrations

Haidu, Andrei; Zhang, Xiaoyue; Beetz, Michael

doi:10.1007/978-3-030-87156-7_11

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12899))

Included in the following conference series:

International Conference on Computer Vision Systems

892 Accesses

Abstract

This work presents a cloud-to-edge framework capable of collecting and annotating synthetic images from human performances in virtual environments with the purpose of enabling the training and deployment of robot vision models. The virtual environment is capable of providing close-to-reality image data using state of the art rendering capabilities of game engine technologies. The human performances in the virtual world are fully recorded and segmented into meaningful motion phases of action models from cognitive science. The recorded performances are stored as fully re-playable episodes enabling multi-camera post-processing to acquire fully labeled vision data. The data is represented using KnowRob acting as an extension of the robot’s knowledge base, making it robot understandable and accessible using it’s built in logic based query language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://ai.facebook.com/tools/visdom/.

References

Beetz, M., Beßler, D., Haidu, A., Pomarlan, M., Bozcuoglu, A.K., Bartels, G.: Know Rob 2.0 - a 2nd generation knowledge processing framework for cognition-enabled robotic agents. In: International Conference on Robotics and Automation (ICRA) (2018)
Google Scholar
Damen, D., et al.: Scaling egocentric vision: the EPIC-KITCHENS dataset. In: European Conference on Computer Vision (ECCV) (2018)
Google Scholar
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: VirtualWorlds as proxy for multi-object tracking analysis. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Garcia, A., et al.: The RobotriX: an extremely photorealistic and very-large-scale indoor dataset of sequences with robot trajectories and interactions. In: IEEE International Conference on Intelligent Robots and Systems (IROS) (2018)
Google Scholar
Haidu, A., Beetz, M.: Automated acquisition of structured, semantic models of manipulation activities from human VR demonstration. In: IEEE International Conference on Robotics and Automation (ICRA) (2021)
Google Scholar
Horrocks, I., Patel-Schneider, P.F., Harmelen, F.V.: From SHIQ and RDF to OWL: the making of a web ontology language. J. Web Semant. 1, 7–26 (2003)
Article Google Scholar
Martinez-Gonzalez, P., Oprea, S., Garcia-Garcia, A., Jover-Alvarez, A., Orts-Escolano, S., Garcia-Rodriguez, J.: UnrealROX: an extremely photorealistic virtual reality environment for robotics simulations and synthetic data generation. Virtual Reality 24(2), 271–288 (2019). https://doi.org/10.1007/s10055-019-00399-5
Article Google Scholar
Müller, M., Casser, V., Lahoud, J., Smith, N., Ghanem, B.: Sim4CV: a photo-realistic simulator for computer vision applications. Int. J. Comput. Vis. 126, 902–919 (2018). https://doi.org/10.1007/s11263-018-1073-7
Article Google Scholar
Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_7
Chapter Google Scholar
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar

Download references

Acknowledgements

This work was supported by the DFG as part of CRC #1320 “EASE - Everyday Activity Science and Engineering”. The work was conducted in subproject R5.

Author information

Authors and Affiliations

Institute for Artificial Intelligence, University of Bremen, Bremen, Germany
Andrei Haidu, Xiaoyue Zhang & Michael Beetz

Authors

Andrei Haidu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Michael Beetz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrei Haidu .

Editor information

Editors and Affiliations

TU Wien, Vienna, Austria
Markus Vincze
University of Technology Sydney, Sydney, Australia
Timothy Patten
University of California San Diego, La Jolla, CA, USA
Henrik I Christensen
Technical University of Denmark, Kongens Lyngby, Denmark
Lazaros Nalpantidis
Hong Kong University of Science and Technology, Hong Kong, China
Ming Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Haidu, A., Zhang, X., Beetz, M. (2021). Knowledge-Enabled Generation of Semantically Annotated Image Sequences of Manipulation Activities from VR Demonstrations. In: Vincze, M., Patten, T., Christensen, H.I., Nalpantidis, L., Liu, M. (eds) Computer Vision Systems. ICVS 2021. Lecture Notes in Computer Science(), vol 12899. Springer, Cham. https://doi.org/10.1007/978-3-030-87156-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-87156-7_11
Published: 19 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87155-0
Online ISBN: 978-3-030-87156-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics