Generating 2.5D Photorealistic Synthetic Datasets for Training Machine Vision Algorithms

Peleka, Georgia; Mariolis, Ioannis; Tzovaras, Dimitrios

doi:10.1007/978-3-030-57802-2_61

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1268))

Included in the following conference series:

International Workshop on Soft Computing Models in Industrial and Environmental Applications

1183 Accesses
1 Altmetric

Abstract

The continued success of deep convolution neural networks (CNN) in computer vision can be directly linked to vast amounts of data and tremendous processing resources for training such non-linear models. However, depending on the task, the available amount of data varies significantly. Particularly robotic systems usually rely on small amounts of data, as producing and annotating them is extremely robot and task specific (e.g. grasping) and therefore prohibitive. Recently, in order to address the aforementioned problem of small datasets in robotic vision, a common practice is to reuse features that are already learned by a CNN within a large-scale task and apply them to different small scale ones. This transfer of learning shows some promising results as an alternative, but nevertheless it can not be compared with the performance of a CNN that is specifically trained from the beginning for that specific task. Thus, many researchers turned to synthetic datasets for training, since they can be produced easily and cost effectively. The main issue of such datasets that already exist, is the lack of photorealism both in terms of background and lighting. Herein, we are proposing a framework for the generation of completely synthetic datasets that includes all types of data that state-of-the-art algorithms in object recognition, and tracking need for their training. Thus, we can improve robotic perception without deploying the robot in time-consuming real-world scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
An example dataset generated using the proposed framework will be publicly available upon the publication of the paper at hand.

References

3DFZephyr (2020). https://www.3dflow.net/3df-zephyr-pro-3d-models-from-photos/. Accessed 30 Apr 2020
Blender Online Community: Blender - a 3D modelling and rendering package. Stichting Blender Foundation, Amsterdam (2018). http://www.blender.org. Accessed 30 Apr 2020
HdriHaven (2020). https://hdrihaven.com/. Accessed 30 April 2020
Orbec: Orbec structured light camera (2020). https://orbbec3d.com/product-astra-pro/. Accessed 30 Apr 2020
Agarwal, A., Triggs, B.: A local basis representation for estimating human pose from cluttered images. In: Asian Conference on Computer Vision, pp. 50–59. Springer (2006)
Google Scholar
Browatzki, B., Fischer, J., Graf, B., Bülthoff, H.H., Wallraven, C.: Going into depth: evaluating 2D and 3D cues for object classification on a new, large-scale object dataset. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1189–1195. IEEE (2011)
Google Scholar
Chetverikov, D., Stepanov, D., Krsek, P.: Robust Euclidean alignment of 3D point sets: the trimmed iterative closest point algorithm. Image Vis. Comput. 23(3), 299–309 (2005)
Article Google Scholar
Freedman, B., Shpunt, A., Machline, M., Arieli, Y.: Depth mapping using projected patterns, 23 July 2013, US Patent 8,493,496
Google Scholar
Keselman, L., Iselin Woodfill, J., Grunnet-Jepsen, A., Bhowmik, A.: Intel realsense stereoscopic depth cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–10 (2017)
Google Scholar
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: 2011 IEEE International Conference on Robotics and Automation, pp. 1817–1824. IEEE (2011)
Google Scholar
Mariolis, I., Peleka, G., Kargakos, A., Malassiotis, S.: Pose and category recognition of highly deformable objects using deep learning. In: 2015 International Conference on Advanced Robotics (ICAR), pp. 655–662. IEEE (2015)
Google Scholar
Michels, J., Saxena, A., Ng, A.Y.: High speed obstacle avoidance using monocular vision and reinforcement learning. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 593–600 (2005)
Google Scholar
Pollefeys, M., Gool, L.V.: From images to 3D models. Commun. ACM 45(7), 50–55 (2002)
Article Google Scholar
Saxena, A., Driemeyer, J., Ng, A.Y.: Robotic grasping of novel objects using vision. Int. J. Robot. Res. 27(2), 157–173 (2008)
Article Google Scholar
Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
Google Scholar

Download references

Acknowledgement

This work has been supported by the European Union’s Horizon 2020 research and innovation programme funded project namely: “Co-production CeLL performing Human-Robot Collaborative AssEmbly (CoLLaboratE)” under the grant agreement with no: 820767.

Author information

Authors and Affiliations

Centre for Research and Technology Hellas - CERTH, Information Technologies Institute, 6th km Charilaou-Thermi Rd., Thessaloniki, Greece
Georgia Peleka, Ioannis Mariolis & Dimitrios Tzovaras

Authors

Georgia Peleka
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis Mariolis
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios Tzovaras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georgia Peleka .

Editor information

Editors and Affiliations

Grupo de Inteligencia Computacional Aplicada (GICAP), Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad de Burgos, Burgos, Spain
Álvaro Herrero
Grupo de Inteligencia Computacional Aplicada (GICAP), Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad de Burgos, Burgos, Spain
Carlos Cambra
Grupo de Inteligencia Computacional Aplicada (GICAP), Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad de Burgos, Burgos, Spain
Daniel Urda
Technological Institute of Castilla y León, Burgos, Spain
Javier Sedano
Department of Industrial Engineering, University of A Coruña, La Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peleka, G., Mariolis, I., Tzovaras, D. (2021). Generating 2.5D Photorealistic Synthetic Datasets for Training Machine Vision Algorithms. In: Herrero, Á., Cambra, C., Urda, D., Sedano, J., Quintián, H., Corchado, E. (eds) 15th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2020). SOCO 2020. Advances in Intelligent Systems and Computing, vol 1268. Springer, Cham. https://doi.org/10.1007/978-3-030-57802-2_61

Download citation

DOI: https://doi.org/10.1007/978-3-030-57802-2_61
Published: 29 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57801-5
Online ISBN: 978-3-030-57802-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics