Online vs. Offline Adaptive Domain Randomization Benchmark

Tiboni, Gabriele; Arndt, Karol; Averta, Giuseppe; Kyrki, Ville; Tommasi, Tatiana

doi:10.1007/978-3-031-22731-8_12

Gabriele Tiboni¹⁴,
Karol Arndt¹⁵,
Giuseppe Averta¹⁴,
Ville Kyrki¹⁵ &
…
Tatiana Tommasi¹⁴

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 26))

Included in the following conference series:

International Workshop on Human-Friendly Robotics

402 Accesses
1 Citations

Abstract

Physics simulators have shown great promise for conveniently learning reinforcement learning policies in safe, unconstrained environments. However, transferring the acquired knowledge to the real world can be challenging due to the reality gap. To this end, several methods have been recently proposed to automatically tune simulator parameters with posterior distributions given real data, for use with domain randomization at training time. These approaches have been shown to work for various robotic tasks under different settings and assumptions. Nevertheless, existing literature lacks a thorough comparison of existing adaptive domain randomization methods with respect to transfer performance and real-data efficiency. This work presents an open benchmark for both offline and online methods (SimOpt, BayRn, DROID, DROPO), to investigate current limitations on multiple settings and tasks. We found that online methods are limited by the quality of the currently learned policy for the next iteration, while offline methods may sometimes fail when replaying trajectories in simulation with open-loop commands. The code used is publicly available at https://github.com/gabrieletiboni/adr-benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Antonova, R., Cruciani, S., Smith, C., Kragic, D.: Reinforcement learning for pivoting task (2017). arXiv Preprint: arXiv:1703.00472v1
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym (2016). arXiv:1606.01540v1
Chebotar, Y., Handa, A., Makoviychuk, V., Macklin, M., Issac, J., Ratliff, N., Fox, D.: Closing the sim-to-real loop: adapting simulation randomization with real world experience. In: ICRA (2019)
Google Scholar
Chen, X., Hu, J., Jin, C., Li, L., Wang, L.: Understanding domain randomization for sim-to-real transfer. In: ICLR (2022)
Google Scholar
Ding, Z., Tsai, Y., Lee, W.W., Huang, B.: Sim-to-real transfer for robotic manipulation with tactile sensory. In: IROS (2021)
Google Scholar
Finn, C., Zhang, M., Fu, J., Tan, X., McCarthy, Z., Scharff, E., Levine, S.: Guided policy search code implementation (2016). http://rll.berkeley.edu/gps.software available from rll.berkeley.edu/gps
Hansen, N.: The CMA Evolution Strategy: A Comparing Review (2006)
Google Scholar
James, S., Davison, A., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. In: PMLR, pp. 334–343 (2017)
Google Scholar
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Article Google Scholar
Mehta, B., Diaz, M., Golemo, F., Pal, C.J., Paull, L.: Active domain randomization. In: CoRL (2020)
Google Scholar
Mehta, B., Handa, A., Fox, D., Ramos, F.: A user‘s guide to calibrating robotics simulators. In: CoRL (2020)
Google Scholar
Muratore, F., Eilers, C., Gienger, M., Peters, J.: Data-efficient domain randomization with Bayesian optimization. IEEE ICRA Robot. Autom. Lett. 6(2), 911–918 (2021)
Article Google Scholar
Muratore, F., Gienger, M., Peters, J.: Assessing transferability from simulation to reality for reinforcement learning. IEEE TPAMI 43(4), 1172–1183 (2021)
Article Google Scholar
Muratore, F., Gruner, T., Wiese, F., Belousov, B., Gienger, M., Peters, J.: Neural posterior domain randomization. In: Faust, A., Hsu, D., Neumann, G. (eds.) Proceedings of the 5th Conference on Robot Learning and Machine Learning Research, Vol. 164, pp. 1532–1542. PMLR (2022)
Google Scholar
Muratore, F., Ramos, F., Turk, G., Yu, W., Gienger, M., Peters, J.: Robot learning from randomized simulations: a review. Front. Robot. AI 9, 799–893 (2022)
Google Scholar
OpenAI, Akkaya, I., Andrychowicz, M., Chociej, M., Litwin, M., McGrew, B., Petron, A., Paino, A., Plappert, M., Powell, G., Ribas, R., Schneider, J., Tezak, N., Tworek, J., Welinder, P., Weng, L., Yuan, Q., Zaremba, W., Zhang, L.: Solving Rubik’s cube with a robot hand (2019). arXiv Preprint: arXiv:1910.07113v1
Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. In: ICRA (2018)
Google Scholar
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., Dormann, N.: Stable-baselines3: Reliable reinforcement learning implementations. J. Mach. Learn. Res. 22(268), 1–8 (2021). http://jmlr.org/papers/v22/20-1364.html
Rajeswaran, A., Ghotra, S., Ravindran, B., Levine, S.: Epopt: learning robust neural network policies using model ensembles. In: ICLR (2017)
Google Scholar
Ramos, F., Possas, R.C., Fox, D.: BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators. In: RSS (2019)
Google Scholar
Sadeghi, F., Levine, S.: CAD2RL: real single-image flight without a single real image. In: RSS (2017)
Google Scholar
Tan, J., Zhang, T., Coumans, E., Iscen, A., Bai, Y., Hafner, D., Bohez, S., Vanhoucke, V.: Sim-to-real: learning agile locomotion for quadruped robots. In: RSS (2018)
Google Scholar
Tiboni, G., Arndt, K., Kyrki, V.: DROPO: sim-to-real transfer with offline domain randomization (2022). arXiv Preprint: arXiv:2201.08434v1
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: IROS (2017)
Google Scholar
Tsai, Y., Xu, H., Ding, Z., Zhang, C., Johns, E., Huang, B.: DROID: minimizing the reality gap using single-shot human demonstration. IEEE Robot. Autom. Lett. 6(2), 3168–3175 (2021)
Article Google Scholar
Valassakis, E., Di Palo, N., Johns, E.: Coarse-to-fine for sim-to-real: sub-millimetre precision across wide task spaces. In: IROS (2021)
Google Scholar
Vuong, Q., Vikram, S., Su, H., Gao, S., Christensen, H.: How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies? In: ICRA (2019)
Google Scholar
Zhao, W., Queralta, J.P., Westerlund, T.: Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 737–744. IEEE (2020)
Google Scholar

Download references

Acknowledgments

We acknowledge the computational resources generously provided by HPC@POLITO and by the Aalto Science-IT project.

Author information

Authors and Affiliations

Politecnico di Torino, Torino, 10129, Italy
Gabriele Tiboni, Giuseppe Averta & Tatiana Tommasi
Aalto University, Espoo, 02150, Finland
Karol Arndt & Ville Kyrki

Authors

Gabriele Tiboni
View author publications
You can also search for this author in PubMed Google Scholar
Karol Arndt
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Averta
View author publications
You can also search for this author in PubMed Google Scholar
Ville Kyrki
View author publications
You can also search for this author in PubMed Google Scholar
Tatiana Tommasi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gabriele Tiboni .

Editor information

Editors and Affiliations

Faculty of Mechanical, Maritime and Materials Engineering, Delft University of Technology, Delft, The Netherlands
Pablo Borja
Faculty of Mechanical, Maritime and Materials Engineering, Delft University of Technology, Delft, The Netherlands
Cosimo Della Santina
Faculty of Mechanical, Maritime and Materials Engineering, Delft University of Technology, Delft, The Netherlands
Luka Peternel
Department of Mechanical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
Elena Torta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tiboni, G., Arndt, K., Averta, G., Kyrki, V., Tommasi, T. (2023). Online vs. Offline Adaptive Domain Randomization Benchmark. In: Borja, P., Della Santina, C., Peternel, L., Torta, E. (eds) Human-Friendly Robotics 2022. HFR 2022. Springer Proceedings in Advanced Robotics, vol 26. Springer, Cham. https://doi.org/10.1007/978-3-031-22731-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-22731-8_12
Published: 02 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22730-1
Online ISBN: 978-3-031-22731-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics