Skip to main content

Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems

  • Conference paper
  • First Online:
Agents and Artificial Intelligence (ICAART 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13251))

Included in the following conference series:

Abstract

Using multi-agent reinforcement learning to find solutions to complex decision-making problems in shared environments has become standard practice in many scenarios. However, this is not the case in safety-critical scenarios, where the reinforcement learning process, which uses stochastic mechanisms, could lead to highly unsafe outcomes. We proposed a novel, safe multi-agent reinforcement learning approach named Assured Multi-Agent Reinforcement Learning (AMARL) to address this issue. Distinct from other safe multi-agent reinforcement learning approaches, AMARL utilises quantitative verification, a model checking technique that guarantees agent compliance of safety, performance, and non-functional requirements, both during and after the learning process. We have previously evaluated AMARL in patrolling domains with various multi-agent reinforcement learning algorithms for both homogeneous and heterogeneous systems. In this work we extend AMARL through the use of deep multi-agent reinforcement learning. This approach is particularly appropriate for systems in which the rewards are sparse and hence extends the applicability of AMARL. We evaluate our approach within a new search and collection domain which demonstrates promising results in safety standards and performance compared to algorithms not using AMARL.

Supported by the Defence Science and Technology Laboratory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  2. Brunke, L., et al.: Safe learning in robotics: from learning-based control to safe reinforcement learning. arXiv preprint arXiv:2108.06266 (2021)

  3. Bui, V.H., Nguyen, T.T., Kim, H.M.: Distributed operation of wind farm for maximizing output power: a multi-agent deep reinforcement learning approach. IEEE Access 8, 173136–173146 (2020)

    Article  Google Scholar 

  4. Buşoniu, L., Babuška, R., De Schutter, B.: Multi-agent reinforcement learning: an overview. In: Srinivasan, D., Jain, L.C. (eds.) Innovations in Multi-Agent Systems and Applications - 1. Studies in Computational Intelligence, vol. 310, pp. 183–221. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14435-6_7

    Chapter  Google Scholar 

  5. Cheng, R., Orosz, G., Murray, R.M., Burdick, J.W.: End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3387–3395 (2019)

    Google Scholar 

  6. Danassis, P., Filos-Ratsikas, A., Faltings, B.: Achieving diverse objectives with AI-driven prices in deep reinforcement learning multi-agent markets. arXiv preprint arXiv:2106.06060 (2021)

  7. Dehnert, C., Junges, S., Katoen, J.-P., Volk, M.: A storm is coming: a modern probabilistic model checker. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 592–600. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_31

    Chapter  Google Scholar 

  8. Fan, J., Wang, Z., Xie, Y., Yang, Z.: A theoretical analysis of deep q-learning. In: Learning for Dynamics and Control, pp. 486–489. PMLR (2020)

    Google Scholar 

  9. Faria, J.M.: Machine learning safety: an overview. In: Proceedings of the 26th Safety-Critical Systems Symposium, York, UK, pp. 6–8 (2018)

    Google Scholar 

  10. Garcia, F., Rachelson, E.: Markov decision processes. Markov Decision Processes in Artificial Intelligence, pp. 1–38 (2013)

    Google Scholar 

  11. Garcıa, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)

    MathSciNet  MATH  Google Scholar 

  12. Ge, Y., Zhu, F., Huang, W., Zhao, P., Liu, Q.: Multi-agent cooperation q-learning algorithm based on constrained Markov game. Comput. Sci. Inf. Syst. 17(2), 647–664 (2020)

    Article  Google Scholar 

  13. Gerasimou, S., Calinescu, R., Shevtsov, S., Weyns, D.: UNDERSEA: an exemplar for engineering self-adaptive unmanned underwater vehicles. In: 2017 IEEE/ACM 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 83–89. IEEE (2017)

    Google Scholar 

  14. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Aspects Comput. 6(5), 512–535 (1994)

    Article  Google Scholar 

  15. Hasanbeig, M., Abate, A., Kroening, D.: Cautious reinforcement learning with logical constraints. arXiv preprint arXiv:2002.12156 (2020)

  16. Hernandez-Leal, P., Kartal, B., Taylor, M.E.: Is multiagent deep reinforcement learning the answer or the question? A brief survey. Learning 21, 22 (2018)

    Google Scholar 

  17. Huang, Y., Wu, S., Mu, Z., Long, X., Chu, S., Zhao, G.: A multi-agent reinforcement learning method for swarm robots in space collaborative exploration. In: 2020 6th International Conference on Control, Automation and Robotics (ICCAR), pp. 139–144. IEEE (2020)

    Google Scholar 

  18. Huh, S., Yang, I.: Safe reinforcement learning for probabilistic reachability and safety specifications: a Lyapunov-based approach. arXiv preprint arXiv:2002.10126 (2020)

  19. Jansen, N., Könighofer, B., Junges, S., Serban, A., Bloem, R.: Safe reinforcement learning using probabilistic shields (2020)

    Google Scholar 

  20. Juliani, A., et al.: Unity: a general platform for intelligent agents. arXiv preprint arXiv:1809.02627 (2018)

  21. Junges, S., Jansen, N., Dehnert, C., Topcu, U., Katoen, J.-P.: Safety-constrained reinforcement learning for MDPs. In: Chechik, M., Raskin, J.-F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 130–146. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49674-9_8

    Chapter  Google Scholar 

  22. Kwiatkowska, M., Norman, G., Parker, D.: Probabilistic symbolic model checking with PRISM: a hybrid approach. In: Katoen, J.-P., Stevens, P. (eds.) TACAS 2002. LNCS, vol. 2280, pp. 52–66. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-46002-0_5

    Chapter  Google Scholar 

  23. Kwiatkowska, M., Norman, G., Parker, D.: Stochastic model checking. In: Bernardo, M., Hillston, J. (eds.) SFM 2007. LNCS, vol. 4486, pp. 220–270. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72522-0_6

    Chapter  Google Scholar 

  24. Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_47

    Chapter  Google Scholar 

  25. Lee, H.R., Lee, T.: Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response. Eur. J. Oper. Res. 291(1), 296–308 (2021)

    Article  Google Scholar 

  26. Liao, X., et al.: Iteratively-refined interactive 3D medical image segmentation with multi-agent reinforcement learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9394–9402 (2020)

    Google Scholar 

  27. Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)

    Article  Google Scholar 

  28. Luis, S.Y., Reina, D.G., Marín, S.L.T.: A multiagent deep reinforcement learning approach for path planning in autonomous surface vehicles: the Ypacaraí lake patrolling case. IEEE Access 9, 17084–17099 (2021)

    Article  Google Scholar 

  29. Mason, G.R., Calinescu, R.C., Kudenko, D., Banks, A.: Assured reinforcement learning with formally verified abstract policies. In: 9th International Conference on Agents and Artificial Intelligence (ICAART), York (2017)

    Google Scholar 

  30. Nowé, A., Vrancx, P., De Hauwere, Y.M.: Game theory and multi-agent reinforcement learning. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12, pp. 441–470. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-27645-3_14

    Chapter  MATH  Google Scholar 

  31. OroojlooyJadid, A., Hajinezhad, D.: A review of cooperative multi-agent deep reinforcement learning. arXiv preprint arXiv:1908.03963 (2019)

  32. Pardalos, P.M., Migdalas, A., Pitsoulis, L.: Pareto Optimality, Game Theory and Equilibria, vol. 17. Springer, Heidelberg (2008). https://doi.org/10.1007/978-0-387-77247-9

    Book  MATH  Google Scholar 

  33. Parnika, P., Diddigi, R.B., Danda, S.K.R., Bhatnagar, S.: Attention actor-critic algorithm for multi-agent constrained co-operative reinforcement learning. arXiv preprint arXiv:2101.02349 (2021)

  34. Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: International Conference on Machine Learning, pp. 2778–2787. PMLR (2017)

    Google Scholar 

  35. Portugal, D., Iocchi, L., Farinelli, A.: A ROS-based framework for simulation and benchmarking of multi-robot patrolling algorithms. In: Koubaa, A. (ed.) Robot Operating System (ROS). SCI, vol. 778, pp. 3–28. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91590-6_1

    Chapter  Google Scholar 

  36. Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A.: Reinforcement learning with quantitative verification for assured multi-agent policies. In: 13th International Conference on Agents and Artificial Intelligence, York (2021)

    Google Scholar 

  37. Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A.: Utilising assured multi-agent reinforcement learning within safety-critical scenarios. Procedia Comput. Sci. 192, 1061–1070 (2021). Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 25th International Conference KES 2021

    Google Scholar 

  38. Rizk, Y., Awad, M., Tunstel, E.W.: Decision making in multiagent systems: a survey. IEEE Trans. Cogn. Dev. Syst. 10(3), 514–529 (2018)

    Article  Google Scholar 

  39. Rosser, C., Abed, K.: Curiosity-driven reinforced learning of undesired actions in autonomous intelligent agents. In: 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI), pp. 000039–000042. IEEE (2021)

    Google Scholar 

  40. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  41. Spano, S., et al.: An efficient hardware implementation of reinforcement learning: the q-learning algorithm. IEEE Access 7, 186340–186351 (2019)

    Article  Google Scholar 

  42. Srinivasan, K., Eysenbach, B., Ha, S., Tan, J., Finn, C.: Learning to be safe: deep rl with a safety critic. arXiv preprint arXiv:2010.14603 (2020)

  43. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  44. Thananjeyan, B., et al.: Recovery RL: safe reinforcement learning with learned recovery zones. IEEE Robot. Autom. Lett. 6(3), 4915–4922 (2021)

    Article  Google Scholar 

  45. Wachi, A., Sui, Y.: Safe reinforcement learning in constrained Markov decision processes. In: International Conference on Machine Learning, pp. 9797–9806. PMLR (2020)

    Google Scholar 

  46. Wiering, M.A., Van Otterlo, M.: Reinforcement learning. Adapt. Learn. Optim. 12(3), 729 (2012)

    Google Scholar 

  47. Zhang, K., Yang, Z., Başar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds.) Handbook of Reinforcement Learning and Control. SSDC, vol. 325, pp. 321–384. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-60990-0_12

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joshua Riley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A. (2022). Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems. In: Rocha, A.P., Steels, L., van den Herik, J. (eds) Agents and Artificial Intelligence. ICAART 2021. Lecture Notes in Computer Science(), vol 13251. Springer, Cham. https://doi.org/10.1007/978-3-031-10161-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10161-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10160-1

  • Online ISBN: 978-3-031-10161-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics