Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems

Riley, Joshua; Calinescu, Radu; Paterson, Colin; Kudenko, Daniel; Banks, Alec

doi:10.1007/978-3-031-10161-8_8

Joshua Riley¹⁰,
Radu Calinescu¹⁰,
Colin Paterson¹⁰,
Daniel Kudenko¹¹ &
…
Alec Banks¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13251))

Included in the following conference series:

International Conference on Agents and Artificial Intelligence

637 Accesses
2 Citations

Abstract

Using multi-agent reinforcement learning to find solutions to complex decision-making problems in shared environments has become standard practice in many scenarios. However, this is not the case in safety-critical scenarios, where the reinforcement learning process, which uses stochastic mechanisms, could lead to highly unsafe outcomes. We proposed a novel, safe multi-agent reinforcement learning approach named Assured Multi-Agent Reinforcement Learning (AMARL) to address this issue. Distinct from other safe multi-agent reinforcement learning approaches, AMARL utilises quantitative verification, a model checking technique that guarantees agent compliance of safety, performance, and non-functional requirements, both during and after the learning process. We have previously evaluated AMARL in patrolling domains with various multi-agent reinforcement learning algorithms for both homogeneous and heterogeneous systems. In this work we extend AMARL through the use of deep multi-agent reinforcement learning. This approach is particularly appropriate for systems in which the rewards are sparse and hence extends the applicability of AMARL. We evaluate our approach within a new search and collection domain which demonstrates promising results in safety standards and performance compared to algorithms not using AMARL.

Supported by the Defence Science and Technology Laboratory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Brunke, L., et al.: Safe learning in robotics: from learning-based control to safe reinforcement learning. arXiv preprint arXiv:2108.06266 (2021)
Bui, V.H., Nguyen, T.T., Kim, H.M.: Distributed operation of wind farm for maximizing output power: a multi-agent deep reinforcement learning approach. IEEE Access 8, 173136–173146 (2020)
Article Google Scholar
Buşoniu, L., Babuška, R., De Schutter, B.: Multi-agent reinforcement learning: an overview. In: Srinivasan, D., Jain, L.C. (eds.) Innovations in Multi-Agent Systems and Applications - 1. Studies in Computational Intelligence, vol. 310, pp. 183–221. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14435-6_7
Chapter Google Scholar
Cheng, R., Orosz, G., Murray, R.M., Burdick, J.W.: End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3387–3395 (2019)
Google Scholar
Danassis, P., Filos-Ratsikas, A., Faltings, B.: Achieving diverse objectives with AI-driven prices in deep reinforcement learning multi-agent markets. arXiv preprint arXiv:2106.06060 (2021)
Dehnert, C., Junges, S., Katoen, J.-P., Volk, M.: A storm is coming: a modern probabilistic model checker. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 592–600. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_31
Chapter Google Scholar
Fan, J., Wang, Z., Xie, Y., Yang, Z.: A theoretical analysis of deep q-learning. In: Learning for Dynamics and Control, pp. 486–489. PMLR (2020)
Google Scholar
Faria, J.M.: Machine learning safety: an overview. In: Proceedings of the 26th Safety-Critical Systems Symposium, York, UK, pp. 6–8 (2018)
Google Scholar
Garcia, F., Rachelson, E.: Markov decision processes. Markov Decision Processes in Artificial Intelligence, pp. 1–38 (2013)
Google Scholar
Garcıa, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)
MathSciNet MATH Google Scholar
Ge, Y., Zhu, F., Huang, W., Zhao, P., Liu, Q.: Multi-agent cooperation q-learning algorithm based on constrained Markov game. Comput. Sci. Inf. Syst. 17(2), 647–664 (2020)
Article Google Scholar
Gerasimou, S., Calinescu, R., Shevtsov, S., Weyns, D.: UNDERSEA: an exemplar for engineering self-adaptive unmanned underwater vehicles. In: 2017 IEEE/ACM 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 83–89. IEEE (2017)
Google Scholar
Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal Aspects Comput. 6(5), 512–535 (1994)
Article Google Scholar
Hasanbeig, M., Abate, A., Kroening, D.: Cautious reinforcement learning with logical constraints. arXiv preprint arXiv:2002.12156 (2020)
Hernandez-Leal, P., Kartal, B., Taylor, M.E.: Is multiagent deep reinforcement learning the answer or the question? A brief survey. Learning 21, 22 (2018)
Google Scholar
Huang, Y., Wu, S., Mu, Z., Long, X., Chu, S., Zhao, G.: A multi-agent reinforcement learning method for swarm robots in space collaborative exploration. In: 2020 6th International Conference on Control, Automation and Robotics (ICCAR), pp. 139–144. IEEE (2020)
Google Scholar
Huh, S., Yang, I.: Safe reinforcement learning for probabilistic reachability and safety specifications: a Lyapunov-based approach. arXiv preprint arXiv:2002.10126 (2020)
Jansen, N., Könighofer, B., Junges, S., Serban, A., Bloem, R.: Safe reinforcement learning using probabilistic shields (2020)
Google Scholar
Juliani, A., et al.: Unity: a general platform for intelligent agents. arXiv preprint arXiv:1809.02627 (2018)
Junges, S., Jansen, N., Dehnert, C., Topcu, U., Katoen, J.-P.: Safety-constrained reinforcement learning for MDPs. In: Chechik, M., Raskin, J.-F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 130–146. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49674-9_8
Chapter Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: Probabilistic symbolic model checking with PRISM: a hybrid approach. In: Katoen, J.-P., Stevens, P. (eds.) TACAS 2002. LNCS, vol. 2280, pp. 52–66. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-46002-0_5
Chapter Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: Stochastic model checking. In: Bernardo, M., Hillston, J. (eds.) SFM 2007. LNCS, vol. 4486, pp. 220–270. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72522-0_6
Chapter Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_47
Chapter Google Scholar
Lee, H.R., Lee, T.: Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response. Eur. J. Oper. Res. 291(1), 296–308 (2021)
Article Google Scholar
Liao, X., et al.: Iteratively-refined interactive 3D medical image segmentation with multi-agent reinforcement learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9394–9402 (2020)
Google Scholar
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)
Article Google Scholar
Luis, S.Y., Reina, D.G., Marín, S.L.T.: A multiagent deep reinforcement learning approach for path planning in autonomous surface vehicles: the Ypacaraí lake patrolling case. IEEE Access 9, 17084–17099 (2021)
Article Google Scholar
Mason, G.R., Calinescu, R.C., Kudenko, D., Banks, A.: Assured reinforcement learning with formally verified abstract policies. In: 9th International Conference on Agents and Artificial Intelligence (ICAART), York (2017)
Google Scholar
Nowé, A., Vrancx, P., De Hauwere, Y.M.: Game theory and multi-agent reinforcement learning. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12, pp. 441–470. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-27645-3_14
Chapter MATH Google Scholar
OroojlooyJadid, A., Hajinezhad, D.: A review of cooperative multi-agent deep reinforcement learning. arXiv preprint arXiv:1908.03963 (2019)
Pardalos, P.M., Migdalas, A., Pitsoulis, L.: Pareto Optimality, Game Theory and Equilibria, vol. 17. Springer, Heidelberg (2008). https://doi.org/10.1007/978-0-387-77247-9
Book MATH Google Scholar
Parnika, P., Diddigi, R.B., Danda, S.K.R., Bhatnagar, S.: Attention actor-critic algorithm for multi-agent constrained co-operative reinforcement learning. arXiv preprint arXiv:2101.02349 (2021)
Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: International Conference on Machine Learning, pp. 2778–2787. PMLR (2017)
Google Scholar
Portugal, D., Iocchi, L., Farinelli, A.: A ROS-based framework for simulation and benchmarking of multi-robot patrolling algorithms. In: Koubaa, A. (ed.) Robot Operating System (ROS). SCI, vol. 778, pp. 3–28. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91590-6_1
Chapter Google Scholar
Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A.: Reinforcement learning with quantitative verification for assured multi-agent policies. In: 13th International Conference on Agents and Artificial Intelligence, York (2021)
Google Scholar
Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A.: Utilising assured multi-agent reinforcement learning within safety-critical scenarios. Procedia Comput. Sci. 192, 1061–1070 (2021). Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 25th International Conference KES 2021
Google Scholar
Rizk, Y., Awad, M., Tunstel, E.W.: Decision making in multiagent systems: a survey. IEEE Trans. Cogn. Dev. Syst. 10(3), 514–529 (2018)
Article Google Scholar
Rosser, C., Abed, K.: Curiosity-driven reinforced learning of undesired actions in autonomous intelligent agents. In: 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI), pp. 000039–000042. IEEE (2021)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Spano, S., et al.: An efficient hardware implementation of reinforcement learning: the q-learning algorithm. IEEE Access 7, 186340–186351 (2019)
Article Google Scholar
Srinivasan, K., Eysenbach, B., Ha, S., Tan, J., Finn, C.: Learning to be safe: deep rl with a safety critic. arXiv preprint arXiv:2010.14603 (2020)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
MATH Google Scholar
Thananjeyan, B., et al.: Recovery RL: safe reinforcement learning with learned recovery zones. IEEE Robot. Autom. Lett. 6(3), 4915–4922 (2021)
Article Google Scholar
Wachi, A., Sui, Y.: Safe reinforcement learning in constrained Markov decision processes. In: International Conference on Machine Learning, pp. 9797–9806. PMLR (2020)
Google Scholar
Wiering, M.A., Van Otterlo, M.: Reinforcement learning. Adapt. Learn. Optim. 12(3), 729 (2012)
Google Scholar
Zhang, K., Yang, Z., Başar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds.) Handbook of Reinforcement Learning and Control. SSDC, vol. 325, pp. 321–384. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-60990-0_12
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

University of York, Heslington, York, U.K.
Joshua Riley, Radu Calinescu & Colin Paterson
L3S Research Center, Leibniz Universität Hannover, Hannover, Germany
Daniel Kudenko
Defence Science and Technology Laboratory, Salisbury, U.K.
Alec Banks

Authors

Joshua Riley
View author publications
You can also search for this author in PubMed Google Scholar
Radu Calinescu
View author publications
You can also search for this author in PubMed Google Scholar
Colin Paterson
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Kudenko
View author publications
You can also search for this author in PubMed Google Scholar
Alec Banks
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joshua Riley .

Editor information

Editors and Affiliations

LIACC, University of Porto, Porto, Portugal
Ana Paula Rocha
ICREA, Institute of Evolutionary Biology, Barcelona, Barcelona, Spain
Luc Steels
Leiden University, Leiden, The Netherlands
Jaap van den Herik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Riley, J., Calinescu, R., Paterson, C., Kudenko, D., Banks, A. (2022). Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems. In: Rocha, A.P., Steels, L., van den Herik, J. (eds) Agents and Artificial Intelligence. ICAART 2021. Lecture Notes in Computer Science(), vol 13251. Springer, Cham. https://doi.org/10.1007/978-3-031-10161-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-10161-8_8
Published: 19 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10160-1
Online ISBN: 978-3-031-10161-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Assured Deep Multi-Agent Reinforcement Learning for Safe Robotic Systems