Skip to main content

Independent Deep Deterministic Policy Gradient Reinforcement Learning in Cooperative Multiagent Pursuit Games

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2021 (ICANN 2021)

Abstract

In this paper, we study a fully decentralized multi-agent pursuit problem in a non-communication environment. Fully decentralized (decentralized training and decentralized execution) has stronger robustness and scalability compared with centralized training and decentralized execution (CTDE), which is the current popular multi-agent reinforcement learning method. Both centralized training and communication mechanism require a large amount of information exchange between agents, which are strong assumptions that are difficult to meet in reality. However, traditional fully decentralized multi-agent reinforcement learning methods (e.g., IQL) are difficult to converge stably due to the dynamic changes of other agents’ strategies. Therefore, we extend actor-critic to actor-critic-N framework, and propose Potential-Field-Guided Deep Deterministic Policy Gradient (PGDDPG) method on this basis. The agent uses the unified artificial potential field to guide the agent’s strategy updating, which reduces the uncertainty of multi-agent’s decision making in the complex and dynamic changing environment. Thus, PGDDPG which we proposed can converge fast and stably. Finally, through the pursuit experiments in MPE and CARLA, we prove that our method achieves higher success rate and more stable performance than DDPG and MADDPG.

S. Zhou and W. Ren—Contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/zhoushiyang12/pgddpg.git.

References

  1. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: CVPR (July 2017)

    Google Scholar 

  2. Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Proceedings of the 1st Annual Conference on Robot Learning, pp. 1–16 (2017)

    Google Scholar 

  3. Hao, X., Wang, W., Hao, J., Yang, Y.: Independent generative adversarial self-imitation learning in cooperative multiagent systems. arXiv preprint arXiv:1909.11468 (2019)

  4. Harutyunyan, A., Devlin, S., Vrancx, P., Nowé, A.: Expressing arbitrary reward functions as potential-based advice. In: AAAI, pp. 2652–2658 (2015)

    Google Scholar 

  5. Hernandez-Leal, P., Kartal, B., Taylor, M.E.: A survey and critique of multiagent deep reinforcement learning. Auton. Agents Multi-agent Syst. 33(6), 750–797 (2019). https://doi.org/10.1007/s10458-019-09421-1

    Article  Google Scholar 

  6. Jaques, N., et al.: Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In: ICML, pp. 3040–3049. PMLR (2019)

    Google Scholar 

  7. Jiang, J., Lu, Z.: Learning attentional communication for multi-agent cooperation. CoRR abs/1805.07733 (2018). http://arxiv.org/abs/1805.07733

  8. LaValle, S.M.: Planning Algorithms. Cambridge University Press (2006)

    Google Scholar 

  9. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  10. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: NIPS (2017)

    Google Scholar 

  11. Ng, A.Y., Harada, D., Russell, S.: Policy invariance under reward transformations: theory and application to reward shaping. In: ICML, vol. 99, pp. 278–287 (1999)

    Google Scholar 

  12. Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 16–17 (2017)

    Google Scholar 

  13. Peng, P., et al.: Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play StarCraft Combat Games. arXiv preprint arXiv:1703.10069 (2017)

  14. Radmanesh, M., Kumar, M., Guentert, P.H., Sarim, M.: Overview of path-planning and obstacle avoidance algorithms for UAVs: a comparative study. Unmanned Syst. 6(02), 95–118 (2018)

    Article  Google Scholar 

  15. Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015)

  16. Siddiqui, R.: Path planning using potential field algorithm. Medium (July 2018)

    Google Scholar 

  17. Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic Policy Gradient Algorithms (2014)

    Google Scholar 

  18. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    Google Scholar 

  19. Xie, L., et al.: Learning with Stochastic guidance for navigation. arXiv preprint arXiv:1811.10756 (2018)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, S., Ren, W., Ren, X., Wang, Y., Yi, X. (2021). Independent Deep Deterministic Policy Gradient Reinforcement Learning in Cooperative Multiagent Pursuit Games. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12894. Springer, Cham. https://doi.org/10.1007/978-3-030-86380-7_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86380-7_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86379-1

  • Online ISBN: 978-3-030-86380-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics