Skip to main content

Improved Deep Deterministic Policy Gradient Algorithm Based on Prioritized Sampling

  • Conference paper
  • First Online:
Proceedings of 2018 Chinese Intelligent Systems Conference

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 528))

Abstract

Deep reinforcement learning tends to have low sampling efficiency, and prioritized sampling algorithm can improve the sampling efficiency to a certain extent. The prioritized sampling algorithm can be used in deep deterministic policy gradient algorithm, and a small sample sorting method is proposed to solve the problem of high complexity of the common prioritized sampling algorithm. Simulation experiments prove that the improved deep deterministic policy gradient algorithm improves the sampling efficiency and the training performance is better.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. D. Silver, A. Huang, C.J. Maddison et al., Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484 (2016)

    Google Scholar 

  2. D. Silver, J. Schrittwieser, K. Simonyan et al., Mastering the game of Go without human knowledge. Nature 550(7676), 354–359 (2017)

    Article  Google Scholar 

  3. V. Mnih, K. Kavukcuoglu, D. Silver et al., Playing Atari with deep reinforcement learning. Comput. Sci. (2013)

    Google Scholar 

  4. V. Mnih, K. Kavukcuoglu, D. Silver et al., Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)

    Article  Google Scholar 

  5. D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, M. Riedmiller, Deterministic policy gradient algorithms, in The International Conference on Machine Learning (ICML) (2014)

    Google Scholar 

  6. T.P. Lillicrap, J.J. Hunt, A. Pritzel et al., Continuous control with deep reinforcement learning. Comput. Sci. 8(6), A187 (2015)

    Google Scholar 

  7. T. Schaul, J. Quan, I. Antonoglou et al., Prioritized experience replay. Comput. Sci. (2015)

    Google Scholar 

  8. Zhou, Machine Learning (Tsinghua University Press, Beijing, 2016), pp. 377–382

    Google Scholar 

  9. J. Schulman, P. Moritz, S. Levine et al., High-dimensional continuous control using generalized advantage estimation. Comput. Sci. (2015)

    Google Scholar 

  10. R.S. Sutton, A.G. Barto et al., Introduction to reinforcement learning. Mach. Learn. 16(1), 285–286 (2005)

    Google Scholar 

  11. V. Konda, Actor-critic algorithms. SIAM J. Control Optim. 42(4), 1143–1166 (2006)

    Article  MathSciNet  Google Scholar 

  12. H.V. Van, A. Guez, D. Silver, Deep reinforcement learning with double q-learning, in Proceedings of the AAAI Conference on Artificial Intelligence. Phoenix, USA (2016), pp. 2094–2100

    Google Scholar 

  13. S. Thrun, A. Schwartz, Issues in using function approximation for reinforcement learning, in Proceedings of the 1993 Connectionist Models Summer School, Hillsdale, NJ, ed. by M. Mozer, P. Smolensky, D. Touretzky, J. Elman, A. Weigend (1993)

    Google Scholar 

  14. Y. Jia, Robust control with decoupling performance for steering and traction of 4WS vehicles under velocity-varying motion. IEEE Trans. Control Syst. Technol. 8(3), 554–569 (2000)

    Google Scholar 

  15. Y. Jia, Alternative proofs for improved LMI representations for the analysis and the design of continuous-time systems with polytopic type uncertainty: a predictive approach. IEEE Trans. Autom. Control 48(8), 1413–1416 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Xiong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, H., Xiong, K., Bai, J. (2019). Improved Deep Deterministic Policy Gradient Algorithm Based on Prioritized Sampling. In: Jia, Y., Du, J., Zhang, W. (eds) Proceedings of 2018 Chinese Intelligent Systems Conference. Lecture Notes in Electrical Engineering, vol 528. Springer, Singapore. https://doi.org/10.1007/978-981-13-2288-4_21

Download citation

Publish with us

Policies and ethics