Improved Deep Deterministic Policy Gradient Algorithm Based on Prioritized Sampling

Zhang, HaoYu; Xiong, Kai; Bai, Jie

doi:10.1007/978-981-13-2288-4_21

HaoYu Zhang³⁵,
Kai Xiong³⁵ &
Jie Bai³⁶

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 528))

986 Accesses
4 Citations

Abstract

Deep reinforcement learning tends to have low sampling efficiency, and prioritized sampling algorithm can improve the sampling efficiency to a certain extent. The prioritized sampling algorithm can be used in deep deterministic policy gradient algorithm, and a small sample sorting method is proposed to solve the problem of high complexity of the common prioritized sampling algorithm. Simulation experiments prove that the improved deep deterministic policy gradient algorithm improves the sampling efficiency and the training performance is better.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

D. Silver, A. Huang, C.J. Maddison et al., Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484 (2016)
Google Scholar
D. Silver, J. Schrittwieser, K. Simonyan et al., Mastering the game of Go without human knowledge. Nature 550(7676), 354–359 (2017)
Article Google Scholar
V. Mnih, K. Kavukcuoglu, D. Silver et al., Playing Atari with deep reinforcement learning. Comput. Sci. (2013)
Google Scholar
V. Mnih, K. Kavukcuoglu, D. Silver et al., Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Article Google Scholar
D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, M. Riedmiller, Deterministic policy gradient algorithms, in The International Conference on Machine Learning (ICML) (2014)
Google Scholar
T.P. Lillicrap, J.J. Hunt, A. Pritzel et al., Continuous control with deep reinforcement learning. Comput. Sci. 8(6), A187 (2015)
Google Scholar
T. Schaul, J. Quan, I. Antonoglou et al., Prioritized experience replay. Comput. Sci. (2015)
Google Scholar
Zhou, Machine Learning (Tsinghua University Press, Beijing, 2016), pp. 377–382
Google Scholar
J. Schulman, P. Moritz, S. Levine et al., High-dimensional continuous control using generalized advantage estimation. Comput. Sci. (2015)
Google Scholar
R.S. Sutton, A.G. Barto et al., Introduction to reinforcement learning. Mach. Learn. 16(1), 285–286 (2005)
Google Scholar
V. Konda, Actor-critic algorithms. SIAM J. Control Optim. 42(4), 1143–1166 (2006)
Article MathSciNet Google Scholar
H.V. Van, A. Guez, D. Silver, Deep reinforcement learning with double q-learning, in Proceedings of the AAAI Conference on Artificial Intelligence. Phoenix, USA (2016), pp. 2094–2100
Google Scholar
S. Thrun, A. Schwartz, Issues in using function approximation for reinforcement learning, in Proceedings of the 1993 Connectionist Models Summer School, Hillsdale, NJ, ed. by M. Mozer, P. Smolensky, D. Touretzky, J. Elman, A. Weigend (1993)
Google Scholar
Y. Jia, Robust control with decoupling performance for steering and traction of 4WS vehicles under velocity-varying motion. IEEE Trans. Control Syst. Technol. 8(3), 554–569 (2000)
Google Scholar
Y. Jia, Alternative proofs for improved LMI representations for the analysis and the design of continuous-time systems with polytopic type uncertainty: a predictive approach. IEEE Trans. Autom. Control 48(8), 1413–1416 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Science and Technology on Space Intelligent Control Laboratory, Beijing Institute of Control Engineering, Beijing, 100190, China
HaoYu Zhang & Kai Xiong
Beijing Key Laboratory of Intelligent Space Robotic Systems Technology and Applications, Beijing Institute of Spacecraft System Engineering, Beijing, China
Jie Bai

Authors

HaoYu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kai Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Jie Bai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Xiong .

Editor information

Editors and Affiliations

Beihang University, Beijing, China
Yingmin Jia
Beijing University of Posts and Telecommunications, Beijing, China
Junping Du
University of Science and Technology Beijing, Beijing, China
Weicun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Xiong, K., Bai, J. (2019). Improved Deep Deterministic Policy Gradient Algorithm Based on Prioritized Sampling. In: Jia, Y., Du, J., Zhang, W. (eds) Proceedings of 2018 Chinese Intelligent Systems Conference. Lecture Notes in Electrical Engineering, vol 528. Springer, Singapore. https://doi.org/10.1007/978-981-13-2288-4_21

Download citation

DOI: https://doi.org/10.1007/978-981-13-2288-4_21
Published: 06 October 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2287-7
Online ISBN: 978-981-13-2288-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics