Skip to main content

Off-Policy Neuro-Optimal Control for Unknown Complex-Valued Nonlinear Systems

  • Chapter
  • First Online:
Adaptive Dynamic Programming: Single and Multiple Controllers

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 166))

Abstract

This chapter establishes an optimal control of unknown complex-valued system. Policy iteration (PI) is used to obtain the solution of the Hamilton–Jacobi–Bellman (HJB) equation. Off-policy learning allows the iterative performance index and iterative control to be obtained by completely unknown dynamics. Critic and action networks are used to get the iterative control and iterative performance index, which execute policy evaluation and policy improvement. Asymptotic stability of the closed-loop system and the convergence of the iterative performance index function are proven. By Lyapunov technique, the uniformly ultimately bounded (UUB) of the weight error is proven. Simulation study demonstrates the effectiveness of the proposed optimal control method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sutton, R., Barto, A.: Reinforcement Learning: An Introduction, A Bradford Book. The MIT Press, Cambridge (2005)

    Google Scholar 

  2. Lewis, F., Vrabie, D.: Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst. Mag. 9(3), 32–50 (2009)

    Article  Google Scholar 

  3. Al-Tamimi, A., Lewis, F., Abu-Khalaf, M.: Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans. Syst. Man Cybern. B Cybern. 38(4), 943–949 (2008)

    Article  Google Scholar 

  4. Murray, J., Cox, C., Lendaris, G., Saeks, R.: Adaptive dynamic programming. IEEE Trans. Syst. Man Cybern. Syst. 32(2), 140–153 (2002)

    Article  Google Scholar 

  5. Modares, H., Lewis, F., Jiang, Z.: \(H_\infty \) tracking control of completely unknown continuous-time systems via off-policy reinforcement learning. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 2550–2562 (2015)

    Article  MathSciNet  Google Scholar 

  6. Song, R., Xiao, W., Zhang, H., Sun, C.: Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(9), 1733–1739 (2014)

    Article  Google Scholar 

  7. Wang, J., Xu, X., Liu, D., Sun, Z., Chen, Q.: Self-learning cruise control using kernel-based least squares policy iteration. IEEE Trans. Control Syst. Technol. 22(3), 1078–1087 (2014)

    Article  Google Scholar 

  8. Luo, B., Wu, H., Huang, T., Liu, D.: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica 50(12), 3281–3290 (2014)

    Article  MathSciNet  Google Scholar 

  9. Modares, H., Lewis, F.: Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Trans. Autom. Control 59, 3051–3056 (2014)

    Article  MathSciNet  Google Scholar 

  10. Kiumarsi, B., Lewis, F., Modares, H., Karimpur, A., Naghibi-Sistani, M.: Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica 50(4), 1167–1175 (2014)

    Article  MathSciNet  Google Scholar 

  11. Abu-Khalaf, M., Lewis, F.: Nearly optimal control laws for nonlinear systems withsaturating actuators using a neural network HJB approach. Automatica 41, 779–791 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruizhuo Song .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Science Press, Beijing and Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Song, R., Wei, Q., Li, Q. (2019). Off-Policy Neuro-Optimal Control for Unknown Complex-Valued Nonlinear Systems. In: Adaptive Dynamic Programming: Single and Multiple Controllers. Studies in Systems, Decision and Control, vol 166. Springer, Singapore. https://doi.org/10.1007/978-981-13-1712-5_7

Download citation

Publish with us

Policies and ethics