Data-Based Optimal Tracking Control of Nonaffine Nonlinear Discrete-Time Systems

Luo, Biao; Liu, Derong; Huang, Tingwen; Li, Chao

doi:10.1007/978-3-319-46681-1_68

Biao Luo¹⁹,
Derong Liu²⁰,
Tingwen Huang²¹ &
…
Chao Li¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9950))

Included in the following conference series:

International Conference on Neural Information Processing

2539 Accesses

Abstract

The optimal tracking control problem of nonaffine nonlinear discrete-time systems is considered in this paper. The problem relies on the solution of the so-called tracking Hamilton-Jacobi-Bellman equation, which is extremely difficult to be solved even for simple systems. To overcome this difficulty, the data-based Q-learning algorithm is proposed by learning the optimal tracking control policy from data of the practical system. For its implementation purpose, the critic-only neural network structure is developed, where only critic neural network is required to estimate the Q-function and the least-square scheme is employed to update the weight of neural network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Google Scholar
Hafner, R., Riedmiller, M.: Reinforcement learning in feedback control. Mach. Learn. 84(1–2), 137–169 (2011)
Article MathSciNet Google Scholar
Lewis, F.L., Liu, D.: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, vol. 17. Wiley, Hoboken (2013)
Google Scholar
Luo, B., Wu, H.N., Huang, T., Liu, D.: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica 50(12), 3281–3290 (2014)
Article MathSciNet MATH Google Scholar
Luo, B., Huang, T., Wu, H.N., Yang, X.: Data-driven \( H_\infty \) control for nonlinear distributed parameter systems. IEEE Trans. Neural Netw. Learn. Syst. 26(11), 2949–2961 (2015)
Article MathSciNet Google Scholar
Zhao, D., Zhu, Y.: MEC-a near-optimal online reinforcement learning algorithm for continuous deterministic systems. IEEE Trans. Neural Netw. Learn. Syst. 26(2), 346–356 (2015)
Article MathSciNet Google Scholar
Luo, B., Wu, H.N., Huang, T.: Off-policy reinforcement learning for \( H_\infty \) control design. IEEE Trans. Cybern. 45(1), 65–76 (2015)
Article Google Scholar
Zhu, L., Modares, H., Peen, G., Lewis, F., Yue, B.: Adaptive suboptimal output-feedback control for linear systems using integral reinforcement learning. IEEE Trans. Control Syst. Technol. 23(1), 264–273 (2015)
Article Google Scholar
Luo, B., Wu, H.N., Li, H.X.: Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 26(4), 684–696 (2015)
Article MathSciNet Google Scholar
Liu, Y.J., Tang, L., Tong, S., Chen, C., Li, D.J.: Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time mimo systems. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 165–176 (2015)
Article MathSciNet Google Scholar
Luo, B., Wu, H.N., Huang, T., Liu, D.: Reinforcement learning solution for HJB equation arising in constrained optimal control problem. Neural Netw. 71, 150–158 (2015)
Article Google Scholar
Kamalapurkar, R., Andrews, L., Walters, P., Dixon, W.E.: Model-based reinforcement learning for infinite-horizon approximate optimal tracking. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1–6 (2016)
Article Google Scholar
Zhong, X., He, H.: An event-triggered ADP control approach for continuous-time system with unknown internal states. IEEE Trans. Cybern. PP(99), 1–12 (2016)
Google Scholar
Zhang, H., Wei, Q., Luo, Y.: A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 38(4), 937–942 (2008)
Article Google Scholar
Zhang, H., Song, R., Wei, Q., Zhang, T.: Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Trans. Neural Netw. 22(12), 1851–1862 (2011)
Article Google Scholar
Wei, Q., Liu, D.: Neural-network-based adaptive optimal tracking control scheme for discrete-time nonlinear systems with approximation errors. Neurocomputing 149, Part A, 106–115 (2015)
Google Scholar
Kamalapurkar, R., Dinh, H., Bhasin, S., Dixon, W.E.: Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica 51(1), 40–48 (2015)
Article MathSciNet MATH Google Scholar
Zhang, H., Cui, L., Zhang, X., Luo, Y.: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22(12), 2226–2236 (2011)
Article Google Scholar
Modares, H., Lewis, F.L.: Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Trans. Autom. Control 59(11), 3051–3056 (2014)
Article MathSciNet Google Scholar
Liu, D., Yang, X., Li, H.: Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neural Comput. Appl. 23(7–8), 1843–1850 (2013)
Article Google Scholar
Kiumarsi, B., Lewis, F.: Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 140–151 (2015)
Article MathSciNet Google Scholar
Kiumarsi, B., Lewis, F.L., Modares, H., Karimpour, A., Naghibi-Sistani, M.B.: Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica 50(4), 1167–1175 (2014)
Article MathSciNet MATH Google Scholar
Qin, C., Zhang, H., Luo, Y.: Online optimal tracking control of continuous-time linear systems with unknown dynamics by using adaptive dynamic programming. Int. J. Control 87(5), 1000–1009 (2014)
Article MathSciNet MATH Google Scholar
Kiumarsi, B., Lewis, F., Naghibi-Sistani, M.B., Karimpour, A.: Optimal tracking control of unknown discrete-time linear systems using input-output measured data. IEEE Trans. Cybern. 45(12), 2770–2779 (2015)
Article Google Scholar
Spooner, J.T., Maggiore, M., Ordonez, R., Passino, K.M.: Stable Adaptive Control and Estimation for Nonlinear Systems: Neural and Fuzzy Approximator Techniques, vol. 43. Wiley, New York (2004)
Google Scholar
Hornik, K., Stinchcombe, M., White, H.: Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw. 3(5), 551–560 (1990)
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants 61233001, 61273140, 61304086, 61374105, 61503377, 61533017, and U1501251, in part by the Early Career Development Award of SKLMCCS and in part by the NPRP grant #NPRP 7-1482-1-278 from the Qatar National Research Fund (a member of Qatar Foundation).

Author information

Authors and Affiliations

The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Biao Luo & Chao Li
School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, 100083, China
Derong Liu
Texas A&M University at Qatar, PO Box 23874, Doha, Qatar
Tingwen Huang

Authors

Biao Luo
View author publications
You can also search for this author in PubMed Google Scholar
Derong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tingwen Huang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Biao Luo .

Editor information

Editors and Affiliations

The University of Tokyo , Tokyo, Japan
Akira Hirose
Kobe University , Kobe, Japan
Seiichi Ozawa
Okinawa Institute of Science and Technology Graduate University, Onna, Japan
Kenji Doya
Nara Institute of Science and Technology , Ikoma, Japan
Kazushi Ikeda
Kyungpook National University , Daegu, Korea (Republic of)
Minho Lee
Chinese Academy of Sciences , Beijing, China
Derong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luo, B., Liu, D., Huang, T., Li, C. (2016). Data-Based Optimal Tracking Control of Nonaffine Nonlinear Discrete-Time Systems. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9950. Springer, Cham. https://doi.org/10.1007/978-3-319-46681-1_68

Download citation

DOI: https://doi.org/10.1007/978-3-319-46681-1_68
Published: 30 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46680-4
Online ISBN: 978-3-319-46681-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics