Abstract
The optimal tracking control problem of nonaffine nonlinear discrete-time systems is considered in this paper. The problem relies on the solution of the so-called tracking Hamilton-Jacobi-Bellman equation, which is extremely difficult to be solved even for simple systems. To overcome this difficulty, the data-based Q-learning algorithm is proposed by learning the optimal tracking control policy from data of the practical system. For its implementation purpose, the critic-only neural network structure is developed, where only critic neural network is required to estimate the Q-function and the least-square scheme is employed to update the weight of neural network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Hafner, R., Riedmiller, M.: Reinforcement learning in feedback control. Mach. Learn. 84(1–2), 137–169 (2011)
Lewis, F.L., Liu, D.: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, vol. 17. Wiley, Hoboken (2013)
Luo, B., Wu, H.N., Huang, T., Liu, D.: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica 50(12), 3281–3290 (2014)
Luo, B., Huang, T., Wu, H.N., Yang, X.: Data-driven \( H_\infty \) control for nonlinear distributed parameter systems. IEEE Trans. Neural Netw. Learn. Syst. 26(11), 2949–2961 (2015)
Zhao, D., Zhu, Y.: MEC-a near-optimal online reinforcement learning algorithm for continuous deterministic systems. IEEE Trans. Neural Netw. Learn. Syst. 26(2), 346–356 (2015)
Luo, B., Wu, H.N., Huang, T.: Off-policy reinforcement learning for \( H_\infty \) control design. IEEE Trans. Cybern. 45(1), 65–76 (2015)
Zhu, L., Modares, H., Peen, G., Lewis, F., Yue, B.: Adaptive suboptimal output-feedback control for linear systems using integral reinforcement learning. IEEE Trans. Control Syst. Technol. 23(1), 264–273 (2015)
Luo, B., Wu, H.N., Li, H.X.: Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 26(4), 684–696 (2015)
Liu, Y.J., Tang, L., Tong, S., Chen, C., Li, D.J.: Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time mimo systems. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 165–176 (2015)
Luo, B., Wu, H.N., Huang, T., Liu, D.: Reinforcement learning solution for HJB equation arising in constrained optimal control problem. Neural Netw. 71, 150–158 (2015)
Kamalapurkar, R., Andrews, L., Walters, P., Dixon, W.E.: Model-based reinforcement learning for infinite-horizon approximate optimal tracking. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1–6 (2016)
Zhong, X., He, H.: An event-triggered ADP control approach for continuous-time system with unknown internal states. IEEE Trans. Cybern. PP(99), 1–12 (2016)
Zhang, H., Wei, Q., Luo, Y.: A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 38(4), 937–942 (2008)
Zhang, H., Song, R., Wei, Q., Zhang, T.: Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Trans. Neural Netw. 22(12), 1851–1862 (2011)
Wei, Q., Liu, D.: Neural-network-based adaptive optimal tracking control scheme for discrete-time nonlinear systems with approximation errors. Neurocomputing 149, Part A, 106–115 (2015)
Kamalapurkar, R., Dinh, H., Bhasin, S., Dixon, W.E.: Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica 51(1), 40–48 (2015)
Zhang, H., Cui, L., Zhang, X., Luo, Y.: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22(12), 2226–2236 (2011)
Modares, H., Lewis, F.L.: Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Trans. Autom. Control 59(11), 3051–3056 (2014)
Liu, D., Yang, X., Li, H.: Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neural Comput. Appl. 23(7–8), 1843–1850 (2013)
Kiumarsi, B., Lewis, F.: Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 140–151 (2015)
Kiumarsi, B., Lewis, F.L., Modares, H., Karimpour, A., Naghibi-Sistani, M.B.: Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica 50(4), 1167–1175 (2014)
Qin, C., Zhang, H., Luo, Y.: Online optimal tracking control of continuous-time linear systems with unknown dynamics by using adaptive dynamic programming. Int. J. Control 87(5), 1000–1009 (2014)
Kiumarsi, B., Lewis, F., Naghibi-Sistani, M.B., Karimpour, A.: Optimal tracking control of unknown discrete-time linear systems using input-output measured data. IEEE Trans. Cybern. 45(12), 2770–2779 (2015)
Spooner, J.T., Maggiore, M., Ordonez, R., Passino, K.M.: Stable Adaptive Control and Estimation for Nonlinear Systems: Neural and Fuzzy Approximator Techniques, vol. 43. Wiley, New York (2004)
Hornik, K., Stinchcombe, M., White, H.: Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw. 3(5), 551–560 (1990)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grants 61233001, 61273140, 61304086, 61374105, 61503377, 61533017, and U1501251, in part by the Early Career Development Award of SKLMCCS and in part by the NPRP grant #NPRP 7-1482-1-278 from the Qatar National Research Fund (a member of Qatar Foundation).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Luo, B., Liu, D., Huang, T., Li, C. (2016). Data-Based Optimal Tracking Control of Nonaffine Nonlinear Discrete-Time Systems. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9950. Springer, Cham. https://doi.org/10.1007/978-3-319-46681-1_68
Download citation
DOI: https://doi.org/10.1007/978-3-319-46681-1_68
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46680-4
Online ISBN: 978-3-319-46681-1
eBook Packages: Computer ScienceComputer Science (R0)