Abstract
In this paper, a novel optimal self-learning battery sequential control scheme is investigated for smart home energy systems. Using the iterative adaptive dynamic programming (ADP) technique, the optimal battery control can be obtained iteratively. Considering the power constraints of the battery, a new non-quadratic form performance index function is established, which guarantees the value of the iterative control law not to exceed the maximum charging/discharging power of the battery to extend the service life of the battery. Simulation results are given to illustrate the performance of the presented method.
This work was supported in part by the National Natural Science Foundation of China under Grants 61233001, 61273140, 61304086, 61374105, 61503377, 61503379, 61304079, 61533017, and U1501251.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abu-Khalaf, M., Lewis, F.L.: Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41, 779–791 (2005)
Boaro, M., Fuselli, D., Angelis, F.D., Liu, D., Wei, Q., Piazza, F.: Adaptive dynamic programming algorithm for renewable energy scheduling and battery management. Cogn. Comput. 5, 264–277 (2013)
Fuselli, D., Angelis, F.D., Boaro, M., Liu, D., Wei, Q., Squartini, S., Piazza, F.: Action dependent heuristic dynamic programming for home energy resource scheduling. Int. J. Electr. Power Energy Syst. 48, 148–160 (2013)
Huang, T., Liu, D.: A self-learning scheme for residential energy system control and management. Neural Comput. Appl. 22, 259–269 (2013)
Jiang, Y., Jiang, Z.P.: Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25, 882–893 (2014)
Lewis, F.L., Vrabie, D., Vamvoudakis, K.G.: Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers. IEEE Control Syst. 32, 76–105 (2012)
Lincoln, B., Rantzer, A.: Relaxing dynamic programming. IEEE Trans. Autom. Control 51, 1249–1260 (2006)
Modares, H., Lewis, F.L.: Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica 50, 1780–1792 (2014)
Song, R., Xiao, W., Zhang, H., Sun, C.: Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25, 1733–1739 (2014)
Song, R., Lewis, F.L., Wei, Q., Zhang, H., Jiang, Z.P., Levine, D.: Multiple actor-critic structures for continuous-time optimal control using input-output data. IEEE Trans. Neural Netw. Learn. Syst. 26, 851–865 (2015)
Song, R., Lewis, F.L., Wei, Q., Zhang, H.: Off-policy actor-critic structure for optimal control of unknown systems with disturbances. IEEE Transactions on Cybernetics (2015, in press). doi:10.1109/TCYB.2015.2421338
Wei, Q., Liu, D., Yang, X.: Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 26, 866–879 (2015)
Wei, Q., Wang, F., Liu, D., Yang, X.: Finite-approximation-error based discrete-time iterative adaptive dynamic programming. IEEE Trans. Cybern. 44, 2820–2833 (2014)
Wei, Q., Liu, D.: Data-driven neuro-optimal temperature control of water gas shift reaction using stable iterative adaptive dynamic programming. IEEE Trans. Ind. Electron. 61, 6399–6408 (2014)
Wei, Q., Liu, D., Shi, G., Liu, Y.: Optimal multi-battery coordination control for home energy management systems via distributed iterative adaptive dynamic programming. IEEE Trans. Ind. Electron. 42, 4203–4214 (2015)
Wei, Q., Song, R., Yan, P.: Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP. IEEE Trans. Neural Netw. Learn. Syst. 27, 444–458 (2016)
Wei, Q., Liu, D., Lin, H.: Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans. Cybern. 46, 840–853 (2016)
Wei, Q., Liu, D.: A novel iterative \(\theta \)-adaptive dynamic programming for discrete-time nonlinear systems. IEEE Trans. Autom. Sci. Eng. 11, 1176–1190 (2014)
Wei, Q., Liu, D.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11, 1020–1036 (2014)
Wei, Q., Liu, D., Shi, G.: A novel dual iterative \(Q\)-learning method for optimal battery management in smart residential environments. IEEE Trans. Ind. Electron. 62, 2509–2518 (2015)
Wei, Q., Liu, D., Lewis, F.L.: Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Inf. Sci. 317, 96–113 (2015)
Wei, Q., Liu, D.: A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems. Sci. China Inf. Sci. 58, 1–15 (2015)
Wei, Q., Lewis, F.L., Sun, Q., Yan, P., Song, R.: Discrete-time deterministic \(Q\)-learning: a novel convergence analysis. IEEE Transactions on Cybernetics (2016, in press)
Wei, Q., Liu, D., Lin, Q., Song, R.: Discrete-time optimal control via local policy iteration adaptive dynamic programming. IEEE Transactions on Cybernetics (2016, in press)
Werbos, P.J.: Advanced forecasting methods for global crisis warning and models of intelligence. General Syst. Yearbook 22, 25–38 (1977)
Werbos, P.J.: A menu of designs for reinforcement learning over time. In: Miller, W.T., Sutton, R.S., Werbos, P.J. (eds.) Neural Networks for Control, pp. 67–95. MIT Press, Cambridge (1991)
Yau, T., Walker, L.N., Graham, H.L., Raithel, R.: Effects of battery storage devices on power system dispatch. IEEE Trans. Power Apparatus Syst. 100, 375–383 (1981)
Zhang, H., Qing, C., Luo, Y.: Neural-network-based constrained optimal control scheme for discrete-time switched nonlinear system using dual heuristic programming. IEEE Trans. Autom. Sci. Eng. 11, 839–849 (2014)
Zhao, Q., Xu, H., Jagannathan, S.: Near optimal output feedback control of nonlinear discrete-time systems based on reinforcement neural network learning. IEEE/CAA J. Automatica Sin. 1, 372–384 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Wei, Q., Liu, D. (2016). Optimal Constrained Neuro-Dynamic Programming Based Self-learning Battery Management in Microgrids. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9949. Springer, Cham. https://doi.org/10.1007/978-3-319-46675-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-46675-0_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46674-3
Online ISBN: 978-3-319-46675-0
eBook Packages: Computer ScienceComputer Science (R0)