Summary
The standard approach to stochastic control is dynamic programming. In our recent research, we proposed an alternative approach based on direct comparison of the performance of any two policies. This approach has a number of advantages: the results may be derived in a simple and intuitive way; the approach applies to different optimization problems, including finite and infinite horizon, discounting and average performance, discrete time discrete states and continuous time and continuous stats, etc., in the same way; and it may be generalized to some non-standard problems where dynamic programming fails. This approach also links stochastic control to perturbation analysis, reinforcement learning and other research subjects in optimization, which may stimulate new research directions.
Supported in part by a grant from Hong Kong UGC.
Tribute to Chris Byrnes and Anders Lindquist.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Akian, M., Sulem, A., Taksar, M.: Dynamic Optimization of Long Term Growth Rate for a Portfolio with Transaction Costs and Logarithmic Utility. Mathematical Finance 11, 153–188 (2001)
Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. I, II. Athena Scientific, Belmont (2007)
Brockett, R.: Stochastic Control. Preprint (2009)
Cao, X.R.: Realization Probabilities - The Dynamics of Queueing Systems. Springer, New York (1994)
Cao, X.R.: Stochastic Learning and Optimization - a Sensitivity-Based Approach. Springer, Heidelberg (2007)
Cao, X.R., Zhang, J.: The nth-Order Bias Optimality for Multi-chain Markov Decision Processes. IEEE Transactions on Automatic Control 53, 496–508 (2008)
Cao, X.R.: Stochastic Control via Direct Comparison. Submitted to IEEE Transaction on Automatic Control (2009)
Cao, X.R.: Singular Stochastic Control and Composite Markov Processes. Manuscript to be submitted (2009)
Fleming, W.H., Soner, H.M.: Controlled Markov Processes and Viscosity Solutions, 2nd edn. Springer, Heidelberg (2006)
Ho, Y.C., Cao, X.R.: Perturbation Analysis of Discrete-Event Dynamic Systems. Kluwer Academic Publisher, Boston (1991)
Meyn, S.P.: The Policy Iteration Algorithm for Average Reward Markov Decision Processes with General State Space. IEEE Transactions on Automatic Control 42, 1663–1680 (1997)
Muthuraman, K., Zha, H.: Simulation-based portfolio optimization for large portfolios with transaction costs. Mathematical Finance 18, 115–134 (2008)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, Chichester (1994)
Oksendal, B., Sulem, A.: Applied Stochastic Control of Jump Diffusions. Springer, Heidelberg (2007)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Veinott, A.F.: Discrete Dynamic Programming with Sensitive Discount Optimality Criteria. The Annals of Mathematical Statistics 40(5), 1635–1660 (1969)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Cao, XR. (2010). Dynamic Programming or Direct Comparison?, . In: Hu, X., Jonsson, U., Wahlberg, B., Ghosh, B. (eds) Three Decades of Progress in Control Sciences. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11278-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-11278-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11277-5
Online ISBN: 978-3-642-11278-2
eBook Packages: EngineeringEngineering (R0)