Dynamic Programming or Direct Comparison?,

Cao, Xi-Ren

doi:10.1007/978-3-642-11278-2_5

Xi-Ren Cao⁵

711 Accesses

Summary

The standard approach to stochastic control is dynamic programming. In our recent research, we proposed an alternative approach based on direct comparison of the performance of any two policies. This approach has a number of advantages: the results may be derived in a simple and intuitive way; the approach applies to different optimization problems, including finite and infinite horizon, discounting and average performance, discrete time discrete states and continuous time and continuous stats, etc., in the same way; and it may be generalized to some non-standard problems where dynamic programming fails. This approach also links stochastic control to perturbation analysis, reinforcement learning and other research subjects in optimization, which may stimulate new research directions.

Supported in part by a grant from Hong Kong UGC.

Tribute to Chris Byrnes and Anders Lindquist.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Akian, M., Sulem, A., Taksar, M.: Dynamic Optimization of Long Term Growth Rate for a Portfolio with Transaction Costs and Logarithmic Utility. Mathematical Finance 11, 153–188 (2001)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. I, II. Athena Scientific, Belmont (2007)
Google Scholar
Brockett, R.: Stochastic Control. Preprint (2009)
Google Scholar
Cao, X.R.: Realization Probabilities - The Dynamics of Queueing Systems. Springer, New York (1994)
Book Google Scholar
Cao, X.R.: Stochastic Learning and Optimization - a Sensitivity-Based Approach. Springer, Heidelberg (2007)
Book MATH Google Scholar
Cao, X.R., Zhang, J.: The nth-Order Bias Optimality for Multi-chain Markov Decision Processes. IEEE Transactions on Automatic Control 53, 496–508 (2008)
Article MathSciNet Google Scholar
Cao, X.R.: Stochastic Control via Direct Comparison. Submitted to IEEE Transaction on Automatic Control (2009)
Google Scholar
Cao, X.R.: Singular Stochastic Control and Composite Markov Processes. Manuscript to be submitted (2009)
Google Scholar
Fleming, W.H., Soner, H.M.: Controlled Markov Processes and Viscosity Solutions, 2nd edn. Springer, Heidelberg (2006)
MATH Google Scholar
Ho, Y.C., Cao, X.R.: Perturbation Analysis of Discrete-Event Dynamic Systems. Kluwer Academic Publisher, Boston (1991)
Book MATH Google Scholar
Meyn, S.P.: The Policy Iteration Algorithm for Average Reward Markov Decision Processes with General State Space. IEEE Transactions on Automatic Control 42, 1663–1680 (1997)
Article MathSciNet MATH Google Scholar
Muthuraman, K., Zha, H.: Simulation-based portfolio optimization for large portfolios with transaction costs. Mathematical Finance 18, 115–134 (2008)
Article MathSciNet MATH Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, Chichester (1994)
Book MATH Google Scholar
Oksendal, B., Sulem, A.: Applied Stochastic Control of Jump Diffusions. Springer, Heidelberg (2007)
Book Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Veinott, A.F.: Discrete Dynamic Programming with Sensitive Discount Optimality Criteria. The Annals of Mathematical Statistics 40(5), 1635–1660 (1969)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Xi-Ren Cao

Authors

Xi-Ren Cao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. Mathematics, Div. Optimization and Systems, KTH – Royal Institute of Technology, Lindstedtsvägen 25, 100 44, Stockholm, Sweden
Xiaoming Hu
KTH – Royal Institute of Technology, Stockholm, Sweden
Ulf Jonsson
Royal, Sweden
Bo Wahlberg
Dept. Mathematics & Statistics, Texas Tech University, 79409-1042, Lubbock, TX, USA
Bijoy Ghosh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cao, XR. (2010). Dynamic Programming or Direct Comparison?^, . In: Hu, X., Jonsson, U., Wahlberg, B., Ghosh, B. (eds) Three Decades of Progress in Control Sciences. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11278-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-11278-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11277-5
Online ISBN: 978-3-642-11278-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics