Skip to main content

On Variability of Optimal Policies in Markov Decision Processes

  • Chapter
Data Analysis and Decision Support
  • 2304 Accesses

Abstract

Both the total reward criterion and the average reward criterion commonly used in Markov decision processes lead to an optimal policy which maximizes the associated expected value. The paper reviews these standard approaches and studies the distribution functions obtained by applying an optimal policy. In particular, an efficient extrapolation method is suggested resulting from the control of Markov decision models with an absorbing set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • ALTMAN, E. (2001): Applications of Markov Decision Processes in Communication Networks: a Survey. In E. Feinberg and A. Shwartz (Eds): Markov Decision Processes, Models, Methods, Directions, and Open Problems. Kluwer, Bosten, 488–536.

    Google Scholar 

  • HINDERER, K. and WALDMANN, K.-H. (2004): Algorithms for countable state Markov decision models with an absorbing set. SIAM J. Control and Optimization, to appear.

    Google Scholar 

  • NORRIS, J.R. (1997): Markov Chains. Cambridge University Press, Cambridge.

    Google Scholar 

  • OGIWARA, T. (1995): Nonlinear Perron-Frobenius problem on an ordered Banach space. Japan J. Math., 21, 43–103.

    MATH  MathSciNet  Google Scholar 

  • PUTERMAN, M.L. (1994): Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York.

    Google Scholar 

  • SENNOTT, L.I. (1999): Stochastic Dynamic Programming and the Control of Queueing Systems. Wiley, New York.

    Google Scholar 

  • WU, C. and LIN, Y. (1999): Minimizing Risk Models in Markov Decision Processes with Policies Depending on Target Values. J. Math. Anal. Appl., 231, 47–60.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin · Heidelberg

About this chapter

Cite this chapter

Waldmann, KH. (2005). On Variability of Optimal Policies in Markov Decision Processes. In: Baier, D., Decker, R., Schmidt-Thieme, L. (eds) Data Analysis and Decision Support. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28397-8_20

Download citation

Publish with us

Policies and ethics