Skip to main content
Log in

A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

This note deals with Markov decision chains evolving on a denumerable state space. Under standard continuity-compactness requirements, an explicit example is provided to show that, with respect to a strong sample-path average reward criterion, the Lyapunov function condition does not ensure the existence of an optimal stationary policy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Hordijk, A.: Dynamic Programming and Potential Theory. Mathematical Centre Tract, vol. 51. Mathematisch Centrum, Amsterdam (1974)

    MATH  Google Scholar 

  2. Cavazos-Cadena, R., Montes-de-Oca, R.: Sample-path optimality in average Markov decision chains under a double Lyapunov function condition. In: Hernández-Hernández, D., Minjárez-Sosa, A. (eds.) Optimization, Control, and Applications of Stochastic Systems, In Honor of Onésimo Hernández-Lerma, pp. 31–57. Springer, New York (2012)

    Chapter  Google Scholar 

  3. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)

    Book  MATH  Google Scholar 

  4. Thomas, L.C.: Connectedness conditions for denumerable state Markov decision processes. In: Hartley, R., Thomas, L.C., White, D.J. (eds.) Recent Developments in Markov Decision Processes, pp. 181–204. Academic Press, London (1980)

    Google Scholar 

  5. Cavazos-Cadena, R., Fernández-Gaucherand, E.: Denumerable controlled Markov chains with average reward criterion: sample path optimality. Math. Methods Oper. Res. 41, 89–108 (1995)

    Article  MATH  Google Scholar 

  6. Lasserre, J.B.: Sample-path average optimality for Markov control processes. IEEE Trans. Autom. Control 44, 1966–1971 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  7. Hunt, F.Y.: Sample path optimality for a Markov optimization problems. Stoch. Process. Appl. 115, 769–779 (2005)

    Article  MATH  Google Scholar 

  8. Ross, S.M.: Applied Probability Models with Optimization Applications. Holden-Day, Oakland (1970)

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the PSF Organization under Grant No. 012/300/02, and by CONACYT (México) and ASCR (Czech Republic) under Grant No. 171396.

The authors are grateful to the editor for helpful suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rolando Cavazos-Cadena.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cavazos-Cadena, R., Montes-de-Oca, R. & Sladký, K. A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion. J Optim Theory Appl 163, 674–684 (2014). https://doi.org/10.1007/s10957-013-0474-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-013-0474-6

Keywords

Navigation