Abstract
This paper deals with two person zero-sum semi-Markov games with a possibly unbounded payoff function, under a discounted payoff criterion. Assuming that the distribution of the holding times H is unknown for one of the players, we combine suitable methods of statistical estimation of H with control procedures to construct an asymptotically discount optimal pair of strategies.
Similar content being viewed by others
References
Bhattacharya, R.N., Majumdar, M.: Controlled semi-Markov models—the discounted case. J. Stat. Plann. Inference 21, 365–381 (1989)
Gordienko, E.I., Minjárez-Sosa, J.A.: Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion. Kybernetika 34, 217–234 (1998)
Guo, X.P., Hernández-Lerma, O.: Zero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates. J. Appl. Probab. 40, 327–345 (2003)
Guo, X.P., Hernández-Lerma, O.: Zero-sum continuous-time Markov games with unbounded transition and discounted payoffs. Bernoulli 11, 1009–1029 (2005)
Guo, X.P., Hernández-Lerma, O.: Nonzero-sum games for continuous-time Markov chains with unbounded payoffs. J. Appl. Probab. 42, 303–320 (2005)
Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)
Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer, New York (1999)
Hasminskii, R., Ibragimov, I.: On density estimation in the view of Kolmogorov’s ideas in approximation theory. Ann. Stat. 18, 999–1010 (1990)
Hilgert, N., Minjárez-Sosa, J.A.: Adaptive policies for time-varying stochastic systems under discounted criterion. Math. Methods Oper. Res. 54, 491–505 (2001)
Jaskiewicz, A.: Zero-sum semi-Markov games. SIAM J. Control Optim. 41, 723–739 (2002)
Lal, A.K., Sinha, S.: Zero-sum two person semi-Markov games. J. Appl. Probab. 29, 56–72 (1992)
Luque-Vásquez, F., Robles-Alcaraz, M.T.: Controlled semi-Markov models with discounted unbounded costs. Bol. Soc. Mat. Mexicana 39, 51–68 (1994)
Lippman, S.A.: Semi-Markov decision processes with unbounded rewards. Manag. Sci. 19, 717–731 (1973)
Lippman, S.A.: On dynamic programming with unbounded rewards. Manag. Sci. 21, 1225–1233 (1975)
Luque-Vásquez, F.: Zero-sum semi-Markov games in Borel spaces: discounted and average payoff. Bol. Soc. Mat. Mexicana 8, 227–241 (2002)
Luque-Vásquez, F., Minjárez-Sosa, J.A.: Semi-Markov control processes with unknown holding times distribution under a discounted criterion. Math. Methods Oper. Res. 61, 455–468 (2005)
Nowak, A.S.: Some remarks on equilibria in semi-Markov games. Appl. Math. (Warsaw) 27-4, 385–394 (2000)
Polowczuk, W.: Nonzero semi-Markov games with countable state spaces. Appl. Math. (Warsaw) 27-4, 395–402 (2000)
Rieder, U.: Measurable selection theorems for optimization problems. Manuscr. Math. 24, 115–131 (1978)
Ross, S.M.: Applied Probability Models with Optimization Applications. Holden-Day, San Francisco (1970)
Schäl, M.: Estimation and control in discounted stochastic dynamic programming. Stochastics 20, 51–131 (1987)
Shapley, L.: Stochastic games. Proc. Natl. Acad. Sci. U.S.A. 39, 1095–1100 (1953)
Vega-Amaya, O.: Average optimality in semi-Markov control models on Borel spaces: unbounded costs and controls. Bol. Soc. Mat. Mexicana 38, 47–60 (1993)
Vega-Amaya, O.: Zero-sum semi-Markov games: fixed point solutions of the Shapley equation. SIAM J. Control Optim. 42-5, 1876–1894 (2003)
Author information
Authors and Affiliations
Corresponding author
Additional information
Work supported partially by Consejo Nacional de Ciencia y Tecnología (CONACyT) under Grant 46633-F.
Rights and permissions
About this article
Cite this article
Minjárez-Sosa, J.A., Luque-Vásquez, F. Two Person Zero-Sum Semi-Markov Games with Unknown Holding Times Distribution on One Side: A Discounted Payoff Criterion. Appl Math Optim 57, 289–305 (2008). https://doi.org/10.1007/s00245-007-9016-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00245-007-9016-7