Abstract
Multiobjective reinforcement learning algorithms extend reinforcement learning techniques to problems with multiple conflicting objectives. This paper discusses the advantages gained from applying stochastic policies to multiobjective tasks and examines a particular form of stochastic policy known as a mixture policy. Two methods are proposed for deriving mixture policies for episodic multiobjective tasks from deterministic base policies found via scalarised reinforcement learning. It is shown that these approaches are an efficient means of identifying solutions which offer a superior match to the user’s preferences than can be achieved by methods based strictly on deterministic policies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Coello Coello, C.A.: Handling Preferences in Evolutionary Multiobjective Optimization: A Survey. In: 2000 Congress on Evolutionary Computation, vol. 1, pp. 30–37 (2000)
Tesauro, G., Das, R., Chan, H., Kephart, J.O., Lefurgy, C., Levine, D.W., Rawson, F.: Managing power consumption and performance of computing systems using reinforcement learning. In: Neural Information Processing Systems (2007)
Natarajan, S., Tadepalli, P.: Dynamic preferences in multi-criteria reinforcement learning. In: International Conference on Machine Learning, Bonn, Germany, pp. 601–608 (2005)
Castelletti, A., Corani, G., Rizzolli, A., Soncinie-Sessa, R., Weber, E.: Reinforcement learning in the operational management of a water system. In: IFAC Workshop on Modeling and Control in Environmental Issues, Keio University, Yokohama, Japan, pp. 325–330 (2002)
Vamplew, P., Yearwood, J., Dazeley, R., Berry, A.: On the Limitations of Scalarisation for Multiobjective Learning of Pareto Fronts. In: Wobcke, W., Zhang, M. (eds.) AI 2008. LNCS (LNAI), vol. 5360, pp. 372–378. Springer, Heidelberg (2008)
Gabor, Z., Kalmar, Z., Szepesvari, C.: Multi-criteria reinforcement learning. In: The Fifteenth International Conference on Machine Learning, pp. 197–205 (1998)
Mannor, S., Shimkin, N.: The steering approach for multi-criteria reinforcement learning. In: Neural Information Processing Systems, Vancouver, Canada, pp. 1563–1570 (2001)
Mannor, S., Shimkin, N.: A geometric approach to multi-criterion reinforcement learning. Journal of Machine Learning Research 5, 325–360 (2004)
Shelton, C.R.: Importance sampling for reinforcement learning with multiple objectives, Massachusetts Institute of Technology AI Laboratory Tech Report No. 2001-003 (2001)
Mahadevan, S., Ghavamzadeh, M., Theocharous, G., Rohanimanesh, K.: Hierarchical Approaches to Concurrency, Multiagency, and Partial Observability. In: Si, J., Barto, A., Powell, W., Wunsch, D. (eds.) Handbook of Learning and Adaptive Dynamic Programming, pp. 285–310. Wiley-IEEE (2004)
Kelley, J.L., Namioka, I.: Linear topological spaces. Graduate Texts in Mathematics, vol. 36. Springer, Heidelberg (1976)
Barrett, L., Narayanan, S.: Learning All Optimal Policies with Multiple Criteria. In: Proceedings of the International Conference on Machine Learning (2008)
Seidel, R.: Convex Hull Computations. In: Goodman, J.E., O’Rourke, J. (eds.) Handbook of Discrete and Computational Geometry, pp. 361–376. CRC Press, Boca Raton (1997)
Agrawal, G., Lewis, K., Chugh, K., Huang, C.-H., Parashar, S., Bloebaum, C.L.: Intuitive Visualization of Pareto Frontier for Multi-Objective Optimization in n-Dimensional Performance Space. In: 10th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, Albany, NY (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vamplew, P., Dazeley, R., Barker, E., Kelarev, A. (2009). Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks. In: Nicholson, A., Li, X. (eds) AI 2009: Advances in Artificial Intelligence. AI 2009. Lecture Notes in Computer Science(), vol 5866. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10439-8_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-10439-8_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10438-1
Online ISBN: 978-3-642-10439-8
eBook Packages: Computer ScienceComputer Science (R0)