Filtered Reinforcement Learning

Aberdeen, Douglas

doi:10.1007/978-3-540-30115-8_6

Douglas Aberdeen²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3201))

Included in the following conference series:

European Conference on Machine Learning

3990 Accesses

Abstract

Reinforcement learning (RL) algorithms attempt to assign the credit for rewards to the actions that contributed to the reward. Thus far, credit assignment has been done in one of two ways: uniformly, or using a discounting model that assigns exponentially more credit to recent actions. This paper demonstrates an alternative approach to temporal credit assignment, taking advantage of exact or approximate prior information about correct credit assignment. Infinite impulse response (IIR) filters are used to model credit assignment information. IIR filters generalise exponentially discounting eligibility traces to arbitrary credit assignment models. This approach can be applied to any RL algorithm that employs an eligibility trace. The use of IIR credit assignment filters is explored using both the GPOMDP policy-gradient algorithm and the Sarsa( λ ) temporal-difference algorithm. A drop in bias and variance of value or gradient estimates is demonstrated, resulting in faster convergence to better policies.

Download to read the full chapter text

Chapter PDF

Introduction to Reinforcement Learning

Extending Sliding-Step Importance Weighting from Supervised Learning to Reinforcement Learning

Robust Reinforcement Learning with a Stochastic Value Function

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Baxter, J., Bartlett, P.L.: Infinite-horizon policy-gradient estimation. JAIR 15, 319–350 (2001)
MATH MathSciNet Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (1998)
Google Scholar
Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, University of Cambridge, England (1989)
Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229–256 (1992)
MATH Google Scholar
Aberdeen, D.: Policy-Gradient Algorithms for Partially Observable Markov Decision Processes. PhD thesis, Australian National University (2003)
Google Scholar
Singh, S.P., Sutton, R.S.: Reinforcement learning with replacing eligibility traces. Machine Learning 22, 123–158 (1996)
MATH Google Scholar
Elliott, S.: Signal Processing for Active Control. Academic Press, London (2001)
Google Scholar
Boroujerdi, M.: Pharmacokinetics: Principles and Application. McGraw-Hill, New York (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

National ICT Australia, Canberra, Australia
Douglas Aberdeen

Authors

Douglas Aberdeen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INSA-Lyon, LIRIS CNRS UMR5205, F-69621, Villeurbanne, France
Jean-François Boulicaut
Dipartimento di Informatica, Università degli Studi di Bari,
Floriana Esposito
Pisa KDD Laboratory, ISTI - CNR, Area della Ricerca di Pisa, Via Giuseppe Moruzzi 1, Pisa, Italy
Fosca Giannotti
Dipartimento di Informatica, Via F. Buonarroti 2, 56127, Pisa, Italy
Dino Pedreschi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aberdeen, D. (2004). Filtered Reinforcement Learning. In: Boulicaut, JF., Esposito, F., Giannotti, F., Pedreschi, D. (eds) Machine Learning: ECML 2004. ECML 2004. Lecture Notes in Computer Science(), vol 3201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30115-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-540-30115-8_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23105-9
Online ISBN: 978-3-540-30115-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Filtered Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Introduction to Reinforcement Learning

Extending Sliding-Step Importance Weighting from Supervised Learning to Reinforcement Learning

Robust Reinforcement Learning with a Stochastic Value Function

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Filtered Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Introduction to Reinforcement Learning

Extending Sliding-Step Importance Weighting from Supervised Learning to Reinforcement Learning

Robust Reinforcement Learning with a Stochastic Value Function

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation