Fuzzy reinforcement Learning and dynamic programming

Berenji, Hamid R.

doi:10.1007/3-540-58409-9_1

Hamid R. Berenji¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 847))

Included in the following conference series:

International Workshop on Fuzzy Logic in Artificial Intelligence

1028 Accesses
13 Citations

Abstract

In this paper, we develop a new algorithm called Fuzzy Q-Learning (or FQ-Learning) which extends Watkin's Q-Learning method. It can be used for decision processes in which the goals and/or the constraints, but not necessarily the system under control, are fuzzy in nature. An example of a fuzzy constraint is: “the weight of object A must not be substantially heavier than w” where w is a specified weight. Similarly, an example of a fuzzy goal is: “the robot must be in the vicinity of door k”. We show that FQ-Learning provides an alternative solution to this problem which is simpler than the Bellman-Zadeh's fuzzy dynamic programming approach. We apply the algorithm to a multistage decision making problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. G. Barto, S. Bradtke, and S. Singh. Learning to act using real-time dynamic programming. Submitted to AI Journal special issue on Computational Theories of Interaction and Agency, 1993.
Google Scholar
A. G. Barto, R. S. Sutton, and C. W. Anderson. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, 13:834–846, 1983.
Google Scholar
R. Bellman. Dynamic Programming. Princeton University Press, Princeton, NJ, 1957.
Google Scholar
R.E. Bellman and L.A. Zadeh. Decision-making in a fuzzy environment. Management Science, 17(4):B–141:B-164, 1970.
Google Scholar
H.R. Berenji and P. Khedkar. Learning and tuning fuzzy logic controllers through reinforcements. IEEE Transactions on Neural Networks, 3(5), 1992.
Google Scholar
H.R. Berenji, Y. Jani R.N Lea, P. Khedkar, A. Malkani, and J. Hoblit. Space shuttle attitude control by fuzzy logic and reinforcement learning. In Second IEEE International conference on Fuzzy Systems, San Francisco, CA, March 1993.
Google Scholar
L.J. Lin. Programming robots using reinforcement learning and teaching. In Proceedings of the Ninth National Conference on Artificial Intelligence, 1991.
Google Scholar
A. Moore and C. Atkeson. Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning, page to appear.
Google Scholar
R.S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9–44, 1988.
Google Scholar
R.S. Sutton. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, 1990.
Google Scholar
G. Tesauro. Practical issues in temporal difference learning. Machine Learning, (8):257–277, 1992.
Google Scholar
G. Tesauro. Td-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215–219, 1994.
Google Scholar
C. Watkins and P. Dayan. Q-learning. Machine Learning, (8):279–292, 1992.
Google Scholar
C.J.C.H. Watkins. Learning with Delayed Rewards. PhD thesis, Cambridge University, Psychology Department, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Inference Systems Corp. Artificial Intelligence Research Branch, MS: 269-2, NASA Ames Research Center, 94035, Mountain View, CA
Hamid R. Berenji

Authors

Hamid R. Berenji
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Anca L. Ralescu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berenji, H.R. (1994). Fuzzy reinforcement Learning and dynamic programming. In: Ralescu, A.L. (eds) Fuzzy Logic in Artificial Intelligence. FLAI 1993. Lecture Notes in Computer Science, vol 847. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58409-9_1

Download citation

DOI: https://doi.org/10.1007/3-540-58409-9_1
Published: 02 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58409-4
Online ISBN: 978-3-540-48780-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics