Abstract
Dyna is a single-agent architectural framework that integrates learning, planning, and reacting.Well known instantiations of Dyna are Dyna-ACand Dyna-Q. Here a multiagent extension of Dyna-Q is presented. This extension, called MDyna-Q, constitutes a novel coordination framework that bridges the gap between plan-based and reactive coordination in multiagent systems.The paper summarizes the key features of Dyna, describes M-Dyna-Q in detail, provides experimental results, and carefully discusses the benefits and limitations of this framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
AI Magazine. Special Issue on Distributed Continual Planning, Vol. 20, No. 4,Winter 1999.
T. Bouron, J. Ferber, and F. Samuel. A multi-agent testbed for heterogeneous agents. In Y. Demazeau and J.-P. Müller, editors, Decentralized A.I. 2, pages 195–214. North-Holland, Amsterdam et al., 1991.
R.H. Crites and A.G. Barto. Elevator group control using multiple reinforcement learning agents. Machine Learning, 33(2/3):235–262, 1998.
K.S. Decker and V.R. Lesser. Generalized partial global planning. International Journal of Intelligent Cooperative Information Systems, 1(2):319–346, 1992.
E.H. Durfee, P.G. Kenny, and K.C. Kluge. Integrated premission planning and execution for unmanned ground vehicles. In Proceedings of the First International Conference on Autonomous Agents (Agents’97), pages 348–354, 1997.
E.H. Durfee and V.R. Lesser. Partial global planning: A coordination framework for distributed hypothesis formation. IEEE Transactions on Systems, Man, and Cybernetics, SMC-21(5):1167–1183, 1991.
J. Ferber. Reactive distributed artificial intelligence: Principles and applications. In G.M.P. O’Hare and N.R. Jennings, editors, Foundations of Distributed Artificial Intelligence, pages 287–314.Wiley, NewYork et al., 1996.
J. Ferber and E. Jacopin. The framework of ECO-problem sloving. InY. Demazeau and J.-P. Müller, editors, Decentralized A.I. 2, pages 181–194. North-Holland, Amsterdam et al., 1991.
M. Georgeff. Communication and interaction in multi-agent planning. In Proceedings of the Third National Conference on Artificial Intelligence (AAAI-83), pages 125–129, 1983.
D. Goldberg and M.J. Matarić. Coordinating mobile robot group behavior using a model of interaction dynamics. In Proceedings of the Third International Conference on Autonomous Agents (Agents’99), pages 100–107, 1999.
B. Horling, V. Lesser, R. Vincent, A. Bazzan, and P. Xuan. Diagnosis as an integral part of multi-agent adaptability. Technical Report 99-03, Computer Science Department, University of Massachussetts at Amherst, 1999.
M.J. Huber and E.H. Durfee. An initial assessment of plan-recognition-based coordination for multi-agent systems. In Proceedings of the 2nd International Conference on Multi-Agent Systems (ICMAS-96), pages 126–133, 1996.
E. Hudlická and V.R. Lesser. Modeling and diagnosing problem-solving system behavior. IEEE Transactions on Systems, Man, and Cybernetics, SMC-17(3):407–419, 1987.
E. Hudlická, V.R. Lesser, A. Rewari, and P. Xuan. Design of a distributed diagnosis system. Technical Report 86–63, Computer Science Department, University of Massachussetts at Amherst, 1986.
F. Kabanza. Synchronizing multiagent plans using temporal logic. In Proceedings of the First International Conference on Multi-Agent Systems (ICMAS-95), pages 217–224, 1995.
F. von Martial. Coordinating plans of autonomous agents. Lecture Notes in Artificial in Artificial Intelligence, Vol. 610. Springer-Verlag, Berlin et al., 1992.
M. Matarić. Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1):73–83, 1997.
M.J. Matarić. Designing and understanding adaptive group behavior. Adaptive Behavior, 4(1):51–80, 1995.
N. Ono and Y. Fukuta. Learning coordinated behavior in a continuous environment. In G. Weiß, editor, Distributed Artificial Intelligence Meets Machine Learning, Lecture Notes in Artificial in Artificial Intelligence, Vol. 1221, pages 73–81. Springer-Verlag, Berlin et al., 1997.
L.E. Parker. On the design of behavior-based multi-robot teams. Advanced Robotics, 10(6):547–578, 1996.
L.E. Parker. L-alliance: Task-oriented multi-robot learning insystems. Advanced Robotics, 11(4):305–322, 1997.
C. Reynolds. Flocks, herds, and schools:Adistributed behavioral model. Computer Graphics, 21(4):25–34, 1987.
A.E.F. Seghrouchni and S. Haddad.Arecursive model for distributed planning. In Proceedings of the 2nd International Conference on Multi-Agent Systems (ICMAS-96), pages 307–314, 1996.
S. Sen and M. Sekaran. Multiagent coordination with learning classifier systems. In G. Weiß and S. Sen, editors, Adaption and Learning in Multiagent Systems, Lecture Notes in Artificial in Artificial Intelligence, Vol. 1042, pages 218–233. Springer-Verlag, Berlin et al., 1996.
L. Steels. Cooperation between distributed agents through self-organization. InY. Demazeau and J.-P. Müller, editors, Decentralized A.I., pages 175–196. North-Holland, Amsterdam et al., 1990.
P. Stone and M. Veloso. Collaborative and adversarial learning:A case study in robotic soccer. In S. Sen, editor, Adaptation, Coevolution and Learning in Multiagent Systems. Papers from the 1996 AAAI Symposium, Technical Report SS-96-01, pages 88–92. AAAI Press, Menlo Park, CA, 1996.
T. Sueyoshi and M. Tokoro. Dynamic modeling of agents for coordination. InY. Demazeau and J.-P. Müller, editors, Decentralized A.I. 2, pages 161–176. North-Holland, Amsterdam et al., 1991.
T. Sugawara. Reusing past plans in distributed planning. In Proceedings of the First International Conference on Multi-Agent Systems (ICMAS-95), pages 360–367, 1995.
T. Sugawara and V. Lesser. On-line learning of coordination plans. InWorking Papers of the 12th International Workshop on Distributed Artificial Intelligence, 1993.
T. Sugawara and V. Lesser. Learning to improve coordinated actions in cooperative distributed problem-solving environments. Machine Learning, 33(2/3):129–153, 1998.
R.S. Sutton. Dyna, an integrated architecture for learning, planning, and reacting. SIGART Bulletin, 2:160–163, 1991.
R.S. Sutton. Planning by incremental dynamic programming. In Proceedings of the Eigth International Workshop on Machine Learning, pages 353–357, 1991.
R.S. Sutton and A.G. Barto. Reinforcement Learning. An Introduction. MIT Press/A Bradford Book, Cambridge, MA, 1998.
C.J.C.H. Watkins. Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge University, 1989.
C.J.C.H. Watkins and P. Dayan. Q-learning. Machine Learning, 8:279–292, 1992.
G. Weiß. Action selection and learning in multi-agent environments. In From Animals to Animats 2-Proceedings of the Second International Conference on Simulation of Adaptive Behavior, pages 502–510, 1993.
G. Weiß. Learning to coordinate actions in multi-agent systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI-93), pages 311–316, 1993.
G. Weiß. Achieving coordination through combining joint planning and joint learning. Technical Report FKI-232–99, Institut für Informatik, Technische Universität München, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Weiß, G. (2001). An Architectural Framework for Integrated Multiagent Planning, Reacting, and Learning. In: Castelfranchi, C., Lespérance, Y. (eds) Intelligent Agents VII Agent Theories Architectures and Languages. ATAL 2000. Lecture Notes in Computer Science(), vol 1986. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44631-1_22
Download citation
DOI: https://doi.org/10.1007/3-540-44631-1_22
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42422-2
Online ISBN: 978-3-540-44631-6
eBook Packages: Springer Book Archive