Background

Barrett, Samuel

doi:10.1007/978-3-319-18069-4_3

Samuel Barrett³

Part of the book series: Studies in Computational Intelligence ((SCI,volume 603))

561 Accesses

Abstract

While the previous chapter describes the general problem investigated in this book, this chapter describes the mathematical model used to analyze this problem. In addition, this chapter presents the existing algorithms that our approach builds upon. Then, the chapter grounds the general ad hoc teamwork problem in a number of domains that the remainder of the book uses to evaluate the proposed approach. Using the dimensions described in Sect. 2.3, we can analyze these domains as well as the teammates that the ad hoc agent may encounter. Informally, we find that similar algorithms are effective on problems with similar values, but we do not use these values for further algorithm design or selection in this book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Sutton, Richard S., and Andrew G. Barto. 1998. Reinforcement learning: An introduction. Cambridge: MIT Press.
Google Scholar
Kocsis, Levente, and Csaba Szepesvari. 2006. Bandit based Monte-Carlo planning. In Prooceedings of the seventeenth European conference on machine learning (ECML).
Google Scholar
Gelly, Sylvain, and Yizao Wang. 2006. Exploration exploitation in Go: UCT for Monte-Carlo Go. In Advances in neural information processing systems 19 (NIPS), Dec 2006.
Google Scholar
Silver, David, and Joel Veness. 2010. Monte-Carlo planning in large POMDPs. In Advances in neural information processing systems 23 (NIPS).
Google Scholar
Silver, David, Richard S. Sutton, and Martin Müller. 2008. Sample-based learning and search with permanent and transient memories. In Proceedings of the twenty-fifth international conference on machine learning (ICML).
Google Scholar
Hester, Todd, and Peter Stone. 2013. TEXPLORE: Real-time sample-efficient reinforcement learning for Robots. Machine Learning (MLJ) 90(3): 385–429.
Article MathSciNet Google Scholar
Ernst, Damien, Pierre Geurts, and Louis Wehenkel. 2005. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research (JMLR) 503–556.
Google Scholar
Watkins, Christopher John Cornish Hellaby. 1989. Learning from delayed rewards. Ph.D. thesis, King’s College, Cambridge, UK, May 1989.
Google Scholar
Deisenroth, Marc Peter, Gerhard Neumann, and Jan Peters. 2013. A survey on policy search for Robotics. Foundations and Trends in Robotics 2(1–2): 1–142.
Google Scholar
Albus, James S. 1971. A theory of cerebellar function. Mathematical Biosciences 10(12): 25–61.
Article Google Scholar
Albus, J.S. 1975. A new approach to manipulator control cerebellar model articulation control (CMAC). Transactions on ASME, Journal of Dynamic Systems, Measurement, and Control 97(9): 220–227.
Google Scholar
Hsu, David, Wee Sun Lee, and Nan Rong. 2007. What makes some POMDP problems easy to approximate? In Advances in neural information processing systems 20 (NIPS).
Google Scholar
Dai, Wenyuan, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007. Boosting for transfer learning. In Proceedings of the twenty-fourth international conference on machine learning (ICML), 193–200.
Google Scholar
Pardoe, David, and Peter Stone. 2010. Boosting for regression transfer. In Proceedings of the twenty-seventh international conference on machine learning (ICML), June 2010.
Google Scholar
Kamishima, T., M. Hamasaki, and S. Akaho. 2009. TrBagg: A simple transfer learning method and its application to personalization in collaborative tagging. In Ninth IEEE international conference on data mining (ICDM), Dec 2009, 219–228.
Google Scholar
Yao, Yi, and G. Doretto. 2010. Boosting for transfer learning with multiple sources. In Proceedings of the conference on computer vision and pattern recognition (CVPR), June 2010.
Google Scholar
Huang, Pipei, Gang Wang, and Shiyin Qin. 2012. Boosting for transfer learning from multiple data sources. Pattern Recognition Letters 33(5): 568–579.
Article Google Scholar
Zhuang, Fuzhen, Xiaohu Cheng, SinnoJialin Pan, Wenchao Yu, Qing He, and Zhongzhi Shi. 2014. Transfer learning with multiple sources via consensus regularized autoencoders. In Machine learning and knowledge discovery in databases, Lecture notes in computer science, ed. Toon Calders, Floriana Esposito, Eyke Hllermeier, and Rosa Meo, vol. 8726, 417–431. Berlin: Springer.
Google Scholar
Fang, Min, Yong Guo, Xiaosong Zhang, and Xiao Li. 2015. Multi-source transfer learning based on label shared subspace. Pattern Recognition Letters 51: 101–106.
Article Google Scholar
Ge, Liang, Jing Gao, and Aidong Zhang. 2013. OMS-TL: A framework of online multiple source transfer learning. In Proceedings of the 22nd ACM international conference on information & knowledge management, CIKM’13, 2423–2428, New York, NY, USA. ACM.
Google Scholar
Stone, Peter, and Sarit Kraus. 2010. To teach or not to teach? Decision making under uncertainty in ad hoc teams. In Proceedings of the ninth international conference on autonomous agents and multiagent systems (AAMAS), May 2010.
Google Scholar
Auer, Peter, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine Learning (MLJ) 47: 235–256.
Article MATH Google Scholar
Stone, Peter, and Manuela Veloso. 2000. Multiagent systems: A survey from a machine learning perspective. Autonomous Robots 8(3): 345–383.
Article Google Scholar
Kalyanakrishnan, Shivaram, Yaxin Liu, and Peter Stone. 2007. Half field offense in RoboCup soccer: A multiagent reinforcement learning case study. In RoboCup-2006: Robot Soccer world cup X. Lecture notes in artificial intelligence, vol. 4434, 72–85. Berlin: Springer.
Google Scholar
Akiyama, Hidehisa. 2010. Agent2d base code release. http://sourceforge.jp/projects/rctools.

Download references

Author information

Authors and Affiliations

Kiva Systems, 300 Riverpark Drive, North Reading, MA, 01864, USA
Samuel Barrett

Authors

Samuel Barrett
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samuel Barrett .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Barrett, S. (2015). Background. In: Making Friends on the Fly: Advances in Ad Hoc Teamwork. Studies in Computational Intelligence, vol 603. Springer, Cham. https://doi.org/10.1007/978-3-319-18069-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-18069-4_3
Published: 26 May 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18068-7
Online ISBN: 978-3-319-18069-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics