Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 603))

  • 561 Accesses

Abstract

While the previous chapter describes the general problem investigated in this book, this chapter describes the mathematical model used to analyze this problem. In addition, this chapter presents the existing algorithms that our approach builds upon. Then, the chapter grounds the general ad hoc teamwork problem in a number of domains that the remainder of the book uses to evaluate the proposed approach. Using the dimensions described in Sect. 2.3, we can analyze these domains as well as the teammates that the ad hoc agent may encounter. Informally, we find that similar algorithms are effective on problems with similar values, but we do not use these values for further algorithm design or selection in this book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.cs.utexas.edu/~larg/index.php/Ad_Hoc_Teamwork:_Pursuit.

  2. 2.

    http://sourceforge.net/projects/sserver/.

  3. 3.

    http://www.cs.utexas.edu/~AustinVilla/sim/halffieldoffense/.

  4. 4.

    http://www.cs.utexas.edu/~larg/index.php/Ad_Hoc_Teamwork:_HFO.

  5. 5.

    http://www.socsim.robocup.org/files/2D/.

References

  1. Sutton, Richard S., and Andrew G. Barto. 1998. Reinforcement learning: An introduction. Cambridge: MIT Press.

    Google Scholar 

  2. Kocsis, Levente, and Csaba Szepesvari. 2006. Bandit based Monte-Carlo planning. In Prooceedings of the seventeenth European conference on machine learning (ECML).

    Google Scholar 

  3. Gelly, Sylvain, and Yizao Wang. 2006. Exploration exploitation in Go: UCT for Monte-Carlo Go. In Advances in neural information processing systems 19 (NIPS), Dec 2006.

    Google Scholar 

  4. Silver, David, and Joel Veness. 2010. Monte-Carlo planning in large POMDPs. In Advances in neural information processing systems 23 (NIPS).

    Google Scholar 

  5. Silver, David, Richard S. Sutton, and Martin Müller. 2008. Sample-based learning and search with permanent and transient memories. In Proceedings of the twenty-fifth international conference on machine learning (ICML).

    Google Scholar 

  6. Hester, Todd, and Peter Stone. 2013. TEXPLORE: Real-time sample-efficient reinforcement learning for Robots. Machine Learning (MLJ) 90(3): 385–429.

    Article  MathSciNet  Google Scholar 

  7. Ernst, Damien, Pierre Geurts, and Louis Wehenkel. 2005. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research (JMLR) 503–556.

    Google Scholar 

  8. Watkins, Christopher John Cornish Hellaby. 1989. Learning from delayed rewards. Ph.D. thesis, King’s College, Cambridge, UK, May 1989.

    Google Scholar 

  9. Deisenroth, Marc Peter, Gerhard Neumann, and Jan Peters. 2013. A survey on policy search for Robotics. Foundations and Trends in Robotics 2(1–2): 1–142.

    Google Scholar 

  10. Albus, James S. 1971. A theory of cerebellar function. Mathematical Biosciences 10(12): 25–61.

    Article  Google Scholar 

  11. Albus, J.S. 1975. A new approach to manipulator control cerebellar model articulation control (CMAC). Transactions on ASME, Journal of Dynamic Systems, Measurement, and Control 97(9): 220–227.

    Google Scholar 

  12. Hsu, David, Wee Sun Lee, and Nan Rong. 2007. What makes some POMDP problems easy to approximate? In Advances in neural information processing systems 20 (NIPS).

    Google Scholar 

  13. Dai, Wenyuan, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007. Boosting for transfer learning. In Proceedings of the twenty-fourth international conference on machine learning (ICML), 193–200.

    Google Scholar 

  14. Pardoe, David, and Peter Stone. 2010. Boosting for regression transfer. In Proceedings of the twenty-seventh international conference on machine learning (ICML), June 2010.

    Google Scholar 

  15. Kamishima, T., M. Hamasaki, and S. Akaho. 2009. TrBagg: A simple transfer learning method and its application to personalization in collaborative tagging. In Ninth IEEE international conference on data mining (ICDM), Dec 2009, 219–228.

    Google Scholar 

  16. Yao, Yi, and G. Doretto. 2010. Boosting for transfer learning with multiple sources. In Proceedings of the conference on computer vision and pattern recognition (CVPR), June 2010.

    Google Scholar 

  17. Huang, Pipei, Gang Wang, and Shiyin Qin. 2012. Boosting for transfer learning from multiple data sources. Pattern Recognition Letters 33(5): 568–579.

    Article  Google Scholar 

  18. Zhuang, Fuzhen, Xiaohu Cheng, SinnoJialin Pan, Wenchao Yu, Qing He, and Zhongzhi Shi. 2014. Transfer learning with multiple sources via consensus regularized autoencoders. In Machine learning and knowledge discovery in databases, Lecture notes in computer science, ed. Toon Calders, Floriana Esposito, Eyke Hllermeier, and Rosa Meo, vol. 8726, 417–431. Berlin: Springer.

    Google Scholar 

  19. Fang, Min, Yong Guo, Xiaosong Zhang, and Xiao Li. 2015. Multi-source transfer learning based on label shared subspace. Pattern Recognition Letters 51: 101–106.

    Article  Google Scholar 

  20. Ge, Liang, Jing Gao, and Aidong Zhang. 2013. OMS-TL: A framework of online multiple source transfer learning. In Proceedings of the 22nd ACM international conference on information & knowledge management, CIKM’13, 2423–2428, New York, NY, USA. ACM.

    Google Scholar 

  21. Stone, Peter, and Sarit Kraus. 2010. To teach or not to teach? Decision making under uncertainty in ad hoc teams. In Proceedings of the ninth international conference on autonomous agents and multiagent systems (AAMAS), May 2010.

    Google Scholar 

  22. Auer, Peter, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine Learning (MLJ) 47: 235–256.

    Article  MATH  Google Scholar 

  23. Stone, Peter, and Manuela Veloso. 2000. Multiagent systems: A survey from a machine learning perspective. Autonomous Robots 8(3): 345–383.

    Article  Google Scholar 

  24. Kalyanakrishnan, Shivaram, Yaxin Liu, and Peter Stone. 2007. Half field offense in RoboCup soccer: A multiagent reinforcement learning case study. In RoboCup-2006: Robot Soccer world cup X. Lecture notes in artificial intelligence, vol. 4434, 72–85. Berlin: Springer.

    Google Scholar 

  25. Akiyama, Hidehisa. 2010. Agent2d base code release. http://sourceforge.jp/projects/rctools.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samuel Barrett .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Barrett, S. (2015). Background. In: Making Friends on the Fly: Advances in Ad Hoc Teamwork. Studies in Computational Intelligence, vol 603. Springer, Cham. https://doi.org/10.1007/978-3-319-18069-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18069-4_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18068-7

  • Online ISBN: 978-3-319-18069-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics