Skip to main content

On Universal Transfer Learning

  • Conference paper
Algorithmic Learning Theory (ALT 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4754))

Included in the following conference series:

Abstract

In transfer learning the aim is to solve new learning tasks using fewer examples by using information gained from solving related tasks. Existing transfer learning methods have been used successfully in practice and PAC analysis of these methods have been developed. But the key notion of relatedness between tasks has not yet been defined clearly, which makes it difficult to understand, let alone answer, questions that naturally arise in the context of transfer, such as, how much information to transfer, whether to transfer information, and how to transfer information across tasks. In this paper we look at transfer learning from the perspective of Algorithmic Information Theory, and formally solve these problems in the same sense Solomonoff Induction solves the problem of inductive inference. We define universal measures of relatedness between tasks, and use these measures to develop universally optimal Bayesian transfer learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Caruana, R.: Multitask learning. Machine Learning 28, 41–75 (1997)

    Article  Google Scholar 

  2. Thrun, S., Mitchell, T.: Lifelong robot learning. Robotics and Autonomous Systems 15, 25–46 (1995)

    Article  Google Scholar 

  3. Thrun, S., Pratt, L.Y. (eds.): Learning To Learn. Kluwer Academic Publishers, Boston (1998)

    MATH  Google Scholar 

  4. Ben-David, S., Schuller, R.: Exploiting task relatedness for learning multiple tasks. In: Proceedings of the 16th Annual Conference on Learning Theory (2003)

    Google Scholar 

  5. Baxter, J.: A model of inductive bias learning. Journal of Artificial Intelligence Research 12, 149–198 (2000)

    MathSciNet  MATH  Google Scholar 

  6. Juba, B.: Estimating relatedness via data compression. In: Proceedings of the 23rd International Conference on Machine Learning (2006)

    Google Scholar 

  7. Bennett, C., Gacs, P., Li, M., Vitanyi, P., Zurek, W.: Information distance. IEEE Transactions on Information Theory 44(4), 1407–1423 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  8. Li, M., Vitanyi, P.: An Introduction to Kolmogorov Complexity and its Applications, 2nd edn. Springer, New York (1997)

    Book  MATH  Google Scholar 

  9. Zvonkin, A.K., Levin, L.A.: The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Russian Math. Surveys 25(6), 83–124 (1970)

    Article  MATH  Google Scholar 

  10. Hutter, M.: Optimality of Bayesian universal prediction for general loss and alphabet. Journal of Machine Learning Research 4, 971–1000 (2003)

    MATH  Google Scholar 

  11. Li, M., Chen, X., Ma, B., Vitanyi, P.: The similarity metric. IEEE Transactions on Information Theory 50(12), 3250–3264 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  12. Cilibrasi, R., Vitanyi, P.: Clustering by compression. IEEE Transactions on Information theory 51(4), 1523–1545 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Solomonoff, R.J.: Complexity-based induction systems: comparisons and convergence theorems. IEEE Transactions on Information Theory 24(4), 422–432 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  14. Hutter, M.: Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Springer, Berlin (2004)

    MATH  Google Scholar 

  15. Mahmud, M.M.H., Ray, S.: Transfer learning using Kolmogorov complexity:basic theory and empirical evaluations. Technical Report UIUC-DCS-R-2007-2875, Department of Computer Science, University of Illinois at Urbana-Champaign (2007)

    Google Scholar 

  16. Hutter, M.: The fastest and shortest algorithm for all well defined problems. International Journal of Foundations of Computer Science 13(3), 431–443 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  17. Grunwald, P., Vitanyi, P.: Shannon information and Kolmogorov complexity. IEEE Transactions on Information Theory (submitted, 2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mahmud, M.M.H. (2007). On Universal Transfer Learning. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds) Algorithmic Learning Theory. ALT 2007. Lecture Notes in Computer Science(), vol 4754. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75225-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75225-7_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75224-0

  • Online ISBN: 978-3-540-75225-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics