Skip to main content

Adaptive Task Sampling for Meta-learning

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12363))

Included in the following conference series:

Abstract

Meta-learning methods have been extensively studied and applied in computer vision, especially for few-shot classification tasks. The key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time by randomly sampling classes in meta-training data to construct few-shot tasks for episodic training. While a rich line of work focuses solely on how to extract meta-knowledge across tasks, we exploit the complementary problem on how to generate informative tasks. We argue that the randomly sampled tasks could be sub-optimal and uninformative (e.g., the task of classifying “dog” from “laptop” is often trivial) to the meta-learner. In this paper, we propose an adaptive task sampling method to improve the generalization performance. Unlike instance based sampling, task based sampling is much more challenging due to the implicit definition of the task in each episode. Therefore, we accordingly propose a greedy class-pair based sampling method, which selects difficult tasks according to class-pair potentials. We evaluate our adaptive task sampling method on two few-shot classification benchmarks, and it achieves consistent improvements across different feature backbones, meta-learning algorithms and datasets.

C. Liu and Z. Wang—contributed equally, and completed most of this work when working at the School of Information Systems, Singapore Management University (SMU). Steven C.H. Hoi is currently with Salesforce Research Asia and on leave from SMU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alain, G., Lamb, A., Sankar, C., Courville, A., Bengio, Y.: Variance reduction in SGD by distributed importance sampling. arXiv preprint arXiv:1511.06481 (2015)

  2. Allen-Zhu, Z., Qu, Z., Richtárik, P., Yuan, Y.: Even faster accelerated coordinate descent using non-uniform sampling. In: International Conference on Machine Learning, pp. 1110–1119 (2016)

    Google Scholar 

  3. Aly, M.: Survey on multiclass classification methods. Neural Netw. 19, 1–9 (2005)

    Google Scholar 

  4. Antoniou, A., Edwards, H., Storkey, A.: How to train your MAML. arXiv preprint arXiv:1810.09502 (2018)

  5. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 41–48. ACM (2009)

    Google Scholar 

  6. Bertinetto, L., Henriques, J.F., Torr, P.H., Vedaldi, A.: Meta-learning with differentiable closed-form solvers. arXiv preprint arXiv:1805.08136 (2018)

  7. Chang, H.S., Learned-Miller, E., McCallum, A.: Active bias: training more accurate neural networks by emphasizing high variance samples. In: Advances in Neural Information Processing Systems, pp. 1002–1012 (2017)

    Google Scholar 

  8. Chen, W., Liu, Y., Kira, Z., Wang, Y.F., Huang, J.: A closer look at few-shot classification. In: Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May (2019). https://openreview.net/forum?id=HkxLXnAcFQ

  9. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  10. Cross, G.R., Jain, A.K.: Markov random field texture models. IEEE Trans. Pattern Anal. Mach. Intell. 5(1), 25–39 (1983)

    Article  Google Scholar 

  11. Csiba, D., Richtárik, P.: Importance sampling for minibatches. J. Mach. Learn. Res. 19(1), 962–982 (2018)

    MathSciNet  MATH  Google Scholar 

  12. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1126–1135 (2017). JMLR.org

    Google Scholar 

  13. Franceschi, L., Frasconi, P., Salzo, S., Grazzi, R., Pontil, M.: Bilevel programming for hyperparameter optimization and meta-learning. In: International Conference on Machine Learning, pp. 1563–1572 (2018)

    Google Scholar 

  14. Freund, Y., Schapire, R.: A short introduction to boosting. J. Jpn. Soc. Artif. Intell. 14(771–780), 1612 (1999)

    Google Scholar 

  15. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  Google Scholar 

  16. Gopal, S.: Adaptive sampling for SGD by exploiting side information. In: International Conference on Machine Learning, pp. 364–372 (2016)

    Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  18. Horváth, S., Richtárik, P.: Nonconvex variance reduced optimization with arbitrary sampling. arXiv preprint arXiv:1809.04146 (2018)

  19. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)

    Google Scholar 

  20. Katharopoulos, A., Fleuret, F.: Biased importance sampling for deep neural network training. arXiv preprint arXiv:1706.00043 (2017)

  21. Katharopoulos, A., Fleuret, F.: Not all samples are created equal: deep learning with importance sampling. arXiv preprint arXiv:1803.00942 (2018)

  22. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  23. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)

    Google Scholar 

  24. Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015)

    Article  MathSciNet  Google Scholar 

  25. Landau, B., Smith, L.B., Jones, S.S.: The importance of shape in early lexical learning. Cogn. Dev. 3(3), 299–321 (1988)

    Article  Google Scholar 

  26. Lee, K., Maji, S., Ravichandran, A., Soatto, S.: Meta-learning with differentiable convex optimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10657–10665 (2019)

    Google Scholar 

  27. Li, Z., Zhou, F., Chen, F., Li, H.: Meta-SGD: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835 (2017)

  28. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  29. Liu, L., Zhou, T., Long, G., Jiang, J., Zhang, C.: Learning to propagate for graph meta-learning. arXiv preprint arXiv:1909.05024 (2019)

  30. London, B.: A PAC-Bayesian analysis of randomized learning with application to stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp. 2931–2940 (2017)

    Google Scholar 

  31. Loshchilov, I., Hutter, F.: Online batch selection for faster training of neural networks. arXiv preprint arXiv:1511.06343 (2015)

  32. van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(86), 2579–2605 (2008)

    MATH  Google Scholar 

  33. Mishra, N., Rohaninejad, M., Chen, X., Abbeel, P.: A simple neural attentive meta-learner. In: Proceedings of the ICLR (2017)

    Google Scholar 

  34. Munkhdalai, T., Yuan, X., Mehri, S., Trischler, A.: Rapid adaptation with conditionally shifted neurons. In: International Conference on Machine Learning, pp. 3661–3670 (2018)

    Google Scholar 

  35. Naik, D.K., Mammone, R.J.: Meta-neural networks that learn by learning. In: Proceedings of the International Joint Conference on Neural Networks, IJCNN 1992, vol. 1, pp. 437–442. IEEE (1992)

    Google Scholar 

  36. Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 (2018)

  37. Oreshkin, B., López, P.R., Lacoste, A.: TADAM: task dependent adaptive metric for improved few-shot learning. In: Advances in Neural Information Processing Systems, pp. 721–731 (2018)

    Google Scholar 

  38. Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: Proceedings of the ICLR (2016)

    Google Scholar 

  39. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  40. Rusu, A.A., et al.: Meta-learning with latent embedding optimization. In: Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May (2019). https://openreview.net/forum?id=BJgklhAcK7

  41. Satorras, V.G., Bruna, J.: Few-shot learning with graph neural networks. In: Proceedings of the ICLR (2018)

    Google Scholar 

  42. Shalev-Shwartz, S., Wexler, Y.: Minimizing the maximal loss: how and why. In: Proceedings of the ICML, pp. 793–801 (2016)

    Google Scholar 

  43. Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 761–769 (2016)

    Google Scholar 

  44. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, pp. 4077–4087 (2017)

    Google Scholar 

  45. Song, H., Kim, S., Kim, M., Lee, J.G.: Ada-boundary: accelerating the DNN training via adaptive boundary batch selection (2018)

    Google Scholar 

  46. Sun, Q., Liu, Y., Chua, T.S., Schiele, B.: Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 403–412 (2019)

    Google Scholar 

  47. Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)

    Google Scholar 

  48. Thrun, S., Pratt, L.: Learning to learn: introduction and overview. In: Thrun, S., Pratt, L. (eds.) Learning to Learn, pp. 3–17. Springer, Boston (1998). https://doi.org/10.1007/978-1-4615-5529-2_1

  49. Triantafillou, E., et al.: Meta-dataset: A dataset of datasets for learning to learn from few examples. arXiv preprint arXiv:1903.03096 (2019)

  50. Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems, pp. 3630–3638 (2016)

    Google Scholar 

  51. Ze, H., Senior, A., Schuster, M.: Statistical parametric speech synthesis using deep neural networks. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 7962–7966. IEEE (2013)

    Google Scholar 

  52. Zhang, C., Kjellstrom, H., Mandt, S.: Determinantal point processes for mini-batch diversification. arXiv preprint arXiv:1705.00607 (2017)

  53. Zhang, C., Öztireli, C., Mandt, S., Salvi, G.: Active mini-batch sampling using repulsive point processes. Proceedings of the AAAI Conference on Artificial Intelligence 33, 5741–5748 (2019)

    Article  Google Scholar 

  54. Zhang, R., Che, T., Ghahramani, Z., Bengio, Y., Song, Y.: MetaGAN: an adversarial approach to few-shot learning. In: Advances in Neural Information Processing Systems, pp. 2365–2374 (2018)

    Google Scholar 

  55. Zhao, P., Zhang, T.: Stochastic optimization with importance sampling for regularized loss minimization. In: International Conference on Machine Learning, pp. 1–9 (2015)

    Google Scholar 

Download references

Acknowledgment

This research is supported by the National Research Foundation, Singapore under its AI Singapore Programme (AISG Award No: AISG-RP-2018-001). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chenghao Liu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 362 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, C., Wang, Z., Sahoo, D., Fang, Y., Zhang, K., Hoi, S.C.H. (2020). Adaptive Task Sampling for Meta-learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12363. Springer, Cham. https://doi.org/10.1007/978-3-030-58523-5_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58523-5_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58522-8

  • Online ISBN: 978-3-030-58523-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics