Skip to main content
Log in

Asymmetric graph based zero shot learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Zero-shot learning (ZSL) now has gained a great deal of focus due to its ability of recognizing unseen categories by training with samples of only seen categories. Existing efforts have been devoted to learn a projection between semantic space and feature space, which has made a big progress in ZSL. However, simply establishing a projection often suffers from the visual semantic ambiguity problem and hubness problem. Specifically, visual patterns and semantic concepts often can not properly match each other, and lead to inaccurate recognition result. To this end, in this paper, we propose a novel ZSL model, namely Asymmetric Graph-based Zero Shot Learning (AGZSL), to simultaneously preserve class level semantic manifold and instance level visual manifold in a latent space. In addition, to make the model more discriminative, we also constrain the latent space to be orthogonal, which means that the projected visual features and semantic embeddings in the latent space are orthogonal when they belong to different categories. We test our approach on four benchmark datasets under both standard zero-shot setting and more realistic generalized zero-shot learning (GZSL) setting, and the results show that our AGZSL can significantly improve the final performance comparing to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 819–826

  2. Akata Z, Perronnin F, Harchaoui Z, Schmid C (2016) Label-embedding for image classification. IEEE Trans Pattern Anal Mach Intell 38(7):1425–1438

    Article  Google Scholar 

  3. Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2927–2936

  4. Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2927–2936

  5. Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Proces Syst 14(6):585–591

    Google Scholar 

  6. Bittorf V, Recht B, Ré C, Tropp JA (2012) Factoring nonnegative matrices with linear programs. In: Advances in neural information processing systems, pp 1214–1222

  7. Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 5327–5336

  8. Chao WL, Changpinyo S, Gong B, Sha F (2016) An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: European conference on computer vision, pp 52–68

  9. Deutsch S, Kolouri S, Kim K, Owechko Y, Soatto S (2017) Zero shot learning via multi-scale manifold regularization. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 7112–7119

  10. Ding Z, Shao M, Fu Y (2017) Low-rank embedded ensemble semantic dictionary for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2050–2058

  11. Ding Z, Shao M, Fu Y (2018) Generative zero-shot learning via low-rank embedded semantic dictionary. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2867870

  12. Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: a deep convolutional activation feature for generic visual recognition. In: International conference on international conference on machine learning, pp 647–655

  13. Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1778–1785

  14. Ferrari V, Zisserman A (2008) Learning visual attributes. In: Advances in neural information processing systems, pp 433–440

  15. Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Mikolov T, et al. (2013) Devise: a deep visual-semantic embedding model. In: Advances in neural information processing systems, pp 2121–2129

  16. Fu Y, Hospedales TM, Xiang T, Fu Z, Gong S (2014) Transductive multi-view embedding for zero-shot recognition and annotation. In: European conference on computer vision, pp 584–599

  17. Fu Y, Xiang T, Jiang YG, Xue X, Sigal L, Gong S (2018) Recent advances in zero-shot recognition. IEEE Signal Process Mag 35(1):112–125

    Article  Google Scholar 

  18. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  19. Ji Z, Yu Y, Pang Y, Guo J, Zhang Z (2017) Manifold regularized cross-modal embedding for zero-shot learning. Inf Sci 378:48–58

    Article  Google Scholar 

  20. Jiang H, Wang R, Shan S, Chen X (2018) Learning class prototypes via structure alignment for zero-shot recognition. In: Proceedings of the European conference on computer vision (ECCV), pp 118–134

  21. Kodirov E, Xiang T, Fu Z, Gong S (2015) Unsupervised domain adaptation for zero-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 2452–2460

  22. Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 3174–3183

  23. Lampert CH, Hannes N, Stefan H (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36 (3):453–465

    Article  Google Scholar 

  24. Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36 (3):453–465

    Article  Google Scholar 

  25. Lee H, Pham PT, Yan L, Ng AY (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: Advances in neural information processing systems, pp 1096–1104

  26. Li J, Lu K, Huang Z, Zhu L, Shen H (2019) Transfer independently together: a generalized framework for domain adaptation. IEEE Trans Cybern 46(6):2144–2155

    Article  Google Scholar 

  27. Li J, Lu K, Zhu L, Li Z (2017) Locality-constrained transfer coding for heterogeneous domain adaptation. In: Australasian database conference, pp 193–204

  28. Li J, Yue W, Ke L (2017) Structured domain adaptation. IEEE Trans Circuits Syst Video Technol 27(8):1700–1713

    Article  Google Scholar 

  29. Li J, Zhao J, Lu K (2016) Joint feature selection and structure preservation for domain adaptation. In: International joint conferences on artificial intelligence (IJCAI), pp 1697–1703

  30. Li J, Zhu L, Huang Z, Lu K, Zhao J (2018) I read, i saw, i tell: texts assisted fine-grained visual classification. In: 2018 ACM multimedia conference on multimedia conference, pp 663–671

  31. Long Y, Liu L, Shao L (2016) Attribute embedding with visual-semantic ambiguity removal for zero-shot learning. In: BMVC

  32. Long Y, Shao L (2017) Describing unseen classes by exemplars: Zero-shot learning using grouped simile ensemble. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp 907–915

  33. Maaten LVD, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11):2579–2605

    MATH  Google Scholar 

  34. Massei S, Palitta D, Robol L (2018) Solving rank-structured Sylvester and Lyapunov equations. SIAM J Matrix Anal Appl 39(4):1564–1590

    Article  MathSciNet  Google Scholar 

  35. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  36. Norouzi M, Mikolov T, Bengio S, Singer Y, Shlens J, Frome A, Corrado GS, Dean J (2014) Zero-shot learning by convex combination of semantic embeddings. In: International conference on learning representation (ICLR)

  37. Palatucci M, Pomerleau D, Hinton GE, Mitchell TM (2009) Zero-shot learning with semantic output codes. In: Advances in neural information processing systems, pp 1410–1418

  38. Patterson G, Xu C, Su H, Hays J (2014) The sun attribute database: beyond categories for deeper scene understanding. Int J Comput Vis 108(1-2):59–81

    Article  Google Scholar 

  39. Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on international conference on machine learning, pp 2152–2161

  40. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: integrated recognition, localization and detection using convolutional networks. In: International conference on learning representation (ICLR)

  41. Shigeto Y, Suzuki I, Hara K, Shimbo M, Matsumoto Y (2015) Ridge regression, hubness, and zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 135–151

  42. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representation (ICLR)

  43. Socher R, Ganjoo M, Sridhar H, Bastani O, Manning CD, Ng AY (2013) Zero-shot learning through cross-modal transfer. In: Advances in neural information processing systems, pp 935–943

  44. Song J, Shen C, Yang Y, Liu Y, Song M (2018) Transductive unbiased embedding for zero-shot learning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 1024–1033

  45. Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3 (1):71–86

    Article  Google Scholar 

  46. Verma VK, Rai P (2017) A simple exponential family framework for zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 792–808

  47. Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-UCSD Birds-200-2011 Dataset. Tech rep

  48. Wright J, Ganesh A, Rao S, Peng Y, Ma Y (2009) Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In: Advances in neural information processing systems, pp 2080–2088

  49. Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 69–77

  50. Xian Y, Schiele B, Akata Z (2017) Zero-shot learning-the good, the bad and the ugly. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 4582–4591

  51. Yan H, Ye Q, Yu DJ, Yuan X, Xu Y, Fu L, et al. (2018) Least squares twin bounded support vector machines based on l1-norm distance metric for classification. Pattern Recogn 74:434–447

    Article  Google Scholar 

  52. Ye Q, Yang J, Liu F, Zhao C, Ye N, Yin T (2018) L1-norm distance linear discriminant analysis based on an effective iterative algorithm. IEEE Trans Circuits Syst Video Technol 28(1):114–129

    Article  Google Scholar 

  53. Zhang H, Long Y, Guan Y, Shao L (2019) Triple verification network for generalized zero-shot learning. IEEE Trans Image Process 28(1):506–517

    Article  MathSciNet  Google Scholar 

  54. Zhang H, Long Y, Liu L, Shao L (2018) Adversarial unseen visual feature synthesis for zero-shot learning. Neurocomputing 329:12–20

    Article  Google Scholar 

  55. Zhang H, Long Y, Shao L (2018) Zero-shot leaning and hashing with binary visual similes. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-018-6842-3

  56. Zhang H, Long Y, Yang W, Shao L (2019) Dual-verification network for zero-shot learning. Inf Sci 470:43–57

    Article  MathSciNet  Google Scholar 

  57. Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2021–2030

  58. Zhang Z, Saligrama V (2015) Zero-shot learning via joint latent similarity embedding. In: 6034–6042

  59. Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (No.61872187).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haofeng Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Zhang, H., Zhang, Z. et al. Asymmetric graph based zero shot learning. Multimed Tools Appl 79, 33689–33710 (2020). https://doi.org/10.1007/s11042-019-7689-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7689-y

Keywords

Navigation