Label-Smooth Learning for Fine-Grained Visual Categorization

Mo, Xianjie; Wei, Tingting; Zhang, Hengmin; Huang, Qiong; Luo, Wei

doi:10.1007/978-3-030-41404-7_2

Xianjie Mo¹²,
Tingting Wei¹²,
Hengmin Zhang¹³,
Qiong Huang¹² &
…
Wei Luo ORCID: orcid.org/0000-0003-1431-4134¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12046))

Included in the following conference series:

Asian Conference on Pattern Recognition

1513 Accesses
1 Citations

Abstract

Fine-Grained Visual Categorization (FGVC) is challenging due to the superior similarity among categories and the large within-category variance. Existing work tackles this problem by designing self-localization modules in an end-to-end DCNN to learn semantic part features. However the model efficiency of this strategy decreases significantly with the increasing of the number of categories, because more parts are needed to offset the impact of the increasing of categories. In this paper, we propose a label-smooth learning method that improves models applicability to large categories by maximizing its prediction diversity. Based on the similarity among fine-grained categories, a KL divergence between uniform and prediction distributions is established to reduce model’s confidence on the ground-truth category, while raising its confidence on similar categories. By minimizing it, information from similar categories are exploited for model learning, thus diminishing the effects caused by the increasing of categories. Experiments on five benchmark datasets of mid-scale (CUB-200-2011, Stanford Dogs, Stanford Cars, and FGVC-Aircraft) and large-scale (NABirds) categories show a clear advantage of the proposed label-smooth learning and demonstrate its comparable or state-of-the-art performance. Code is available at https://github.com/Cedric-Mo/LS-for-FGVC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baum, E.B., Wilczek, F.: Supervised learning of probability distributions by neural networks. In: NIPS (1988)
Google Scholar
Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: ICCV (2013)
Google Scholar
Chen, Y., Mo, X., Liang, Z., Wei, T., Luo, W.: Cross-category cross-semantic regularization for fine-grained image recognition. In: Lin, Z., et al. (eds.) PRCV 2019. LNCS, vol. 11857, pp. 110–122. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31654-9_10
Chapter Google Scholar
Chorowski, J., Jaitly, N.: Towards better decoding and language model integration in sequence to sequence models. arXiv preprint arXiv:1612.02695 (2016)
Dubey, A., Gupta, O., Guo, P., Raskar, R., Farrell, R., Naik, N.: Pairwise confusion for fine-grained visual classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 71–88. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_5
Chapter Google Scholar
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017)
Google Scholar
Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: CVPR (2016)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Google Scholar
Gosselin, P.H., Murray, N., Jégou, H., Perronnin, F.: Revisiting the Fisher vector for fine-grained classification. Pattern Recogn. Lett. 49, 92–98 (2014)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Hertz, J.A.: Introduction to the Theory of Neural Computation. CRC Press, Boca Raton (2018)
Book Google Scholar
Horn, G.V., et al.: Building a bird recognition app and large scale dataset with citizen scientists: the fine print in fine-grained dataset collection. In: CVPR (2015)
Google Scholar
Huang, Y., et al.: GPipe: efficient training of giant neural networks using pipeline parallelism. arXiv preprint arXiv:1811.06965 (2018)
Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: NIPS (2015)
Google Scholar
Khosla, A., Jayadevaprakash, N., Yao, B., Li, F.F.: Novel dataset for fine-grained image categorization. In: First Workshop on Fine-Grained Visual Categorization (FGVC) at CVPR (2011)
Google Scholar
Krause, J., Stark, M., Deng, J., Li, F.F.: 3D object representations for fine-grained categorization. In: 4th IEEE Workshop on 3D Representation and Recognition at ICCV (2013)
Google Scholar
Lam, M., Mahasseni, B., Todorovic, S.: Fine-grained recognition as HSnet search for informative image parts. In: CVPR (2017)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Levin, E., Fleisher, M.: Accelerated learning in layered neural networks. Complex Syst. 2, 625–640 (1988)
MathSciNet MATH Google Scholar
Lin, D., Shen, X., Lu, C., Jia, J.: Deep LAC: deep localization, alignment and classification for fine-grained recognition. In: CVPR (2015)
Google Scholar
Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: ICCV (2015)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Luo, W., Yang, X., Mo, X., Lu, Y., Davis, L.S., Lin, S.N.: Cross-X learning for fine-grained visual categorization. In: ICCV (2019)
Google Scholar
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. In: arXiv preprint arXiv:1306.5151 (2013)
Mo, X., Zhu, J., Zhao, X., Liu, M., Wei, T., Luo, W.: Exploiting category-level semantic relationships for fine-grained image recognition. In: Lin, Z., et al. (eds.) PRCV 2019. LNCS, vol. 11857, pp. 50–62. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31654-9_5
Chapter Google Scholar
Naik, N., Dubey, A., Gupta, O., Raskar, R.: Maximum entropy fine-grained classification. In: NIPS (2018)
Google Scholar
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI (2019)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J., et al.: Learning representations by back-propagating errors. Cogn. Model. 5(3), 1 (1988)
MATH Google Scholar
Simon, M., Rodner, E.: Neural activation constellations: unsupervised part model discovery with convolutional networks. In: ICCV (2015)
Google Scholar
Sun, M., Yuan, Y., Zhou, F., Ding, E.: Multi-attention multi-class constraint for fine-grained image recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 834–850. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_49
Chapter Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR (2016)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)
Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset. Technical report, California Institute of Technology (2011)
Google Scholar
Wang, Y., Morariu, V.I., Davis, L.S.: Learning a discriminative filter bank within a CNN for fine-grained recognition. In: CVPR (2018)
Google Scholar
Wu, C.Y., Manmatha, R., Smola, A.J., Krahenbuhl, P.: Sampling matters in deep embedding learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2840–2848 (2017)
Google Scholar
Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., Zhang, Z.: The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: CVPR (2015)
Google Scholar
Xu, Z., Huang, S., Zhang, Y., Tao, D.: Augmenting strong supervision using web data for fine-grained categorization. In: ICCV (2015)
Google Scholar
Yang, S., Bo, L., Wang, J., Shapiro, L.G.: Unsupervised template learning for fine-grained object recognition. In: NIPS (2012)
Google Scholar
Yang, Z., Luo, T., Wang, D., Hu, Z., Gao, J., Wang, L.: Learning to navigate for fine-grained classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 438–454. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_26
Chapter Google Scholar
Zhang, H., et al.: SPDA-CNN: unifying semantic part detection and abstraction for fine-grained recognition. In: CVPR (2016)
Google Scholar
Zhang, J., Zhang, R., Huang, Y., Zou, Q.: Unsupervised part mining for fine-grained image classification. In: arXiv preprint arXiv:1902.09941 (2019)
Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54
Chapter Google Scholar
Zhang, N., Farrell, R., Iandola, F., Darrell, T.: Deformable part descriptors for fine-grained recognition and attribute prediction. In: ICCV (2013)
Google Scholar
Zhang, X., Zhou, F., Lin, Y., Zhang, S.: Embedding label structures for fine-grained feature representation. In: CVPR (2016)
Google Scholar
Zhang, X., Xiong, H., Zhou, W., Lin, W., Tian, Q.: Picking deep filter responses for fine-grained image recognition. In: CVPR (2016)
Google Scholar
Zheng, H., Fu, J., Mei, T., Luo, J.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: ICCV (2017)
Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: CVPR (2018)
Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61702197, in part by the Natural Science Foundation of Guangdong Province under Grant 2017A030310261.

Author information

Authors and Affiliations

College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510642, GD, People’s Republic of China
Xianjie Mo, Tingting Wei, Qiong Huang & Wei Luo
Key Laboratory of Advanced Control and Optimization for Chemical Processes of Ministry of Education, East China University of Science and Technology, Shanghai, 200237, People’s Republic of China
Hengmin Zhang

Authors

Xianjie Mo
View author publications
You can also search for this author in PubMed Google Scholar
Tingting Wei
View author publications
You can also search for this author in PubMed Google Scholar
Hengmin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qiong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Luo .

Editor information

Editors and Affiliations

University of Malaya, Kuala Lumpur, Malaysia
Shivakumara Palaiahnakote
Consiglio Nazionale delle Ricerche, ICAR, Naples, Italy
Gabriella Sanniti di Baja
Chinese Academy of Sciences, Beijing, China
Liang Wang
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mo, X., Wei, T., Zhang, H., Huang, Q., Luo, W. (2020). Label-Smooth Learning for Fine-Grained Visual Categorization. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12046. Springer, Cham. https://doi.org/10.1007/978-3-030-41404-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-41404-7_2
Published: 23 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41403-0
Online ISBN: 978-3-030-41404-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics