Scene Recognition via Bi-enhanced Knowledge Space Learning

Zhang, Jin; Bao, Bing-Kun; Xu, Changsheng

doi:10.1007/978-981-13-9190-3_23

Jin Zhang¹⁰,
Bing-Kun Bao^11,12 &
Changsheng Xu^10,12

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1013))

Included in the following conference series:

International Computer Symposium

1324 Accesses

Abstract

Scene recognition is one of the hallmark tasks in computer vision, as it provides rich information beyond object recognition and action recognition. It is easy to accept that scene images from the same class always include the same essential objects and relations, for example, scene images of “wedding” usually have bridegroom and bride next to him. Following this observation, we introduce a novel idea to boost the accuracy of scene recognition by mining essential scene sub-graph and learning a bi-enhanced knowledge space. The essential scene sub-graph describes the essential objects and their relations for each scene class. The learned knowledge space is bi-enhanced by global representation on the entire image and local representation on the corresponding essential scene sub-graph. Experimental results on the constructed dataset called Scene 30 demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Chen, P.H., Lin, C.J., Schölkopf, B.: A tutorial on-support vector machines. Appl. Stoch. Models Bus. Ind. 21(2), 111–136 (2005)
Article MathSciNet Google Scholar
Cheng, X., Lu, J., Feng, J., Yuan, B., Zhou, J.: Scene recognition with objectness. Pattern Recogn. 74, 474–487 (2018)
Article Google Scholar
Chollet, F., et al.: Keras (2015). https://github.com/keras-team/keras
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detectionwith discriminatively trained part-based models. PAMI 32(9), 1627–1645 (2010)
Article Google Scholar
Geman, S., Graffigne, C.: Markov random field image models and their applications to computer vision. In: Proceedings of the International Congress of Mathematicians, vol. 1, p. 2 (1986)
Google Scholar
Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 392–407. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_26
Chapter Google Scholar
Herranz, L., Jiang, S., Li, X.: Scene recognition with CNNs: objects, scales and dataset bias. In: CVPR, pp. 571–579 (2016)
Google Scholar
Huang, S., Xu, Z., Tao, D., Zhang, Y.: Part-stacked CNN for fine-grained visual categorization. In: CVPR, pp. 1173–1182 (2016)
Google Scholar
Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. IJCV 123(1), 32–73 (2017)
Article MathSciNet Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Bao, B.-K., Zhu, G., Shen, J., Yan, S.: Robust image analysis with sparse representation on quantized visual features. IEEE Trans. Image Process. 22(3), 860–871 (2013)
Article MathSciNet Google Scholar
Li, L.J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: A high-level image representation for scene classification and semantic feature sparsification. In: Advances in Neural Information Processing Systems, pp. 1378–1386 (2010)
Google Scholar
Margolin, R., Zelnik-Manor, L., Tal, A.: OTC: a novel local descriptor for scene classification. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 377–391. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_25
Chapter Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representationof the spatial envelope. IJCV 42(3), 145–175 (2001)
Article Google Scholar
Parizi, S.N., Oberlin, J.G., Felzenszwalb, P.F.: Reconfigurable models for scene recognition. In: CVPR 2012, pp. 2775–2782. IEEE (2012)
Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Stamp, M., Professor, A.: A revealing introduction to hidden Markov models. IEEE ASSP Magruine 1(24), 258–261 (2004)
Google Scholar
Sudderth, E.B., Torralba, A., Freeman, W.T., Willsky, A.S.: Learning hierarchical models of scenes, objects, and parts. In: ICCV 2005, vol. 2, pp. 1331–1338. IEEE (2005)
Google Scholar
Wang, Z., Wang, L., Wang, Y., Zhang, B., Qiao, Y.: Weakly supervised patchnets: describing and aggregating local patches for scene recognition. TIP 26(4), 2028–2041 (2017)
MathSciNet MATH Google Scholar
Wu, J., Rehg, J.M.: Centrist: a visual descriptor for scene categorization. PAMI 33(8), 1489–1501 (2011)
Article Google Scholar
Xie, G.S., Zhang, X.Y., Yan, S., Liu, C.L.: Hybrid CNN and dictionary-based models for scene recognition and domain adaptation. IEEE Trans. Circuits Syst. Video Technol. 27(6), 1263–1274 (2017)
Article Google Scholar
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. PAMI 40, 1452–1464 (2017)
Article Google Scholar
Bao, B.-K., Liu, G., Changsheng, X., Yan, S.: Inductive robust principal component analysis. IEEE Trans. Image Process. 21(8), 3794–3800 (2012)
Article MathSciNet Google Scholar
Bao, B.-K., Min, W., Li, T., Changsheng, X.: Joint local and global consistency on interdocument and interword relationships for co-clustering. IEEE Trans. Cybern. 45(1), 15–28 (2015)
Article Google Scholar
Min, W., Bao, B.-K., Mei, S., Zhu, Y., Rui, Y., Jiang, S.: You are what you eat: exploring rich recipe information for cross-region food analysis. IEEE Trans. Multimed. 20(4), 950–964 (2018)
Article Google Scholar
Bao, B.-K., Changsheng, X., Min, W., Hossain, M.S.: Cross-platform emerging topic detection and elaboration from multimedia streams. TOMCCAP 11(4), 54 (2015)
Article Google Scholar
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)
Google Scholar

Download references

Acknowledgement

This work is supported by the National Key Research & Development Plan of China (No. 2017YFB1002800), by the National Natural Science Foundation of China under Grant 61872424, 61572503, 61720106006, 61432019, and by NUPTSF (No. NY218001), also supported by the Key Research Program of Frontier Sciences, CAS, Grant NO. QYZDJ-SSW-JSC039, and the K.C. Wong Education Foundation.

Author information

Authors and Affiliations

Hefei University of Technology, Hefei, China
Jin Zhang & Changsheng Xu
Nanjing University of Posts and Telecommunications, Nanjing, China
Bing-Kun Bao
National Lab of Pattern Recognition, Institute of Automation, Beijing, China
Bing-Kun Bao & Changsheng Xu

Authors

Jin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bing-Kun Bao
View author publications
You can also search for this author in PubMed Google Scholar
Changsheng Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bing-Kun Bao .

Editor information

Editors and Affiliations

National Yunlin University of Science and Technology, Douliu, Taiwan
Chuan-Yu Chang
National Yunlin University of Science and Technology, Douliu, Taiwan
Chien-Chou Lin
Southern Taiwan University of Science and Technology, Tainan, Taiwan
Horng-Horng Lin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Bao, BK., Xu, C. (2019). Scene Recognition via Bi-enhanced Knowledge Space Learning. In: Chang, CY., Lin, CC., Lin, HH. (eds) New Trends in Computer Technologies and Applications. ICS 2018. Communications in Computer and Information Science, vol 1013. Springer, Singapore. https://doi.org/10.1007/978-981-13-9190-3_23

Download citation

DOI: https://doi.org/10.1007/978-981-13-9190-3_23
Published: 11 July 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9189-7
Online ISBN: 978-981-13-9190-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics