Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1153))

Abstract

Most successful approaches on image classification apply the Bag-of-Words (BoW) approach in the context of category-level image classification. To incorporate spatial image information in the BoW model, Spatial Pyramids (SPs) are used. However, spatial pyramids are rigid in nature and are based on predefined grid configurations. As a consequence, they often fail to coincide with the underlying spatial structure of images from different categories which may negatively affect the classification accuracy.

This study proposes a technique that addresses this problem by presenting a selective Spatial Pyramid (SP) representation for automatically learning the most appropriate shape for each category. The proposed approach provides an image representation by inferring the constituent geometrical parts. As a result, the image representation retains the descriptive spatial information to yield a structural description of the image. From large scale experiments on the Pascal VOC2007 and Caltech101, it can be derived that SPs which are obtained by selective search outperforms the standard SPs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: European Conference on Computer Vision Workshop on Statistical Learning in Computer Vision (2004)

    Google Scholar 

  2. Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)

    Article  Google Scholar 

  3. Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Computer Vision and Pattern Recognition (2005)

    Google Scholar 

  4. Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: an in-depth study: a comprehensive study. Int. J. Comput. Vis. 73(2), 213–218 (2007)

    Article  Google Scholar 

  5. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Computer Vision and Pattern Recognition (2006)

    Google Scholar 

  6. Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge 2007 results (2007)

    Google Scholar 

  7. Nedovic, V., Smeulders, A.W.M., Redert, A., Geusebroek, J.-M.: Stages as models of scene geometry. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1673–1687 (2010)

    Article  Google Scholar 

  8. Slonim, N., Tishby, N.: Agglomerative information bottleneck. In: Neural Information Processing Systems (1999)

    Google Scholar 

  9. Fulkerson, B., Vedaldi, A., Soatto, S.: Localizing objects with smart dictionaries. In: European Conference of Computer Vision (2008)

    Google Scholar 

  10. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples. In: Computer Vision on Pattern Recognition Workshop (2004)

    Google Scholar 

  11. Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: International Conference on Computer Vision, pp. 654–661 (2005)

    Google Scholar 

  12. Delage, E., Lee, H., Ng, A.Y.: A dynamic bayesian network model for autonomous 3D reconstruction from a single indoor image. In: Computer Vision and Pattern Recognition, pp. 2418–2428 (2006)

    Google Scholar 

  13. Sudderth, E.B., Torralba, A., Freeman, W.T., Willsky, A.S.: Depth from familiar objects: a hierarchical model for 3D scenes. In: Computer Vision on Pattern Recognition, pp. 2410–2417 (2006)

    Google Scholar 

  14. Nedovic, V., Smeulders, A., Redert, A., Geusebroek, J.-M.: Depth information by stage classification. In: International Conference on Computer Vision (2007)

    Google Scholar 

  15. Marszalek, M., Schmid, C., Harzallah, H., van de Weijer, J.: Learning object representation for visual object class recognition. In: International Conference on Computer Vision on Visual recognition Challenge Workshop (2007)

    Google Scholar 

  16. van Gemert, J.: Exploiting photographic style for category-level image classification by generalizing the spatial pyramid. In: International Conference on Machine Learning (2011)

    Google Scholar 

  17. Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: International Conference on Computer Vision (2005)

    Google Scholar 

  18. Rui, L., Gijsenij, A., Gevers, T., Nedovic, V., De, X., Geusebroek, J.: Color constancy using 3D scene geometry. In: International Conference on Computer Vision (2009)

    Google Scholar 

  19. Moore, A.P., Prince, S.J.D., Warrell, J., Mohammed, U., Jones, G., Superpixel lattices. In: Conference on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  20. Lowe, D.: Distinctive image features from scale invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  21. Gehler, P.V., Nowozin, S.: On feature combination for multiclass object classification. In: International Conference on Computer Vision (2009)

    Google Scholar 

  22. Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Largescale scene recognition from abbey to zoo. In: Computer Vision and Pattern Recognition (2010)

    Google Scholar 

  23. Khan, F., van de Weijer, J., Vanrell, M.: Top-down color attention for object recognition. In: International Conference on Computer Vision (2009)

    Google Scholar 

  24. Su, Y., Jurie, F.: Visual word disambiguation by semantic contexts. In: International Conference on Computer Vision (2011)

    Google Scholar 

  25. Sharma, G., Jurie, F.: Learning discriminative spatial representation for image classification. In: British Machine Vision Conference (2011)

    Google Scholar 

  26. Boureau, Y., Bach, F., LeCun, Y., Ponce, J.: Learning midlevel features for recognition. In: Computer Vision and Patten Recognition (2010)

    Google Scholar 

  27. Zhang, H., Berg, A.C., Maire, M., Malik, J.: SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: Computer Vision and Pattern Recognition (2006)

    Google Scholar 

  28. Boiman, O., Rehovot, I., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  29. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: Computer Vision and Pattern Recognition (2009)

    Google Scholar 

  30. van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)

    Article  Google Scholar 

  31. Elfiky, N.: Application of analytics in machine vision using Big Data. Asian J. Appl. Sci. 7(4), 376–385 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noha Elfiky .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Elfiky, N. (2020). A Novel Spatial Layout Representation for Object Recognition. In: Hassanien, AE., Azar, A., Gaber, T., Oliva, D., Tolba, F. (eds) Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020). AICV 2020. Advances in Intelligent Systems and Computing, vol 1153. Springer, Cham. https://doi.org/10.1007/978-3-030-44289-7_52

Download citation

Publish with us

Policies and ethics