Skip to main content

Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12353))

Included in the following conference series:

Abstract

This paper tackles a 2D architecture vectorization problem, whose task is to infer an outdoor building architecture as a 2D planar graph from a single RGB image. We provide a new benchmark with ground-truth annotations for 2,001 complex buildings across the cities of Atlanta, Paris, and Las Vegas. We also propose a novel algorithm utilizing 1) convolutional neural networks (CNNs) that detects geometric primitives and infers their relationships and 2) an integer programming (IP) that assembles the information into a 2D planar graph. While being a trivial task for human vision, the inference of a graph structure with an arbitrary topology is still an open problem for computer vision. Qualitative and quantitative evaluations demonstrate that our algorithm makes significant improvements over the current state-of-the-art, towards an intelligent system at the level of human perception. We will share code and data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Rooms are regions in their problem and can be detected easily. Our regions are roof segments and much less distinguishable.

  2. 2.

    In short, a corner is declared to be correct if there exists a ground-truth corner within a certain distance. An edge is declared to be correct if both corners are declared to be correct. A region is declared to be correct if there exists a ground-truth region with more than 0.7 IOU. Our only change is to tighten the distance tolerance on the corner detection from 10 pixels to 8 pixels.

References

  1. SpaceNet on Amazon Web Services (AWS). “Datasets.” The SpaceNet Catalog. Last modified April 30, 2018. https://spacenetchallenge.github.io/datasets/datasetHomePage.html Accessed 19 Oct 2018

  2. Acuna, D., Ling, H., Kar, A., Fidler, S.: Efficient interactive annotation of segmentation datasets with polygon-rnn++. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 859–868 (2018)

    Google Scholar 

  3. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986)

    Google Scholar 

  4. Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: realtime multi-person 2d pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018)

  5. Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)

    Google Scholar 

  6. Chao, Y.-W., Choi, W., Pantofaru, C., Savarese, S.: Layout estimation of highly cluttered indoor scenes using geometric and semantic cues. In: Petrosino, A. (ed.) ICIAP 2013. LNCS, vol. 8157, pp. 489–499. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41184-7_50

    Chapter  Google Scholar 

  7. Chen, J., Liu, C., Wu, J., Furukawa, Y.: Floor-sp: inverse cad for floorplans by sequential room-wise shortest path. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2661–2670 (2019)

    Google Scholar 

  8. Cheng, D., Liao, R., Fidler, S., Urtasun, R.: Darnet: deep active ray network for building segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7431–7439 (2019)

    Google Scholar 

  9. Etten, A.V., Lindenbaum, D., Bacastow, T.M.: Spacenet: A remote sensing dataset and challenge series. arXiv preprint arXiv:1807.01232 (2018)

  10. Flint, A., Mei, C., Murray, D., Reid, I.: A dynamic programming approach to reconstructing building interiors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 394–407. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_29

    Chapter  Google Scholar 

  11. Flint, A., Murray, D., Reid, I.: Manhattan scene understanding using monocular, stereo, and 3D features. In: 2011 International Conference on Computer Vision, pp. 2228–2235. IEEE (2011)

    Google Scholar 

  12. Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Manhattan-world stereo. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1422–1429. IEEE (2009)

    Google Scholar 

  13. Hamaguchi, R., Hikosaka, S.: Building detection from satellite imagery using ensemble of size-specific detectors. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 223–2234. IEEE (2018)

    Google Scholar 

  14. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Computer Vision (ICCV), 2017 IEEE International Conference on, pp. 2980–2988. IEEE (2017)

    Google Scholar 

  15. Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: Computer vision, 2009 IEEE 12th international conference on, pp. 1849–1856. IEEE (2009)

    Google Scholar 

  16. Huang, K., Wang, Y., Zhou, Z., Ding, T., Gao, S., Ma, Y.: Learning to parse wireframes in images of man-made environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 626–635 (2018)

    Google Scholar 

  17. Lee, C.Y., Badrinarayanan, V., Malisiewicz, T., Rabinovich, A.: Roomnet: end-to-end room layout estimation. arXiv preprint arXiv:1703.06241 (2017)

  18. Lin, H., et al.: Semantic decomposition and reconstruction of residential scenes from lidar data. ACM Trans. Graph. (TOG) 32(4), 66 (2013)

    Google Scholar 

  19. Liu, C., Wu, J., Kohli, P., Furukawa, Y.: Raster-to-vector: revisiting floorplan transformation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2195–2203 (2017)

    Google Scholar 

  20. Liu, C., Wu, J., Furukawa, Y.: Floornet: a unified framework for floorplan reconstruction from 3D scans. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 201–217 (2018)

    Google Scholar 

  21. Liu, H., Zhang, J., Zhu, J., Hoi, S.: Deepfacade: a deep learning approach to facade parsing. pp. 2301–2307 (2017) https://doi.org/10.24963/ijcai.2017/320

  22. Martinović, A., Mathias, M., Weissenberg, J., Van Gool, L.: A three-layered approach to facade parsing. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 416–429. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33786-4_31

    Chapter  Google Scholar 

  23. Nishida, G., Bousseau, A., Aliaga, D.G.: Procedural modeling of a building from a single image. Comput. Graph. Forum 37, 415–429 (2018)

    Google Scholar 

  24. Parish, Y.I., Müller, P.: Procedural modeling of cities. In: Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pp. 301–308. ACM (2001)

    Google Scholar 

  25. Schwing, A.G., Hazan, T., Pollefeys, M., Urtasun, R.: Efficient structured prediction for 3D indoor scene understanding. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 2815–2822. IEEE (2012)

    Google Scholar 

  26. Szeliski, R.: Computer Vision: Algorithms and Applications. Springer Science & Business Media, Springer, London (2010). https://doi.org/10.1007/978-1-84882-935-0

    Book  MATH  Google Scholar 

  27. Von Gioi, R.G., Jakubowicz, J., Morel, J.M., Randall, G.: Lsd: a line segment detector. Image Process. Line 2, 35–55 (2012)

    Google Scholar 

  28. Yu, F., Koltun, V., Funkhouser, T.A.: Dilated residual networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 636–644 (2017)

    Google Scholar 

  29. Zeng, H., Wu, J., Furukawa, Y.: Neural procedural reconstruction for residential buildings. In: The European Conference on Computer Vision (ECCV), pp. 737–753 (2018)

    Google Scholar 

  30. Zhang, Z., et al.: Ppgnet: learning point-pair graph for line segment detection. arXiv preprint arXiv:1905.03415 (2019)

  31. Zhou, Y., Qi, H., Ma, Y.: End-to-end wireframe parsing. arXiv preprint arXiv:1905.03246 (2019)

Download references

Acknowledgement

This research is partially supported by NSERC Discovery Grants, NSERC Discovery Grants Accelerator Supplements, and DND/NSERC Discovery Grant Supplement. This research is also supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior/Interior Business Center (DOI/IBC) contract number D17PC00288. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI/IBC, or the U.S. Government.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Nelson Nauata or Yasutaka Furukawa .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 17335 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nauata, N., Furukawa, Y. (2020). Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12353. Springer, Cham. https://doi.org/10.1007/978-3-030-58598-3_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58598-3_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58597-6

  • Online ISBN: 978-3-030-58598-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics