Skip to main content

Stereo-Based vs. Monocular 6-DoF Pose Estimation Using Point Features: A Quantitative Comparison

  • Conference paper
Autonome Mobile Systeme 2009

Part of the book series: Informatik aktuell ((INFORMAT))

Abstract

In the recent past, object recognition and localization based on correspondences of local point features in 2D views has become very popular in the robotics community. For grasping and manipulation with robotic systems, in addition accurate 6-DoF pose estimation of the object of interest is necessary. Now there are two substantially different approaches to computing a 6-DoF pose: monocular and stereo-based. In this paper we show the theoretical and practical drawbacks and limits of monocular approaches based on 2D-3D correspondences. We will then present our stereo-based approach and compare the results to the conventional monocular approach in an experimental evaluation. As will be shown, our stereo-based approach performs superior in terms of robustness and accuracy, with only few additional computational effort.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. T. Asfour, K. Regenstein, P. Azad, J. Schröder, N. Vahrenkamp, and R. Dillmann. ARMAR-III: An Integrated Humanoid Platform for Sensory-Motor Control. In IEEE/RAS International Conference on Humanoid Robots (Humanoids), pages 169–175, Genova, Italy, 2006.

    Google Scholar 

  2. P. Azad. Visual Perception for Manipulation and Imitation in Humanoid Robots. PhD thesis, Universität Karlsruhe (TH), Karlsruhe, Germany, 2008.

    Google Scholar 

  3. P. Azad, T. Asfour, and R. Dillmann. Stereo-based 6D Object Localization for Grasping with Humanoid Robot Systems. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 919–924, San Diego, USA, 2007.

    Google Scholar 

  4. P. Azad, T. Asfour, and R. Dillmann. Combining Harris Interest Points and the SIFT Descriptor for Fast Scale-Invariant Object Recognition. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), St. Louis, USA, 2009.

    Google Scholar 

  5. H. Bay, T. Tuytelaars, and L. Van Gool. SURF: Speeded Up Robust Features. In European Conference on Computer Vision (ECCV), pages 404–417, Graz, Austria, 2006.

    Google Scholar 

  6. C. Choi, S.-M. Baek, and S. Lee. Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based Visual Servo. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3983–3989, Nice, France, 2008.

    Google Scholar 

  7. D. F. DeMenthon and L. S. Davis. Model-Based Object Pose in 25 Lines of Code. In European Conference on Computer Vision (ECCV), pages 123–141, Santa Margherita Ligure, Italy, 1992.

    Google Scholar 

  8. B. K. P. Horn. Closed-form Solution of Absolute Orientation using Unit Quaternions. Journal of the Optical Society of America, 4(4):629–642, 1987.

    Article  MathSciNet  Google Scholar 

  9. V. Lepetit, L. Vacchetti, D. Thalmann, and P. Fua. Fully Automated and Stable Registration for Augmented Reality Applications. In International Symposium on Mixed and Augmented Reality (ISMAR), pages 93–102, Tokyo, Japan, 2003.

    Google Scholar 

  10. D. G. Lowe. Object Recognition from Local Scale-Invariant Features. In IEEE International Conference on Computer Vision (ICCV), pages 1150–1517, Kerkyra, Greece, 1999.

    Google Scholar 

  11. C.-P. Lu, G. D. Hager, and E. Mjolsness. Fast and Globally Convergent Pose Estimation from Video Images. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 22(6):610–622, 2000.

    Article  Google Scholar 

  12. E. Marchand, P. Bouthemy, F. Chaumette, and V. Moreau. Robust Real-Time Visual Tracking using a 2D-3D Model-based Approach. In IEEE International Conference on Computer Vision (ICCV), pages 262–268, Kerkyra, Greece, 1999.

    Google Scholar 

  13. J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust Wide Baseline Stereo from Maximally Stable Extremal Regions. In British Machine Vision Conference (BMVC), volume 1, pages 384–393, London, UK, 2002.

    Google Scholar 

  14. G. Taylor and L. Kleeman. Fusion of Multimodal Visual Cues for Model-Based Object Tracking. In Australasian Conference on Robotics and Automation (AGRA), Brisbane, Australia, 2003.

    Google Scholar 

  15. C. Tomasi and T. Kanade. Detection and Tracking of Point Features. Technical Report CMU-CS-91-132, Carnegie Mellon University, Pittsburgh, USA, 1991.

    Google Scholar 

  16. N. Vahrenkamp, S. Wieland, P. Azad, D. Gonzalez, T. Asfour, and R. Dillmann. Visual Servoing for Humanoid Grasping and Manipulation Tasks. In IEEE/RAS International Conference on Humanoid Robots (Humanoids), Daejeon, Korea, 2008.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Azad, P., Asfour, T., Dillmann, R. (2009). Stereo-Based vs. Monocular 6-DoF Pose Estimation Using Point Features: A Quantitative Comparison. In: Dillmann, R., Beyerer, J., Stiller, C., Zöllner, J.M., Gindele, T. (eds) Autonome Mobile Systeme 2009. Informatik aktuell. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10284-4_6

Download citation

Publish with us

Policies and ethics