Object Recognition in the Geometric Era: A Retrospective

Mundy, Joseph L.

doi:10.1007/11957959_1

Joseph L. Mundy²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4170))

3051 Accesses
49 Citations

Abstract

Recent advances in object recognition have emphasized the integration of intensity-derived features such as affine patches with associated geometric constraints leading to impressive performance in complex scenes. Over the four previous decades, the central paradigm of recognition was based on formal geometric object descriptions with a focus on the properties of such descriptions under perspective image formation. This paper will review the key advances of the geometric era and investigate the underlying causes of the movement away from formal geometry and prior models towards the use of statistical learning methods based on appearance features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agin, G., Binford, T.: Computer description of curved objects. In: Proceedings 3rd International Conference on Artificial Intelligence, pp. 629–640 (1993)
Google Scholar
Agin, G.J.: Representation and Description of Curved Objects. Ph.D thesis, Stanford University (October 1972)
Google Scholar
Ambler, A., Barrow, H., Brown, C., Burstall, R., Popplestone, R.: A Versatile Computer-Controlled Assembly System. In: International Joint Conference on Artificial Intelligence, pp. 298–307 (1973)
Google Scholar
Ayache, N., Faugeras, O.: HYPER: A New Approach for the Recognition and Positioning of Two-Dimensional Objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 8(1), 44–54 (1986)
Article Google Scholar
Ballard, D.: Generalizing the Hough Transform to Detect Arbitrary Shapes. Pattern Recognition 13(2), 111–122 (1981)
Article MATH Google Scholar
Belhumeur, P., Kriegman, D.: Learning and recognizing objects using illumination subspaces. In: Proceedings of the IEEEConference on Computer Vision and Pattern Recognition, pp. 270–277 (1996)
Google Scholar
Biederman, I.: Human Image Understanding: Recent Research and a Theory. Computer Vision, Graphics and Image Processing 32, 29–73 (1985)
Article Google Scholar
Binford, T.O.: Visual Perception by Computer. In: Proc. IEEE Conf. on Systems and Control (December 1971)
Google Scholar
Binford, T.O.: Spatial understanding: the successor system. In: Proceedings of the ARPA Image Understanding Workshop. Defense Advanced Research Projects Agency, pp. 12–20. Morgan Kaufmann Publishers, Inc., San Francisco (1992)
Google Scholar
Bolles, R., Cain, R.: Recognizing and locating partially visible objects: The local-feature-focus method. International Journal of Robotics Research 1(3), 57–82 (1982)
Article Google Scholar
Bolles, R., Horaud, R.: 3DPO: A Tree-dimensional Part Orientation System. International Journal of Robotics Research 5(3), 3–26 (1986)
Article Google Scholar
Bolles, R.C., Fischler, M.A.: A RANSAC-based approach to model fitting and its application to finding cylinders in range data. In: International Joint Conference on Artificial Intelligence, Vancouver, Canada, pp. 637–643 (August 1981)
Google Scholar
Brooks, R.: Symbolic reasoning among 3D models and 2D images. Artificial Intelligence Journal 17, 285–348 (1982)
Article Google Scholar
Burns, J., Weiss, R., Riseman, E.: The Non-existence of General-case View-Invariants, pp. 120–131. MIT Press, Cambridge (1992)
Google Scholar
Canny, J.F.: Finding edges and lines in images. Technical Report AI-TR-720, Massachusets Institute of Technology, Artificial Intelligence Laboratory (June 1983)
Google Scholar
Carlsson, S.: Multiple image invariance using the double algebra. In: Mundy, J.L., Zisserman, A., Forsyth, D. (eds.) AICV 1993. LNCS, vol. 825, pp. 145–164. Springer, Heidelberg (1994)
Google Scholar
Chakravarty, I.: The use of characteristic views as a basis for the recognition of three-dimensional objects. In: Proc. Society for Photo-Optical Instrumentation Engineers conference on Robot Vision, vol. 336, pp. 37–45 (May 1982)
Google Scholar
Clemens, D., Jacobs, D.: Space and time bounds on model indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(10), 1007–1116 (1991)
Article Google Scholar
Clemens, D.T., Jacobs, D.W.: Model group indexing for recognition. In: Proceedings of the IEEEConference on Computer Vision and Pattern Recognition, Maui, HI, pp. 4–9 (June 1991)
Google Scholar
Clowes, M.B.: On seeing things. Artificial Intelligence Journal 2, 79–116 (1971)
Article Google Scholar
Cyr, C., Kimia, B.: 3d object recognition using shape similiarity-based aspect graph. In: Proceedings of the International Conference on Computer Vision, Vancouver, Canada, pp. 254–261 (July 2001)
Google Scholar
Dickinson, S., Pentland, A., Rosenfeld, A.: 3-d shape recovery using distributed aspect matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, special issue on Interpretation of 3-D Scenes 14(2), 174–198 (1992)
Article Google Scholar
Faugeras, O., Mundy, J., Ahuja, N., Dyer, C., Pentland, A., Jain, R., Ikeuchi, K., Bowyer, K.: Why aspect graphs are not (yet) practical for computer vision. In: IEEE Workshop on Directions in Automated CAD-Based Vision, pp. 98–104 (1991)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 264–271 (June 2003)
Google Scholar
Firschein, O. (ed.): RADIUS: Image Understanding for Imagery Intelligence. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Fitzgibbon, A.W., Zisserman, A.: Automatic 3D model acquisition and generation of new images from video sequences. In: Proceedings of European Signal Processing Conference (EUSIPCO 1998), Rhodes, Greece, pp. 1261–1269 (1998)
Google Scholar
Goad, C.: Special purpose automatic programming for 3d model-based vision. In: Proc. DARPA Image Understanding Workshop, Arlington, VA, pp. 94–104 (June 1983)
Google Scholar
Grimson, W.E.L.: Object Recognition by Computer: The Role of Geometric Constraints. The MIT Press, Cambridge (1990)
Google Scholar
Grimson, W.E.L., Lozano-Pérez, T.: Model-based recognition and localization from sparse range or tactile data. International Journal of Robotics Research 3(3), 3–35 (1984)
Article Google Scholar
Guzman, A.: Decomposition of a visual scene into three-dimensional bodies. In: Proceedings Fall Joint Computer Conference, vol. 33, pp. 291–304 (1968)
Google Scholar
Guzman, A.: Analysis of curved line drawings using context and global information. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence 6, pp. 325–375. John Wiley and Sons, Inc., New York (1971)
Google Scholar
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)
MATH Google Scholar
Horn, B.K.P.: Shape from shading: a method for obtaining the shape of a smooth opaque object from one view. Technical Report TR-79, MIT Project Mac (October 1970)
Google Scholar
Hu, M.: Visual pattern recognition by moment invariants. IRE Transactions on Information Theory 8(2), 179–187 (1962)
Article Google Scholar
Huffman, D.A.: Impossible Objects as Nonsense Sentences. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence 6, pp. 295–324. Edinburgh University Press (1971)
Google Scholar
Huttenlocher, D.P., Ullman, S.: Object recognition using alignment. In: Proceedings of the First International Conference on Computer Vision, London, pp. 102–111 (1987)
Google Scholar
Ikeuchi, K., Kanade, T.: Applying sensor models to automatic generation of object recognition programs. In: Proc. Second Int’l Conf. Comput. Vision, Tampa, FL, pp. 228–237 (December 1988)
Google Scholar
Kadir, T., Zisserman, A., Brady, M.: An affine invariant salient region detector. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 228–241. Springer, Heidelberg (2004)
Chapter Google Scholar
Koenderink, J.J., van Doorn, A.J.: The singularities of the visual mapping. Biological Cybernetics 24, 51–59 (1976)
Article MATH Google Scholar
Koenderink, J.J., van Doorn, A.J.: Relief: pictorial and otherwise. Image and Vision Computing 13(5), 321–334 (1995)
Article Google Scholar
Kriegman, D., Ponce, J.: Computing exact aspect graphs of curved objects:solids of revolution. The International Journal of Computer Vision 5(2), 119–136 (1990)
Article Google Scholar
Kurzweil, R.: The age of intelligent machines. MIT Press, Cambridge (1990)
Google Scholar
Lamdan, Y., Wolfson, H.J.: Geometric Hashing: A General and Efficient Model-Based Recognition Scheme. In: Proceedings of the 2nd International Conference on Computer Vision, Tampa, Florida, pp. 238–249 (December 1988)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Semi-local affine parts for object recognition. In: British Machine Vision Conference, vol. 2, pp. 779–788 (2004)
Google Scholar
Lowe, D.: Perceptual Organization and Visual Recognition. Kluwer Academic Publishers, Dordrecht (1985)
Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV 1999: Proceedings of the International Conference on Computer Vision, Washington, DC, USA, vol. 2, p. 1150. IEEE Computer Society, Los Alamitos (1999)
Google Scholar
Mackworth, A.K.: Interpreting pictures of polyhedral scenes. Artificial Intelligence Journal 4, 99–118 (1973)
Google Scholar
Marr, D.: Vision. W.H. Freeman and Co., New York (1982)
Google Scholar
Meer, P., Ramakrishna, S., Lenz, R.: Correspondance of coplanar features through p ²-invariant representations. In: Mundy, J.L., Zisserman, A., Forsyth, D.A. (eds.) AICV 1993. LNCS, vol. 825, pp. 437–492. Springer, Heidelberg (1994)
Google Scholar
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, J.A., Matas, F.S., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. Comput. Vision (to appear, 1994)
Google Scholar
Moses, Y., Ullman, S.: Limitations of non model-based recognition systems. In: Sandini, G. (ed.) ECCV 1992. LNCS, vol. 588, pp. 820–828. Springer, Heidelberg (1992)
Google Scholar
Mundy, J.L., Heller, A.J.: The evolution and testing of a model-based object recognition system. In: Proceedings of the 3rd International Conference on Computer Vision, Osaka, Japan, December 1990, pp. 268–282. IEEE Computer Society Press, Los Alamitos (1990)
Chapter Google Scholar
Mundy, J.L., Liu, A., Pillow, N., Zisserman, A., Abdallah, S., Utcke, S., Nayar, S.K., Rothwell, C.: An experimental comparison of appearance and geometric model based recognition. In: Object Representation in Computer Vision, pp. 247–269 (1996)
Google Scholar
Mundy, J.L., Zisserman, A. (eds.): Geometric Invariance in Computer Vision. MIT Press, Cambridge (1992)
Google Scholar
Murase, H., Nayar, S.: Learning and recognition of 3d objects from appearance. The International Journal of Computer Vision 14(1), 5–24 (1995)
Article Google Scholar
Nevatia, R., Binford, T.O.: Structured descriptions of complex obects. In: Proc. 3rd International Joint Conference on Artificial Intelligence, pp. 641–647 (1973)
Google Scholar
Nevatia, R., Binford, T.O.: Description and Recognition of Curved Objects. Artificial Intelligence Journal 8, 77–98 (1977)
Article MATH Google Scholar
Perkins, W.: A model-based vision system for industrial parts. IEEE Transactions on Computers C-27(2), 126–143 (1978)
Article Google Scholar
Petitjean, S.: The complexity and enumerative geometry of aspect graphs of smooth surfaces (April 1994)
Google Scholar
Plantinga, H., Dyer, C.: Visibility, occlusion and the aspect graph. The International Journal of Computer Vision 5(2), 137–160 (1990)
Article Google Scholar
Ponce, J.: Designing tomorrow’s category-level 3D object recognition systems: an international workshop, Taormina, Sicily (September 2003)
Google Scholar
Ponce, J., Zisserman, A., Hebert, M. (eds.): ECCV-WS 1996. LNCS, vol. 1144. Springer, Heidelberg (1996)
Google Scholar
Pope, A., Lowe, D.: Learning Appearance Models for Object Recognition. In: Ponce, et al (ed.) [62], pp. 201–219
Google Scholar
Roberts, L.G.: Machine perception of three-dimensional solids. In: Tippett, J., Berkowitz, D., Clapp, L., Koester, C., Vanderburgh, A. (eds.) Optical and Electrooptical Information processing, pp. 159–197. MIT Press, Cambridge (1965)
Google Scholar
Roland, A., Shiman, P.: DARPA and the Quest for Machine Intelligence. MIT Press, Cambridge (2002)
Google Scholar
Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: 3d object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In: CVPR, pp. 272–280 (2003)
Google Scholar
Rothwell, C.: Object recognition through invariant indexing. Oxford University Science Publications. Oxford University Press, Oxford (1995)
Google Scholar
Rothwell, C.A., Forsyth, D.A., Zisserman, A., Mundy, J.L.: Extracting projective structure from single perspective views of 3D point sets. In: Proceedings International Joint Conference on Computer Vision, Berlin, Germany, May 1993, pp. 573–582. IEEE Computer Society Press, Los Alamitos (1993)
Google Scholar
Sarkar, S., Boyer, K.L.: Perceptual organization in computer vision: A review and a proposal for a classificatory structure. IEEE Transactions on Systems, Man, and Cybernetics 23, 382–399 (1993)
Article Google Scholar
Schaffalitzky, F., Zisserman, A.: Multi-view Matching for Unordered Image Sets, or How Do I Organize My Holiday Snaps? In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 414–431. Springer, Heidelberg (2002)
Chapter Google Scholar
Schmid, C., Bobet, P., Lamiroy, B., Mohr, R.: An image-oriented cad approach. In: Ponce, et al (ed.) [62], pp. 221–246
Google Scholar
Schmid, C., Mohr, R.: Local greyvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(5), 530–535 (1997)
Article Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: Proceedings of the International Conference on Computer Vision (October 2003)
Google Scholar
Stark, L., Bowyer, K.: Generalized Object Recognition through Reasoning About Association of Function to Structure. IEEE Transactions on Pattern Analysis and Machine Intelligence 13, 1097–1104 (1991)
Article Google Scholar
Stockman, G.: Object recognition and localization via pose clustering. Computer Vision, Graphics, and Image Processing 40, 361–387 (1987)
Article Google Scholar
Sugihara, K.: Machine Interpretation of Line Drawings. MIT Press, Cambridge (1986)
Google Scholar
Tarr, M.J., Pinker, S.: When does human object recognition use a viewer-centered reference frame? Psychological Science 1(42), 253–256 (1990)
Article Google Scholar
Thompson, D.W., Mundy, J.L.: Three-dimensional model matching from an unconstrained viewpoint. In: Proceedings of the International Conference on Robotics and Automation, Raleigh, NC, pp. 208–220 (1987)
Google Scholar
Tuytelaars, T., Van Gool, L.: Matching widely separated views based on affine invariant regions. Int. J. Comput. Vision 59(1), 61–85 (2004)
Article Google Scholar
Underwood, S.A., Coates, C.L.: Visual Learning from Multiple Views. IEEE Transactions on Computers C-24(6), 651–661 (1975)
Article MathSciNet Google Scholar
Waltz, D.: Understanding line drawings of scenes with shadows. In: Winston, P.H. (ed.) The Psychology of Computer Vision, pp. 19–91. McGraw-Hill, New York (1975)
Google Scholar
Weinshall, D., Tomasi, C.: Linear and incremental acquisition of invariant shape models from image sequences. In: Proceedings International Joint Conference on Computer Vision, Berlin, Germany, pp. 675–682. IEEE Computer Society Press, Los Alamitos (1993)
Google Scholar
Weiss, I., Ray, M.: Model-based recognition of 3d objects from single images. PAMI 23(2), 116–128 (2001)
Google Scholar
Winston, P.H.: The MIT robot. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence 7, pp. 431–463. Edinberg University Press (1972)
Google Scholar
Zerroug, M., Nevatia, R.: From an intensity image to 3-d segmented descriptions. In: Ponce, J., Hebert, M., Zisserman, A. (eds.) Object Representation in Computer Vision II, pp. 11–24 (1996)
Google Scholar
Zisserman, A., Mundy, J., Forsyth, D., Liu, J., Pillow, N., Rothwell, C., Utcke, S.: Class-based grouping in perspective images. In: Proceedings of the 5th International Conference on Computer Vision, Boston, MA, June 1995, pp. 183–188. IEEE Computer Society Press, Los Alamitos (1995)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Division of Engineering, Brown University, Providence, Rhode Island
Joseph L. Mundy

Authors

Joseph L. Mundy
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Département d’Informatique, Ecole Normale Supérieure, P.O. Box, Paris, France
Jean Ponce
Carnegie Mellon University, Pittsburgh, USA
Martial Hebert
GRAVIR-INRIA, 655 avenue de l’Europe, P.O. Box, 38330, Montbonnot, France
Cordelia Schmid
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mundy, J.L. (2006). Object Recognition in the Geometric Era: A Retrospective. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_1

Download citation

DOI: https://doi.org/10.1007/11957959_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics