Retrieval of Multiple Instances of Objects in Videos

Bursuc, Andrei; Zaharia, Titus; Prêteux, Françoise

doi:10.1007/978-3-642-27355-1_34

Andrei Bursuc^22,23,
Titus Zaharia²² &
Françoise Prêteux²⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7131))

Included in the following conference series:

International Conference on Multimedia Modeling

2051 Accesses
1 Citations

Abstract

This paper tackles the issue of retrieving different instances of an object of interest within a given video document or in a video database. The principle consists in considering a semi-global image representation based on an over-segmentation of image frames. An aggregation mechanism is then applied in order to group a set of sub-regions into an object similar to the query, under a global similarity criterion. Two different strategies are proposed. The first one involves a greedy, dynamic region construction method. The second is based on simulated annealing, and aims at determining a global optimum. Experimental results show promising performances, with object detection rates of up to 79%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Snoek, C.G.M., Worring, M.: Concept-Based Video Retrieval. Foundation and Trend in Information Retrieval 2(4), 215–322 (2008)
Article Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proc. 8th ACM International Workshop on Multimedia Information Retrieval, MIR 2006, USA, October 26 - 27, pp. 321–330. ACM Press, New York (2006)
Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: IEEE International Conf. on Computer Vision, ICCV 2003 (2003)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV) 2(60), 91–110 (2004)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: An Affine Invariant Interest Point Detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 128–142. Springer, Heidelberg (2002)
Chapter Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference (BMVC 2002), pp. 384–393 (2002)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Weakly supervised scale-invariant learning of models for visual recognition. Int. Journal of Computer Vision 71(3), 273–303 (2007)
Article Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. IJCV 77(1-3), 259–289 (2008)
Article Google Scholar
Jiang, H., Drew, M.S., Li, Z.: Matching by linear programming and successive convexification. IEEE Trans. PAMI 29, 959–975 (2007)
Article Google Scholar
Li, H., Kim, E., Huang, X., He, L.: Object matching with a locally affine-invariant constraint. In: IEEE International Conf. on Computer Vision and Pattern Recognition (CVPR 2010), pp. 1641–1648 (2010)
Google Scholar
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Tola, E., Lepetit, V., Fua, P.: A fast local descriptor for dense matching. In: IEEE International Conf. on Computer Vision and Pattern Recognition, CVPR 2008 (2008)
Google Scholar
Tuytelaars, T., Schmid, C.: Vector quantizing feature space with a regular lattice. In: IEEE International Conf. on Computer Vision, ICCV 2007 (2007)
Google Scholar
Tuytelaars, T.: Dense Interest Points. In: IEEE International Conf. on Computer Vision and Pattern Recognition (CVPR 2010), pp. 2281–2288 (2010)
Google Scholar
Browne, P., Smeaton, A.F.: Video retrieval using dialogue, keyframe similarity and video objects. In: IEEE International Conf. on Image Processing (ICIP 2005), September 11-14, pp. III-1208- III-1211 (2005)
Google Scholar
Foley, C., et al.: TRECVID 2010 Experiments at Dublin City University. TRECVid 2010 - Text REtrieval Conference TRECVid Workshop, Gaithersburg, MD (November 2010)
Google Scholar
Gorisse, D., et al.: IRIM at TRECVID 2010: Semantic Indexing and Instance Search. TRECVid 2010 - Text REtrieval Conference TRECVid Workshop (November 2010)
Google Scholar
Ren, X., Malik, J.: Learning a classification model for segmentation. In: IEEE International Conf. on Computer Vision (ICCV 2003), vol. 1, pp. 10–17 (2003)
Google Scholar
Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. International Journal on Computer Vision (2008)
Google Scholar
Malisiewicz, T., Efros, A.: Improving spatial support for objects via multiple segmentations. In: British Machine Vision Conference, BMVC 2007 (2007)
Google Scholar
Chevalier, F., Domenger, J.P., Benois-Pineau, J., Delest, M.: Retrieval of objects in video by similarity based on graph matching. Pattern Recognition Letters 28(8), 939–949 (2007)
Article Google Scholar
Vieux, R., Benois-Pineau, J., Domenger, J.-P., Braquelaire, A.: Segmentation-based multi-class semantic object detection. In: Multimedia Tools and Applications, pp. 1–22 (2010)
Google Scholar
Kim, K., Grauman, K.: Boundary Preserving Dense Local Regions. In: IEEE International Conf. on Computer Vision and Pattern Recognition (2010)
Google Scholar
Manjunath, B.S., Ohm, J.R., Vasudevan, V.V., Yamada, A.: Color and Texture Descriptors. IEEE Transactions on Circuits and Systems for Video Technology 11(6), 703–715 (2001)
Article Google Scholar
Yang, N.C., Chang, W.H., Kuo, C.M., Li, T.H.: A fast MPEG-7 dominant color extraction with new similarity measure for image retrieval. Journal of Visual Communication and Image Representation 19(2), 92–105 (2008)
Article Google Scholar
Zin, T.T., Tin, P., Toriu, T., Hama, H.: Dominant Color Embedded Markov Chain Model for Object Image Retrieval. In: 5th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, September 12-14, pp. 186–189 (2009)
Google Scholar
Tapu, R., Zaharia, T.: A complete framework for temporal video segmentation. In: Proc. IEEE Int. Conf. on Consumer Electronics Berlin (ICCE-Berlin), Germany (September 2011)
Google Scholar
Comaniciu, D., Meer, P.: Mean Shift: A Robust Approach Toward Feature Space Analysis. IEEE Tran. on Pattern Analysis and Machine Intelligence, 603–619 (May 2002)
Google Scholar
Hafner, J., Sawhney, H.S., Equitz, W., Flickner, M., Niblack, W.: Efficient color histogram indexing for quadratic form distance functions. IEEE Trans. Pattern Anal. Machine Intell. 17, 729–736 (1995)
Article Google Scholar
Kirkpatrick, S., Gelatt, C.D., Vechi, M.P.: Optimization by simulated annealing. Science, 220 (1983)
Google Scholar
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equations of state calculation by fast computing machines. Journal of Chemical Physics 21(6), 1087–1092 (1953)
Article Google Scholar
Lundy, M., Mees, A.: Convergence of an annealing algorithm. Mathematical Programming 34, 111–124 (1986)
Article MATH Google Scholar
Bursuc, A., Zaharia, T., Prêteux, F.: Mobile Video Browsing and Retrieval with the OVIDIUS Platform. In: Proc. ACM Multimedia 2010 International Conference, Florence, Italy (October 2010)
Google Scholar

Download references

Author information

Authors and Affiliations

ARTEMIS Department, UMR CNRS 8145 MAP5, Institut Télécom, Télécom SudParis, 9 rue Charles Fourier, 91011, Evry Cedex, France
Andrei Bursuc & Titus Zaharia
Alcatel-Lucent Bell Labs France, route de Villejust, 91620, Nozay, France
Andrei Bursuc
Mines ParisTech, 60, Boulevard Saint-Michel, 75272, Paris Cedex, France
Françoise Prêteux

Authors

Andrei Bursuc
View author publications
You can also search for this author in PubMed Google Scholar
Titus Zaharia
View author publications
You can also search for this author in PubMed Google Scholar
Françoise Prêteux
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Information Technology, Alpen-Adria-Universität Klagenfurt, Universitätsstr. 65-67, 9020, Klagenfurt, Austria
Klaus Schoeffmann
EURECOM, 2229 Rout des Crêtes, BP 193, 06904, Sophia Antipolis Cedex, France
Bernard Merialdo
School of Computer Science, Carnegie Mellon University, 5000 Forbes Ave, 15213-3890, Pittsburgh, PA, USA
Alexander G. Hauptmann
Department of Computer Science, City University of Hong Kong, Tat Chee Ave, Kowloon, Hong Kong
Chong-Wah Ngo
Department of Electronic and Electrical Engineering, University College London, Roberts Building, Torrington Place, WC1E 7JE, London, UK
Yiannis Andreopoulos
Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstrasse 9-11 188/2, 1040, Vienna, Austria
Christian Breiteneder

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bursuc, A., Zaharia, T., Prêteux, F. (2012). Retrieval of Multiple Instances of Objects in Videos. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, CW., Andreopoulos, Y., Breiteneder, C. (eds) Advances in Multimedia Modeling. MMM 2012. Lecture Notes in Computer Science, vol 7131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27355-1_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-27355-1_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27354-4
Online ISBN: 978-3-642-27355-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics