Compact Video Description for Copy Detection with Precise Temporal Alignment

Douze, Matthijs; Jégou, Hervé; Schmid, Cordelia; Pérez, Patrick

doi:10.1007/978-3-642-15549-9_38

Matthijs Douze¹⁹,
Hervé Jégou²⁰,
Cordelia Schmid¹⁹ &
…
Patrick Pérez²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6311))

Included in the following conference series:

European Conference on Computer Vision

8859 Accesses
22 Citations

Abstract

This paper introduces a very compact yet discriminative video description, which allows example-based search in a large number of frames corresponding to thousands of hours of video. Our description extracts one descriptor per indexed video frame by aggregating a set of local descriptors. These frame descriptors are encoded using a time-aware hierarchical indexing structure. A modified temporal Hough voting scheme is used to rank the retrieved database videos and estimate segments in them that match the query. If we use a dense temporal description of the videos, matched video segments are localized with excellent precision.

Experimental results on the Trecvid 2008 copy detection task and a set of 38000 videos from YouTube show that our method offers an excellent trade-off between search accuracy, efficiency and memory usage.

Download to read the full chapter text

Chapter PDF

Efficient video copy detection using multi-modality and dynamic path search

Article 14 May 2014

Compressing Visual Descriptors of Image Sequences

Spatio-temporal Features for Efficient Video Copy Detection

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Over, P., Awad, G., Rose, T., Fiscus, J., Kraaij, W., Smeaton, A.: Trecvid 2008- goals, tasks, data, evaluation mechanisms and metrics. In: Trecvid (2008)
Google Scholar
Law-To, J., Chen, L., Joly, A., Laptev, I., Buisson, O., Gouet-Brunet, V., Boujemaa, N., Stentiford, F.: Video copy detection: a comparative study. In: CIVR, pp. 371–378. ACM, New York (2007)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1615–1630 (2005)
Article Google Scholar
Joly, A.: New local descriptors based on dissociated dipoles. In: CIVR (2007)
Google Scholar
Douze, M., Jégou, H., Schmid, C.: An image-based approach to video copy detection with spatio-temporal post-filtering. IEEE Transactions on Multimedia 12, 257–266 (2010)
Article Google Scholar
Perronnin, F., Dance, C.R.: Fisher kernels on visual vocabularies for image categorization. In: CVPR (2007)
Google Scholar
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: CVPR (2010)
Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV, 1470–1477 (2003)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence (2010)
Google Scholar
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. International Journal of Computer Vision 60, 63–86 (2004)
Article Google Scholar
Heikkila, M., Pietikainen, M., Schmid, C.: Description of interest regions with local binary patterns. Pattern Recognition 42, 425–436 (2009)
Article Google Scholar
Bay, H., Ess, A., Tuytelaars, T., Gool, L.V.: Surf: Speeded up robust features. Computer Vision and Image Understanding 110, 346–359 (2008)
Article Google Scholar
Winder, S., Hua, G., Brown, M.: Picking the best Daisy. In: CVPR (2009)
Google Scholar
Chandrasekhar, V., Takacs, G., Chen, D., Tsai, S., Grzeszczuk, R., Girod, B.: Chog: Compressed histogram of gradients: A low bit-rate feature descriptor. In: CVPR (2009)
Google Scholar
Calonder, M., Lepetit, V., Fua, P., Konolige, K., Bowman, J., Mihelich, P.: Compact signatures for high-speed interest point description and matching. In: ICCV (2009)
Google Scholar
Perronnin, F., Liu, Y., Sanchez, J., Poirier, H.: Large-scale image retrieval with compressed Fisher vectors. In: CVPR (2010)
Google Scholar
Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: CVPR, pp. 2161–2168 (2006)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)
Google Scholar
Yeh, M.C., Cheng, K.T.: Video copy detection by fast sequence matching. In: CIVR (2009)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: CVPR (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

INRIA Grenoble, France
Matthijs Douze & Cordelia Schmid
INRIA Rennes, France
Hervé Jégou
Technicolor Rennes, France
Patrick Pérez

Authors

Matthijs Douze
View author publications
You can also search for this author in PubMed Google Scholar
Hervé Jégou
View author publications
You can also search for this author in PubMed Google Scholar
Cordelia Schmid
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Pérez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

GRASP Laboratory, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA
Kostas Daniilidis
School of Electrical and Computer Engineering, National Technical University of Athens, 15773, Athens, Greece
Petros Maragos
Department of Applied Mathematics, Ecole Centrale de Paris, Grande Voie des Vignes, 92295, Chatenay-Malabry, France
Nikos Paragios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Douze, M., Jégou, H., Schmid, C., Pérez, P. (2010). Compact Video Description for Copy Detection with Precise Temporal Alignment. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6311. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15549-9_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-15549-9_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15548-2
Online ISBN: 978-3-642-15549-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Compact Video Description for Copy Detection with Precise Temporal Alignment

Abstract

Chapter PDF

Similar content being viewed by others

Efficient video copy detection using multi-modality and dynamic path search

Compressing Visual Descriptors of Image Sequences

Spatio-temporal Features for Efficient Video Copy Detection

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Compact Video Description for Copy Detection with Precise Temporal Alignment

Abstract

Chapter PDF

Similar content being viewed by others

Efficient video copy detection using multi-modality and dynamic path search

Compressing Visual Descriptors of Image Sequences

Spatio-temporal Features for Efficient Video Copy Detection

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation