Video Temporal Alignment for Object Viewpoint

Papazoglou, Anestis; Del Pero, Luca; Ferrari, Vittorio

doi:10.1007/978-3-319-54190-7_17

Anestis Papazoglou¹⁷,
Luca Del Pero^17,18 &
Vittorio Ferrari¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10114))

Included in the following conference series:

Asian Conference on Computer Vision

Abstract

We address the problem of temporally aligning semantically similar videos, for example two videos of cars on different tracks. We present an alignment method that establishes frame-to-frame correspondences such that the two cars are seen from a similar viewpoint (e.g. facing right), while also being temporally smooth and visually pleasing. Unlike previous works, we do not assume that the videos show the same scripted sequence of events. We compare against three alternative methods, including the popular DTW algorithm, on a new dataset of realistic videos collected from the internet. We perform a comprehensive evaluation using a novel protocol that includes both quantitative measures and a user study on visual pleasingness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Liao, J., Lima, R.S., Nehab, D., Hoppe, H., Sander, P.V.: Semi-automated video morphing. In: Eurographics Symposium on Rendering (2014)
Google Scholar
Agarwala, A., Zheng, K.C., Pal, C., Agrawala, M., Cohen, M., Curless, B., Szeliski, R.: Panoramic video textures. In: SIGGRAPH (2005)
Google Scholar
Ruegg, J., Wang, O., Smolic, A., Gross, M.: Ducttake: spatiotemporal video compositing. Comput. Graph. Forum (Proc. Eurograph.) 32, 51–61 (2013)
Google Scholar
Ngo, C., Ma, Y., Zhang, H.: Video summarization and scene detection by graph modeling. IEEE Trans. Circ. Syst. Video Technol. 15, 296–305 (2005)
Article Google Scholar
Jiang, Y., Ngo, C., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: International Conference on Image and Video Retrieval (2007)
Google Scholar
Kang, S.B., Uyttendaele, M., Winder, S., Szeliski, R.: High dynamic range video. ACM Trans. Graph. 26(3), 760–768 (2007)
Google Scholar
Caspi, Y., Irani, M.: A step towards sequence-to-sequence alignment. In: CVPR (2000)
Google Scholar
Caspi, Y., Irani, M.: Spatio-temporal alignment of sequences. IEEE Trans. PAMI 24, 1409–1424 (2002)
Article Google Scholar
Caspi, Y., Irani, M.: Alignment of non-overlapping sequences. In: ECCV (2001)
Google Scholar
Caspi, Y., Simakov, D., Irani, M.: Feature-based sequence-to-sequence matching. IJCV 68, 53–64 (2006)
Article Google Scholar
Wolf., L., Zomet, A.: Wide baseline matching between unsynchronized video sequences. IJCV (2006)
Google Scholar
Tuytelaars, T., van Gool, L.: Synchronizing video sequences. In: CVPR (2004)
Google Scholar
Evangelidis, G.D., Bauckhage, C.: Efficient subframe video alignment using short descriptors. IEEE Trans. PAMI (2013)
Google Scholar
Wang, O., Schroers, C., Zimmer, H., Gross, M., Sorkine-Hornung, A.: Videosnapping: Interactive synchronization of multiple videos. ACM Trans. Graph. (2014)
Google Scholar
Rao, C., Gritai, A., Shah, M.: View-invariant alignment and matching of video sequences. In: ICCV (2003)
Google Scholar
Ukrainitz, Y., Irani, M.: Aligning sequences and actions by maximizing space-time correlations. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 538–550. Springer, Heidelberg (2006). doi:10.1007/11744078_42
Chapter Google Scholar
Dexter, E., Perez, P., Laptev, I.: Multi-view synchronization of human actions and dynamic scenes. In: BMVC (2009)
Google Scholar
Sakoe, H., Chiba, S.: Object segmentation by alignment of poselet activations to image contours. IEEE Trans. Acoust. Speech Signal Proc. (1978)
Google Scholar
Neal, R.M.: Probabilistic inference using markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, University of Toronto (1993)
Google Scholar
Kim, W.H., Kim, J.N.: An adaptive shot change detection algorithm using an average of absolute difference histogram within extension sliding window. In: ISCE (2009)
Google Scholar
Padua, F.L.C., Carceroni, R.L.: Linear sequence-to-sequence alignment. IEEE Trans. PAMI (2009)
Google Scholar
Douze, M., Revaud, J., Verbeek, J., Jegou, H., Schmid, C.: Circulant temporal encoding for video retrieval and temporal alignment. IJCV (2016)
Google Scholar
Diego, F., Serrat, J., Lpez, A.M.: Joint spatio-temporal alignment of sequences. IEEE Trans. Multimedia (2013)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv abs/1409.1556 (2014)
Ramanathan, V., Tang, K., Mori, G., Fei-Fei, L.: Learning temporal embeddings for complex video analysis. In: ICCV (2015)
Google Scholar
Zha, S., Luisier, F., Andrews, W., Srivastava, N., Salakhutdinov, R.: Exploiting image-trained cnn architectures for unconstrained video classification. In: BMVC (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Google Scholar
Crow, F.C.: Summed-area tables for texture mapping. In: SIGGRAPH (1984)
Google Scholar
Oh, S., Russell, S.J., Sastry, S.: Markov chain monte carlo data association for multi-target tracking. IEEE Trans. Autom. Control 54, 481–497 (2009)
Article MathSciNet Google Scholar
Brau, E., J., G., Simek, K., Del Pero, L., Dawson, C.R., Barnard, K.: Bayesian 3D tracking from monocular video. In: ICCV (2013)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., Fei-Fei, L.: ImageNet large scale visual recognition challenge. IJCV (2015)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results (2012). http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html
Papazoglou, A., Ferrari, V.: Fast object segmentation in unconstrained video. In: ICCV (2013)
Google Scholar
Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV (2013)
Google Scholar
Girshick, R.: Fast R-CNN. In: ICCV (2015)
Google Scholar
Prest, A., Leistner, C., Civera, J., Schmid, C., Ferrari, V.: Learning object class detectors from weakly annotated video. In: CVPR (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Edinburgh, Edinburgh, Scotland
Anestis Papazoglou, Luca Del Pero & Vittorio Ferrari
Blippar, London, UK
Luca Del Pero

Authors

Anestis Papazoglou
View author publications
You can also search for this author in PubMed Google Scholar
Luca Del Pero
View author publications
You can also search for this author in PubMed Google Scholar
Vittorio Ferrari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anestis Papazoglou .

Editor information

Editors and Affiliations

National Tsing Hua University, Hsinchu, Taiwan
Shang-Hong Lai
Graz University of Technology, Graz, Austria
Vincent Lepetit
Drexel University, Philadelphia, Pennsylvania, USA
Ko Nishino
The University of Tokyo, Tokyo, Japan
Yoichi Sato

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Papazoglou, A., Del Pero, L., Ferrari, V. (2017). Video Temporal Alignment for Object Viewpoint. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10114. Springer, Cham. https://doi.org/10.1007/978-3-319-54190-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-54190-7_17
Published: 12 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54189-1
Online ISBN: 978-3-319-54190-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics