Skip to main content

Video Temporal Alignment for Object Viewpoint

  • Conference paper
  • First Online:
Computer Vision – ACCV 2016 (ACCV 2016)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10114))

Included in the following conference series:

Abstract

We address the problem of temporally aligning semantically similar videos, for example two videos of cars on different tracks. We present an alignment method that establishes frame-to-frame correspondences such that the two cars are seen from a similar viewpoint (e.g. facing right), while also being temporally smooth and visually pleasing. Unlike previous works, we do not assume that the videos show the same scripted sequence of events. We compare against three alternative methods, including the popular DTW algorithm, on a new dataset of realistic videos collected from the internet. We perform a comprehensive evaluation using a novel protocol that includes both quantitative measures and a user study on visual pleasingness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Liao, J., Lima, R.S., Nehab, D., Hoppe, H., Sander, P.V.: Semi-automated video morphing. In: Eurographics Symposium on Rendering (2014)

    Google Scholar 

  2. Agarwala, A., Zheng, K.C., Pal, C., Agrawala, M., Cohen, M., Curless, B., Szeliski, R.: Panoramic video textures. In: SIGGRAPH (2005)

    Google Scholar 

  3. Ruegg, J., Wang, O., Smolic, A., Gross, M.: Ducttake: spatiotemporal video compositing. Comput. Graph. Forum (Proc. Eurograph.) 32, 51–61 (2013)

    Google Scholar 

  4. Ngo, C., Ma, Y., Zhang, H.: Video summarization and scene detection by graph modeling. IEEE Trans. Circ. Syst. Video Technol. 15, 296–305 (2005)

    Article  Google Scholar 

  5. Jiang, Y., Ngo, C., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: International Conference on Image and Video Retrieval (2007)

    Google Scholar 

  6. Kang, S.B., Uyttendaele, M., Winder, S., Szeliski, R.: High dynamic range video. ACM Trans. Graph. 26(3), 760–768 (2007)

    Google Scholar 

  7. Caspi, Y., Irani, M.: A step towards sequence-to-sequence alignment. In: CVPR (2000)

    Google Scholar 

  8. Caspi, Y., Irani, M.: Spatio-temporal alignment of sequences. IEEE Trans. PAMI 24, 1409–1424 (2002)

    Article  Google Scholar 

  9. Caspi, Y., Irani, M.: Alignment of non-overlapping sequences. In: ECCV (2001)

    Google Scholar 

  10. Caspi, Y., Simakov, D., Irani, M.: Feature-based sequence-to-sequence matching. IJCV 68, 53–64 (2006)

    Article  Google Scholar 

  11. Wolf., L., Zomet, A.: Wide baseline matching between unsynchronized video sequences. IJCV (2006)

    Google Scholar 

  12. Tuytelaars, T., van Gool, L.: Synchronizing video sequences. In: CVPR (2004)

    Google Scholar 

  13. Evangelidis, G.D., Bauckhage, C.: Efficient subframe video alignment using short descriptors. IEEE Trans. PAMI (2013)

    Google Scholar 

  14. Wang, O., Schroers, C., Zimmer, H., Gross, M., Sorkine-Hornung, A.: Videosnapping: Interactive synchronization of multiple videos. ACM Trans. Graph. (2014)

    Google Scholar 

  15. Rao, C., Gritai, A., Shah, M.: View-invariant alignment and matching of video sequences. In: ICCV (2003)

    Google Scholar 

  16. Ukrainitz, Y., Irani, M.: Aligning sequences and actions by maximizing space-time correlations. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 538–550. Springer, Heidelberg (2006). doi:10.1007/11744078_42

    Chapter  Google Scholar 

  17. Dexter, E., Perez, P., Laptev, I.: Multi-view synchronization of human actions and dynamic scenes. In: BMVC (2009)

    Google Scholar 

  18. Sakoe, H., Chiba, S.: Object segmentation by alignment of poselet activations to image contours. IEEE Trans. Acoust. Speech Signal Proc. (1978)

    Google Scholar 

  19. Neal, R.M.: Probabilistic inference using markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, University of Toronto (1993)

    Google Scholar 

  20. Kim, W.H., Kim, J.N.: An adaptive shot change detection algorithm using an average of absolute difference histogram within extension sliding window. In: ISCE (2009)

    Google Scholar 

  21. Padua, F.L.C., Carceroni, R.L.: Linear sequence-to-sequence alignment. IEEE Trans. PAMI (2009)

    Google Scholar 

  22. Douze, M., Revaud, J., Verbeek, J., Jegou, H., Schmid, C.: Circulant temporal encoding for video retrieval and temporal alignment. IJCV (2016)

    Google Scholar 

  23. Diego, F., Serrat, J., Lpez, A.M.: Joint spatio-temporal alignment of sequences. IEEE Trans. Multimedia (2013)

    Google Scholar 

  24. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv abs/1409.1556 (2014)

  25. Ramanathan, V., Tang, K., Mori, G., Fei-Fei, L.: Learning temporal embeddings for complex video analysis. In: ICCV (2015)

    Google Scholar 

  26. Zha, S., Luisier, F., Andrews, W., Srivastava, N., Salakhutdinov, R.: Exploiting image-trained cnn architectures for unconstrained video classification. In: BMVC (2015)

    Google Scholar 

  27. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)

    Google Scholar 

  28. Crow, F.C.: Summed-area tables for texture mapping. In: SIGGRAPH (1984)

    Google Scholar 

  29. Oh, S., Russell, S.J., Sastry, S.: Markov chain monte carlo data association for multi-target tracking. IEEE Trans. Autom. Control 54, 481–497 (2009)

    Article  MathSciNet  Google Scholar 

  30. Brau, E., J., G., Simek, K., Del Pero, L., Dawson, C.R., Barnard, K.: Bayesian 3D tracking from monocular video. In: ICCV (2013)

    Google Scholar 

  31. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)

    Google Scholar 

  32. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., Fei-Fei, L.: ImageNet large scale visual recognition challenge. IJCV (2015)

    Google Scholar 

  33. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results (2012). http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html

  34. Papazoglou, A., Ferrari, V.: Fast object segmentation in unconstrained video. In: ICCV (2013)

    Google Scholar 

  35. Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV (2013)

    Google Scholar 

  36. Girshick, R.: Fast R-CNN. In: ICCV (2015)

    Google Scholar 

  37. Prest, A., Leistner, C., Civera, J., Schmid, C., Ferrari, V.: Learning object class detectors from weakly annotated video. In: CVPR (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anestis Papazoglou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Papazoglou, A., Del Pero, L., Ferrari, V. (2017). Video Temporal Alignment for Object Viewpoint. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10114. Springer, Cham. https://doi.org/10.1007/978-3-319-54190-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54190-7_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54189-1

  • Online ISBN: 978-3-319-54190-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics