Skip to main content

A Phase-Based Approach for Caption Detection in Videos

  • Conference paper
Computer Vision – ACCV 2012 (ACCV 2012)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7725))

Included in the following conference series:

  • 3882 Accesses

Abstract

The captions in videos are closely related to the video contents, so the research of automatic caption detection contributes to video contents analysis and content-based retrieval. In this paper, a novel phase-based static caption detection approach is proposed. Our phase-based algorithm consists of two processes: candidate caption region detection and candidate caption region refinement. Firstly, the candidate caption regions are extracted from the caption saliency map, which is mainly generated by phase-only Fourier synthesis. Secondly, the candidate regions are refined by text region shape features. The comparison experimental results with existing methods show a better performance of our proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kim, K.I., Jung, K., Kim, J.H.: Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence 25, 1631–1639 (2003)

    Article  Google Scholar 

  2. Liu, X., Wang, W.: Robustly Extracting Captions in Videos Based on Stroke-Like Edges and Spatio-Temporal Analysis. IEEE Transactions on Multimedia 14, 482–489 (2012)

    Article  Google Scholar 

  3. Shivakumara, P., Huang, W., Tan, C.L.: An Efficient Edge based Technique for Text Detection in Video Frames. In: The Eighth IAPR Workshop on Document Analysis Systems, pp. 307–314 (2008)

    Google Scholar 

  4. Shivakumara, P., Phan, T.Q., Tan, C.L.: Video Text Detection Based on Filters and Edge Features. In: IEEE International Conference on Multimedia and Expo., pp. 514–517 (2009)

    Google Scholar 

  5. Phan, T.Q., Shivakumara, P., Tan, C.L.: A Laplacian Method for Video Text Detection. In: International Conference on Document Analysis and Recognition, pp. 66–70 (2009)

    Google Scholar 

  6. Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian Approach to Multi-Oriented Text Detection in Video. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 412–419 (2011)

    Article  Google Scholar 

  7. Ar, I., Elif Karsligil, M.: Text Area Detection in Digital Documents Images Using Textural Features. In: Kropatsch, W.G., Kampel, M., Hanbury, A. (eds.) CAIP 2007. LNCS, vol. 4673, pp. 555–562. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and Robust Text Detection in Images and Video Frames. Image and Vision Computing 23, 565–576 (2005)

    Article  Google Scholar 

  9. Zhao, X., Lin, K.-H., Fu, Y., Hu, Y., Liu, Y., Huang, T.S.: Text From Corners: A Novel Approach to Detect Text and Caption in Videos. IEEE Transactions on Image Processing 20, 790–799 (2011)

    Article  MathSciNet  Google Scholar 

  10. Oppenheim, A.V., Lim, J.S.: The Importance of Phase in Signals. Proceedings of the IEEE 69, 529–541 (1981)

    Article  Google Scholar 

  11. Guo, C., Ma, Q., Zhang, L.: Spatio-temporal Saliency Detection Using Phase Spectrum of Quaternion Fourier Transform. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 295–302 (2008)

    Google Scholar 

  12. Aiger, D., Talbot, H.: The Phase Only Transform For Unsupervised Surface Defect Detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 295–302 (2010)

    Google Scholar 

  13. Otsu, N.: A Threshold Selection Method From Gray Level Histograms. IEEE Transactions on Systems, Man and Cybernetics 9, 62–66 (1979)

    Article  Google Scholar 

  14. Meng, W., Yang, L.J., Hua, X.S.: MSRA-MM: Bridging Research and Industrial Societies for Multimedia Information Retrieval. Microsoft Technical Report, MSR-TR-2009-30 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wen, S., Song, Y., Zhang, Y., Yu, Y. (2013). A Phase-Based Approach for Caption Detection in Videos. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37444-9_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37444-9_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37443-2

  • Online ISBN: 978-3-642-37444-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics