A Phase-Based Approach for Caption Detection in Videos

Wen, Shu; Song, Yonghong; Zhang, Yuanlin; Yu, Yu

doi:10.1007/978-3-642-37444-9_32

Shu Wen²⁰,
Yonghong Song²⁰,
Yuanlin Zhang²⁰ &
…
Yu Yu²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7725))

Included in the following conference series:

Asian Conference on Computer Vision

3882 Accesses

Abstract

The captions in videos are closely related to the video contents, so the research of automatic caption detection contributes to video contents analysis and content-based retrieval. In this paper, a novel phase-based static caption detection approach is proposed. Our phase-based algorithm consists of two processes: candidate caption region detection and candidate caption region refinement. Firstly, the candidate caption regions are extracted from the caption saliency map, which is mainly generated by phase-only Fourier synthesis. Secondly, the candidate regions are refined by text region shape features. The comparison experimental results with existing methods show a better performance of our proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kim, K.I., Jung, K., Kim, J.H.: Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence 25, 1631–1639 (2003)
Article Google Scholar
Liu, X., Wang, W.: Robustly Extracting Captions in Videos Based on Stroke-Like Edges and Spatio-Temporal Analysis. IEEE Transactions on Multimedia 14, 482–489 (2012)
Article Google Scholar
Shivakumara, P., Huang, W., Tan, C.L.: An Efficient Edge based Technique for Text Detection in Video Frames. In: The Eighth IAPR Workshop on Document Analysis Systems, pp. 307–314 (2008)
Google Scholar
Shivakumara, P., Phan, T.Q., Tan, C.L.: Video Text Detection Based on Filters and Edge Features. In: IEEE International Conference on Multimedia and Expo., pp. 514–517 (2009)
Google Scholar
Phan, T.Q., Shivakumara, P., Tan, C.L.: A Laplacian Method for Video Text Detection. In: International Conference on Document Analysis and Recognition, pp. 66–70 (2009)
Google Scholar
Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian Approach to Multi-Oriented Text Detection in Video. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 412–419 (2011)
Article Google Scholar
Ar, I., Elif Karsligil, M.: Text Area Detection in Digital Documents Images Using Textural Features. In: Kropatsch, W.G., Kampel, M., Hanbury, A. (eds.) CAIP 2007. LNCS, vol. 4673, pp. 555–562. Springer, Heidelberg (2007)
Chapter Google Scholar
Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and Robust Text Detection in Images and Video Frames. Image and Vision Computing 23, 565–576 (2005)
Article Google Scholar
Zhao, X., Lin, K.-H., Fu, Y., Hu, Y., Liu, Y., Huang, T.S.: Text From Corners: A Novel Approach to Detect Text and Caption in Videos. IEEE Transactions on Image Processing 20, 790–799 (2011)
Article MathSciNet Google Scholar
Oppenheim, A.V., Lim, J.S.: The Importance of Phase in Signals. Proceedings of the IEEE 69, 529–541 (1981)
Article Google Scholar
Guo, C., Ma, Q., Zhang, L.: Spatio-temporal Saliency Detection Using Phase Spectrum of Quaternion Fourier Transform. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 295–302 (2008)
Google Scholar
Aiger, D., Talbot, H.: The Phase Only Transform For Unsupervised Surface Defect Detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 295–302 (2010)
Google Scholar
Otsu, N.: A Threshold Selection Method From Gray Level Histograms. IEEE Transactions on Systems, Man and Cybernetics 9, 62–66 (1979)
Article Google Scholar
Meng, W., Yang, L.J., Hua, X.S.: MSRA-MM: Bridging Research and Industrial Societies for Multimedia Information Retrieval. Microsoft Technical Report, MSR-TR-2009-30 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, No.28, Xianning West Road, Xi’an, Shaanxi, P.R. China
Shu Wen, Yonghong Song, Yuanlin Zhang & Yu Yu

Authors

Shu Wen
View author publications
You can also search for this author in PubMed Google Scholar
Yonghong Song
View author publications
You can also search for this author in PubMed Google Scholar
Yuanlin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Yu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, 151-744, Seoul, Korea
Kyoung Mu Lee
Microsoft Research Asia, No. 5, Danling st., Haidian district, 100080, Beijing, P.R. China
Yasuyuki Matsushita
School of Interactive Computing, Georgia Institute of Technology, 801 Atlantic Drive, CCB 315, 30332, Atlanta, GA, USA
James M. Rehg
Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Zhong Quan Cun East Road 95, Haidian District, 100 190, Beijing, P.R. China
Zhanyi Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wen, S., Song, Y., Zhang, Y., Yu, Y. (2013). A Phase-Based Approach for Caption Detection in Videos. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37444-9_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-37444-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37443-2
Online ISBN: 978-3-642-37444-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics