Skip to main content

Multi-script and Multi-oriented Text Localization from Scene Images

  • Conference paper
Camera-Based Document Analysis and Recognition (CBDAR 2011)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7139))

Abstract

This paper describes a new method of color text localization from generic scene images containing text of different scripts and with arbitrary orientations. A representative set of colors is first identified using the edge information to initiate an unsupervised clustering algorithm. Text components are identified from each color layer using a combination of a support vector machine and a neural network classifier trained on a set of low-level features derived from the geometric, boundary, stroke and gradient information. Experiments on camera-captured images that contain variable fonts, size, color, irregular layout, non-uniform illumination and multiple scripts illustrate the robustness of the method. The proposed method yields precision and recall of 0.8 and 0.86 respectively on a database of 100 images. The method is also compared with others in the literature using the ICDAR 2003 robust reading competition dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. Patt. Recog. 28(10), 1523–1535 (1995)

    Article  Google Scholar 

  2. Li, H., Doermann, D., Kia, O.: Automatic Text Detection and Tracking in Digital Video. IEEE Trans. Image Proc. 9(1), 147–156 (2000)

    Article  Google Scholar 

  3. Raju, S.S., Pati, P.B., Ramakrishnan, A.G.: Gabor Filter Based Block Energy Analysis for Text Extraction from Digital Document Images. In: Proc. Intl. Workshop DIAL, pp. 233–243 (2004)

    Google Scholar 

  4. Wu, V., Manmatha, R., Riseman, E.M.: TextFinder: an automatic system to detect and recognize text in images. IEEE Trans. PAMI 21(11), 1124–1129 (1999)

    Article  Google Scholar 

  5. Clark, P., Mirmehdi, M.: Finding text using localised measures. In: Proc. British Machine Vision Conf., pp. 675–684 (2000)

    Google Scholar 

  6. Chen, X., Yuille, A.L.: Detecting and Reading Text in Natural Scenes. In: Proc. IEEE Intl. Conf. CVPR, vol. 2, pp. 366–373 (2004)

    Google Scholar 

  7. Shivakumara, P., Dutta, A., Tan, C.L., Pal, U.: A New Wavelet-Median-Moment based Method for Multi-Oriented Video Text Detection. In: Proc. Intl. Workshop on Document Analysis and Systems, pp. 279–286 (2010)

    Google Scholar 

  8. Gatos, B., Pratikakis, I., Kepene, K., Perantonis, S.J.: Text detection in indoor/outdoor scene images. In: Proc. Intl. Workshop CBDAR, pp. 127–132 (2005)

    Google Scholar 

  9. Zhu, K., Qi, F., Jiang, R., Xu, L., Kimachi, M., Wu, Y., Aizawa, T.: Using Adaboost to Detect and Segment Characters from Natural Scenes. In: Proc. Intl. Workshop CBDAR, pp. 52–59 (2005)

    Google Scholar 

  10. Pan, W., Brui, T.D., Suen, C.Y.: Text Detection from Scene Images Using Sparse Representation. In: Proc. ICPR, pp. 1–5 (2008)

    Google Scholar 

  11. Kasar, T., Ramakrishnan, A.G.: COCOCLUST: Contour-based Color Clustering for Robust Binarization of Colored Text. In: Proc. Intl. Workshop CBDAR, pp. 11–17 (2009)

    Google Scholar 

  12. Antonacopoulos, A., Karatzas, D.: Fuzzy Segmentation of Characters in Web Images Based on Human Colour Perception. In: Lopresti, D.P., Hu, J., Kashi, R.S. (eds.) DAS 2002. LNCS, vol. 2423, pp. 295–306. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. ICDAR Robust reading competition data set (2003), http://algoval.essex.ac.uk/icdar/Competitions.html

  14. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines, http://www.csie.ntu.edu.tw/cjlin/libsvm

  15. Kasar, T., Kumar, D., Prasad, A.N., Girish, D., Ramakrishnan, A.G.: MAST: Multi-scipt Annotation Toolkit for Scenic Text. In: Joint Workshop on MOCR and AND, pp. 113–120 (2011), software http://mile.ee.iisc.ernet.in/mast

  16. Lucas, S.M.: ICDAR 2005 Text Locating Competition Results. In: Proc. ICDAR, pp. 80–84 (2005)

    Google Scholar 

  17. Neumann, L., Matas, J.: A Method for Text Localization and Recognition in Real-World Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  18. Minetto, R., Thome, N., Cord, M., Fabrizio, J., Marcotegui, B.: SNOOPERTEXT: A multiresolution system for text detection in complex visual scenes. In: Proc. IEEE ICIP, pp. 3861–3864 (2010)

    Google Scholar 

  19. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proc. IEEE Conf. CVPR, pp. 2963–2970 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kasar, T., Ramakrishnan, A.G. (2012). Multi-script and Multi-oriented Text Localization from Scene Images. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2011. Lecture Notes in Computer Science, vol 7139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29364-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29364-1_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29363-4

  • Online ISBN: 978-3-642-29364-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics