Multi-script and Multi-oriented Text Localization from Scene Images

Kasar, Thotreingam; Ramakrishnan, Angarai G.

doi:10.1007/978-3-642-29364-1_1

Thotreingam Kasar¹⁸ &
Angarai G. Ramakrishnan¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7139))

Included in the following conference series:

International Workshop on Camera-Based Document Analysis and Recognition

1074 Accesses
9 Citations

Abstract

This paper describes a new method of color text localization from generic scene images containing text of different scripts and with arbitrary orientations. A representative set of colors is first identified using the edge information to initiate an unsupervised clustering algorithm. Text components are identified from each color layer using a combination of a support vector machine and a neural network classifier trained on a set of low-level features derived from the geometric, boundary, stroke and gradient information. Experiments on camera-captured images that contain variable fonts, size, color, irregular layout, non-uniform illumination and multiple scripts illustrate the robustness of the method. The proposed method yields precision and recall of 0.8 and 0.86 respectively on a database of 100 images. The method is also compared with others in the literature using the ICDAR 2003 robust reading competition dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. Patt. Recog. 28(10), 1523–1535 (1995)
Article Google Scholar
Li, H., Doermann, D., Kia, O.: Automatic Text Detection and Tracking in Digital Video. IEEE Trans. Image Proc. 9(1), 147–156 (2000)
Article Google Scholar
Raju, S.S., Pati, P.B., Ramakrishnan, A.G.: Gabor Filter Based Block Energy Analysis for Text Extraction from Digital Document Images. In: Proc. Intl. Workshop DIAL, pp. 233–243 (2004)
Google Scholar
Wu, V., Manmatha, R., Riseman, E.M.: TextFinder: an automatic system to detect and recognize text in images. IEEE Trans. PAMI 21(11), 1124–1129 (1999)
Article Google Scholar
Clark, P., Mirmehdi, M.: Finding text using localised measures. In: Proc. British Machine Vision Conf., pp. 675–684 (2000)
Google Scholar
Chen, X., Yuille, A.L.: Detecting and Reading Text in Natural Scenes. In: Proc. IEEE Intl. Conf. CVPR, vol. 2, pp. 366–373 (2004)
Google Scholar
Shivakumara, P., Dutta, A., Tan, C.L., Pal, U.: A New Wavelet-Median-Moment based Method for Multi-Oriented Video Text Detection. In: Proc. Intl. Workshop on Document Analysis and Systems, pp. 279–286 (2010)
Google Scholar
Gatos, B., Pratikakis, I., Kepene, K., Perantonis, S.J.: Text detection in indoor/outdoor scene images. In: Proc. Intl. Workshop CBDAR, pp. 127–132 (2005)
Google Scholar
Zhu, K., Qi, F., Jiang, R., Xu, L., Kimachi, M., Wu, Y., Aizawa, T.: Using Adaboost to Detect and Segment Characters from Natural Scenes. In: Proc. Intl. Workshop CBDAR, pp. 52–59 (2005)
Google Scholar
Pan, W., Brui, T.D., Suen, C.Y.: Text Detection from Scene Images Using Sparse Representation. In: Proc. ICPR, pp. 1–5 (2008)
Google Scholar
Kasar, T., Ramakrishnan, A.G.: COCOCLUST: Contour-based Color Clustering for Robust Binarization of Colored Text. In: Proc. Intl. Workshop CBDAR, pp. 11–17 (2009)
Google Scholar
Antonacopoulos, A., Karatzas, D.: Fuzzy Segmentation of Characters in Web Images Based on Human Colour Perception. In: Lopresti, D.P., Hu, J., Kashi, R.S. (eds.) DAS 2002. LNCS, vol. 2423, pp. 295–306. Springer, Heidelberg (2002)
Chapter Google Scholar
ICDAR Robust reading competition data set (2003), http://algoval.essex.ac.uk/icdar/Competitions.html
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines, http://www.csie.ntu.edu.tw/cjlin/libsvm
Kasar, T., Kumar, D., Prasad, A.N., Girish, D., Ramakrishnan, A.G.: MAST: Multi-scipt Annotation Toolkit for Scenic Text. In: Joint Workshop on MOCR and AND, pp. 113–120 (2011), software http://mile.ee.iisc.ernet.in/mast
Lucas, S.M.: ICDAR 2005 Text Locating Competition Results. In: Proc. ICDAR, pp. 80–84 (2005)
Google Scholar
Neumann, L., Matas, J.: A Method for Text Localization and Recognition in Real-World Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)
Chapter Google Scholar
Minetto, R., Thome, N., Cord, M., Fabrizio, J., Marcotegui, B.: SNOOPERTEXT: A multiresolution system for text detection in complex visual scenes. In: Proc. IEEE ICIP, pp. 3861–3864 (2010)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proc. IEEE Conf. CVPR, pp. 2963–2970 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Medical Intelligence and Language Engineering Laboratory, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India, 560012
Thotreingam Kasar & Angarai G. Ramakrishnan

Authors

Thotreingam Kasar
View author publications
You can also search for this author in PubMed Google Scholar
Angarai G. Ramakrishnan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Engineering, Dept. of Computer Science and Intelligent Systems, Osaka Prefecture University, 1-1 Gakuencho, Naka Sakai, 599-8531, Osaka, Japan
Masakazu Iwamura
German Research Center for Artificial Intelligence, Multimedia Analysis and Data Mining Competence Center, Trippstadter Str. 122, 67663, Kaiserslautern, Germany
Faisal Shafait

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kasar, T., Ramakrishnan, A.G. (2012). Multi-script and Multi-oriented Text Localization from Scene Images. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2011. Lecture Notes in Computer Science, vol 7139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29364-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-29364-1_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29363-4
Online ISBN: 978-3-642-29364-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics