Text-line-up: Don’t Worry About the Caret

Adak, Chandranath; Chaudhuri, Bidyut B.; Lin, Chin-Teng; Blumenstein, Michael

doi:10.1007/978-3-030-86334-0_14

Chandranath Adak^11,12,
Bidyut B. Chaudhuri^13,14,
Chin-Teng Lin¹² &
…
Michael Blumenstein¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12823))

Included in the following conference series:

International Conference on Document Analysis and Recognition

3401 Accesses

Abstract

In a freestyle handwritten text-line, sometimes words are inserted using a caret symbol (\(^\wedge \)) for corrections/annotations. Such insertions create fluctuations in the reading sequence of words. In this paper, we aim to line-up the words of a text-line, so that it can assist the OCR engine. Previous text-line segmentation techniques in the literature have scarcely addressed this issue. Here, the task undertaken is formulated as a path planning problem, and a novel multi-agent hierarchical reinforcement learning-based architecture solution is proposed. As a matter of fact, no linguistic knowledge is used here. Experimentation of the proposed solution architecture has been conducted on English and Bengali offline handwriting, which yielded some interesting results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Grüning, T., et al.: A two-stage method for text line detection in historical documents. IJDAR 22, 285–302 (2019)
Article Google Scholar
Survey, A., Sulem, L.L., Zahour, A., Taconet, B.: Text line segmentation of historical documents. IJDAR 9, 123–138 (2007)
Google Scholar
Surinta, O., et al.: A* path planning for line segmentation of handwritten documents. In: ICFHR, pp. 175–180 (2014)
Google Scholar
Li, X.Y., et al.: Script-independent text line segmentation in freestyle handwritten documents. IEEE TPAMI 30(8), 1313–1329 (2008)
Article Google Scholar
Arulkumaran, K., et al.: Deep reinforcement learning: a brief survey. IEEE Sig. Process. Mag. 34(6), 26–38 (2017)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Cambridge (2018). ISBN: 9780262039246
Google Scholar
Wilber: GIMP 2.10.22 Released (2020). Online: gimp.org. Accessed 3 May 2021
Marti, U., Bunke, H.: The IAM-database: an English sentence database for off-line handwriting recognition. IJDAR 5, 39–46 (2002)
Article Google Scholar
Alaei, A., Pal, U., Nagabhushan, P.: Dataset and ground truth for handwritten text in four different scripts. IJPRAI 26(4), 1253001 (2012)
MathSciNet Google Scholar
Berliac, Y. F.: The Promise of Hierarchical Reinforcement Learning. The Gradient (2019)
Google Scholar
Wierstra, D., Foerster, A., Peters, J., Schmidhuber, J.: Solving deep memory POMDPs with recurrent policy gradients. In: ICANN, pp. 697–706 (2007)
Google Scholar
Badrinarayanan, V., et al.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE TPAMI 39(12), 2481–2495 (2017)
Article Google Scholar
Zhang, A., et al.: Dive into Deep Learning (2020). Online: d2l.ai. Accessed 3 May 2021
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. ICML 37, 448–456 (2015)
Google Scholar
Misra, D.: Mish: a self regularized non-monotonic activation function. In: Paper # 928, BMVC 2020 (2020)
Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)
MathSciNet MATH Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Article Google Scholar
Wang, Z., et al.: Dueling network architectures for deep reinforcement learning. ICML 48, 1995–2003 (2016)
Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
MATH Google Scholar
Wandell, B.A.: Foundations of Vision. Sinauer Asso. Inc. (1995). ISBN: 9780878938537
Google Scholar
Larochelle, H., Hinton, G.E.: Learning to combine foveal glimpses with a third-order Boltzmann machine. In: NIPS, pp. 1243–1251 (2010)
Google Scholar
He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP, pp. 1724–1734 (2014)
Google Scholar
Chung, J., et al.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS Workshop on Deep Learning (2014)
Google Scholar
Mnih, V., et al.: Recurrent models of visual attention. In: NIPS, pp. 2204–2212 (2014)
Google Scholar
Sutton, R.S., et al.: Policy gradient methods for reinforcement learning with function approximation. In: NIPS, pp. 1057–1063 (1999)
Google Scholar
Hertz, J., Krogh, A., Palmer, R.G.: Introduction to the Theory of Neural Computation. CRC Press, Boca Raton (1991). https://doi.org/10.1201/9780429499661
Botchkarev, A.: Performance metrics (error measures) in machine learning regression, forecasting and prognostics: properties and typology arXiv:1809.03006 (2018)
Stamatopoulos, N., et al.: ICDAR 2013 handwriting segmentation contest. In: ICDAR, pp. 1402–1406 (2013)
Google Scholar
Chaudhuri, B.B., Adak, C.: An approach for detecting and cleaning of struck-out hand-written text. Pattern Recogn. 61, 282–294 (2017)
Article Google Scholar
Almageed, W.A., et al.: Page rule-line removal using linear subspaces in monochromatic handwritten Arabic documents. In: ICDAR, pp. 768–772 (2009)
Google Scholar

Download references

Acknowledgment

All the people who contributed to generating the database are gratefully acknowledged. The authors also heartily thank all the consulted linguistic and handwriting experts.

Author information

Authors and Affiliations

JIS Institute of Advanced Studies and Research, JIS University, 700091, Kolkata, India
Chandranath Adak
Australian AI Institute, University of Technology Sydney, Ultimo, 2007, Australia
Chandranath Adak, Chin-Teng Lin & Michael Blumenstein
Techno India University, 700091, Kolkata, India
Bidyut B. Chaudhuri
CVPR Unit, Indian Statistical Institute, 700108, Kolkata, India
Bidyut B. Chaudhuri

Authors

Chandranath Adak
View author publications
You can also search for this author in PubMed Google Scholar
Bidyut B. Chaudhuri
View author publications
You can also search for this author in PubMed Google Scholar
Chin-Teng Lin
View author publications
You can also search for this author in PubMed Google Scholar
Michael Blumenstein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chandranath Adak .

Editor information

Editors and Affiliations

Universitat Autònoma de Barcelona, Barcelona, Spain
Josep Lladós
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti
Kyushu University, Fukuoka-shi, Japan
Seiichi Uchida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Adak, C., Chaudhuri, B.B., Lin, CT., Blumenstein, M. (2021). Text-line-up: Don’t Worry About the Caret. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12823. Springer, Cham. https://doi.org/10.1007/978-3-030-86334-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-86334-0_14
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86333-3
Online ISBN: 978-3-030-86334-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Text-line-up: Don’t Worry About the Caret