A Preprocessing Algorithm to Increase OCR Performance on Application Processor-Centric FPGA Architectures

Crovato, César; Torok, Delfim; Heidrich, Regina; de Cerqueira, Bernardo; Velho, Eduardo

doi:10.1007/978-3-319-39601-9_3

César Crovato¹⁹,
Delfim Torok¹⁹,
Regina Heidrich²⁰,
Bernardo de Cerqueira²¹ &
…
Eduardo Velho²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9677))

Included in the following conference series:

International Conference on Smart Homes and Health Telematics

2278 Accesses
1 Citations
1 Altmetric

Abstract

The aim of this research is to build up a fully automatic preprocessing algorithm capable of binarize and dewarping digitized documents, embedded in an application processor-centric Field Programmable Gate Array (FPGA), in order to develop an autonomous voice scanner for blind and visually impaired. Providing for blind the ability of hearing books without further assistance is the main purpose of this work overall. This is a part of a larger project, called “The Vocalizer Project”, emerged due to a demand by Brazil’s Ministry of Culture and Education for utilization in schools and public libraries, and is addressed for having more inclusive and intelligent cities. Furthermore, it is destined for the inclusion of blind and visually impaired people to the vast bibliographic material existent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Xiu, P., Baird, H.: Scaling up whole-book recognition. In: 10th International Conference on Document Analysis and Recognition, pp. 698–702. IEEE Press, Barcelona (2009)
Google Scholar
Ulges, A., Lampert, C., Breuel, T.: Document image dewarping using robust estimation of curled text lines. In: 8th International Conference on Document Analysis and Recognition, pp. 1001–1005. IEEE Press, Seoul (2005)
Google Scholar
Kakumanu, P., Bourbakis, N., Black, J., Panchanathan, S.: Document image dewarping based on line estimation for visually impaired. In: 18th IEEE International Conference on Tools with Artificial Intelligence, pp. 625–631. IEEE Press, Arlington (2006)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histogram. In: IEEESMC, pp. 62–66 (1979)
Google Scholar
Sauvola, J., Seppänen, T., Haapakoski, S., Pietikäinen, M.: Adaptive document binarization. In: Proceedings of the 4th International Conference on Document Analysis and Recognition, pp. 147–152. IEEE Press, Ulm (1997)
Google Scholar
Shamqoli, M., Khosravi, H.: Warped document restoration by recovering shape of the surface. In: 8th Iranian Conference on Machine Vision and Image Processing, pp. 262–265. IEEE Press, Zanjan (2013)
Google Scholar
Panchanathan, S., Black, J., Rush, M., Iyer, V.: iCare - a user centric approach to the development of assistive devices for the blind and visually impaired. In: 15th IEEE International Conference on Tools with Artificial Intelligence, pp. 641–648. IEEE Press, Sacramento (2003)
Google Scholar
Stamatopoulos, N. Gatos, B., Pratikakis, I., Perantonis, S.: A two-step dewarping of camera document images. In: The 8th IAPR International Workshop on Document Analysis Systems, pp. 209–216. IEEE Press, Nara (2008)
Google Scholar
Song, L., Wu, Y., Sun, B.: A robust and fast dewarping method of document images. In: International Conference on E-Product E-Service and E-Entertainment, pp. 1–4. IEEE Press, Henan (2010)
Google Scholar
Tesseract, version 3.03 (rc1), computer software, Google Inc., Mountain View, California (2014)
Google Scholar
MATLAB, version R2015b, computer software, The MathWorks Inc., Natick, Massachusetts (2015)
Google Scholar
Vivado, version 2015.2, computer software, Xilinx Inc. San José, California (2015)
Google Scholar
DiffMatch, version 20121119, computer software, Google Inc., Mountain View, California (2012)
Google Scholar

Download references

Acknowledgement

We would like to thank FINEP and CNPq for the financial support. And a special acknowledgment to Pináculo company and also the multi-disciplinary team whose worked through the stages of this project.

Author information

Authors and Affiliations

Institute of Technology and Exact Sciences, Feevale University, Novo Hamburgo, Brazil
César Crovato & Delfim Torok
Feevale University, Novo Hamburgo, Brazil
Regina Heidrich
Scientific Improvement Researcher, Feevale University, Novo Hamburgo, Brazil
Bernardo de Cerqueira & Eduardo Velho

Authors

César Crovato
View author publications
You can also search for this author in PubMed Google Scholar
Delfim Torok
View author publications
You can also search for this author in PubMed Google Scholar
Regina Heidrich
View author publications
You can also search for this author in PubMed Google Scholar
Bernardo de Cerqueira
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Velho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernardo de Cerqueira .

Editor information

Editors and Affiliations

Iowa State University, Ames, Iowa, USA
Carl K. Chang
University of Bologna, Bologna, Italy
Lorenzo Chiari
The University of Massachusetts, Lowell, Massachusetts, USA
Yu Cao
Huazhong Univ. of Science and Technology, Wuhan, China
Hai Jin
Institut Mines Télécom Paris/CNRS, Paris, France
Mounir Mokhtari
Institut Mines Télécom, Paris, France
Hamdi Aloulou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Crovato, C., Torok, D., Heidrich, R., de Cerqueira, B., Velho, E. (2016). A Preprocessing Algorithm to Increase OCR Performance on Application Processor-Centric FPGA Architectures. In: Chang, C., Chiari, L., Cao, Y., Jin, H., Mokhtari, M., Aloulou, H. (eds) Inclusive Smart Cities and Digital Health. ICOST 2016. Lecture Notes in Computer Science(), vol 9677. Springer, Cham. https://doi.org/10.1007/978-3-319-39601-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-39601-9_3
Published: 21 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39600-2
Online ISBN: 978-3-319-39601-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics