Skip to main content

A Preprocessing Algorithm to Increase OCR Performance on Application Processor-Centric FPGA Architectures

  • Conference paper
  • First Online:
Inclusive Smart Cities and Digital Health (ICOST 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9677))

Included in the following conference series:

Abstract

The aim of this research is to build up a fully automatic preprocessing algorithm capable of binarize and dewarping digitized documents, embedded in an application processor-centric Field Programmable Gate Array (FPGA), in order to develop an autonomous voice scanner for blind and visually impaired. Providing for blind the ability of hearing books without further assistance is the main purpose of this work overall. This is a part of a larger project, called “The Vocalizer Project”, emerged due to a demand by Brazil’s Ministry of Culture and Education for utilization in schools and public libraries, and is addressed for having more inclusive and intelligent cities. Furthermore, it is destined for the inclusion of blind and visually impaired people to the vast bibliographic material existent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Xiu, P., Baird, H.: Scaling up whole-book recognition. In: 10th International Conference on Document Analysis and Recognition, pp. 698–702. IEEE Press, Barcelona (2009)

    Google Scholar 

  2. Ulges, A., Lampert, C., Breuel, T.: Document image dewarping using robust estimation of curled text lines. In: 8th International Conference on Document Analysis and Recognition, pp. 1001–1005. IEEE Press, Seoul (2005)

    Google Scholar 

  3. Kakumanu, P., Bourbakis, N., Black, J., Panchanathan, S.: Document image dewarping based on line estimation for visually impaired. In: 18th IEEE International Conference on Tools with Artificial Intelligence, pp. 625–631. IEEE Press, Arlington (2006)

    Google Scholar 

  4. Otsu, N.: A threshold selection method from gray-level histogram. In: IEEESMC, pp. 62–66 (1979)

    Google Scholar 

  5. Sauvola, J., Seppänen, T., Haapakoski, S., Pietikäinen, M.: Adaptive document binarization. In: Proceedings of the 4th International Conference on Document Analysis and Recognition, pp. 147–152. IEEE Press, Ulm (1997)

    Google Scholar 

  6. Shamqoli, M., Khosravi, H.: Warped document restoration by recovering shape of the surface. In: 8th Iranian Conference on Machine Vision and Image Processing, pp. 262–265. IEEE Press, Zanjan (2013)

    Google Scholar 

  7. Panchanathan, S., Black, J., Rush, M., Iyer, V.: iCare - a user centric approach to the development of assistive devices for the blind and visually impaired. In: 15th IEEE International Conference on Tools with Artificial Intelligence, pp. 641–648. IEEE Press, Sacramento (2003)

    Google Scholar 

  8. Stamatopoulos, N. Gatos, B., Pratikakis, I., Perantonis, S.: A two-step dewarping of camera document images. In: The 8th IAPR International Workshop on Document Analysis Systems, pp. 209–216. IEEE Press, Nara (2008)

    Google Scholar 

  9. Song, L., Wu, Y., Sun, B.: A robust and fast dewarping method of document images. In: International Conference on E-Product E-Service and E-Entertainment, pp. 1–4. IEEE Press, Henan (2010)

    Google Scholar 

  10. Tesseract, version 3.03 (rc1), computer software, Google Inc., Mountain View, California (2014)

    Google Scholar 

  11. MATLAB, version R2015b, computer software, The MathWorks Inc., Natick, Massachusetts (2015)

    Google Scholar 

  12. Vivado, version 2015.2, computer software, Xilinx Inc. San José, California (2015)

    Google Scholar 

  13. DiffMatch, version 20121119, computer software, Google Inc., Mountain View, California (2012)

    Google Scholar 

Download references

Acknowledgement

We would like to thank FINEP and CNPq for the financial support. And a special acknowledgment to Pináculo company and also the multi-disciplinary team whose worked through the stages of this project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernardo de Cerqueira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Crovato, C., Torok, D., Heidrich, R., de Cerqueira, B., Velho, E. (2016). A Preprocessing Algorithm to Increase OCR Performance on Application Processor-Centric FPGA Architectures. In: Chang, C., Chiari, L., Cao, Y., Jin, H., Mokhtari, M., Aloulou, H. (eds) Inclusive Smart Cities and Digital Health. ICOST 2016. Lecture Notes in Computer Science(), vol 9677. Springer, Cham. https://doi.org/10.1007/978-3-319-39601-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-39601-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-39600-2

  • Online ISBN: 978-3-319-39601-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics