Abstract
Image warping caused by scanning, photocopying or photographing a document is a common problem in the .eld of document processing and understanding. Distortion within the text documents impairs OCRability and thus strongly decreases the usability of the results. This is one of the major obstacles for automating the process of digitizing printed documents.
In this paper we present a novel algorithm which is able to correct document image warping based on the detection of distorted text lines. The proposed solution is used in a recent project of digitizing old, poor quality manuscripts. The algorithm is compared to other published approaches. Experiments with various document samples and the resulting improvements of the text recognition rate achieved by a commercial OCR engine are also presented.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Amin, A., Fischer, S., Parkinson, A.F., Shiu, R.: Comparative Study of Skew Detection Algorithms. Jour. of Electronic Imaging SPIE, USA, 443–451 (1996)
Biella, D., Dyllong, E., Kaiser, H., Luther, W., Mittmann, T.: Edition électronique de la réception de Nietzsche des années 1865 à 1945. In: Proc. ICHIM 2003, Paris, France (September 2003)
Biella, D., Luther, W.: Mobile verteilte Dokumentenrecherche in Bibliotheken und Archiven. In: INFORMATIK 2003 - Innovative Informatikanwendungen, GI 2003, Germany, vol. 1, pp. 298–302 (2003)
Biella, D., Luther, W., Pilz, T.: A web-based System for Assisted Literature Research. In: Proceedings of the 3rd European Conference on e-Learning, ECEL 2004, Paris, France, November 2004, pp. 15–24 (2004)
Cao, H., Ding, X., Liu, C.: A Cylindrical Surface Model to Rectify the Bound Document Image. In: Ninth IEEE ICCV 2003, Nice, France, October 2003, vol. 1, pp. 228–233 (2003)
Fletcher, L.A., Kasturi, R.: A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images. IEEE Trans. Pattern Anal. Mach. Intell. 10(6), 910–918 (1988)
Otsu, N.: A Threshold Selection Method from Graylevel Histograms. IEEE Trans. Sys. Man Cybern. 9(1), 62–66 (1979)
Savakis, A.E.: Adaptive Document Image Thresholding Using Foreground and Background Clustering. In: Proc. of ICIP 1998, pp. 785–789 (1998)
Wu, C., Agam, G.: Document Image De-Warping for Text/Graphics Recognition. In: Proc. of Joint IAPR 2002 and SPR 2002, Windsor, Ontario, Canada, August 2002, pp. 348–357 (2002)
Zhang, Z., Tan, C.L.: Correcting Document Image Warping Based on Regression of Curved Text Lines. In: ICDAR 2003, Edinburgh, UK, August 2003, pp. 589–593 (2003)
Zhang, Z., Tan, C.L., Fan, L.: Estimation of 3D Shape of Warped Document Surface for Image Restoration. In: ICPR 2004, Cambridge, UK, August 2004, pp. 486–489 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mischke, L., Luther, W. (2005). Document Image De-warping Based on Detection of Distorted Text Lines. In: Roli, F., Vitulano, S. (eds) Image Analysis and Processing – ICIAP 2005. ICIAP 2005. Lecture Notes in Computer Science, vol 3617. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11553595_131
Download citation
DOI: https://doi.org/10.1007/11553595_131
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28869-5
Online ISBN: 978-3-540-31866-8
eBook Packages: Computer ScienceComputer Science (R0)