Abstract
Convolutional Neural Networks have systematically shown good performance in Computer Vision and in Handwritten Text Recognition tasks. This paper proposes the use of these models for document image binarization. The main idea is to classify each pixel of the image into foreground and background from a sliding window centered at the pixel to be classified. An experimental analysis on the effect of sensitive parameters and some working topologies are proposed using two different corpora, of very different properties: DIBCO and Santgall.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Badekas, E., Papamarkos, N.: Optimal combination of document binarization techniques using a self-organizing map neural network. Engineering Applications of Artificial Intelligence 20(1), 11–24 (2007)
Banerjee, J., Namboodiri, A.M., Jawahar, C.: Contextual restoration of severely degraded document images. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 517–524. IEEE (2009)
Brink, A.: Thresholding of digital images using two-dimensional entropies. Pattern recognition 25(8), 803–808 (1992)
Chi, Z., Wong, K.: A two-stage binarization approach for document images. In: Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, pp. 275–278 (2001)
Ciresan, D.C., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. CoRR abs/1202.2745 (2012)
Egmont-Petersen, M., de Ridder, D., Handels, H.: Image processing with neural networks - a review. Pattern Recognition 35(10), 2279–2301 (2002)
Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of latin manuscripts using hidden markov models. In: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, pp. 29–36. ACM (2011)
Fischer, A., Indermühle, E., Bunke, H., Viehhauser, G., Stolz, M.: Ground truth creation for handwriting recognition in historical documents. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 3–10. ACM (2010)
Gatos, B., Ntirogiannis, K., Pratikakis, I.: Icdar 2009 document image binarization contest (dibco 2009). ICDAR 9, 1375–1382 (2009)
Hidalgo, J.L., España, S., Castro, M.J., Pérez, J.A.: Enhancement and cleaning of handwritten data by using neural networks. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3522, pp. 376–383. Springer, Heidelberg (2005)
Kang, L., Kumar, J., Ye, P., Li, Y., Doermann, D.: Convolutional neural networks for document image classification. In: Intern. Conf. on Pattern Recognition, pp. 3168–3172. IEEE (2014)
Kittler, J., Illingworth, J.: On threshold selection using clustering criteria. IEEE Transactions on Systems, Man and Cybernetics 5, 652–655 (1985)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Marinai, S., Gori, M., Soda, G.: Artificial neural networks for document analysis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(1), 23–35 (2005)
Mehrara, H., Zahedinejad, M., Pourmohammad, A.: Novel edge detection using bp neural network based on threshold binarization. In: Second International Conference on Computer and Electrical Engineering, 2009. ICCEE 2009, vol. 2, pp. 408–412. IEEE (2009)
Nagy, G.: Twenty years of document image analysis in pami. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(1), 38–62 (2000)
Niblack, W.: An introduction to digital image processing. Strandberg Publishing Company (1985)
Otsu, N.: A threshold selection method from gray-level histograms. Automatica 11(285–296), 23–27 (1975)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: H-dibco 2010-handwritten document image binarization competition. In: 2010 International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 727–732. IEEE (2010)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: Icfhr 2012 competition on handwritten document image binarization (h-dibco 2012). ICFHR 12, 18–20 (2012)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: Icdar 2013 document image binarization contest (dibco 2013). In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1471–1476. IEEE (2013)
Rehman, A., Saba, T.: Neural networks for document image preprocessing: state of the art. Artificial Intelligence Review 42(2), 253–273 (2014)
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recognition 33(2), 225–236 (2000)
Sermanet, P., Chintala, S., LeCun, Y.: Convolutional neural networks applied to house numbers digit classification. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 3288–3291 (2012)
Su, B., Lu, S., Tan, C.L.: Combination of document image binarization techniques. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 22–26. IEEE (2011)
Zamora-Martínez, F., España-Boquera, S., Gorbe-Moya, J., Pastor-Pellicer, J., Palacios-Corella, A.: APRIL-ANN toolkit, A Pattern Recognizer In Lua with Artificial Neural Networks (2013). https://github.com/pakozm/april-ann
Zeiler, M.D.: ADADELTA: an adaptive learning rate method. CoRR abs/1212.5701 (2012). http://arxiv.org/abs/1212.5701
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. CoRR abs/1301.3557 (2013). http://arxiv.org/abs/1301.3557
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pastor-Pellicer, J., España-Boquera, S., Zamora-Martínez, F., Afzal, M.Z., Castro-Bleda, M. (2015). Insights on the Use of Convolutional Neural Networks for Document Image Binarization. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2015. Lecture Notes in Computer Science(), vol 9095. Springer, Cham. https://doi.org/10.1007/978-3-319-19222-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-19222-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19221-5
Online ISBN: 978-3-319-19222-2
eBook Packages: Computer ScienceComputer Science (R0)