Improved Two-Step Binarization of Degraded Document Images Based on Gaussian Mixture Model

Krupiński, Robert; Lech, Piotr; Okarma, Krzysztof

doi:10.1007/978-3-030-50426-7_35

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12141))

Included in the following conference series:

International Conference on Computational Science

2153 Accesses
3 Citations

Abstract

Image binarization is one of the most relevant preprocessing operations influencing the results of further image analysis conducted for many purposes. During this step a significant loss of information occurs and the use of inappropriate thresholding methods may cause difficulties in further shape analysis or even make it impossible to recognize different shapes of objects or characters. Some of the most typical applications utilizing the analysis of binary images are Optical Character Recognition (OCR) and Optical Mark Recognition (OMR), which may also be applied for unevenly illuminated natural images, as well as for challenging degraded historical document images, considered as typical benchmarking tools for image binarization algorithms.

To face the still valid challenge of relatively fast and simple, but robust binarization of degraded document images, a novel two-step algorithm utilizing initial thresholding, based on the modelling of the simplified image histogram using Gaussian Mixture Model (GMM) and the Monte Carlo method, is proposed in the paper. This approach can be considered as the extension of recently developed image preprocessing method utilizing Generalized Gaussian Distribution (GGD), based on the assumption of its similarity to the histograms of ground truth binary images distorted by Gaussian noise. The processing time of the first step, producing the intermediate images with partially removed background information, may be significantly reduced due to the use of the Monte Carlo method.

The proposed improved approach leads to even better results, not only for well-known DIBCO benchmarking databases, but also for more demanding Bickley Diary dataset, allowing the use of some well-known classical binarization methods, including the global ones, in the second step of the algorithm.

You have full access to this open access chapter, Download conference paper PDF

Binarization of Degraded Document Images with Generalized Gaussian Distribution

A statistical tool based binarization method for document images

Article 19 June 2019

Region Based Approach for Binarization of Degraded Document Images

Keywords

1 Introduction

Analysis of binary images still belongs to the most popular applications of machine vision both in industry and some other computer vision tasks, where the shape of objects plays the dominant role. Although in some industrial applications with controlled lighting conditions, as well as in the analysis of high quality scanned documents, some classical global thresholding algorithms, such as e.g. well-known Otsu [24] method, may be sufficient, for unevenly illuminated objects or degraded document images, even the use of more advanced adaptive methods might be challenging in some cases. For some of the popular adaptive methods, e.g. proposed by Niblack [18] or Sauvola [29], the obtained results may be far from expectations, especially in outdoor scenarios. On the other hand, some more sophisticated methods may be troublesome to implement in some embedded systems and devices with low computing performance.

Some typical areas of applications, where the quality of binary images obtained from natural images is important, are Optical Text Recognition (OCR), Optical Mark Recognition (OMR), recognition of QR codes, self-localization, terrain exploration and path following in autonomous navigation of vehicles and mobile robots, video monitoring and inspection, etc. Nevertheless, due to the lack of image and video datasets, containing both natural and ground truth images other than document images, a widely accepted approach to performance evaluation of image binarization methods is the use of the datasets provided yearly by the organizers of Document Image Binarization COmpetitions (DIBCO), taking place during two major conferences, namely International Conference on Document Analysis and Recognition (ICDAR) and International Conference on Frontiers in Handwriting Recognition (ICFHR).

Although these datasets contain images with more and more challenging image distortions each year, another interesting possibility is the additional verification of the proposed methods for the images included in Bickley Diary dataset [5], containing 92 photocopies of individual pages from a diary written ca. 100 years ago by the wife of one of the first missionaries in Malaysia – Bishop George H. Bickley. Since the distortions in this dataset are related not only to overall noise caused by photocopying, but also discolorization and water stains, as well as differences in ink contrast for different years, it may be considered as even more challenging in comparison to DIBCO datasets [26]. To ensure a reliable verification of the advantages of the method proposed in this paper, all currently available DIBCO datasets together with Bickley Diary database have been used.

Although many various approaches to image binarization have been presented over the years, including adaptive methods e.g. proposed by Bradley [2], Feng [6], Niblack [18], Sauvola [29] or Wolf [35], and their modifications [28, 30], for each newly developed algorithm its required computational effort usually increases. Good examples may be the applications of local features with the use of Gaussian Mixture Models [17] or the use of deep neural networks [32], where multiple processing stages are necessary. Some comparisons of popular methods and their overviews can be found in recent survey papers or books [3, 31]. In many methods the additional background removal, median filtering or morphological processing are required, as well as time-consuming training process for recently popular deep convolutional neural networks. Therefore, our motivation is the increase of performance of some classical methods due to efficient image preprocessing rather than comparison with sophisticated state-of-the-art methods and solutions based on deep learning, considering also the time-quality efficiency challenges [14].

2 Modelling the Histograms of Distorted Images

2.1 Generalized Gaussian Distribution and Gaussian Mixture Model

The application areas of the Generalized Gaussian Distribution (GGD) cover a wide range of signal and image processing methods utilizing the designation of various models, including e.g. tangential wavelet coefficients used to compress three-dimensional triangular mesh data [12] or generation of augmented quaternion random variables with the GGD [7]. Some other popular applications are related to no-reference image quality assessment (IQA) based on natural scene statistics (NSS) model, used to describe certain regular statistical properties of natural images [38], as well as image segmentation [33] and approximation of an atmosphere point spread function (APSF) kernel [34].

One of the main advantages of the GGD is the coverage of the other popular distributions, namely Gaussian distribution, Laplacian distribution, a uniform one, an impulse function, as well as some other special cases [9, 10]. Estimation of its parameters is possible using various methods [37]. Its extension into multidimensional case [25] and covering the complex variables [19] is also possible.

The probability density function of the GGD can be expressed as [4]:

$$\begin{aligned} f(x)=\frac{\lambda \cdot p}{2 \cdot \varGamma \left( \frac{1}{p}\right) }e^{-[\lambda \cdot |x|]^{p}}, \end{aligned}$$

(1)

where p denotes the shape parameter, $\varGamma (z)=\int _{0}^{\infty }t^{z-1}e^{-t}dt, z>0$ [23] and $\lambda $ is the parameter based on the standard deviation $\sigma $ of the distribution. Their relation is given by the equation $\lambda (p,\sigma )=\frac{1}{\sigma }\left[ \frac{\varGamma (\frac{3}{p})}{\varGamma (\frac{1}{p})}\right] ^{\frac{1}{2}}$. Choice of the parameter $p=1$ corresponds to Laplacian distribution, whereas $p=2$ is typical for Gaussian distribution. When $p \rightarrow \infty $, the GGD density function goes to a uniform distribution and for $p \rightarrow 0$, f(x) becomes an impulse function.

A Gaussian Mixture Model (GMM) consists of Gaussian distribution components defined by their locations $\mu $ and standard deviations $\sigma $ and additionally, a vector of mixing proportions. Due to the main purpose of investigation, related to image binarization, the application of two Gaussian distribution components is considered, since only two classes of pixels are assumed. In the other words, it is assumed that only two clusters are present in the image, consisting of pixels representing text and background respectively, and – in the ideal case – each cluster is represented by a single Gaussian distribution component. Having computed the parameters of the GMM with two Gaussian distribution components, the initial threshold should be located between the locations $\mu $ of both distributions and may be calculated in several ways e.g. as the intersection point of two determined curves.

The GMM parameters can be determined using the iterative Expectation-Maximization (EM) algorithm. The algorithm iterates over two steps until the convergence is achieved. The first step would estimate the expected value for each observation and the maximization step would optimize the parameters of the probability distributions using the maximum likelihood.

2.2 General Assumptions for Natural Images

Natural images, representing old handwritten or machine-printed documents, contain some specific distortions, being the result of gradual degradation of original manuscripts or printings during years. Some visible imperfections, such as faded and low contrast ink, as well as the presence of noisy distortions and some stains, influence the histogram of the image. Hence, assuming the analysis of greyscale images, more intermediate grey levels may be observed, similarly as for binary images corrupted by Gaussian noise, as illustrated in Fig. 1 for the sample image no. 7 from DIBCO2017 dataset. This similarity is especially well visible assuming the use of two Gaussian distributions modelling the histogram of the ground truth (GT) binary image corrupted by Gaussian noise.

Therefore, the approximation of histograms by the GMM with two Gaussian distribution components should be useful for the initial thresholding step, eliminating the most of the background information. It can be conducted by choosing a threshold between two peaks of the approximated histogram defined by their location parameters $\mu $. Another possible approach, investigated in one of the previous papers [11], is the use of a single Gaussian distribution or the GGD, being its extended version. Nevertheless, the choice of its location parameter $\mu $ as the initial threshold leads to elimination of less background information. The comparison of parameters of the GGD and GMM with two Gaussian distribution components (further referred as GMM2), obtained for the sample image no. 7 from DIBCO2017 with a typical bimodal histogram, is shown in Fig. 2, where the threshold selected for the GMM is marked as the intersection point of both Gaussian curves.

For some images the GMM2 components may be located closer to each other and therefore the choice of an appropriate threshold may be more troublesome. In such situations the solution proposed in the paper [11] may be insufficient and the application of the GMM2 makes it possible to remove the background information better. Some exemplary results, obtained for sample image no. 8 from more challenging DIBCO2018 dataset, are presented in Fig. 3, where the greater ability to remove unnecessary background can be clearly observed for the GMM2. Such obtained images may be subjected to further binarization steps.

3 Proposed Method

3.1 Improved Two-Step Binarization Algorithm

Taking the advantage of similarity of histograms of degraded document images converted to greyscale and binary GT images corrupted by Gaussian noise, chosen as the most widespread type of noise in practical applications, the first step of the proposed algorithm is the calculation of parameters of the Gaussian distributions. Due to a great importance of the universality of the proposed approach, three possible models are used: a single Gaussian distribution, GGD and GMM2. Using the most relevant parameters: $\mu $ and $\sigma $, several variants of possible thresholds $X_{thr}$ have been tested, including:

location parameter $\mu _{GGD}$ of the GGD (originally proposed in [11]),
location parameter $\mu _G$ of the single Gaussian distribution,
location parameter $\mu _G$ lowered by Gaussian standard deviation $\sigma _G$,
intersection of two GMM2 curves (thr), as shown in Fig. 2b,
upper location parameter of two GMM2 curves $\mu _{GMMmax}$,
weighted average of two GMM2 locations: $\mu _{GMM01} \cdot w_{01}$ + $\mu _{GMM02} \cdot w_{02}$,
weighted average of two GMM2 locations lowered by the respective standard deviations: ($\mu _{GMM01} - \sigma _{01}) \cdot w_{01}$ + ($\mu _{GMM02} - \sigma _{02}) \cdot w_{02}$,
minimum values of the above thresholds.

The weighting coefficients $w_{01}$ and $w_{02}$ have been determined during the calculation of the GMM and normalized so that $w_{01} + w_{02} = 1$. To avoid the necessity of using sophisticated estimators based on maximum likelihood, moments, entropy matching or global convergence [27], the values of the four GGD parameters: shape parameter p, location parameter $\mu $, variance of the distribution $\lambda $, and standard deviation $\sigma $, as well as parameters of the GMM2, have been determined using the fast approximated method based on the standardized moment, described in the paper [8].

Additionally, all the above parameters have also been calculated after filtration of the 256-bin image histograms using 5-element median filter used to remove peaks. However, the results obtained using this approach have been worse for all databases and slightly higher binarization accuracy has been observed only for a few images. Therefore, all further experiments have been conducted using the original histograms without the additional time-consuming filtering.

To improve text readability, the determined thresholds ($X_{thr}$) are used instead of the maximum intensity values in the classical normalization of pixel intensity levels, applied only for the intensity levels not exceeding $X_{thr}$, as

$$\begin{aligned} Y(i,j) = \left| \frac{(X(i,j) - X_{min})\cdot 255}{X_{thr} - X_{min}}\right| , \end{aligned}$$

(2)

where $ 0 \le X_{min}< X_{thr} < X_{max} \le 255 $ and

$ X_{thr} $ is the upper threshold determined during the proposed preprocessing,
$ X_{min} $ is the minimum intensity of all image pixels,
$ X_{max} $ is the maximum intensity of all image pixels,
X(i, j) is the intensity level of the input pixel at (i, j) coordinates,
Y(i, j) is the intensity level of the output pixel at (i, j) coordinates.

Assuming the presence of a dark text on a brighter background, to remove partially the bright background data, usually containing some distortions not influencing the text information, intensity values for all pixels with brightness higher than $X_{thr}$ are set to 255 independently on the formula (2).

As the result, the limitation of the brightness range from $ \langle X_{min} \; ; \; X_{max} \rangle $ to $ \langle X_{min} \; ; \; X_{thr} \rangle $ with additional normalization to the range $ \langle 0 \; ; \; 255 \rangle $ is obtained, where the increase of dynamic range for images with overexposure or visible low ink contrast is achieved regardless of the selected upper threshold. Finally, the intermediate image with partially eliminated background is obtained, which is the input for some other classical global or adaptive binarization methods. Since such obtained images are better balanced in terms of text and background information, is it assumed that the finally obtained thresholds should be closer to expectations in comparison with those achieved by the same methods without the proposed preprocessing.

3.2 Acceleration of Calculations Using the Monte Carlo Method

The idea of the Monte Carlo method is based on the significant decrease of the number of analysed pixels, preserving the statistical properties of the image histogram. According to the law of large numbers and the central limit theorem, for a statistical experiment the sequence of successive approximations of the estimated value is convergent to the sought solution. Therefore, using the pseudo-random number generator with a uniform distribution, a limitation of the number of analysed pixels, decreasing the computational burden, is possible [21].

To prevent the necessity of using two independent generators to draw the coordinates of pixels, the image is initially reshaped into one-dimensional vector V containing the intensities of all $M \times N$ pixels. Applying the pseudo-random number generator with possibly good statistical properties and a uniform distribution, n independent draws of the positions in the vector V are conducted. To build an estimate of the simplified histogram, the total number of the randomly drawn pixels (k) for each intensity level is calculated according to

$$\begin{aligned} \hat{L}_{MC} = \frac{k}{n} \cdot M \cdot N, \end{aligned}$$

(3)

where k denotes the number of randomly chosen pixels of the given intensity, n is the total number of draws and $M \times N$ determines the image size. In some applications a random choice of pixels can also be made in parallel to increase the computational speed.

Analysing the convergence of the method [11], the estimation error can be determined as

$$\begin{aligned} \varepsilon _\alpha = \frac{u_\alpha }{\sqrt{n}} \cdot \sqrt{\frac{K}{M \cdot N} \cdot \left( 1 - \frac{K}{M \cdot N}\right) }, \end{aligned}$$

(4)

where K is the total number of pixels for a given intensity and $u_\alpha $ represents the two-sided critical range. Nevertheless, the influence of even relatively high values of the above estimation error (calculated for the histogram) on the determined binarization thresholds is marginal.

Such obtained estimated simplified histogram may be successfully used as the input data for histogram based global thresholding methods [13, 22], however its use for adaptive thresholding would be possible assuming the division of images into regions. Nevertheless, the direct application of this approach for typical adaptive methods based on the analysis of the local neighbourhood of each pixel, such as Bradley [2], Niblack [18] or Sauvola [29], would be troublesome.

In the proposed approach the Monte Carlo method is applied to reduce the computational effort of the first step of the algorithm. Due to the use of the simplified histogram it is possible to estimate the initial upper threshold $X_{thr}$ using a significantly reduced number of samples. To reduce the possibility of the influence of potentially imbalanced intensities of the randomly chosen pixels for a small number of draws, the Monte Carlo experiment may be repeated and then the median from the determined thresholds would be selected as the result. To verify the stability of this approach, some experiments have been conducted with the use of 2.5%, 5%, 7.5%, 10%, 12.5% and 15% of the total numbers of pixels in consecutive images from all available DIBCO datasets, as well as for all 92 images from Bickley Diary database. For the lower percentages median values from 3, 5, 7 and 9 Monte Carlo experiments have been chosen, although it should be noted that from computational point of view e.g. the use of 9 draws for 5% of pixels can be treated as equivalent to a random choice of 45% of all pixels (not considering the time necessary for selection of the median value).

The second stage of the proposed approach, assuming the use of some previously proposed binarization methods, is not based on the use of the Monte Carlo method, although for some of the global methods, it might be possible as well and should be considered in further research. Nevertheless, in the first experiment it has been assumed that n is equal to the total number of pixels ($M \times N$), however – as described further – it may be significantly reduced applying the Monte Carlo method, without affecting the accuracy of binarization.

4 Experimental Results

The verification of the proposed approach has been made using 208 images: 116 images from 9 available DIBCO datasets (2009 to 2018), converted to greyscale according to popular ITU-R Recommendation BT.601, and 92 monochrome images from Bickley Diary dataset. All the calculations have been made for full images, as well as for the limited number of samples, applying the Monte Carlo method with repetitions and median choice, as stated above. During the first stage of the algorithm all threshold variants listed in Sect. 3.1 have been examined. Such obtained images with partially eliminated background information have been subjected to further binarization in the second stage, using popular thresholding methods, such as: fixed threshold (0.5 of the intensity range), Bernsen [1] (also with the local Gaussian window), Bradley [2], Otsu [24], Sauvola [29] and Wolf [35].

Finally, the obtained results have been compared with the direct use of the above mentioned methods without the proposed preprocessing. To make a reliable comparison of the final binarization results, according to widely accepted methodologies [20], some typical metrics based on the counting of true positive (TP) pixels, true negatives (TN), false positives (FP) and false negatives (FN), such as Precision, Recall, F-Measure, Specificity and Accuracy, have been calculated, assuming the pixels representing text as “ones” and background pixels as “zeros”. Additionally, some other metrics, such as PSNR, Distance Reciprocal Distortion (DRD) [15] and Misclassification Penalty Metric (MPM) [36], have also been computed.

Although the values of the estimated parameters may be slightly different for each independent execution of the Monte Carlo method, especially for a low number of drawn samples (n), the overall influence of the number of randomly drawn pixels on the final binarization accuracy is unnoticeable, even for the use of 2.5% of the pixels assuming the 3-fold drawing and the choice of the median threshold in the first stage. Hence, only the results obtained for full images, considered as easier for potential recalculation, are presented in this paper, although the same results have been achieved applying the Monte Carlo method almost for all images. To avoid the presentation of all metrics based on the number of TP, TN, FP and FN pixels, we have focused on accuracy and PSNR, as well as some alternative metrics, such as DPD and MPM.

The results of experiments conducted for all DIBCO datasets and Bickley Diary database are illustrated in Fig. 4. Better results are represented by higher accuracy and PSNR, but lower DRD and MPM values.

As it may be observed, the influence of preprocessing for Sauvola and Wolf methods is marginal, however its application for some other methods, including the simplest fixed thresholding and the classical global Otsu method, leads to a significant improvement of all average metrics presented in Fig. 4. Analysing the accuracy, PSNR and DRD values, the use of the proposed preprocessing for Otsu method, as well as for adaptive Bradley and Bernsen algorithms, leads to better results than achieved by Sauvola and Wolf algorithms, even though these methods applied directly have led to much worse binarization results. In almost all cases the application of the proposed preprocessing method, based on the weighted average of two GMM2 locations lowered by the respective standard deviations ($\mu _{GMM01} - \sigma _{01}) \cdot w_{01}$ + ($\mu _{GMM02} - \sigma _{02}) \cdot w_{02}$, leads to better results, also in comparison with the previously proposed method [11] using the location parameter $\mu _{GGD}$ of the GGD.

Nevertheless, considering the results indicated as “best”, some minor exceptions occur, especially for the MPM results, as the results obtained for some other variants of possible thresholds $X_{thr}$ are slightly better. For the fixed threshold (0.5) the best accuracy, PSNR and DRD may be obtained applying the $X_{thr} = max(\mu _{GMM01},\mu _{GMM02})$, whereas the use of $X_{thr} = \mu _{GGD} - \sigma _{GGD}$ leads to the best DRD and MPM values for Otsu and Bradley methods, as well as the DRD for Wolf and all metrics for Bernsen thresholding. The use of weighted average of two GMM2 locations (without lowering) slightly improves the MPM results for Wolf method and DRD for Sauvola, as well as PSNR and accuracy for both of them. Nonetheless, as can be seen in Fig. 4, the differences between the results for “best” variants and the most universal method, utilizing the formula $X_{thr} = (\mu _{GMM01} - \sigma _{01}) \cdot w_{01}$ + ($\mu _{GMM02} - \sigma _{02}) \cdot w_{02}$, are relatively small.

The best overall accuracy equal to 0.9336 and PSNR = 13.1614 has been achieved applying the proposed method followed by Bradley thresholding. The same popular adaptive method, implemented e.g. as the adaptthresh function in MATLAB environment, with the GGD based preprocessing [11], leads to noticeably worse results (ACC = 0.9279 and PSNR = 12.5857), whereas its direct application gives the accuracy equal to 0.9187 and PSNR = 12.1072 (without preprocessing).

A visual comparison of the final binarization results for a sample image no. 8 from the challenging DIBCO2018 dataset is shown in Fig. 5, where the advantages of the proposed approach, also over the GGD based preprocessing [11], are clearly visible, especially for Bernsen method shown in the left part. Nevertheless, the improvements of results can also be noticed for Bradley thresholding.

5 Summary and Future Work

The experimental results presented in the paper confirm the usefulness of the preprocessing of degraded document images based on the histogram modelling using the GGD and GMM based methods. A combination of the proposed approach with some well-known thresholding algorithms makes it possible to enhance the binarization results significantly, making them comparable with the use of more sophisticated methods. Even though the final results may be outperformed by some other methods, e.g. utilizing deep learning [32], the presented approach is relatively fast, also due to the use of the Monte Carlo method, and does not require the long training process with many images. The proposed approach may be easily combined with some other methods proposed by some other researchers, although – considering the results achieved for Sauvola and Wolf methods – achieved improvements may be smaller.

Some of the directions of our future research will be an attempt to a further simplification of the histogram modelling step, as well as the combination of the proposed preprocessing with statistical methods [13] and some region based thresholding methods [16], being usually much faster in comparison with typical adaptive methods, which require the analysis of the local neighbourhood of each pixel. Considering potential applications in robotics, related to the real-time analysis of natural images, our efforts will be oriented towards a further acceleration of image processing operations preceding the final binarization step.

References

Bernsen, J.: Dynamic thresholding of grey-level images. In: Proceedings 8th International Conference on Pattern Recognition (ICPR), pp. 1251–1255 (1986)
Google Scholar
Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. Tools 12(2), 13–21 (2007). https://doi.org/10.1080/2151237X.2007.10129236
Article Google Scholar
Chaki, N., Shaikh, S.H., Saeed, K.: Exploring Image Binarization Techniques. SCI, vol. 560. Springer, New Delhi (2014). https://doi.org/10.1007/978-81-322-1907-1
Book Google Scholar
Clarke, R.J.: Transform Coding of Images. Academic press, New York (1985)
Google Scholar
Deng, F., Wu, Z., Lu, Z., Brown, M.S.: Binarization shop: a user assisted software suite for converting old documents to black-and-white. In: Proceedings of Annual Joint Conference on Digital Libraries, pp. 255–258 (2010)
Google Scholar
Feng, M.L., Tan, Y.P.: Adaptive binarization method for document image analysis. In: Proceedings of 2004 IEEE International Conference on Multimedia and Expo (ICME), vol. 1, pp. 339–342 (2004). https://doi.org/10.1109/ICME.2004.1394198
Krupiński, R.: Generating augmented quaternion random variable with generalized Gaussian distribution. IEEE Access 6, 34608–34615 (2018). https://doi.org/10.1109/ACCESS.2018.2848202
Article Google Scholar
Krupiński, R.: Approximated fast estimator for the shape parameter of generalized Gaussian distribution for a small sample size. Bull. Pol. Acad. Sci. Tech. Sci. 63(2), 405–411 (2015). https://doi.org/10.1515/bpasts-2015-0046
Article Google Scholar
Krupiński, R.: Reconstructed quantized coefficients modeled with generalized Gaussian distribution with exponent 1/3. Image Process. Commun. 21(4), 5–12 (2016)
Article Google Scholar
Krupiński, R.: Modeling quantized coefficients with generalized Gaussian distribution with Exponent 1 / m, $m=2,3,\ldots $. In: Gruca, A., Czachórski, T., Harezlak, K., Kozielski, S., Piotrowska, A. (eds.) ICMMI 2017. AISC, vol. 659, pp. 228–237. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67792-7_23
Chapter Google Scholar
Krupiński, R., Lech, P., Tecław, M., Okarma, K.: Binarization of degraded document images with generalized Gaussian distribution. In: Rodrigues, J.M.F., et al. (eds.) ICCS 2019. LNCS, vol. 11540, pp. 177–190. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22750-0_14
Chapter Google Scholar
Lavu, S., Choi, H., Baraniuk, R.: Estimation-quantization geometry coding using normal meshes. In: Proceedings of the Data Compression Conference (DCC 2003), p. 362, March 2003. https://doi.org/10.1109/DCC.2003.1194027
Lech, P., Okarma, K.: Optimization of the fast image binarization method based on the Monte Carlo approach. Elektronika Ir Elektrotechnika 20(4), 63–66 (2014). https://doi.org/10.5755/j01.eee.20.4.6887
Article Google Scholar
Lins, R.D., Bernardino, R.B., de Jesus: D.M.: A quality and time assessment of binarization algorithms. In: Proceedings of the 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019, Sydney, Australia, 20–25 September 2019, pp. 1444–1450. IEEE (2019). https://doi.org/10.1109/ICDAR.2019.00232
Lu, H., Kot, A.C., Shi, Y.Q.: Distance-reciprocal distortion measure for binary document images. IEEE Signal Process. Lett. 11(2), 228–231 (2004). https://doi.org/10.1109/LSP.2003.821748
Article Google Scholar
Michalak, H., Okarma, K.: Adaptive image binarization based on multi-layered stack of regions. In: Vento, M., Percannella, G. (eds.) CAIP 2019. LNCS, vol. 11679, pp. 281–293. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29891-3_25
Chapter Google Scholar
Mitianoudis, N., Papamarkos, N.: Document image binarization using local features and Gaussian mixture modeling. Image Vis. Comput. 38, 33–51 (2015). https://doi.org/10.1016/j.imavis.2015.04.003
Article Google Scholar
Niblack, W.: An introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs (1986)
Google Scholar
Novey, M., Adali, T., Roy, A.: A complex generalized Gaussian distribution - characterization, generation, and estimation. IEEE Trans. Signal Process. 58(3), 1427–1433 (2010). https://doi.org/10.1109/TSP.2009.2036049
Article MathSciNet MATH Google Scholar
Ntirogiannis, K., Gatos, B., Pratikakis, I.: Performance evaluation methodology for historical document image binarization. IEEE Trans. Image Process. 22(2), 595–609 (2013). https://doi.org/10.1109/TIP.2012.2219550
Article MathSciNet MATH Google Scholar
Okarma, K., Lech, P.: Monte Carlo based algorithm for fast preliminary video analysis. In: Bubak, M., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2008. LNCS, vol. 5101, pp. 790–799. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69384-0_84
Chapter Google Scholar
Okarma, K., Lech, P.: Fast statistical image binarization of colour images for the recognition of the QR codes. Elektronika Ir Elektrotechnika 21(3), 58–61 (2015). https://doi.org/10.5755/j01.eee.21.3.10397
Article Google Scholar
Olver, F.W.J.: Asymptotics and Special Functions. Academic Press, New York (1974)
MATH Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979). https://doi.org/10.1109/TSMC.1979.4310076
Article MathSciNet Google Scholar
Pascal, F., Bombrun, L., Tourneret, J.Y., Berthoumieu, Y.: Parameter estimation for multivariate generalized Gaussian distributions. IEEE Trans. Signal Process. 61(23), 5960–5971 (2013). https://doi.org/10.1109/TSP.2013.2282909
Article MathSciNet MATH Google Scholar
Pratikakis, I., Zagoris, K., Kaddas, P., Gatos, B.: ICFHR 2018 competition on handwritten document image binarization (H-DIBCO 2018). In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 489–493, August 2018. https://doi.org/10.1109/ICFHR-2018.2018.00091
Roenko, A.A., Lukin, V.V., Djurović, I., Simeunović, M.: Estimation of parameters for generalized Gaussian distribution. In: 2014 6th International Symposium on Communications, Control and Signal Processing (ISCCSP), pp. 376–379, May 2014. https://doi.org/10.1109/ISCCSP.2014.6877892
Samorodova, O.A., Samorodov, A.V.: Fast implementation of the Niblack binarization algorithm for microscope image segmentation. Pattern Recogn. Image Anal. 26(3), 548–551 (2016). https://doi.org/10.1134/S1054661816030020
Article Google Scholar
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000). https://doi.org/10.1016/S0031-3203(99)00055-2
Article Google Scholar
Saxena, L.P.: Niblack’s binarization method and its modifications to real-time applications: a review. Artif. Intell. Rev. 51(4), 673–705 (2017). https://doi.org/10.1007/s10462-017-9574-2
Article Google Scholar
Shrivastava, A., Srivastava, D.K.: A review on pixel-based binarization of gray images. In: Satapathy, S.C., Bhatt, Y.C., Joshi, A., Mishra, D.K. (eds.) Proceedings of the International Congress on Information and Communication Technology. AISC, vol. 439, pp. 357–364. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-0755-2_38
Chapter Google Scholar
Tensmeyer, C., Martinez, T.: Document image binarization with fully convolutional neural networks. In: 14th IAPR International Conference on Document Analysis and Recognition, ICDAR 2017, Kyoto, Japan, 9–15 November 2017, pp. 99–104. IEEE (2017). https://doi.org/10.1109/ICDAR.2017.25
Wang, C.: Research of image segmentation algorithm based on wavelet transform. In: 2015 IEEE International Conference on Computer and Communications (ICCC), pp. 156–160, October 2015. https://doi.org/10.1109/CompComm.2015.7387559
Wang, R., Li, R., Sun, H.: Haze removal based on multiple scattering model with superpixel algorithm. Signal Process. 127, 24–36 (2016). https://doi.org/10.1016/j.sigpro.2016.02.003
Article Google Scholar
Wolf, C., Jolion, J.M.: Extraction and recognition of artificial text in multimedia documents. Formal Pattern Anal. Appl. 6(4), 309–326 (2004). https://doi.org/10.1007/s10044-003-0197-7
Article MathSciNet Google Scholar
Young, D.P., Ferryman, J.M.: PETS metrics: on-line performance evaluation service. In: Proceedings of 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 317–324 (2005). https://doi.org/10.1109/VSPETS.2005.1570931
Yu, S., Zhang, A., Li, H.: A review of estimating the shape parameter of generalized Gaussian distribution. J. Comput. Inf. Syst. 21(8), 9055–9064 (2012)
Google Scholar
Zhang, Y., Wu, J., Xie, X., Li, L., Shi, G.: Blind image quality assessment with improved natural scene statistics model. Digital Signal Process. 57, 56–65 (2016). https://doi.org/10.1016/j.dsp.2016.05.012
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Signal Processing and Multimedia Engineering, Faculty of Electrical Engineering, West Pomeranian University of Technology in Szczecin, Sikorskiego 37, 70-313, Szczecin, Poland
Robert Krupiński, Piotr Lech & Krzysztof Okarma

Authors

Robert Krupiński
View author publications
You can also search for this author in PubMed Google Scholar
Piotr Lech
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Okarma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krzysztof Okarma .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Amsterdam, Amsterdam, The Netherlands
Gábor Závodszky
University of Amsterdam, Amsterdam, The Netherlands
Michael H. Lees
University of Tennesee, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot
Intellegibilis, Setúbal, Portugal
Sérgio Brissos
Intellegibilis, Setúbal, Portugal
João Teixeira

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Krupiński, R., Lech, P., Okarma, K. (2020). Improved Two-Step Binarization of Degraded Document Images Based on Gaussian Mixture Model. In: Krzhizhanovskaya, V.V., et al. Computational Science – ICCS 2020. ICCS 2020. Lecture Notes in Computer Science(), vol 12141. Springer, Cham. https://doi.org/10.1007/978-3-030-50426-7_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-50426-7_35
Published: 15 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50425-0
Online ISBN: 978-3-030-50426-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improved Two-Step Binarization of Degraded Document Images Based on Gaussian Mixture Model

Abstract

Similar content being viewed by others

Binarization of Degraded Document Images with Generalized Gaussian Distribution

A statistical tool based binarization method for document images

Region Based Approach for Binarization of Degraded Document Images

Keywords

1 Introduction

2 Modelling the Histograms of Distorted Images

2.1 Generalized Gaussian Distribution and Gaussian Mixture Model

2.2 General Assumptions for Natural Images

3 Proposed Method

3.1 Improved Two-Step Binarization Algorithm

3.2 Acceleration of Calculations Using the Monte Carlo Method

4 Experimental Results

5 Summary and Future Work

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Improved Two-Step Binarization of Degraded Document Images Based on Gaussian Mixture Model

Abstract

Similar content being viewed by others

Binarization of Degraded Document Images with Generalized Gaussian Distribution

A statistical tool based binarization method for document images

Region Based Approach for Binarization of Degraded Document Images

Keywords

1 Introduction

2 Modelling the Histograms of Distorted Images

2.1 Generalized Gaussian Distribution and Gaussian Mixture Model

2.2 General Assumptions for Natural Images

3 Proposed Method

3.1 Improved Two-Step Binarization Algorithm

3.2 Acceleration of Calculations Using the Monte Carlo Method

4 Experimental Results

5 Summary and Future Work

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation