Effective and fast binarization method for combined degradation on ancient documents

Khairun Saddami; Khairul Munadi; Yuwaldi Away; Fitri Arnia

doi:10.1016/j.heliyon.2019.e02613

. 2019 Oct 22;5(10):e02613. doi: 10.1016/j.heliyon.2019.e02613

Effective and fast binarization method for combined degradation on ancient documents

Khairun Saddami ^a, Khairul Munadi ^a,^b, Yuwaldi Away ^a,^b, Fitri Arnia ^a,^b,^⁎

PMCID: PMC6820306 PMID: 31687493

Abstract

Document image binarization is a challenging task because of combined degradation in a document. In this study, a new binarization method is proposed for binarizing an ancient document with combined degradation. The proposed method comprises the following four stages: histogram analysis, contrast enhancement, local adaptive thresholding, and artifact removal. In histogram analysis, a new approach is applied to establish a uniform background. Next, the image contrast is enhanced using a new contrast enhancement, and then the document is binarized using a novel local adaptive thresholding. Artifacts from the binarization process are removed in the artifact removal stage. Finally, an experiment is conducted using one private and four public datasets and by simulating the proposed method with and without contrast enhancement. The results showed that the proposed method is faster and more effective compared to other state-of-the-art procedures for binarizing ancient documents.

Keywords: Computer science, Local adaptive thresholding, Uniform histogram, Document image binarization, Degradation combination

Computer science; Local adaptive thresholding; Uniform histogram; Document image binarization; Degradation combination

1. Introduction

Studying degraded document images is an important aspect of a character recognition (CR) system. Note that the CR of a poorly degraded document is a challenging task. Moreover, the study becomes more challenging when different types of noise are present in degraded documents. Note that several types of degradation, such as faint text, water-spilling, ink-bleed, fox, and non-uniform illumination, are found in ancient documents. Documents with poor retention process and the digitalization process could lead to degradation in document quality. However, a document can not only suffer from a single degradation type but also from any combinations of degradation types. Fig. 1 shows ancient documents with combinations of degradation types. The document in Fig. 1a suffers from a combination of ink-bleed through, faint text, and yellowing background. The document in Fig. 1b suffers from a combination of faint text, low brightness, water spilling, and non-uniform illumination. Finally, the document in Fig. 1c suffers from a combination of low contrast, faint text, and uneven object. Note that binarization aims to extract significant information from a document's image. Identifying a method that can quickly and accurately binarize an image has received considerable interest from researchers. Moreover, several binarization techniques have been performed for resolving severe and complex degradation. These include traditional thresholding methods such as those reported by Otsu [1], Niblack [2], and Sauvola [3] or hybrid methods such as those reported by Su [4] and Nafchi [5]. A thresholding based approach is the most straightforward binarization technique; however, it fails for binarizing documents that have severe and complex degradation. Some modified thresholding methods such as Bataineh [6] and iNICK [7] were proposed to enhance the binarization performance but issues still remain.

Examples of ancient documents suffering from combinations of degradation types. (a) Document suffering from a combination of ink-bleed through, faint text, and yellowing background. (b) Ancient Jawi document suffering from a combination of faint text, low brightness, water-spilling, and non-uniform illumination. (c) Document suffering from a combination of low contrast, faint text, and uneven object.

Hybrid methods undergo several stages to binarize an image. Several binarization methods, such as Gatos [8] and Lu [9], use background estimation as a preprocessing step to remove the non-uniform background of the document; however, iterative processing is required to estimate the appropriate background. Su et al. [4] proposed contrast construction to detect high contrast text and text boundary in a document; however, constrat construction encounters problem when show-through degradation exists. Moreover, an energy minimization-based binarization method was proposed [10] that solves severe complex background using Laplacian energy minimization. Nevertheless, a high computational cost is incurred when processing documents with a complex background.

Ramirez-Ortegon et al. proposed an edge-based binarization method using a smoother pixel transition, i.e. transition pixel [11]. Lelore and Bouchara improved the transition pixel method by reducing the number of existing parameters using two threshold values [12]. The primary drawback of the edge-based technique is that it cannot distinguish between the text and the degradation contour such as hard water-spilling noise, furthermore, most hybrid methods incur a high computational cost for complex degradation.

Recently, binarization was performed using deep learning (DL); however, DL has two primary drawbacks: long processing time and ground-truth (GT) image availability. Processing time is vital when binarization becomes part of a framework such as optical character recognition (OCR). However, to produce a reasonable result, training the DL model using the GT of the target dataset is necessary [13].

To summarize, existing techniques still have multiple drawbacks in terms both of performance and computational cost [14]. Furthermore, combinations of degradation types on a single document still pose an issue in binarizing ancient documents [5], which need to be addressed because old documents rarely suffer from only one degradation type. Preliminary studies have been conducted to binarize ancient Jawi documents. An ancient Jawi document is a form of Southeast Asian heritage that contains more information and knowledge compared to other heritages such as tombstone and calligraphic wall. As shown in Fig. 1b, many of these documents suffer from severe combined degradation. Existing methods, including the document image binarization contest (DIBCO) competition winner have poor performances in binarizing documents with such combined degradation.

In this study, our objective is to improve the performance of state-of-the-art methods for binarizing documents with severe combined degradation, particularly for ancient Jawi documents, and to propose a new binarization method for handling combined degradation on ancient documents based on histogram uniformity.

The proposed method comprises a new approach for improving degraded background to be uniform using histogram analysis; a new technique of improving contrast quality using histogram shifting and stretching; a novel local adaptive thresholding using a combination of local and global mean value; and an artifact removal stage.

The rest of the paper is organized as follows. In Section 2, we present the proposed method. In Section 3, we discuss the experimental setup. In Section 4, we present the results and discussion. Finally, in Section 5, we provide the conclusion of this study.

2. Proposed method

In this section, a novel binarization method for binarizing ancient documents is proposed, which aims to address severe combined degradation in ancient documents. Fig. 2 shows the pipeline of the proposed method comprising four stages. The first stage is the histogram analysis of the document image, whereas the second stage is image contrast enhancement using contrast modification. Because the image contrast represents the background and the degradation dispersion of the image, there is a need to enhance the image contrast. The third stage is a thresholding process for extracting texts from the background. Finally, the fourth stage is artifact removal to remove unwanted small parts (artifacts) in the image.

2.1. Histogram analysis

Many pre-processing procedures have been proposed for improving binarization performance such as discrete cosine transform [15], background estimation [9], and contrast construction [16]. In our proposed method, we performed histogram analysis to estimate how a pixel is deployed from an image. According to Moghaddam and Cheriet, the image of an ancient document can be assumed as a combination of several parts (layers) [17], while Rabeux et al., assumed that an ancient document can be divided into three layers, i.e. text, noise, and background [18]. In our proposed method, we assume that the histogram area of an ancient document is divided into five or nine layers, which we refer to as “segments” depending on the image condition. Generally, we divide the histogram into five segments: the first segment is the pixel that represents the text, the second segment is probably text, the third segment is an ambiguous area, the fourth segment represents probably background, and the last segment represents the background pixel. However, if the pixel is spread on the segment near the background, we divide the histogram into nine segments to anticipate pixels that should be considered text but are present in the background segment. We performed experiments on all documents in the dataset by dividing the histogram into seven and nine segments; moreover, we identified that the most appropriate number of segments is nine. We assigned the first and the last segments of the histogram as the text and the mean value of the image, respectively. Note that each segment was separated by a threshold value, and the number of the segment was determined by Eq. (1):

N_{a} = {\begin{matrix} 9 & if μ (x, y) \geq K_{l b} A N D \\ M o (x, y) \geq K_{l b} A N D \tilde{μ} (x, y) \geq K_{l b} \\ 5 & otherwise \end{matrix}

(1)

where $N_{a}$ is the number of the segment, $μ (x, y)$ , $M o (x, y)$ and $\tilde{μ} (x, y)$ are the mean, mode, and median values of the image intensity, respectively and $K_{l b}$ is a factor for detecting background condition.

We conducted experiments on document images whose histograms are dominated by the background pixels. We found that the most appropriate value for $K_{l b}$ in Eq. (1) is 192, which was determined by performing experiments on different types of standard testing documents whose pixel values dominated the background area. The threshold value of the first segment ( $F_{s e g}$ ) and the last segment ( $L_{s e g}$ ) are determined as $F_{s e g} = \frac{I_{m a x} - I_{m i n}}{N_{a}} + I_{m i n}$ and $L_{s e g} = I_{m a x} - \frac{I_{m a x} - I_{m i n}}{N_{a}}$ , respectively, where $I_{m a x}$ and $I_{m i n}$ are the maximum and minimum image intensities, respectively. The first step of histogram analysis is to assign the image pixel into the decided segment based on Eq. (2):

I^{'} (x, y) = {\begin{matrix} 0 & if f (x, y) in F_{s e g} \\ μ (x, y) & if f (x, y) in L_{s e g} \\ f (x, y) & otherwise \end{matrix}

(2)

where $f (x, y)$ is the original document image and $I^{'} (x, y)$ is the segmented image. Then, to achieve a uniform intensity, we assigned the rest of the pixels according to Eq. (3):

I_{f i n a l}^{'} (x, y) = {\begin{matrix} μ (x, y) & if I^{'} (x, y) \geq μ (x, y) - \frac{σ}{2} A N D \\ I^{'} (x, y) < μ (x, y) + \frac{σ}{2} \\ I^{'} (x, y) & otherwise \end{matrix}

(3)

where $I_{f i n a l}^{'} (x, y)$ is the final segmented image obtained after histogram analysis and σ is the image standard deviation. By assigning the mean value of the pixel in the range of $- \frac{σ}{2}$ to $+ \frac{σ}{2}$ , this process results in a histogram with a uniform middle segment, which makes binarization easier. Fig. 3 shows histogram analysis by performing Eqs. (2) and (3). In Fig. 3, the dark area of the document is set to be uniform using Eq. (2); however in Fig. 3b the bright area of the document is set to be uniform using Eq. (3).

Illustration of histogram analysis for (a) a document with combination of degradation types (b) after applying Eq. (2) and (c) after applying Eq. (3).

2.2. Contrast enhancement

The next step of the proposed method is to enhance the contrast quality of an image; the procedure is described by Eq. (4):

I_{c o n} (x, y) = {\begin{matrix} A & if μ (x, y) \geq T_{c} A N D \\ M o (x, y) \geq T_{c} \\ B + \frac{3}{2} \times \sqrt{3 \times σ} & otherwise \end{matrix}

(4)

where $I_{c o n} (x, y)$ is the result obtained after contrast enhancement, $T_{c}$ is a threshold value for separating the contrast enhancement procedure, and A is described by Eq. (5). If $μ (x, y) \geq T_{c}$ and $M o (x, y) \geq T_{c}$ , we stretched the histogram pixel $\geq T_{c}$ to the left side to obtain a uniform contrast for the image result. Based on our experiments, we found that $T_{c} = 190$ leads to the best contrast quality.

A = {\begin{matrix} B + \frac{3}{2} \times \sqrt{3 \times σ} & if \frac{N P t}{N P b} > 0.5 \\ I_{f i n a l}^{'} (x, y) & otherwise \end{matrix}

(5)

where NPt is the number of pixels in the first segment, Nbt is the number of pixels in the last segment, and $B = \frac{255}{I_{m a x} - I_{m i n}} \times I_{f i n a l}^{'} (x, y) - I_{m i n}$ [19]. Based on Eq. (5), if the ratio of the number of pixels in the text segment and the number of pixels in the background segment is ≥0.5, the image histogram needs to be stretched to make it uniform. After contrast enhancement, the document image is binarized in the next stage.

2.3. Local adaptive thresholding

To extract text from a document, we proposed a novel local adaptive thresholding to handle the remaining non-uniform illumination part not fixed by the previous stages. The new local threshold is given in Eq. (6):

T = m_{a d a p t i v e} + k \sqrt{σ + β \frac{m_{w i n d o w}^{2}}{σ}}

(6)

where σ is the image standard deviation, β is an adaptive factor set in the range of 0–30, and $m_{a d a p t i v e}$ is an adaptive mean value that can be defined as follows:

m_{a d a p t i v e} = \frac{m_{g l o b a l} + m_{w i n d o w}}{2}

(7)

where $m_{g l o b a l}$ is the mean value of the image and $m_{w i n d o w}$ is the mean value of the local window. Based on our experiments, we found that the most appropriate local window is 21 x 21 pixels; moreover, k is defined as follows: [7]

k = - \frac{σ}{(255 - \frac{3}{2} σ)}

(8)

while σ is the standard deviation of the image window. After binarization, the unwanted object is removed in the removing artifact stage.

2.4. Removing artifacts

Most thresholding methods detect artifacts along the foreground. The artifacts could affect the binarization performance because they are recognized as false negatives (unexpected black pixels). In the removing artifact stage, we applied a connected component approach to remove the artifacts with ≤25 connected pixels. This value of artifact size is selected based on our experiments specific to the ancient Jawi and DIBCO dataset; other artifact sizes can be selected for other documents.

3. Experimental setup

We conducted experiments on five datasets. Four are public dataset, i.e. DIBCO 2013 [20], HDIBCO 2014 [21], HDIBCO 2016 [22], and PHIBD 2013 [23], while one is a private Jawi dataset (Jawi1-Jawi7), which is a collection of ancient Jawi documents with severe degradation. One of the challenges for analyzing ancient Jawi documents is that they usually suffer from combinations of degradations such as faint text, water-spilling, dark, and non-uniform illumination background as shown in Fig. 1b. The implementation algorithm of the proposed method is described as follows:

Step 1: Obtain a document image.

Step 2: Convert the image to grayscale.

Step 3: Analyze the image histogram using Eqs. (1), (2), and (3).

Step 4: Enhance the image contrast using Eq. (4).

Step 5: Filter the image using median filtering.

Step 6: Binarize the image using Eq. (6).

Step 7: Remove existing artifacts.

Step 8: Result of the binarization process.

The proposed method was compared with state-of-the-art binarization methods including the winner of the DIBCO competition. In the experiment, we compared the Proposed I method (the proposed method without contrast enhancement step) and the Proposed II method (the proposed method with contrast enhancement step) with the Multi-Grid-based Sauvola (MG-Sauvola) method [24]; Lu's method (Lu), the winner of the DIBCO 2009 competition [9]; Su's method (Su), the winner of the handwritten (DIBCO) 2010 competition [4]; FAIR method (FAIR), the extended version of the DIBCO 2011 winner [12]; Howe method (Howe), the winner of the HDIBCO 2012 competition [10]; the winner of the HDIBCO 2016 (WH16) [22]; the Ramirez-Ortegon (Ramirez) method [11], [25]; the Bataineh method [6] (our implementation); and the phase-based method [5] (our implementation).

In the experiment, we evaluated (1) the quality of the binarized image and (2) the running time. The quality of the binarized image was measured qualitatively (visual) and then quantitatively using evaluation metric from the DIBCO competition including F-measure (FM), pseudo-F-measure (FMps), peak signal-to-noise ratio (PSNR), misclassification penalty metric (MPM), and distance reciprocal distortion (DRD) [26], [27], [28]. The source implementation of these evaluation metrics is available in Ref. [29].

F-measure (FM) is the harmonic mean of recall and precision, which is defined as follows:

F - m e a s u r e = 2 \times \frac{R C \times P R}{R C + P R}

(9)

where recall $(RC) = \frac{T P}{T P + F N}$ and precision $(PR) = \frac{T P}{T P + F P}$ .

FMps is based on the skeleton of both binarized and GT image and is defined as follows:

F M p s = 2 \times \frac{R C p s \times P R}{R C p s + P R}

(10)

while pseudo-recall (RCps) is defined as follows:

R C p s = \frac{\sum_{x = 1, y = 1}^{x = M, y = N} S G (x, y) . B (x, y)}{\sum_{x = 1, y = 1}^{x = M, y = N} S G (x, y)}

(11)

where SG is the skeleton of the GT image and B is the skeleton of the binarized image. PSNR measures the similarity between the GT and resulting images such that high PSNR value represents good binarization result. It is defined as follows:

P S N R = 10 \log (\frac{C^{2}}{M S E})

(12)

where C is the difference between the foreground and the background and MSE is defined as: $M S E = \frac{1}{M N} \sum_{x = 1}^{M} \sum_{y = 1}^{N} {(I (x, y) - I {(x, y)}^{'})}^{2}$ . Note that MPM evaluates the object around the GT boundary image such that a low MPM value represents good binarization result. It is defined as follows:

M P M = \frac{\sum_{i = 1}^{F_{N}} d_{F N}^{i} + \sum_{i = 1}^{F_{P}} d_{F P}^{i}}{2 D}

(13)

DRD evaluates the visual distortion of a binary image. It is defined as follows:

D R D = \frac{\sum_{k = 1}^{S} D R D_{k}}{N U B N}

(14)

where NUBN is the number of non-uniform 8 x 8 blocks in the GT image, and $D R D_{k}$ is defined as: $D R D_{k} = \sum_{i = - 2}^{2} \sum_{j = - 2}^{2} G T_{k} (i, j) - B_{k} (x, y) W_{N M} (i, j)$ .

4. Results and discussion

4.1. Qualitative result

Figs. 4, 5, and 6 show binarization result of degraded ancient documents. Fig. 4 compares qualitative results for image PR6 from the DIBCO 2013 dataset. PR6 is a document image that suffers from hard ink-bleed, faint text, and yellowing background degradations. We determined that the Proposed I and Proposed II methods produced better visual binarization results compared to other methods. The MG-Sauvola method produced results similar to the proposed method, while other methods produced poor binarization results. Fig. 5 compares qualitative results for image HW6 from the HDIBCO 2014 dataset. HW6 is a document image that suffers from low contrast, faint text, and uneven object degradations. We determined that the proposed method, the Howe method, and the HDIBCO 2016 winner produced better results compared to other methods. However, the MG-Sauvola, Lu, Bataineh, Su, and Ramirez methods were not able to extract the text properly.

Binarization result of the image in Fig. 1a using the following methods: (a) GT; (b) Proposed I; (c) Proposed II; (d)MG-Sauvola; (e) Lu; (f) Bataineh; (g) Su; (h) Howe; (i) Ramirez; (j) FAIR; (k) Nafchi; and (l) WH16.

Binarization result of the image in Fig. 1c using the following methods: (a) GT; (b) Proposed I; (c) Proposed II; (d) MG-Sauvola; (e) Lu; (f) Bataineh; (g) Su; (h) Howe; (i) Ramirez; (j) FAIR; (k) Nafchi; and (l) WH16.

Binarization result of the image in Fig. 1b using the following methods: (a) GT; (b) Proposed I; (c) Proposed II; (d) MG-Sauvola; (e) Lu; (f) Bataineh; (g) Su; (h) Howe; (i) Ramirez; (j) FAIR; (k) Nafchi; and (l) WH16.

Fig. 6 compares qualitative results for a Jawi image. We found that the Proposed II method produced a binarized image that was most similar to the GT image. The Proposed I and Su methods produced result comparable to the Proposed II method but they extracted more artifacts. The Ramirez, the winner of HDIBCO 2016, and FAIR methods produced poor results and failed to extract several parts of the text. Hence, we can conclude that the proposed method had oustanding performance for different combined degradations such as water-spilling, faint character, and non-uniform intensity.

These results confirm that the proposed method can handle several types of combined degradations. In particular, histogram analysis successfully changed the histogram of the image into a uniform distribution, while local adaptive thresholding successfully changed pixels in a non-text segment to be uniform and binarized. The histogram segmented the text appropriately, such that only pixels in the first segment were assigned as text.

4.2. Quantitative result

Table 1, Table 2, Table 3, Table 4, Table 5 show quantitative results for the DIBCO 2013, HDIBCO 2014, HDIBCO 2016, PHIBD, and private Jawi datasets, respectively. From Table 1, the Proposed I method obtained FM, FMps, PSNR, DRD, and MPM values of 89.73, 93.89, 18.94, 3.50, and 1.57, respectively; while the Proposed II method obtained 89.41, 94.15, 18.87, 3.51, and 1.45, respectively. The results show that the Proposed II method obtained the best values for FMps and MPM, while the Howe method obtained better value for FM, PSNR, NRM, and DRD. Table 2 shows that the proposed method had slightly lower performance than the Howe method and the winner of HDIBCO 2016. Table 3 shows that the proposed method obtained the best values for FM, PSNR, DRD, and MPM; however, the Ramirez method obtained the best value for FMps. Table 4 shows that the Proposed I method obtained FM, FMps, PSNR, DRD, and MPM values of 91.47, 93.00, 19.64, 2.85, and 2.08 respectively. Moreover, the table shows that the Proposed I method obtained the best value for DRD, while the Ramirez method obtained better value for FM, FMps, PSNR, and MPM. Table 5 shows that the proposed methods obtained better results in term of all evaluation metrics. The Proposed I method obtained FM, FMps, PSNR, and DRD values of 91.52, 93.54, 15.77, and 3.26, respectively, which were the highest among all methods, while the Proposed II method obtained an MPM value of 2,95, which was the highest among all methods.

Table 1.

Comparison of the performance of different methods on the DIBCO 2013 dataset.

Methods	FM	FMps	PSNR	DRD	MPM
MG-Sauvola	84.10	89.33	17.89	4.85	1.98
Lu	87.08	88.03	18.75	4.27	3.2
Bataineh	77.81	81.27	15.21	15.09	18.33
Howe	91.34	91.79	21.29	3.18	3.55
Su	87.70	88.15	19.59	4.21	3.02
Ramirez	90.43	92.94	19.32	3.91	3.32
FAIR	90.78	91.47	20.54	3.59	3.35
Nafchi	90.41	90.99	19.44	3.47	2.08
WH16	91.26	91.82	21.21	3.18	3.53
Proposed I	89.73	93.89	18.94	3.50	1.57
Proposed II	89.41	94.15	18.87	3.51	1.45

Open in a new tab

Table 2.

Comparison of the performance of different methods on the HDIBCO 2014 dataset.

Methods	FM	FMps	PSNR	DRD	MPM
MG-Sauvola	87.70	90.90	18.04	4.04	0.72
Lu	91.08	91.64	19.71	3.08	0.96
Bataineh	87.32	89.02	17.75	4.57	2.32
Howe	96.49	97.38	22.24	1.08	0.33
Su	94.38	95.94	20.31	1.95	0.33
Ramirez	92.26	94.38	19.72	2.61	0.36
FAIR	96.14	96.73	21.88	1.25	0.29
Nafchi	93.35	96.05	19.45	2.19	–
WH16	96.38	97.39	22.11	1.07	0.29
Proposed I	93.54	95.70	20.25	2.01	0.90
Proposed II	93.11	96.03	19.51	2.07	0.77

Open in a new tab

Table 3.

Comparison of the performance of different methods on the HDIBCO 2016 dataset.

Methods	FM	FMps	PSNR	DRD	MPM
MG-Sauvola	87.54	90.74	17.96	4.74	4.80
Lu	84.44	92.04	17.33	0.12	2.29
Bataineh	82.08	84.08	15.47	10.00	7.92
Howe	87.47	92.28	18.05	5.35	9.30
Su	84.75	88.94	17.64	5.64	4.61
Ramirez	88.23	92.73	18.44	4.17	3.58
FAIR	88.50	92.51	18.31	4.27	3.32
Nafchi	88.11	91.17	18.00	4.38	–
WH16	87.61	92.40	18.11	5.21	8.70
Proposed I	91.06	92.36	19.29	3.38	1.86
Proposed II	90.78	92.43	19.13	3.46	1.96

Open in a new tab

Table 4.

Comparison of the performance of different methods on the PHIBD dataset.

Methods	FM	FMps	PSNR	DRD	MPM
MG-Sauvola	87.21	89.09	18.72	9.20	5.21
Lu	87.95	91.07	18.35	4.61	0.96
Bataineh	82.82	84.53	16.67	12.64	9.17
Howe	90.97	92.91	19.29	3.03	1.94
Su	88.21	88.82	18.27	5.44	2.65
Ramirez	93.11	94.69	20.27	3.07	1.74
FAIR	71.38	72.39	14.73	11.25	2.34
Nafchi	92.26	94.00	20.15	4.08	–
WH16	89.84	91.56	19.09	3.15	2.39
Proposed I	91.47	93.00	19.64	2.85	2.08
Proposed II	91.39	92.88	19.60	2.94	2.08

Open in a new tab

Table 5.

Comparison of the performance of different methods on the Jawi dataset.

Methods	FM	FMps	PSNR	DRD	MPM
MG-Sauvola	90.31	92.50	15.35	4.98	8.80
Lu	85.17	88.13	13.81	6.17	3.70
Bataineh	87.04	90.95	14.37	7.15	8.14
Howe	83.43	84.88	13.03	7.21	7.75
Su	85.98	84.78	14.04	7.27	6.32
Ramirez	77.55	79.22	13.96	9.75	7.15
FAIR	80.89	80.90	11.69	9.00	9.97
Nafchi	85.80	89.25	13.64	7.20	15.86
WH16	80.10	81.63	12.84	7.48	14.98
Proposed I	91.52	93.54	15.77	3.26	3.12
Proposed II	91.08	93.44	15.63	3.34	2.95

Open in a new tab

Table 6 shows the average results of different methods across all experimental datasets. The table shows that the proposed methods had better performances than the other methods; in particular, they were more effective and efficient in handling different type of degradations in a degraded document. The Proposed I method performed better than other methods in terms of the average results. However, the Howe method obtained a PSNR value similar to the Proposed I method. Moreover, the Proposed I method achieved better performance than the Proposed II method. Nevertheless, the Proposed II method still achieved better performance than the other methods. For example, for FM, the Proposed I method obtained a value of 91.46, which was the best result, while the Proposed II method obtained a value of 91.03 which was the second best result. Furthermore, the Howe method, which was obtained better performance among the comparison methods, only obtained a value of 89.94 that was worse than the proposed methods. Another example is for DRD, where the Proposed I method obtained a value of 3.00, which was the best result, while the Proposed II method obtained a value of 3.05, which was the second best result. Moreover, the Howe method, which was obtained better performance among the comparison methods, only obtained a value of 3.97 that was worse than the proposed methods. Hence, we can conclude that the proposed methods showed superior performances than the other methods.

Table 6.

Comparison of the average results of different methods across the datasets.

Methods	FM	FMps	PSNR	DRD	MPM
MG-Sauvola	87.37	90.51	17.59	5.56	4.30
Lu	86.93	90.85	17.50	4.61	1.95
Bataineh	83.41	85.97	15.89	9.89	9.18
Howe	89.94	91.85	18.78	3.97	4.57
Su	88.20	89.32	17.97	4.90	3.39
Ramirez	88.32	90.79	18.34	4.70	3.23
FAIR	85.54	86.80	17.43	5.87	3.85
Nafchi	88.85	90.77	17.98	4.23	3.41
WH16	89.46	91.37	18.66	4.12	4.42
Proposed I	91.46	93.70	18.78	3.00	1.91
Proposed II	91.03	93.63	18.49	3.05	1.87

Open in a new tab

4.3. Running time

In the experiment, we calculated the processing time of the proposed methods and the Bataineh, Su, Howe, and Nafchi methods. The experiment was conducted on quad-core Intel Core i5 with a 2.40 GHz processor and a 4.0 GB RAM. We estimated the computational speed processing on three datasets: the DIBCO, PHIBD, and Jawi datasets, and presented the result as a cumulative value in Table 7. The proposed I method obtained a computational processing time of 16.91 ms, which was the fastest, while the Howe method obtained 171.52 ms, which was the slowest. From Table 7, the Bataineh and Su methods had similar speed as the proposed method but they had lower quantitative performances than the proposed method. The Nafchi and Howe methods had speeds ten times slower than the proposed method, although they had similar quantitative performances as the proposed method. Hence, we can conclude that the proposed method was more efficient and effective in binarizing severely degraded ancient documents, particularly for documents that suffer from combinations of degradations.

Table 7.

Comparison of the running times of different binarization methods on the DIBCO, PHIBD, and Jawi datasets.

Dataset	DIBCO	PHIBD	Jawi	Average
Bataineh	59.72	99.28	10.33	56.44
Su	27.51	26.22	3.04	18.92
Nafchi	209.77	189.51	20.55	139.94
Howe	250.02	239.14	25.39	171.52
Proposed I	24,97	23.30	2.49	16.91
Proposed II	26,75	24.96	2.67	18.12

Open in a new tab

5. Conclusion

In this study, we proposed a novel binarization method that handles severe combined degradations in ancient documents. The proposed method comprised the following four stages: histogram analysis, contrast enhancement, local adaptive thresholding, and artifact removal. The proposed method was tested on four public i.e., the DIBCO 2013, HDIBCO 2014, HDIBCO 2016, and PHIBD datasets, and one private dataset, i.e., the Jawi dataset. The average results confirmed that the proposed method had outstanding performances in binarizing single documents suffering from several degradation types. The performances improved compared to methods derived from winners of various binarization competitions. Moreover, the processing time of the proposed method was faster than those of four other benchmarking methods. This fast processing time was crucial when the method is combined with other applications such as OCR or GT creator.

Declarations

Author contribution statement

Khairun Saddami: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.

Khairul Munadi, Yuwaldi Away: Conceived and designed the experiments; Wrote the paper.

Fitri Arnia: Conceived and designed the experiments; Contributed reagents, materials, analysis tools or data; Wrote the paper.

Funding statement

This work was funded by the Ministry of Research, Technology and Higher Education of the Republic of Indonesia under the PMDSU Grant No. 62/UN11.2/PP/SP3/2018, and the Doctoral Dissertation schemes Grant No. 2/UN11.2/PP/SP3/2019.

Competing interest statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

References

1.Otsu N. A threshold selection method from gray-level histograms. Automatica. 1975;11(285–296):23–27. [Google Scholar]
2.Niblack W. Strandberg Publishing Company; 1985. An Introduction to Digital Image Processing. [Google Scholar]
3.Sauvola J., Pietikäinen M. Adaptive document image binarization. Pattern Recognit. 2000;33(2):225–236. [Google Scholar]
4.Su B., Lu S., Tan C.L. Proceedings of the 9th IAPR International Workshop on Document Analysis Systems. ACM; 2010. Binarization of historical document images using the local maximum and minimum; pp. 159–166. [Google Scholar]
5.Nafchi H.Z., Moghaddam R.F., Cheriet M. Phase-based binarization of ancient document images: model and applications. IEEE Trans. Image Process. 2014;23(7):2916–2930. doi: 10.1109/TIP.2014.2322451. [DOI] [PubMed] [Google Scholar]
6.Bataineh B., Abdullah S.N.H.S., Omar K. An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows. Pattern Recognit. Lett. 2011;32(14):1805–1813. [Google Scholar]
7.Saddami K., Munadi K., Muchallil S., Arnia F. Improved thresholding method for enhancing Jawi binarization performance. 14th International Conference on Document Analysis and Recognition, 2017; ICDAR'17; IEEE; 2017. pp. 1108–1113. [Google Scholar]
8.Gatos B., Pratikakis I., Perantonis S.J. Adaptive degraded document image binarization. Pattern Recognit. 2006;39(3):317–327. [Google Scholar]
9.Lu S., Su B., Tan C.L. Document image binarization using background estimation and stroke edges. Int. J. Doc. Anal. Recognit. (IJDAR) 2010;13(4):303–314. [Google Scholar]
10.Howe N.R. Document binarization with automatic parameter tuning. Int. J. Doc. Anal. Recognit. (IJDAR) 2013;16(3):247–258. [Google Scholar]
11.Ramírez-Ortegón M.A., Tapia E., Ramírez-Ramírez L.L., Rojas R., Cuevas E. Transition pixel: a concept for binarization based on edge detection and gray-intensity histograms. Pattern Recognit. 2010;43(4):1233–1243. [Google Scholar]
12.Lelore T., Bouchara F. Fair: a fast algorithm for document image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 2013;35(8):2039–2048. doi: 10.1109/TPAMI.2013.63. [DOI] [PubMed] [Google Scholar]
13.Westphal F. Blekinge Institute of Technology; 2018. Efficient Document Image Binarization Using Heterogeneous Computing and Interactive Machine Learning. Ph.D. thesis, PhD thesis. [Google Scholar]
14.Mitianoudis N., Papamarkos N. Document image binarization using local features and Gaussian mixture modeling. Image Vis. Comput. 2015;38:33–51. [Google Scholar]
15.Arnia F., Munadi K., Muchallil S., Fardian F. Improvement of binarization performance by applying dct as pre-processing procedure. 2014 6th International Symposium on Communications, Control and Signal Processing; ISCCSP; IEEE; 2014. pp. 128–132. [Google Scholar]
16.Su B., Lu S., Tan C.L. Robust document image binarization technique for degraded document images. IEEE Trans. Image Process. 2013;22(4):1408–1417. doi: 10.1109/TIP.2012.2231089. [DOI] [PubMed] [Google Scholar]
17.Moghaddam R.F., Cheriet M. Low quality document image modeling and enhancement. Int. J. Doc. Anal. Recognit. 2009;11(4):183–201. [Google Scholar]
18.Rabeux V., Journet N., Vialard A., Domenger J.-P. Quality evaluation of degraded document images for binarization result prediction. Int. J. Doc. Anal. Recognit. (IJDAR) 2014;17(2):125–137. [Google Scholar]
19.Rosenfeld A. Academic Press; 1976. Digital Picture Processing. [Google Scholar]
20.Pratikakis I., Gatos B., Ntirogiannis K. ICDAR 2013 Document Image Binarization Contest (DIBCO 2013). 2013 12th International Conference on Document Analysis and Recognition; ICDAR; IEEE; 2013. pp. 1471–1476. [Google Scholar]
21.Ntirogiannis K., Gatos B., Pratikakis I. ICFHR2014 competition on handwritten document image binarization (H-DIBCO 2014). 2014 14th International Conference on Frontiers in Handwriting Recognition; ICFHR; IEEE; 2014. pp. 809–813. [Google Scholar]
22.Pratikakis I., Zagoris K., Barlas G., Gatos B. ICFHR2016 handwritten document image binarization contest (H-DIBCO 2016). 2016 15th International Conference on Frontiers in Handwriting Recognition; ICFHR; IEEE; 2016. pp. 619–623. [Google Scholar]
23.Ayatollahi S.M., Nafchi H.Z. Persian heritage image binarization competition (PHIBC 2012). 2013 First Iranian Conference on Pattern Recognition and Image Analysis; PRIA; IEEE; 2013. pp. 1–4. [Google Scholar]
24.Moghaddam R.F., Cheriet M. A multi-scale framework for adaptive binarization of degraded document images. Pattern Recognit. 2010;43(6):2186–2198. [Google Scholar]
25.Ramírez-Ortegón M.A., Märgner V., Cuevas E., Rojas R. An optimization for binarization methods by removing binary artifacts. Pattern Recognit. Lett. 2013;34(11):1299–1306. [Google Scholar]
26.Gatos B., Ntirogiannis K., Pratikakis I. ICDAR 2009 document image binarization contest (DIBCO 2009). 2009. ICDAR'09. 10th International Conference on Document Analysis and Recognition; IEEE; 2009. pp. 1375–1382. [Google Scholar]
27.Pratikakis I., Gatos B., Ntirogiannis K. H-DIBCO 2010-handwritten document image binarization competition. 2010 International Conference on Frontiers in Handwriting Recognition; ICFHR; IEEE; 2010. pp. 727–732. [Google Scholar]
28.Pratikakis I., Gatos B., Ntirogiannis K. ICDAR 2011 document image binarization contest (DIBCO 2011). 11th International Conference on Document Analysis and Recognition, 2011; ICDAR'11; IEEE; 2011. pp. 1506–1510. [Google Scholar]
29.Moghaddam R.F., Nafchi H.Z. Objective evaluation of binarization methods, MATLAB Central File Exchange. http://www.mathworks.com/matlabcentral/fileexchange/27652

[br0010] 1.Otsu N. A threshold selection method from gray-level histograms. Automatica. 1975;11(285–296):23–27. [Google Scholar]

[br0020] 2.Niblack W. Strandberg Publishing Company; 1985. An Introduction to Digital Image Processing. [Google Scholar]

[br0030] 3.Sauvola J., Pietikäinen M. Adaptive document image binarization. Pattern Recognit. 2000;33(2):225–236. [Google Scholar]

[br0040] 4.Su B., Lu S., Tan C.L. Proceedings of the 9th IAPR International Workshop on Document Analysis Systems. ACM; 2010. Binarization of historical document images using the local maximum and minimum; pp. 159–166. [Google Scholar]

[br0050] 5.Nafchi H.Z., Moghaddam R.F., Cheriet M. Phase-based binarization of ancient document images: model and applications. IEEE Trans. Image Process. 2014;23(7):2916–2930. doi: 10.1109/TIP.2014.2322451. [DOI] [PubMed] [Google Scholar]

[br0060] 6.Bataineh B., Abdullah S.N.H.S., Omar K. An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows. Pattern Recognit. Lett. 2011;32(14):1805–1813. [Google Scholar]

[br0070] 7.Saddami K., Munadi K., Muchallil S., Arnia F. Improved thresholding method for enhancing Jawi binarization performance. 14th International Conference on Document Analysis and Recognition, 2017; ICDAR'17; IEEE; 2017. pp. 1108–1113. [Google Scholar]

[br0080] 8.Gatos B., Pratikakis I., Perantonis S.J. Adaptive degraded document image binarization. Pattern Recognit. 2006;39(3):317–327. [Google Scholar]

[br0090] 9.Lu S., Su B., Tan C.L. Document image binarization using background estimation and stroke edges. Int. J. Doc. Anal. Recognit. (IJDAR) 2010;13(4):303–314. [Google Scholar]

[br0100] 10.Howe N.R. Document binarization with automatic parameter tuning. Int. J. Doc. Anal. Recognit. (IJDAR) 2013;16(3):247–258. [Google Scholar]

[br0110] 11.Ramírez-Ortegón M.A., Tapia E., Ramírez-Ramírez L.L., Rojas R., Cuevas E. Transition pixel: a concept for binarization based on edge detection and gray-intensity histograms. Pattern Recognit. 2010;43(4):1233–1243. [Google Scholar]

[br0120] 12.Lelore T., Bouchara F. Fair: a fast algorithm for document image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 2013;35(8):2039–2048. doi: 10.1109/TPAMI.2013.63. [DOI] [PubMed] [Google Scholar]

[br0130] 13.Westphal F. Blekinge Institute of Technology; 2018. Efficient Document Image Binarization Using Heterogeneous Computing and Interactive Machine Learning. Ph.D. thesis, PhD thesis. [Google Scholar]

[br0140] 14.Mitianoudis N., Papamarkos N. Document image binarization using local features and Gaussian mixture modeling. Image Vis. Comput. 2015;38:33–51. [Google Scholar]

[br0150] 15.Arnia F., Munadi K., Muchallil S., Fardian F. Improvement of binarization performance by applying dct as pre-processing procedure. 2014 6th International Symposium on Communications, Control and Signal Processing; ISCCSP; IEEE; 2014. pp. 128–132. [Google Scholar]

[br0160] 16.Su B., Lu S., Tan C.L. Robust document image binarization technique for degraded document images. IEEE Trans. Image Process. 2013;22(4):1408–1417. doi: 10.1109/TIP.2012.2231089. [DOI] [PubMed] [Google Scholar]

[br0170] 17.Moghaddam R.F., Cheriet M. Low quality document image modeling and enhancement. Int. J. Doc. Anal. Recognit. 2009;11(4):183–201. [Google Scholar]

[br0180] 18.Rabeux V., Journet N., Vialard A., Domenger J.-P. Quality evaluation of degraded document images for binarization result prediction. Int. J. Doc. Anal. Recognit. (IJDAR) 2014;17(2):125–137. [Google Scholar]

[br0190] 19.Rosenfeld A. Academic Press; 1976. Digital Picture Processing. [Google Scholar]

[br0200] 20.Pratikakis I., Gatos B., Ntirogiannis K. ICDAR 2013 Document Image Binarization Contest (DIBCO 2013). 2013 12th International Conference on Document Analysis and Recognition; ICDAR; IEEE; 2013. pp. 1471–1476. [Google Scholar]

[br0210] 21.Ntirogiannis K., Gatos B., Pratikakis I. ICFHR2014 competition on handwritten document image binarization (H-DIBCO 2014). 2014 14th International Conference on Frontiers in Handwriting Recognition; ICFHR; IEEE; 2014. pp. 809–813. [Google Scholar]

[br0220] 22.Pratikakis I., Zagoris K., Barlas G., Gatos B. ICFHR2016 handwritten document image binarization contest (H-DIBCO 2016). 2016 15th International Conference on Frontiers in Handwriting Recognition; ICFHR; IEEE; 2016. pp. 619–623. [Google Scholar]

[br0230] 23.Ayatollahi S.M., Nafchi H.Z. Persian heritage image binarization competition (PHIBC 2012). 2013 First Iranian Conference on Pattern Recognition and Image Analysis; PRIA; IEEE; 2013. pp. 1–4. [Google Scholar]

[br0240] 24.Moghaddam R.F., Cheriet M. A multi-scale framework for adaptive binarization of degraded document images. Pattern Recognit. 2010;43(6):2186–2198. [Google Scholar]

[br0250] 25.Ramírez-Ortegón M.A., Märgner V., Cuevas E., Rojas R. An optimization for binarization methods by removing binary artifacts. Pattern Recognit. Lett. 2013;34(11):1299–1306. [Google Scholar]

[br0260] 26.Gatos B., Ntirogiannis K., Pratikakis I. ICDAR 2009 document image binarization contest (DIBCO 2009). 2009. ICDAR'09. 10th International Conference on Document Analysis and Recognition; IEEE; 2009. pp. 1375–1382. [Google Scholar]

[br0270] 27.Pratikakis I., Gatos B., Ntirogiannis K. H-DIBCO 2010-handwritten document image binarization competition. 2010 International Conference on Frontiers in Handwriting Recognition; ICFHR; IEEE; 2010. pp. 727–732. [Google Scholar]

[br0280] 28.Pratikakis I., Gatos B., Ntirogiannis K. ICDAR 2011 document image binarization contest (DIBCO 2011). 11th International Conference on Document Analysis and Recognition, 2011; ICDAR'11; IEEE; 2011. pp. 1506–1510. [Google Scholar]

[br0290] 29.Moghaddam R.F., Nafchi H.Z. Objective evaluation of binarization methods, MATLAB Central File Exchange. http://www.mathworks.com/matlabcentral/fileexchange/27652

PERMALINK

Effective and fast binarization method for combined degradation on ancient documents

Khairun Saddami

Khairul Munadi

Yuwaldi Away

Fitri Arnia

Abstract

1. Introduction

Figure 1.

2. Proposed method

Figure 2.

2.1. Histogram analysis

Figure 3.

2.2. Contrast enhancement

2.3. Local adaptive thresholding

2.4. Removing artifacts

3. Experimental setup

4. Results and discussion

4.1. Qualitative result

Figure 4.

Figure 5.

Figure 6.

4.2. Quantitative result

Table 1.

Table 2.

Table 3.

Table 4.

Table 5.

Table 6.

4.3. Running time

Table 7.

5. Conclusion

Declarations

Author contribution statement

Funding statement

Competing interest statement

Additional information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases