Abstract
This paper presents a novel method for segmentation of white blood cells (WBCs) in peripheral blood and bone marrow images under different lights through mean shift clustering, color space conversion and nucleus mark watershed operation (NMWO). The proposed method focuses on obtaining seed points. First, color space transformation and image enhancement techniques are used to obtain nucleus groups as inside seeds. Second, mean shift clustering, selection of the C channel component in the CMYK model, and illumination intensity adjustment are employed to acquire WBCs as outside seeds. Third, the seeds and NMWO are employed to precisely determine WBCs and solve the cell adhesion problem. Morphological operations are further used to improve segmentation accuracy. Experimental results demonstrate that the algorithm exhibits higher segmentation accuracy and robustness compared with traditional methods.
Keywords: segment, white blood cells, peripheral blood and bone marrow, different lights, mean shift clustering, C channel component, nucleus mark watershed operation, morphological operations
1. Introduction
White blood cells (WBCs) in peripheral blood and bone marrow play a significant role in the auxiliary diagnosis of various diseases, such as AIDS, leukemia, and other blood-related diseases. The WBC count, also known as the differential blood count (DBC), is an indicator of certain diseases. In DBC, medical experts count 100 or 200 WBCs on slides stained with blood and accordingly compute the percentage occurrence of each type of WBCs [1]. Traditional counting methods that involve the use of a microscope are time consuming, complicated, tedious, and prone to errors. Meanwhile, automatic recognition methods utilize a flow cytometry apparatus [2] and a blood cell analyzer [3]. These tools are mainly employed for routine blood examination rather than blood cell detection. However, experts always employ blood smears from patients and a microscope to observe the shape of blood cells for the clinical diagnosis of blood diseases in patients. As such, development of an automatic cell recognition system based on image processing and pattern recognition technology to replace manual recognition and counting has been the current trend.
Blood contains different cell lines, the most important of which are the WBCs, platelet, and red blood cells (RBCs) [4]. WBCs, which are also called immune cells, can help the body to fight infection and external matter. Inflammation in the body or other blood diseases can cause changes in the percentage and total numbers of WBC. Collected image samples contain both WBCs and RBCs, thereby influencing the processing and selection of WBCs. In this regard, the WBC segmentation algorithm should accurately work on both peripheral blood and leukemic cells in image processing for precise diagnosis of blood diseases.
Several approaches have been developed for WBC segmentation. These methods are usually based on color space and mathematical morphology operations. Putzu et al. [5,6] proposed a method based on the cyan, magenta, yellow, and key plate (CMYK) color space to separate WBCs because these cells lack the Y component. This method solved the problem of distinguishing similarities between the cytoplasm and the background. Putzu et al. [5] also used the component in the CIE Lab color space to obtain the nucleus in a leukemic blood image. The hue, saturation, and intensity (HSI) model [7,8,9] is commonly used in numerous color models. Dividing the nucleus only in a peripheral blood image is easy and rapid and could yield improved segmentation accuracy. Lim et al. [9] employed a combination of the HSI color space, watershed technique [10,11], and other morphology operations to obtain WBCs in leukocytes and blast cells. The nucleus of WBCs has been segmented using techniques that allow contrast enhancement on grayscale images for noise elimination [12]. These methods are simple but incapable for accurately segmenting the nucleus of WBCs when the gray value of the nucleus is close to the cytoplasm in acute myelocytic leukemia (AML) blood images.
Several researchers have combined the CIE Lab color space with the K-means clustering algorithm for cell division [13,14,15]. The K-means is an algorithm based on color pixel values, with the Euclidean distance as the similarity measure. The algorithm relies excessively on initialization data and other parameters. The clustering result is closely related to the number and shape of the target data. When the color of the cytoplasm strikingly differs from that of the nucleus, this strategy cannot precisely obtain all WBCs in several image conditions.
Rezatofighi et al. [16] segmented the nucleus of five types of WBCs in peripheral blood via a novel method based on Gram-Schmidt orthogonalization to amplify the desired color vectors and weaken undesired ones; in this method, the nucleus boundary was employed as the initial contour of the snake to track the boundaries of WBCs. Ko et al. [17] proposed a WBC segmentation algorithm with stepwise merging rules through mean shift clustering and boundary removal with a gradient vector flow snake. These algorithms are highly effective and accurate in distinguishing cells. However, these approaches require a significant amount of time and cannot address the problem of overlapping WBCs.
Previous studies mainly used active contour models [18,19,20] and mathematical morphology [10,11,21,22,23] to segment overlapping WBCs. However, excessive segmentation, low accuracy of cell division, and other challenges remain and must be overcome.
The present study mainly aims to develop a new algorithm based on color space, mean shift clustering, illumination adjustment, and nucleus mark watershed operation (NMWO) for segmentation of overlapping WBCs in peripheral blood and AML blood images. Morphological operations such as are morphological reconstruction, open-and-close operations, morphological denoising, and cell centroid connection are employed to improve segmentation accuracy.
The remainder of the paper is organized as follows: Section 2 presents a brief introduction to the theories of color space, mean shift clustering, and watershed transform. Section 3 introduces the proposed framework for WBC segmentation in five stages. Section 4 depicts and discusses the experimental results. Finally, Section 5 summarizes the conclusions of this study.
2. Technological Background
Segmentation of WBCs is the most crucial step in hematological image analysis. In clinical blood analysis, medical experts generally use the color features and morphology information of blood cells to distinguish cell types. The proposed method employs different color spaces to identify WBCs, as well as mean shift clustering, C channel component selection, illumination intensity adjustment and image enhancement techniques to obtain a complete set of WBCs. Watershed transformation is also applied to address cell adhesion problem.
2.1. Different Color Spaces
The original stained blood smear image is represented by the RGB color space in the RGB model. An RGB image consists of color pixels of M × N × 3 array. Each pixel in a specific spatial location corresponds to red (R), green (G), and blue (B), which are the three components and primary colors for the superposition of different color levels to produce different colors. Thus far, the RGB color space is one of the most widely used color systems.
The HSI color model consists of three components, hue (H), saturation (S), and intensity (I). This model has a color description consistent with that of humans. The HSI space is more appropriate than the RGB space for WBC segmentation because of its low correlation with image processing.
The CMYK color model is a subtractive model used in color printing and describing the printing process. CMYK refers to the four inks used in color printing; C denotes cyan, M refers to magenta, Y denotes yellow, and K depicts the key plate (black) [6].
2.2. Illumination Intensity Adjustment
In practice, color information and morphological information are used for WBC recognition. Color information, plays an important role in the image segmentation. The different illumination can produce dissimilarities in the image color. However, illumination is difficult to be standardized in different labs. Therefore, the effect of light intensity should be eliminated during image segmentation.
The methods of color feature extraction mainly include color histogram, chromaticity histogram, and color constancy [24]. A chromaticity histogram not only has the advantages of a color histogram but also can eliminate the influence of light intensity when changing image color. Color constancy can eliminate the influence of light intensity and demonstrate robustness in illumination of color changes, however the corresponding calculation procedure is complex. In this paper, the proposed method transforms the image from the RGB space to the red-green (rg) chroma space to eliminate the influence of illumination intensity.
The rg chroma space can eliminate the influence of light intensity on color. The color space conversion from the RGB space to the rg chroma space is shown in the following equation:
(1) |
where shows the three channel components of the original RGB image; shows the three channel components in the rg chroma space. Figure 1a shows the original RGB image under different light conditions, whereas Figure 1b shows the same image in the rg chroma space.
2.3. Mean Shift Clustering
Mean shift clustering is a gradient ascent method used to determine the local highest density of a data set by using mean shifts. Although the procedure was initially described decades ago [25], it was unpopular in the vision community until its potential uses for feature space analysis and optimization were understood [26,27,28]. The non-parametric nature of mean shift makes this method an effective tool to discover arbitrarily shaped clusters in the data.
Based on the assumption that n sample points in xi, where , are given in the -dimensional space Rd, the basic form of the x mean shift vector is defined as:
(2) |
where h is the radius, and Sh is the radius of the high-dimensional spherical area, which satisfies the following relationship of the y point set:
(3) |
where k implies that k points exist in xi and falls in the area of Sh.
If a sample point is obtained from sampling through the probability density function. The gradient of the non-zero probability density will point to the largest increased direction of the density. Thus, the samples are more likely to be distributed more along the gradient direction of the probability density.
In the mean shift method, final clustering is affected by two factors, namely, bandwidth of the neighborhood and bandwidth of the color pixel. The following rules are defined for the xi points that fall in the area of Sh.
In the comparison of the colors of pixels x and xi, the probability density is high when the pixel bandwidth is small. When comparing the distances of pixels x and xi, the short distance bandwidth between x and xi indicates high probability density. Thus, probability density is the product of these two rules.
By substituting the kernel function in Equation (1), we can transform the result of Equation (2) into Equation (4):
(4) |
where is the kernel function to solve high dimension disaster. represents the distance bandwidth. hr reflects the color bandwidth:
indicates the space location of the information:
shows the color information. denotes unit density.
2.4. Watershed Transformation
Watershed transformation was originally proposed by Beucher et al. [29] and improved via rapid implementation methods established by Vincent and Soille [30]. This transformation is traditionally classified as a region-based segmentation approach. Arslan et al. [11] also used marker functions to improve the performance of watershed transformation.
In geography, a watershed is a ridge, and a catchment basin is located in the areas on both sides of the ridge with different types of water flow. In gray-scale image processing, watershed transform is applied with the local extremum region as the catchment basin and the boundary of the basin as the ridge line to perform segmentation in gray-scale images. Oversegmentation because of more local extremum region is a well-known limitation of watershed transformation. However, this problem does not significantly influence the proposed WBC adhesion segmentation scheme because precise nucleus groups are regarded as the local extremum region.
3. Scheme and Methods
This paper presents novel insights into WBC segmentation by obtaining cell seeds and separating adhesive cells in peripheral blood and bone marrow images under different lighting conditions. The developed algorithm is mainly divided into four phases. The first phase aims to obtain the nucleus and inside seeds from the rg and HSI color spaces. The second phase focuses on obtaining WBCs as outside seeds through mean shift clustering operations, extraction of the C channel component in CMYK model, illumination intensity adjustment, and image enhancement techniques. The third phase intends to solve cell adhesion by using NMWO. The last phase involves post-processing techniques to precisely obtain the nucleus and WBCs. A morphological operation is applied in all phases to improve segmentation accuracy.
In contrast to traditional algorithms [11,14,31], the proposed method is suitable for bone marrow and peripheral blood image, and employs local region color information clustering, illumination intensity adjustment, extraction of the C channel component in the CMYK model, and adaptive threshold techniques, The proposed method exhibits robustness for segmentation of various types of WBCs, even under conditions with similar cytoplasm and background. This method also solves the cell adhesion problem with satisfactory performance.
Figure 2 shows the proposed framework of the segmentation scheme. As shown in the block diagram, the system has four main phases, the details of which are explained in the following subsections.
3.1. Morphology Operation
Reconstruction operation is a morphological transformation method that requires a marker binary image, a mask binary image, and a structural element. The marker image should be a subset of the mask image.
In morphological reconstruction, the marker binary image denotes the beginning of transformation, the mask binary image constraints the transformation process, and the structural element defines the connectivity. The refactoring mask binary image from the marker binary image is defined by an iterative process. Refactoring can restore the shape of the image. The accuracy of refactoring depends on the shape and the similarity among structural elements.
In morphology denoising, common algorithms used to extract connected regions are conducted at four connected regions. Area less than S1 are removed in the four connected regions of the nucleus binary image, whereas areas less than S2 are removed in the four connected regions of the individual WBC binary image. In this study, this method is referred to as morphology denoising.
During morphological erosion and dilation, the former removes small isolated features; breaks apart thin, adjoining regions in a feature; and reduces the size of solid objects by “eroding” them at the boundaries. Dilation joins the broken lines to form a contour, which delineates the region of interest.
3.2. Phase I
This phase mainly aims to obtain nucleus in a peripheral blood image and nucleus groups to locate WBCs.
Inside seeds, which are also called nucleus groups, can be used to determine the correct number of WBCs and applied as the local extremum region in the process of watershed transformation to avoid oversegmentation. Therefore, accurate acquisition of inside seeds is crucial.
When color information is considered, the nucleus in the g component of the rg chroma space presents lower pixel values than the other components. In the HSI color space, the nucleus in the S component demonstrates higher pixel values than the other components.
Let be the rg chroma image by Equation (1). The normalized matrices Ig and Is are denoted by the separated g component in the rg chroma spaces and the S component in the HSI color spaces, respectively. The ranges of g and S can be normalized by:
(5) |
where , , , and are the maximum value of , the minimum value of , the maximum value of S, and the minimum value of S, respectively. Figure 3 shows the normalized matrices Ig and Is, as well as their histogram.
The enhanced image (IE) can be expressed as follows:
(6) |
The histogram in Figure 3 shows that the nucleus contains a small gray level in and a large gray level in , whereas the background and RBCs present low gray levels in and high gray levels in . Figure 3 shows that , , and . where , , , , , and are the gray levels of nucleus in , nucleus in , RBCs in , RBCs in , background in , and background in , respectively.
Adaptive global threshold segmentation proposed by Otsu is applied for to obtain the inside seed binary figure. The mathematical morphology method is used to eliminate platelets or other impurities in the binary figure. The nucleus in a peripheral blood image can also be obtained through the same method, which produced accurate results in [8]. For parts of the image of bone marrow blood cells, the difference between the tint of the cytoplasm and the nucleus is too small to be precisely distinguished. This method can also be used to segment WBCs in several AML blood images. Regarding texture, the method can also produce good recognition results in an automatic recognition system.
Nucleus groups are difficult to obtain because of segmented granulocytes with multiple cores. A centroid-connected operation is proposed in this study to convert multiple cores into a single core. The area and distance between multiple cores have a certain range. We assume the existence of two cores, and , where and are the areas of and , respectively, and is the minimum distance between and . If < < , < < , and < , then multiple cores exist in the inside seed binary image. In addition, , , and are the thresholds determined by the area range of a single core and the general spacing of multiple cores. We assume that and are the centroid coordinates of and , respectively. The two coordinates are adopted to perform centroid-connected operation and obtain the following nucleus groups:
(7) |
In Equation (7), is the fluctuation range of the centroid. , , , and are the minimum value of , the maximum value of , the minimum value of , and the maximum value of , respectively. 0, , 0, and are the minimum value of the X-axis, the maximum value of the X-axis, the minimum value of the Y-axis, and the maximum value of the Y-axis, respectively. Figure 4 presents the process of obtaining inside seeds.
We obtain the maximum and minimum values to prevent the boundaries of the centroid from exceeding the range of the axes. A rectangular shape is constructed using the four vertices: , , , and . Within this rectangular shape, the pixel value is 1, and the value of the others pixels is 0. Finally, the multiple cores are connected as shown in Figure 4c,d.
3.3. Phase II
The outside seed is a binary image that contains WBCs and certain impurities. The outside seed plays an important function in segmentation of adhesive cells and obtains WBCs accurately. In this phase, two strategies are used to obtain WBCs, namely, mean shift clustering and WBC enhancement. The combination of these two strategies confers two advantages for obtaining outside seeds. The operations strengthen the weak areas of WBCs and overcome the lighting effects on image quality to some extent.
3.3.1. Mean Shift Clustering
Mean shift clustering is an algorithm used to smooth image based on color and distance in image segmentation. The mean shift procedure is applied to cluster data points, whose trajectories of the gradient ascent lead to the same mode. Figure 5 shows the original RGB image of four types of WBCs. Figure 6 and Figure 7 show the C component and the histogram of C the component in the CMYK color space, respectively. Selection of the C component to obtain WBCs in the CMYK color space is a suitable technique for mean shift clustering based on color and space for two reasons. First, the brightness of the nucleus is stronger than that of the cytoplasm, whereas the brightness of the cytoplasm is stronger than that of RBCs and the background. Second, cyan is a mixture of green and blue. The cyan domain, which is relatively small in the visible spectrum at wavelengths of 480 nm to 510 nm, presents identical color brightness. By placing the color image in a 3D feature space, where one component represents the C component and two other components represent the x and y spatial coordinates of the pixel, the image can be segmented through 3D mean shift clustering. This technique is highly effective for clustering statistics iteration. For clustering, the visual image derived from the segmentation method is combined with the color and space information.
The simplest uniform distribution is the kernel function that used in this study. After clustering through selection of the appropriate kernel distance bandwidth ( ) and color bandwidth ( ), the C component image is shown in Figure 8.
As shown in Table 1, the width range of the WBCs is 26 to 74. Generally, the width of most WBCs is within the mid-value of 50. In the mean shift clustering algorithm, iterations are conducted four times by default to determine a mode in one square with a side length of (2 + 1). In theory, = 3 or = 5 is a suitable kernel distance bandwidth.
Table 1.
Cell Types | Area | Height | Weight | Roundness |
---|---|---|---|---|
segmented neutrophil | 708~1797 | 28~53 | 30~48 | 1.49~2.31 |
staff neutrophil | 939~2236 | 32~58 | 37~59 | 1.57~2.54 |
lymphocyte | 460~2468 | 26~65 | 27~61 | 1.52~2.34 |
monocyte | 1777~3367 | 49~70 | 47~74 | 1.51~2.48 |
eosinophil | 942~2216 | 30~55 | 32~55 | 1.67~3.09 |
basophil | 943~1932 | 36~63 | 31~51 | 1.51~3.08 |
AML WBCs | 553~3041 | 27~60 | 24~67 | 1.47~3.03 |
Figure 6 and Figure 7 show that the RBCs and background exhibit smaller intensity values than the nucleus and cytoplasm, with = 3 as the color bandwidth. Figure 9 shows the clustering results with different bandwidths. As shown in Table 1 and Table 2, we assume that the area of WBCs is times higher than the area of the nucleus. After mean shift clustering, we set the segmentation threshold as follows to convert the image intensity in 0–1:
(8) |
where is the area of nucleus, and is the area of the whole image. shows the gray value frequency in the C component, is the summation of gray value frequency, and is the gray value from 0 to . (peripheral blood images, N = 3; bone marrow images, N = 1.5).
Table 2.
Cell types | Area | Height | Weight | Distance between Two Cores |
---|---|---|---|---|
nucleus of segmented neutrophil | 124~872 | 10~39 | 10~39 | 0~ |
nucleus of staff neutrophil | 418~1029 | 12~52 | 13~50 | |
nucleus of lymphocyte | 437~1018 | 23~44 | 23~39 | |
nucleus of monocyte | 970~1640 | 33~61 | 32~53 | |
nucleus of eosinophil | 426~2157 | 31~53 | 32~53 | |
nucleus of basophil | 858~1716 | 31~47 | 30~50 |
From the adaptive threshold segmentation for the C component after clustering, we obtain WBCs, as shown in Figure 10.
3.3.2. WBC Enhancement
When the components of the rg chroma image are considered, WBCs in the component present smaller pixel values (Figure 11a) than the other components, whereas WBCs in the component (Figure 11b) demonstrate pixel values than the other components. After removing the light intensity, the enhanced image can be defined by:
(9) |
Figure 11c shows the enhanced image. Adaptive threshold segmentation for the enhanced image can be used to obtain WBCs with some impurities, as shown in Figure 11d.
The adaptive segmentation threshold is set as follows:
(10) |
where is the area of the nucleus, is the area of the entire image, is the area of RBCs and the background, is the gray value frequency of the enhanced image, is the summation of the gray value frequencies and i is the gray value from 0 to .
3.3.3. Obtaining the WBC Region
Two WBC binary images segmented by mean shift clustering operations and WBC enhancement operations are add before binarization and morphology denoising. The required WBC binary images are determined and presented in Figure 12 and Figure 13.
3.4. Phase III
Watershed segmentation is a mathematical method based on the theory of topology for morphology segmentation. NMWO is a watershed algorithm marked by the nucleus. If we assume that one WBC has one nucleus and the nucleus is not an adhesive type, the nucleus can be regarded as a local extremum region to mark WBCs. The watershed ridge line of the local extremum region can serve as the boundary of WBCs to separate adhesive WBCs. When the nucleus is adhesive or WBCs and RBCs are of the adhesive type, a second watershed transformation process is required to separate adhesive cells.
If we assume the presence of target , then denotes the roundness value and shows the area of target . If < and > , then the target is the adhesive cells in this study. and are the roundness and area thresholds, respectively. The overall description of the NMWO algorithm is described as follows:
-
Step 1:
Obtain and modify inside seeds and outside seeds by using the mean shift algorithm and the morphology operation.
-
Step 2:
Determine whether cell adhesion occurs or not in . If yes, proceed to the following steps; if no, end.
-
Step 3:
Generate the map of distances, named , from the black pixel to the white pixels of the inside seeds.
-
Step 4:
Apply the watershed algorithm to . The watershed ridge line shown in can be used to obtain separating blood cell images ( ).
-
Step 5:
Determine whether cell adhesion occurs or not in . If yes, do the following steps; if no, end.
-
Step 6:
Obtain the local extremum region [21] on the adhesion target individually. The local extremum is designed by cell size. Perform adaptive iteration corrosion on adhesion target individually until the number of targets increases or does not merely disappear.
-
Step 7:
Apply the watershed algorithm to one by one to obtain the watershed ridge lines. The watershed ridge line displayed in can obtain the separating blood cell images ( ).
-
Step 8:
End.
Figure 14b shows the preliminary binary image of WBCs. Most WBCs are adhesive. As shown in Figure 14d, all adhesive cells are separated after applying the NMWO segmentation algorithm. Multiple sets of successful segmentation results show that the adhesion segmentation method is highly effective.
3.5. Post-Processing
Post-processing is required to accurately obtain nucleus and WBCs. Applying “logic and” between the second watershed transformation binary figure ( ) and the nucleus can address nucleus adhesion. When is a mask and the nucleus is a marker binary image, morphological reconstruction for and nucleus can remove impurities that adhered in . The final result shows the separation of WBCs in the binary image.
4. Experimental Results
4.1. Data Set
Two sets of blood smear images were used in this study. Dataset 1 was collected from the Second Affiliated Hospital of Shandong University. Blood smears were created by the Wright staining method. A ternary Olympus microscope with an oil lens of 100× magnification and CCD was used to collect data images. The 24-bit RGB images were obtained under different lighting conditions for three types of smears (normal peripheral blood, M3 bone marrow, and M5 bone marrow) sampled from more than 10 individuals. A total of 306 images with 774 WBCs were captured with a resolution of 2080 × 1542 pixels. In dataset 2, 108 images were downloaded from the ALL-IDB1 dataset [32]. These JPEG images were captured in the RGB format with three resolutions of 2592 × 1944 pixels, 1712 × 1368 pixels, and 1226 × 652 pixels.
4.2. Morphology Preference
Before the image processing, the original image was resized to 0.2 times higher to improve the efficiency of segmentation system.
Most WBCs are round and convex in shape, and the roundness and area of individual cells present a certain range. However, the shapes and roundness appear checkered and non-convex when multiple cells form larger regional areas or myeloplast lesions. After repairing the binary image of WBCs by morphological operations, chain code tracking is used to obtain the morphological characteristic parameters of WBCs.
The main morphological characteristic parameters are area, perimeter, length, width, and roundness. We divided WBCs into seven cell types, namely, segmented neutrophil, staff neutrophil, lymphocyte, monocyte, eosinophil, basophil, and AML WBCs. More than 30 WBCs are involved in the parameter statistics for every cell type.
The main morphological characteristic parameters of WBCs and the nucleus in dataset 1 are shown in Table 1 and Table 2, respectively. In this study, most parameters were designed based on the morphological characteristic parameters in Table 1 and Table 2:
-
(1)
and . As shown in Table 1, the minimum area in the nucleus is 124 in segmented neutrophils, and the minimum area in WBCs is 460. Thus, we select = 100 and = 350 as the area thresholds to eliminate platelets and noise in morphological denoising.
-
(2)
: Fluctuation range of the centroid, which may not be located in the cores. We set a fluctuation range around the centroid to connect multiple cores into one nucleus and comprehensively consider the length and width of WBCs, as well as the distances of the multiple cores. = 5 is a suitable fluctuation range.
-
(3)
: The minimum distance threshold of multiple cores. We calculated the range of distances between multiple cores, which is from 0 to . = is a suitable distance threshold.
-
(4)
and : Area thresholds of one core. The range of the area is 124 to 872 for one core. The area whose value is within the range of 100 to 950 can be one of the judgment conditions to distinguish segmental neutrophils. Thus, = 100 and = 950.
-
(5)
and . and are the area and roundness threshold of the adhesion target, respectively. In most cases, the area of the adhesive target is higher than 3500, whereas the roundness of the adhesive target is higher than 3.00. Thus, = 3500 and = 3.00.
-
(6)
. is the local extremum in the second watershed transformation. is designed based on WBC size. The minimum length of WBCs is 24 in terms of AML WBCs shown in Table 1. is a little less than , Thus, we selected as a suitable local extremum.
The morphological characteristic parameters of WBCs and the nucleus are modified with the image resolution pixels. We also calculate the morphological characteristic parameters of dataset 2 by using the similar method to that used in dataset 1.
4.3. Segmentation Evaluation
Identification of bone marrow leukocytes can provide a preliminary and auxiliary diagnosis of leukemia. To date, the recognition rate of bone marrow cells by several senior leukemia experts is only 60% to 70%. Degenerative bone marrow cells and other cell types affect the distinction of leukemia in a computer recognition system. Therefore, cells with characteristics similar to WBCs, such as degenerative bone marrow cells and metarubricytes, were regarded as WBCs in this study.
A total of 414 images, 952 individual WBCs, and 598 adhesive WBCs were used from the dataset to complete the segmentation experiments. The dataset contained 260 peripheral blood WBCs, which consisted of 45 basophils, 31 eosinophils, 88 neutrophils, 46 lymphocytes, and 50 monocytes in the peripheral blood images, as well as 514 AML WBCs, and approximately 776 ALL-IDB1 WBCs in the bone marrow images. We compared our algorithm with three methods, namely, color-based clustering [14], color and shape transformation [11], and simple linear iterative clustering (SLIC) [31].
The visual results for WBC segmentation are presented in Figure 15, Figure 16, Figure 17, Figure 18 and Figure 19. These resulting measures are also consistent with the quantitative results presented in Table 3, Table 4 and Table 5.
Table 3.
TP | OVER SEGM | UNDERSEGM | FP | FN | |
---|---|---|---|---|---|
Proposed algorithm | 1481 | 19 | 44 | 10 | 69 |
Color clustering | 1332 | 9 | 511 | >800 | 218 |
Color and shape transformation | 1407 | 25 | 73 | 35 | 143 |
SLIC | 1327 | 286 | 57 | >100 | 223 |
Table 4.
Proposed Algorithm | Color Clustering | Color and Shape Transformation | SLIC | |
---|---|---|---|---|
P | 99% | <62.5% | 97.6% | <93% |
R | 95.5% | 86% | 90.8% | 85.6% |
F1 | 97% | <72.4% | 94.1% | <89.1% |
Table 5.
Proposed Algorithm | Color Clustering | Color and Shape Transformation |
SLIC | ||
---|---|---|---|---|---|
Dataset 1 | Lymphocyte | 43 | 33 | 39 | 38 |
Monocyte | 49 | 35 | 45 | 40 | |
Eosinophil | 29 | 22 | 30 | 23 | |
Basophil | 44 | 33 | 42 | 35 | |
Neutrophil | 83 | 57 | 61 | 49 | |
AML WBCs | 472 | 301 | 435 | 368 | |
Dataset 2 | ALL-IDB1 WBCs | 698 | 331 | 657 | 431 |
In [14], the image is transformed into the Lab color space before image pixels are clustered by running K-means on the a and b channels. The image pixels, such as a and b channels, are plotted in the feature space and clustered into three groups based on their features. The cluster that contains circular objects was selected to represent WBCs. Connected objects with small sizes were removed in a post processing step.
Color clustering implemented for cell segmentation performs effectively on bone marrow images because of the color similarity between the cytoplasm and nucleus in these images. However, for a few peripheral blood images, the pixel values of the cytoplasm and nucleus in the a or b components are easily distinguished, as shown in Figure 15 and Figure 19.
In [11], an intensity map can be obtained by color transformation in the first stage, and the coarse mask can be obtained by applying the Otsu’s method on the intensity map. The boundaries were also refined by applying active contours without edges [33], which produced an active contour by minimizing the energy function on the intensity map. Therefore, the mask produced after active contours application was delimited by the coarse mask, followed by the WBC binary mask. The problem of cell adhesion was solved by watershed transform based on distance transformation [11].
The experimental results of [11] showed that the method is suitable for bone marrow image in which WBCs considerably differ from the background. Two discussion-worthy points can be inferred. First, the intensity map cannot describe the characteristics of all types of WBCs. Second, the active contour model cannot deal with the intensity of inhomogeneous images.
Comparison of NMWO with marker-function watershed transformation [11] showed that the local extremum region can be obtained depending on the shape of the adhesive cells in the binary image. When the adhesive cells appear to be in pieces, an adaptive iteration method combined with NMWO used as the watershed operation can achieve more accurate segmentation results than the marker-function watershed transformation. In this study, the use of the nucleus and adaptive iteration of WBCs as the reception basin for segmentation of adhesive cells in complex bone marrow images achieved more accurate results than other methods.
The brightness of the nucleus is generally higher than that of the cytoplasm, RBCs, and background, in the peripheral blood and bone marrow images. In our experiments, we combined spatial and color similarities to obtain WBCs and achieve high accuracy for every type of WBCs through mean shift clustering. The K-means algorithm is based on color clustering, and differs from mean shift clustering. When different cell types simultaneously appear in an image, the K-means algorithm, in contrast to mean shift clustering, cannot adapt to large changes in the pixel intensity of different types of WBCs. SLIC is another segmentation method that differs from mean shift clustering in terms of color and space. In the preprocessing stage, SLIC is suitable for obtaining image boundaries, but not for WBC segmentation.
In this study, the proposed algorithm based on color features and adaptive thresholds achieved accurate segmentation result for WBC segmentation. Two segmentation strategies were performed to overcome common problems in segmenting weak areas, such as the cytoplasm. The proposed algorithm is considered robust for different types of WBCs under different lighting conditions.
Most of the nucleus did not show adhesion, and one WBC contained one nucleus group. We applied nucleus groups as the local extremum regions, and the watershed transformation based on tag [21] was used for segment-overlapping WBCs. Thus, the complexity of leukocyte adhesion is considerably reduced to a certain extent. When the nucleus showed adhesion, we applied the second watershed transformation based on the adaptive iteration method for overlapping WBCs. The application of NMWO to segment adhesive cells achieved more accurate results than other methods for complex bone marrow images.
The performance of the algorithms was also evaluated based on F-measure [34]. TP shows the number of correct WBC segmentations, FP reflects the number of RBCs and other impurities, and FN denotes the number of leaking and broken WBCs. Table 3 and Table 4 show the comparison of the overall performance in terms of the F-measure of the four algorithms, whereas Table 5 enumerates the correct segmentation of different cell types in the two datasets.
For both datasets, the proposed method exhibit satisfactory performance for WBC segmentation. The experimental results are presented in Table 3, Table 4 and Table 5. Table 3 and Table 4 show that the proposed algorithm achieves better performance than other traditional methods, such as color clustering [14], color and shape transformation [11], and SLIC [31]. Table 5 further demonstrates that the proposed method outperforms other traditional methods for the private and public datasets. Figure 15, Figure 16, Figure 17, Figure 18 and Figure 19 present the visual comparison between our method and other traditional methods, with the proposed method generating more accurate segmentation results.
The proposed method achieves high performance for WBC segmentation in peripheral blood and bone marrow images. Adhesion segmentation accuracy is a second advantage of the proposed method over traditional methods, as shown in Table 3 and Figure 14 and Figure 16, Figure 17 and Figure 18. However, the proposed algorithm has a few disadvantages. For example, the watershed ridge line does not match the boundaries of cells in a few images. If the degree of cell adhesion and morphology in the image is very complex, other algorithms with reasonable and effective morphological characteristics should be used to solve this complicated problem in future work.
5. Conclusions
This paper presents novel insights into WBC segmentation by obtaining cell seeds and separating adhesive cells in peripheral blood and bone marrow images under different light conditions. Color space conversion, mean shift clustering, and illumination intensity adjustment were applied to obtain WBCs as outside seeds. Nucleus enhancement and centroid connection were also employed to obtain inside seeds in different color spaces. The morphological and NMWO operations were subsequently conducted in WBC adhesion segmentation. The proposed method presents a reasonable processing time and provides accurate results.
The proposed method exhibits high robustness and satisfactory performance for WBC segmentation in peripheral blood and bone marrow images. Adhesion segmentation accuracy is the second advantage of the proposed method over traditional methods.
Adhesion segmentation for the diagnosis of complicated abnormal cells in bone marrow diseases remains a great challenge. In addition, distinguishing between ALL and AML or identifying M3 and other AML diseases via an automatic identification system, as mandated by the current WHO standard requires further research.
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (No. 61475085), the Science and Technology Development Plans of Shandong Province (No. 2012GGE27073 and 2014GSF118142), the Science and Technology Development Plans of Jinan City and the Fundamental Research Funds of Shandong University (No. 2015JC038).
Author Contributions
Zhi Liu and Jing Liu contributed equally to this paper. Zhi Liu, Jing Liu and Xiaoyan Xiao wrote the paper. Zhi Liu, and Chengyun Zheng designed the experiments. Hui Yuan and Jun Chang analyzed the data. Jing Liu and Xiaomei Li performed the experiments.
Conflicts of Interest
The authors declare no conflict of interest.
References
- 1.Wang S.T., Wang M. A New Detection algorithm (NDA) based on Fuzzy Cellular Neural Networks for White Blood Cell Detection. IEEE Trans. Inf. Technol. Biomed. 2006;10:5–10. doi: 10.1109/titb.2005.855545. [DOI] [PubMed] [Google Scholar]
- 2.Altendorf E., Zebert D., Holl M., Yager P. Differential blood cell counts obtained using a microchannel based flow cytometer; Proceedings of the 1997 International Conference on Solid State Sensors and Actuators; Chicago, IL, USA. 16–19 June 1997; pp. 531–534. [Google Scholar]
- 3.Hirschfeld T. Blood Cell Analyzer. No. 3,819,270. U.S. Patent. 1974 June 25;
- 4.Alberts B., Johnson A., Lewis J., Raff M., Roberts K., Walter P. Molecular Biology of the Cell. 3rd ed. W. H. Freeman; New York, NY, USA: 1994. [Google Scholar]
- 5.Putzu L., di Ruberto C. White blood cells identification and counting from microscopic blood images; Proceedings of the WASET International Conference on Bioinformatics, Computational Biology and Biomedical Engineering; Guangzhou, China. 1 November 2013; pp. 268–275. [Google Scholar]
- 6.Zhang C., Xiao X., Li X. White Blood Cell Segmentation by Color-Space-Based K-Means Clustering. Sensors. 2014;14:16128–16147. doi: 10.3390/s140916128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Duan J., Yu L. A WBC segmentation method based on HSI color space; Proceedings of the 4th IEEE International Conference on Broadband Network and Multimedia Technology (IC-BNMT); Shenzhen, China. 28–30 October 2011; pp. 629–632. [Google Scholar]
- 8.Huang D.C., Hung K.D. Leukocyte nucleus segmentation and recognition in color blood-smear images; Proceedings of the IEEE International Instrumentation and Measurement Technology Conference (I2MTC); Graz, Austria. 13–16 May 2012; pp. 171–176. [Google Scholar]
- 9.Lim H.N., Mashor M.Y., Hassan R. White blood cell segmentation for acute leukemia bone marrow images; Proceedings of the 2012 IEEE International Conference on Biomedical Engineering (ICoBE); Penang, Malaysia. 27–28 February 2012; pp. 357–361. [Google Scholar]
- 10.Roerdink J.B., Meijster A. The watershed transform: Definitions, algorithms and parallelization strategies. Fundam. Inform. 2000;41:187–228. [Google Scholar]
- 11.Arslan S., Ozyurek E., Gunduz-Demir C. A color and shape based algorithm for segmentation of white blood cells in peripheral blood and bone marrow images. Cytom. A. 2014;85:480–490. doi: 10.1002/cyto.a.22457. [DOI] [PubMed] [Google Scholar]
- 12.Mohamed M., Far B., Guaily A. An efficient technique for white blood cells nuclei automatic segmentation; Proceedings of the 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC); Seoul, Korea. 14–17 October 2012; pp. 220–225. [Google Scholar]
- 13.Laosai J., Chamnongthai K. Acute leukemia classification by using SVM and K-Means clustering; Proceedings of the 2014 IEEE International Electrical Engineering Congress (IEECON); Chonburi, Thailand. 19–21 March 2014; pp. 1–4. [Google Scholar]
- 14.Salem N.M. Segmentation of white blood cells from microscopic images using K-means clustering; Proceedings of the 31st IEEE National Radio Science Conference (NRSC); Cairo, Egypt. 28–30 April 2014; pp. 371–376. [Google Scholar]
- 15.Gautam A., Bhadauria H.S. White blood nucleus extraction using K-Mean clustering and mathematical morphing; Proceedings of the 5th IEEE International Conference on The Next Generation Information Technology Summit; Noida, India. 25–26 September 2014; pp. 549–554. [Google Scholar]
- 16.Rezatofighi S.H., Soltanian-Zadeh H. Automatic recognition of five types of white blood cells in peripheral blood. Comput. Med. Imaging Graph. 2011;35:333–343. doi: 10.1016/j.compmedimag.2011.01.003. [DOI] [PubMed] [Google Scholar]
- 17.Ko B.C., Gim J.W., Nam J.Y. Automatic white blood cell segmentation using stepwise merging rules and gradient vector flow snake. Micron. 2011;42:695–705. doi: 10.1016/j.micron.2011.03.009. [DOI] [PubMed] [Google Scholar]
- 18.Zimmer C., Labruyere E., Meas-Yedid V., Guillen N., Olivo-Marin J.C. Segmentation and tracking of migrating cells in videomicroscopy with parametric active contours: A tool for cell-based drug testing. IEEE Trans. Med. Imaging. 2002;21:1212–1221. doi: 10.1109/TMI.2002.806292. [DOI] [PubMed] [Google Scholar]
- 19.Zhang B., Zimmer C., Olivo-Marin J.C. Tracking fluorescent cells with coupled geometric active contours; Proceedings of the 2004 IEEE International Symposium on Biomedical Imaging: Nano to Macro; Arlington, VA, USA. 15–18 April 2004; pp. 476–479. [Google Scholar]
- 20.Xiong G., Zhou X., Ji L. Automated segmentation of Drosophila RNAi fluorescence cellular images using deformable models. IEEE Trans. Circuits Syst. 2006;53:2415–2424. doi: 10.1109/TCSI.2006.884461. [DOI] [Google Scholar]
- 21.Gonzalez R.C., Woods R.E., Eddins S.L. Digital Image Processing Using MATLAB. Pearson Education; Indianapolis, IN, USA: 2004. [Google Scholar]
- 22.Osowski S., Markiewicz T., Marianska B., Moszczynski L. Feature generation for the cell image recognition of myelogenous leukemia; Proceedings of the 12th IEEE Conference on Signal Processing; Vienna, Austria. 6–10 September 2004; pp. 753–756. [Google Scholar]
- 23.Ping Z.J.H. A Survey for the Arithmetic of Overlapped Cell Segmentation. Comput. Digit. Eng. 2008;6:53–56. [Google Scholar]
- 24.Funt B.V., Finlayson G.D. Color constant color indexing. IEEE Trans. Pattern Anal. Mach. Intell. 1995;17:522–529. doi: 10.1109/34.391390. [DOI] [Google Scholar]
- 25.Funaga K., Hostetler L. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theory. 1975;21:32–40. doi: 10.1109/TIT.1975.1055330. [DOI] [Google Scholar]
- 26.Wand M.P., Jones M.C. Kernel Smoothing. CRC Press; Boca Raton, FL, USA: 1994. [Google Scholar]
- 27.Cheng Y. Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 1995;17:790–799. doi: 10.1109/34.400568. [DOI] [Google Scholar]
- 28.Comaniciu D., Meer P. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2002;24:603–619. doi: 10.1109/34.1000236. [DOI] [Google Scholar]
- 29.Beuher S. Watersheds of functions and picture segmentation; Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP); Paris, France. 3–5 May 1982; pp. 1928–1931. [Google Scholar]
- 30.Vincent L., Pierre S. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 1991;13:583–598. doi: 10.1109/34.87344. [DOI] [Google Scholar]
- 31.Achanta R., Shaji A., Smith K., Lucchi A., Fua P., Susstrunk S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012;34:2274–2282. doi: 10.1109/TPAMI.2012.120. [DOI] [PubMed] [Google Scholar]
- 32.ALL-IDB Website. [(accessed on 23 July 2015)]. Available online: http://www.dti.unimi.it/fscotti/all.
- 33.Chan T.F., Vese L.A. Active contours without edges. IEEE Trans. Image Process. 2001;10:266–277. doi: 10.1109/83.902291. [DOI] [PubMed] [Google Scholar]
- 34.Jiawei H., Kamber M. Data Mining Concepts and Techniques. 3rd ed. Machine Press; Beijing, China: 2012. [Google Scholar]