Abstract.
Multicolor fluorescence in situ hybridization (M-FISH) is a multichannel imaging technique for rapid detection of chromosomal abnormalities. It is a critical and challenging step to segment chromosomes from M-FISH images toward better chromosome classification. Recently, several fuzzy C-means (FCM) clustering-based methods have been proposed for M-FISH image segmentation or classification, e.g., adaptive fuzzy C-means (AFCM) and improved AFCM (IAFCM), but most of these methods used only one channel imaging information with limited accuracy. To improve the segmentation for better accuracy and more robustness, we proposed an FCM clustering-based method, denoted by spatial- and spectral-FCM. Our method has the following advantages: (1) it is able to exploit information from neighboring pixels (spatial information) to reduce the noise and (2) it can incorporate pixel information across different channels simultaneously (spectral information) into the model. We evaluated the performance of our method by comparing with other FCM-based methods in terms of both accuracy and false-positive detection rate on synthetic, hybrid, and real images. The comparisons on 36 M-FISH images have shown that our proposed method results in higher segmentation accuracy () and a lower false-positive ratio () than conventional FCM (accuracy: , and false-positive ratio: ) and the IAFCM (accuracy: and false-positive ratio: ) methods by incorporating both spatial and spectral information from M-FISH images.
Keywords: image segmentation, multicolor fluorescence in situ hybridization images, sparsity, spatial and spectral fuzzy C-means cluster, total variation
1. Introduction
Both numerical and structural chromosomal abnormalities, such as duplications, translocations, inversions, and deletions, are important factors associated with complex diseases. For example, Down syndrome is caused by having three copies of chromosome 21.1 The detection of chromosomal abnormalities has been used for prenatal and postnatal diagnostics and for cancer cytogenetics research. Multicolor fluorescence in situ hybridization (M-FISH) is a powerful tool for simultaneous visualization of chromosomal abnormalities in a single cell by labeling chromosomes with different fluorophores, where the Boolean combinational labeling strategy is used. The number of the combinations of fluorophores is ; therefore, five different fluorophores are sufficient for differentiating 24 types of chromosomes (22 autosomes and two sex chromosomes). Figure 1 shows an image set with six-channel M-FISH images from a single metaphase cell. In Figs. 1(a)–1(e), the chromosomes labeled with different fluorophores are acquired by an epifluorescence microscopy equipped with a filter cube [including far red (F), spectrum green (G), spectrum aqua (A), spectrum red (R), and spectrum gold (Y)]. The other image, Fig. 1(f), is 4 in, 6-diamidino-2-phenylindole (DAPI), which is labeled as D. All chromosomes are labeled in DAPI channel, so they are all visible in the image D.
Fig. 1.
A demonstration of an M-FISH image set with six-channel M-FISH images from a single metaphase cell. In images (a)–(e), different subsets of chromosomes are labeled with different fluorophores and are acquired by an epifluorescence microscopy equipped with the filter cubes [far red (F), spectrum green (G), spectrum aqua (A), spectrum red (R), and spectrum gold (Y), respectively]. The other image (f) is DAPI channel, labeling all chromosomes in the image.
Accurate segmentation of M-FISH images has a significant impact on subsequent chromosomal classification and hence on clinical diagnosis. In our previous works on chromosome classification, including Bayesian classifier,2 sparse representation-based classifier,3 regularized multinomial logistic classifier,4 structural sparse representation classifier,5,6 and patch-based tensor decomposition classifier,7 several steps have been done in preprocessing8,9 to facilitate the analysis of M-FISH images. Among them, image segmentation is a critical step and can significantly improve classification accuracy.9,10 The other importance of segmentation is to take advantage of five fluorescence channels to code different chromosomes. With accurate segmentation of chromosome in each channel, the chromosomes can be better distinguished with the combination of all channels of information. Therefore, accurate segmentation is critical for M-FISH image analysis. By segmentation, a mask is generated to separate chromosome regions from the background. However, the quality of acquired M-FISH images is usually poor due to various factors,11 such as uneven intensity between imaging channels, inhomogeneous intensity of intrachannel, and spectral overlap.8 It has been a daunting challenge to accurately segment chromosomal images under such conditions.
A number of machine learning methods have been proposed for image segmentation or classification, e.g., supervised method,12 semisupervised method,13 unsupervised method,14–16 and hybrid method.17 Since M-FISH images are sensitive to experimental platforms and are largely heterogeneous across subjects (e.g., cells), unsupervised methods (e.g., clustering) are usually applied to image segmentation or classification. The FCM-based clustering method, as one of the unsupervised techniques, has been widely used in the fields of geology, astronomy, medical imaging, and multimedia.18–21 To reduce the effect of noise and intensity inhomogeneities in the image, several modified FCM approaches have been proposed by considering the spatial smoothness of neighboring pixel intensities in the image.22,23 For example, Tolias and Panas24 modified the prototype vectors as functions of the pixel location in the image for segmentation. Ahmed et al.25 modified FCM to use the immediate neighborhood for labeling the fuzziness membership of a pixel in the segmentation of MRI data. Chuang et al.26 incorporated the spatial information into the membership function to make the clustering be less sensitive to noise, which yielded a more homogeneous segmented region. Pham and Prince27 developed AFCM to improve the objective function by adding the first and second order regularizations to the estimated cluster centroids to enable the estimated membership functions to be spatially smooth. In our recent work, we developed an improved adaptive fuzzy C-means (IAFCM) algorithm by adding a gain field to regularize the cluster center and showed better performance over other methods in M-FISH image segmentation.9 Despite the efficiency of these works, these methods mainly focused on a single channel image clustering, while ignoring multispectral information in M-FISH image. Multichannel information is complementary, which can be used to further improve the segmentation.
Segmentation of multispectral images has been studied in several fields, such as remote sensing, monitoring, and chemometrics. Tran et.al.28 presented an interesting tutorial on multispectral image segmentation. The work demonstrated the benefit of utilizing spectral information in multivariate image segmentation. As suggested by the study, partitional clustering techniques (e.g., FCM) can be the best option for segmentation given the known or easily estimated number of clusters in the image. Recently, Saïd et al.29 extracted speeded-up robust features from multispectral face images to be weighted in an FCM cost function to improve the segmentation. He et al.30 imposed a total variation (TV) regularization on the membership function in FCM to reduce the noise effect in hyperspectral imaging segmentation. In addition, some works have involved the incorporation of both spatial and spectral information for multispectral image segmentation. For example, Paclık et al.31 applied clustering to spectral and spatial information separately and combined both posterior probabilities based on the product rule. Li et al.32 also applied a two-step strategy to estimate the maximum combined posterior possibility by integrating spectral and spatial prior information.
The previous works have shown the advantages of utilizing both spatial and spectral information in image segmentation; however, little or no work has been performed for the M-FISH imaging application. In this paper, we propose an improved fuzzy C-means clustering model, namely spatial- and spectral-FCM (ssFCM), to account for both spatial and spectral information simultaneously, with the use of TV and row-wise sparse constraints, respectively. The TV constraint uses gradient information to smooth the cluster by considering that the neighboring pixels share similar membership values. The row-wise sparse constraint ensures that the membership functions of the neighboring pixels share similar patterns across different channels. This new method is expected to deliver better results because it incorporates both the spectral information across multichannels and spatial information within a neighborhood simultaneously.
The rest of the paper is organized as follows. In Sec. 2, we first review the conventional FCM clustering method and then propose an improved model by incorporating both spatial and spectral information. In Sec. 3, we present the experimental results on both simulated and real M-FISH images. Finally, we conclude the paper with a brief discussion.
2. Method
2.1. FCM Clustering Method
The FCM clustering proposed by Refs. 15 and 16 is an unsupervised learning technique to organize the data into two or more clusters. FCM clustering can be obtained by the following optimization:
| (1) |
where is the intensity value of the ’th pixel from an image with size by , , , and denotes the center of the ’th cluster, . is set to 2 in this work, referring to background and chromosome. is the membership of in the ’th cluster and is the weight on each membership to determine the degree of fuzziness. A higher value of membership indicates a higher degree of probability for the ’th pixel to belong to the ’th cluster and vice versa. Therefore, a pixel will be assigned to a particular cluster if it has the highest membership value corresponding to the cluster. The optimization problem in Eq. (1) can be solved with the Lagrangian formulation as follows:
| (2) |
where is the Lagrange multiplier. The solution of and can be derived by
As a result, we can approximate the membership and the cluster centers iteratively with
| (3) |
The iteration will stop when , where is a threshold for the termination and is the iteration step. Figure 2(a) shows the schematic diagram of the conventional FCM model. With FCM, each pixel (e.g., A) will be assigned a membership value, belonging to either chromosome cluster () or background cluster (). By comparing the two membership matrices and , we can therefore assign all the pixels to different clusters.
Fig. 2.
(a) The comparison of the conventional FCM model and (b) our newly proposed FCM model. In the conventional FCM model, the membership functions of pixel assigned to 2 classes in the ’th channel, denoted by (, ; ), are estimated individually. However, in the proposed model, the membership functions of neighboring pixels (i.e., B-E), denoted by are also accounted for, estimating in the ’th channel. TVs of membership with each of other neighboring points (e.g., ) are used as the constraint in the model to account for spatial neighborhood information.
2.2. Proposed FCM Clustering Algorithm Incorporating Both Spatial and Spectral Information
Despite the success of FCM clustering-based methods in imaging segmentation or classification, many of them overlook the spatial and/or spectral information among pixels in multidimensional images (e.g., M-FISH image). Both spatial and spectral information have been widely used for noise reduction,33 hyperspectral or color imaging classification and segmentation,32,34 and magnetic resonance imaging segmentation.30,35–37 Instead of analyzing each pixel individually, spatially or spectrally neighboring pixels can be used together to improve image analysis because these neighboring pixels will share similar characteristics, e.g., having similar intensities. Therefore, it is promising to incorporate this information into the FCM model.
Specifically, our newly proposed FCM clustering model is formulated as follows:
| (4) |
where is the tuning parameter. denotes the number of channels, which equals 6 for M-FISH images. The function is regularized by the row-wise sparsity. is the membership matrix with . is the membership value indicating the probability that the ’th pixel in the ’th channel image is assigned to the ’th cluster, and each column indicates the membership values for all pixels in the ’th channel image to be assigned to the ’th cluster.
consists of difference operators to calculate the membership difference of neighboring pixels in horizontally, vertically, and diagonally. Let denote the matrix defined by and choose , where is the identity matrix and the operator denotes the Kronecker tenser product of matrices and . and indicate the number of rows and columns, respectively, of the image matrix in each channel with . is the difference operator to compute the difference of membership between a pixel and its neighboring pixels horizontally in the image. Similarly, let and be the difference operators in the vertical and diagonal directions, respectively. We can combine three difference operators as , where each row in represents interchannel membership values. These difference operators are actually used to define row-wise anisotropic TV constraints.38,39 The use of the TV can remove unwanted variations among neighboring pixels while preserving important details such as edges.40 To incorporate spectral information, we enforce a row-wise sparse penalty on matrix . Row-wise sparse penalization on a matrix is defined by , where is an matrix and and are parameters for defining norm of matrix . In the ssFCM model, we use norm (, ), which is the sum of the Euclidean norms of the rows of the matrix. norm is nondifferentiable on each row of matrix only when all elements in the row are zeros, resulting in a row-sparse solution matrix (the element values in the majority of rows are zeros), as introduced in Refs. 41 and 42. Combined with TV to calculate the membership difference of neighboring pixels in each channel, the additional row-sparse penalty is imposed on the membership difference across all channels, enforcing that the membership differences among neighboring pixels are close to zeros across spectral channels.
The proposed ssFCM model is illustrated in Fig. 2(b). The input is a multichannel image matrix with each image collected from one channel, e.g., six images in the M-FISH collected from six channels, respectively. For each pixel in one-channel image (e.g., pixel in the first channel), we will explore the spatial neighboring pixels of in the same channel (e.g., pixels ,) and the spectrally neighboring pixels from the other five channels (e.g., , ). Each pixel in each channel image is given a membership value to the chromosome (i.e., ) and to the background cluster (i.e., ), respectively, where indicates the pixel in each image (e.g., ) and is the index of the channel (i.e., ). Totally, we have 12 membership matrices to be estimated, which is combined as , where is the number of pixels in each channel image. As discussed before, we enforce a TV combined with a row-wise sparse constraint on the membership matrix to utilize both spatial and spectral information. This way we will ensure that (1) the neighboring pixels will have similar membership functions and (2) the clustering patterns of neighboring pixels (e.g., A,…,E) will be similar across different channels. The constraints will have the potential to reduce noise using both spatial and spectral information of each pixel.
The alternating direction method of multipliers (ADMM)43 can be used to solve Eq. (4). Specifically, we solve the following optimization problem:
| (5) |
which can be solved with the following cost function:
| (6) |
where is the Lagrange multiplier and is the parameter for controlling the row sparsity in matrix , i.e., the proportion of nonzero rows. The fuzziness is set to 2. , where is an identity matrix and is an operator to reformulate a matrix to a vector by columns; is the augmented Lagrangian parameter to approximate , by ; and is the scaled dual variable. The parameter is updated by iterations, as suggested in Ref. 43, to improve the convergence and make the solution be less dependent on the initial value. The solution of Eq. (6) is derived by
| (7) |
| (8) |
where , , . In Eq. (8), will be solved by the constraint , as in Eq. (3). Table 1 shows the algorithm of solving Eq. (6) via ADMM. It uses a decomposition-coordination procedure, and the solutions to small local subproblems are coordinated to find the solution to a large global problem. In the algorithm, at each iteration, the parameters , , , and are updated sequentially with other parameters fixed. This iteration will stop until a stop criterion is satisfied.
Table 1.
The application of the ADMM algorithm to the proposed FCM clustering model.
| Algorithm: the solution of the proposed FCM clustering via ADMM algorithm |
|---|
| Input: the original image , linear operator , , |
| Output: Matrix of membership functions, matrix of cluster centers |
| Initialization: , , |
| While stopping criteria false do |
| Update center : |
| Update : |
| Update : |
| Update : |
| End while |
| Return: and |
3. Application of the Proposed FCM to both Simulated and Real M-FISH Images
3.1. Results on Synthetic Image Sets
Two three-channel synthetic image sets with , which have the intensity values of 0 and 100, were generated. The first image set (Fig. 3) consists of four squares belonging to three different classes: the first class contains the pixels with an intensity value of 100 in the first channel and 0 in the second channel; the second class consists of pixels with an intensity of 0 in the first channel and 100 in the second channel; the remaining pixels (background pixels) are clustered as the third class with an intensity 0 in both channels. The third channel (DAPI channel) has the intensity value of 100 for the pixels from the first two classes and 0 for background pixels. The ground truth image is used to evaluate the performance. The Gaussian noise with a standard deviation was added to the three channels, as shown in Figs. 3(a)–3(c). The FCM, IAFCM, our proposed ssFCM-DAPI only (i.e., only the DAPI channel is segmented by ssFCM model), and ssFCM model (i.e., all channels are used for segmentation) were then applied, respectively. All programs were run on a Linux system-based CCS cluster at Tulane,44 and the average computational time was around 200 s. The results of segmentation using four different models were shown in Figs. 3(d)–3(g), respectively. In addition, they were further assessed by the following metrics as used in our previous work:9
Fig. 3.
The segmentation of the synthetic image set consisting of four squares with Gaussian noise () added. (a)–(c) show the images from the first, second, and third channels, respectively. (d)–(g) are the segmentation results by FCM model, IAFCM, ssFCM-DAPI only model, and ssFCM model, respectively.
The proposed ssFCM model shows higher correct ratio with lower false ratio and the IAFCM model shows lower with higher compared with the other methods. The comparison of four methods in the segmentation of synthetic images with different noise levels is listed in Table 2. The results demonstrate that the ssFCM model is more accurate and more robust to noise than the other three models.
Table 2.
A comparison of the segmentation in terms of CR/FR Using FCM, IAFCM, ssFCM-DAPI only and the ssFCM models on the first synthetic image set with three different noise levels.
| Noise levels | FCM(CR/FR) | IAFCM(CR/FR) | ssFCM-DAPI only (CR/FR) | ssFCM (CR/FR) |
|---|---|---|---|---|
| 0.9933/0.006 | 0.9711/0.2913 | 1/0.0002 | 1/0 | |
| 0.8975/0.1075 | 0.7057/0.4351 | 1/0.0008 | 1/0 | |
| 0.7927/0.2047 | 0.6335/0.4652 | 0.9950/0.0088 | 0.9974/0.0026 |
The second synthetic image set includes two annuluses with the same center, including three different classes: small annulus, big annulus, and background. The first two channels contain a small and big annulus, respectively, while the third channel contains both of them. Figures 4(a)–4(c) show the second synthetic image set with Gaussian noise. The four models (i.e., FCM, IAFCM, ssFCM-DAPI only, ssFCM) were then applied to the image set with noise level (e.g., ), and the segmentation results are presented in Figs. 4(d)–4(g), compared with the ground truth image. In Fig. 4, the segmentation of the proposed ssFCM is better than the three others with the and . Table 3 summarized the segmentation results of FCM, IAFCM, ssFCM-DAPI only and the ssFCM models in terms of CR and FR. The results show that the performance of the ssFCM model is better than the others with higher CR and lower FR at all noise levels.
Fig. 4.
The segmentation of the synthetic image set consisting of two annuluses with Gaussian noise (). (a)–(c) The images from the first, second, and third channels, respectively. (d)–(g) The segmentation results using FCM model, IAFCM, ssFCM-DAPI only model, and ssFCM model, respectively.
Table 3.
A comparison of the segmentation of FCM, IAFCM, ssFCM-DAPI only, and the ssFCM models on the second synthetic image set consisting of two annuluses with three different noise levels.
| Noise levels | FCM (CR/FR) | IAFCM (CR/FR) | ssFCM-DAPI only (CR/FR) | ssFCM (CR/FR) |
|---|---|---|---|---|
| 0.994/0.0047 | 0.995/0.0076 | 1/0.0002 | 1/0 | |
| 0.9239/0.1464 | 0.96147/0.3028 | 0.9944/0.0047 | 0.9974/0.0033 | |
| 0.8512/0.2777 | 0.8341/0.4088 | 0.9551/0.0939 | 0.9896/0.0324 |
3.2. Results with Hybrid Simulation Images
First, two hybrid simulation datasets were generated from real images. The simulated image sets correspond to five different classes and four different channels. The first three channels contain chromosomes partially, while the fourth channel is the DAPI image showing all chromosomes. The ground truth image is also simulated, where each pixel in the chromosome is labeled correctly, which is used to verify the accuracy of image segmentation. For simulated dataset I, the background is labeled 0 and the chromosomal region is labeled 255; for simulated dataset II, the background is labeled 0 and the intensity level of the chromosomal region is the same as the real chromosomal image. To test the robustness of different models, Gaussian noise with 30 different levels ( is from 0 to 150) is added. Figure 5 shows the simulated image dataset I mixed with Gaussian noise, and the segmentation results of different FCM models. Figures 5(a)–5(d) are the first, second, third, and fourth channel of simulated image dataset I, respectively. Figures 5(i)–5(l) are the first, second, third, and fourth channel of simulated image dataset II, respectively. Figures 5(e)–5(h) are the segmentation results of applying FCM model, IAFCM model, ssFCM-DAPI only, and ssFCM models to simulated image dataset I, respectively, and Figs. 5(m)–5(p) are the segmentation results of applying FCM model, IAFCM model, ssFCM-DAPI only, and ssFCM models to simulated image dataset II, respectively. The results show that the proposed ssFCM model gives the highest segmentation accuracy, which demonstrates better robustness to the noise than the other two models.
Fig. 5.
The simulated image datasets I and II and segmentation results of different FCM models. (a)–(d) The images from the first to the fourth channel in simulated image set I, respectively. (e)–(h) The segmentation results of simulated image dataset I using FCM model, IAFCM, ssFCM-DAPI only model, and ssFCM model, respectively. (i)–(l) The images from the first channel to the fourth channel in simulated image set II, respectively. (m)–(p) The segmentation results of simulated image dataset II using FCM model, IAFCM, ssFCM-DAPI only model, and ssFCM model, respectively.
Figure 6 compares the segmentation results of four FCM-based models with different levels of noise, in terms of CR and FR. The noise follows the normal distribution with 30 levels ( varying from 0 to 150). In Fig. 6, FCM and IAFCM were tested only on the DAPI channel as we did in Ref. 6. To evaluate the benefit of using spectral information, we also applied our FCM method to all six channels (ssFCM). The results show that our ssFCM models (ssFCM-DAPI only and ssFCM) largely outperform the other two FCM methods with higher CR and much lower FR, especially when the noise level is higher (e.g., ), which indicates that our methods are more robust than the other two FCM methods. For the ssFCM and ssFCM-DAPI only, when the noise level , both have no significant difference. However, as the noise level grows higher, ssFCM has higher CR than ssFCM-DAPI only, while FR is still similar in both methods, which shows the benefit of incorporating multichannel information to reduce the noise effect.
Fig. 6.
The comparisons of (a) CR and (b) FR using different models on the simulated image set I by changing the noise level from 0 to 30 (corresponding is from 0 to 150).
In simulation set II, the images are taken the same as the real chromosomal images. Figure 7 shows the simulated image dataset II mixed with Gaussian noise ( is from 0 to 150) and the segmentation results of different FCM models. Figures 5(i)–5(l) are the top four channels in simulated image dataset II. Figures 5(m)–5(p) show the segmentation results of applying the FCM model, IAFCM model, ssFCM-DAPI only and ssFCM models to the simulated image dataset II, respectively. The same levels of noise as in set I are added to test model robustness, as shown in Fig. 7. It can be seen that the results are similar to those in set I, and our ssFCM model generally gives the highest CR and lowest FR compared with the other models. Therefore, both simulations indicate that ssFCM method can take advantage of both spatial information to improve segmentation accuracy and multichannel spectral information to reduce the noise effect.
Fig. 7.
The comparisons of (a) CR and (b) FR using different segmentation models on the simulated image set II by changing the noise level from 0 to 30 (corresponding is from 0 to 150).
3.3. Results with Real M-FISH Images
Besides the simulated datasets, we also applied the proposed FCM model to real M-FISH images with six different channels from Advanced Digital Imaging Research (League City, Texas).45 The images of first five channels were labeled with different fluorophores, and the sixth channel was the DAPI image. The ground truth image was also provided, which was labeled by an experienced cytogeneticist. The results of testing on real M-FISH images are shown in Fig. 8. It shows that the CR of the proposed ssFCM models has the highest value at almost all noise levels ( from 0 to 150). As the noise level increases, the proposed ssFCM models deliver higher CR value and lower FR value than that of other models. Therefore, the newly proposed FCM model is more robust to the noise than the other two models.
Fig. 8.
The comparison results of (a) CR and (b) FR using different segmentation models on the real M-FSIH images by changing the noise level from 0 to 30 (corresponding is from 0 to 150).
All models have been applied to the images of whole cells for further comparison. Figure 9 shows an example of M-FISH dataset and the segmentation results as well as the ground truth. It can be seen that there is more noise around edges of chromosomes with the conventional FCM model and even more noise on the background with the IAFCM model than our ssFCM-based models (ssFCM-DAPI only and ssFCM models). Furthermore, the differences between the ground truth and the results of three models are shown in Fig. 10, where the blue, red, and cyan indicate the pixels correctly segmented into chromosomes and background, background pixels wrongly segmented into chromosomes, and chromosome pixels wrongly segmented into background, respectively. In addition, Fig. 11 shows a quantitative analysis of the segmentation results in terms of CR and FR with 15 levels of noise ( from 0 to 150). It is worth noting that the CRs of ssFCM-based models are generally comparable and higher than those by FCM and IAFCM models. It is also shown that FRs of ssFCM-based models are lower than the other two models under high noise levels. Overall, the results indicate that the proposed ssFCM models have improved accuracy and more robustness than FCM and IAFCM models because both spatial and spectral information is incorporated into the model.
Fig. 9.
An example of M-FISH dataset and the comparison of segmentation results by different FCM methods. (a)–(f) Six channels of M-FISH dataset, (g) the ground truth used for model evaluation, (h)–(k) the segmentation results of FCM model, IAFCM, ssFCM-DAPI only model, and ssFCM model, respectively.
Fig. 10.
(a) The difference between the ground truth and the segmentation results by FCM, (b) IAFCM, (c) ssFCM-DAPI only model, and (d) ssFCM models, respectively, where the blue, red, and cyan indicate the pixels correctly segmented into chromosomes and background, background pixels wrongly segmented into chromosomes, and chromosome pixels wrongly segmented into background, respectively.
Fig. 11.
The segmentation results [mean values of (a) CR and (b) FR] on six cells with 15 noise levels (corresponding σ is from 0 to 150) using different FCM methods.
4. Conclusion and Discussion
In this paper, we propose an improved fuzzy C-means clustering algorithm for chromosome segmentation, namely ssFCM, by taking advantage of both multichannel spectral information of M-FISH imaging and the spatial information within neighboring pixels in each channel. The model is designed by imposing both TV and row-wise sparse constraints on the membership matrix in the FCM model. The model is then applied to the segmentation of chromosome images from the M-FISH image set. To evaluate the performance, we compare the proposed ssFCM models with the conventional FCM and an IAFCM on synthetic, hybrid, and real M-FISH images, respectively. Table 4 summarizes all tested images and the quantitative performance using different models. The segmentation results shows that (1) the proposed FCM model outperforms the other two models (e.g., FCM and IAFCM) in terms of both CR and FR on the synthetic and hybrid image sets and (2) the proposed model is generally better than the other two models, e.g., having lower FR and being more robust to the noise on real M-FISH images. Nevertheless, further work can be conducted by modeling noises with Gaussian mixture distribution. Also, more datasets should be tested to show the reproducibility of the ssFCM model. More accurate segmentation of M-FISH images can increase subsequent classification of chromosomes and the detection of chromosomal abnormalities, which will, in turn, translate into improved diagnosis of genetic diseases and cancers.
Table 4.
The summary of testing on hybrid simulation and real M-FISH images and quantitative performance using different models.
| Image sets | Noise level | Method | CR () | FR () |
|---|---|---|---|---|
| Simulation set I (four images) | 30 | FCM | ||
| IAFCM | ||||
| ssFCM-DAPI only | ||||
| ssFCM | ||||
| Simulation set II (four images) | 30 | FCM | ||
| IAFCM | ||||
| ssFCM-DAPI only | ||||
| ssFCM | ||||
| Real M-FISH images (36 images) | 15 | FCM | ||
| IAFCM | ||||
| ssFCM-DAPI only | ||||
| ssFCM |
Acknowledgments
This work has been supported by NIH R01 GM109068, R01 MH104680 and R01 MH107354, and NSF EPSCoR program (1539067).
Biographies
Jingyao Li received her BS degree in measurement and control technology and instrumentation in 2006 from Hebei University of Technology, China, her MS degree in biomedical engineering in 2009 from Tianjin University, China, and her PhD in biomedical engineering from Tulane University United States, in 2015. Her current research interests include medical image analysis, sparse representation modeling, and machine learning.
Dongdong Lin received his BS degree in biomedical engineering from Chongqing University in China in 2007 and his PhD in biomedical engineering from Tulane University, in 2015. He is currently a postdoctoral researcher in the Mind Research Network Institute. His primary research is on mathematical models for integrative analysis of high-dimensional biological data and multimodal medical imaging to deepen the insights of the pathophysiology underlying complex disorders.
Yu-Ping Wang received his BS degree in applied mathematics from Tianjin University, China, in 1990, and his MS degree in computational mathematics and PhD in communications and electronic systems from Xi’an Jiaotong University, China, in 1993 and 1996, respectively. He is currently a professor in the Department of Biomedical Engineering and Global Biostatistics and Data Sciences at Tulane University. His research interests include signal processing and machine learning with applications to biomedical imaging and bioinformatics.
Disclosures
The authors declare that they have no competing interests.
References
- 1.Patterson D., “Molecular genetic analysis of Down syndrome,” Hum. Genet. 126, 195–214 (2009). 10.1007/s00439-009-0696-8 [DOI] [PubMed] [Google Scholar]
- 2.Wang Y. P., Castleman K. R., “Normalization of multicolor fluorescence in situ hybridization (M-FISH) images for improving color karyotyping,” Cytometry Part A 64A, 101–109 (2005). 10.1002/cyto.a.20116 [DOI] [PubMed] [Google Scholar]
- 3.Cao H. B., et al. , “Classification of multicolor fluorescence in situ hybridization (M-FISH) images with sparse representation,” IEEE Trans. Nanobiosci. 11, 111–118 (2012). 10.1109/TNB.2012.2189414 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Li J., Cao H., Wang Y.-P., “Classification of multicolor fluorescence in-situ hybridization (M-FISH) image using regularized multinomial logistic regression,” in Proc. of the ACM Conf. on Bioinformatics, Computational Biology and Biomedicine, pp. 551–554 (2012). [Google Scholar]
- 5.Li J., et al. , “An improved sparse representation model with structural information for multicolour fluorescence in-situ hybridization (M-FISH) image classification,” BMC Syst. Biol. 7(Suppl. 4), S5 (2013). 10.1186/1752-0509-7-S4-S5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Li J., et al. , “Classification of multicolor fluorescence in-situ hybridization (M-FISH) image using structure based sparse representation model,” in IEEE Int. Conf. on Bioinformatics and Biomedicine (BIBM), Philadelphia: (2012). 10.1109/BIBM.2012.6392672 [DOI] [Google Scholar]
- 7.Wang M., et al. , “A patch‐based tensor decomposition algorithm for M‐FISH image classification,” Cytometry Part A 91, 622–632 (2016). 10.1002/cyto.a.22864 [DOI] [PubMed] [Google Scholar]
- 8.Choi H., Bovik A. C., Castleman K. R., “Feature normalization via expectation maximization and unsupervised nonparametric classification for M-FISH chromosome images,” IEEE Trans. Med. Imaging 27, 1107–1119 (2008). 10.1109/TMI.2008.918320 [DOI] [PubMed] [Google Scholar]
- 9.Cao H. B., Deng H. W., Wang Y. P., “Segmentation of M-FISH images for improved classification of chromosomes with an adaptive fuzzy C-means clustering algorithm,” IEEE Trans. Fuzzy Syst. 20, 1–8 (2012). 10.1109/TFUZZ.2011.2160025 [DOI] [Google Scholar]
- 10.Li J., Lin D., Wang Y.-P., “Segmentation of multicolor fluorescence in-situ hybridization (M-FISH) image using an improved fuzzy C-means clustering algorithm while incorporating both spatial and spectral information,” in IEEE Int. Conf. on Bioinformatics and Biomedicine (BIBM), pp. 413–416 (2015). 10.1109/BIBM.2015.7359717 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lee C., et al. , “Limitations of chromosome classification by multicolor karyotyping,” Am. J. Hum. Genet. 68, 1043–1047 (2001). 10.1086/319503 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Poggi G., Scarpa G., Zerubia J. B., “Supervised segmentation of remote sensing images based on a tree-structured MRF model,” IEEE Trans. Geosci. Remote Sens. 43, 1901–1911 (2005). 10.1109/TGRS.2005.852163 [DOI] [Google Scholar]
- 13.Law Y. N., et al. , “A semisupervised segmentation model for collections of images,” IEEE Trans. Image Process. 21, 2955–2968 (2012). 10.1109/TIP.2012.2187670 [DOI] [PubMed] [Google Scholar]
- 14.Otsu N., “A threshold selection method from gray-level histograms,” IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979). 10.1109/TSMC.1979.4310076 [DOI] [Google Scholar]
- 15.Bezdek J. C., Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York: (1981). [Google Scholar]
- 16.Dunn J. C., “A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters,” J. Cybern. 3, 32–57 (1973). 10.1080/01969727308546046 [DOI] [Google Scholar]
- 17.Liu Z. G., et al. , “Hybrid classification system for uncertain data,” IEEE Trans. Syst. Man Cybern. Syst. PP, 1–8 (2016). 10.1109/TSMC.2016.2622247 [DOI] [Google Scholar]
- 18.Juang C. F., Chiu S. H., Shiu S. J., “Fuzzy system learned through fuzzy clustering and support vector machine for human skin color segmentation,” IEEE Trans. Syst. Man Cybern. Part A: Syst. Humans 37, 1077–1087 (2007). 10.1109/TSMCA.2007.904579 [DOI] [Google Scholar]
- 19.HongLei Y., et al. , “Remote sensing classification using fuzzy C-means clustering with spatial constraints based on Markov random field,” Eur. J. Remote Sens. 46, 305–316 (2013). 10.5721/EuJRS20134617 [DOI] [Google Scholar]
- 20.Trivedi M. M., Bezdek J. C., “Low-level segmentation of aerial images with fuzzy clustering,” IEEE Trans. Syst. Man Cybern. 16, 589–598 (1986). 10.1109/TSMC.1986.289264 [DOI] [Google Scholar]
- 21.Jiang Y., et al. , “Realizing two-view TSK fuzzy classification system by using collaborative learning,” IEEE Trans. Syst. Man Cybern. Syst. 47, 145–160 (2017). 10.1109/TSMC.2016.2577558 [DOI] [Google Scholar]
- 22.Feng Y., et al. , “A modified fuzzy C-means method for segmenting MR images using non-local information,” Technol. Health Care 24, S785–S793 (2016). 10.3233/THC-161208 [DOI] [PubMed] [Google Scholar]
- 23.Feng Y., et al. , “An adaptive fuzzy C‐means method utilizing neighboring information for breast tumor segmentation in ultrasound images,” Med. Phys. 44, 3752–3760 (2017). 10.1002/mp.2017.44.issue-7 [DOI] [PubMed] [Google Scholar]
- 24.Tolias Y. A., Panas S. M., “Image segmentation by a fuzzy clustering algorithm using adaptive spatially constrained membership functions,” IEEE Trans. Syst. Man Cybern. Part A Syst. Humans 28, 359–369 (1998). 10.1109/3468.668967 [DOI] [Google Scholar]
- 25.Ahmed M. N., et al. , “A modified fuzzy C-means algorithm for bias field estimation and segmentation of MRI data,” IEEE Trans. Med. Imaging 21, 193–199 (2002). 10.1109/42.996338 [DOI] [PubMed] [Google Scholar]
- 26.Chuang K. S., et al. , “Fuzzy C-means clustering with spatial information for image segmentation,” Comput. Med. Imaging Graph 30, 9–15 (2006). 10.1016/j.compmedimag.2005.10.001 [DOI] [PubMed] [Google Scholar]
- 27.Pham D. L., Prince J. L., “Adaptive fuzzy segmentation of magnetic resonance images,” IEEE Trans. Med. Imaging 18, 737–752 (1999). 10.1109/42.802752 [DOI] [PubMed] [Google Scholar]
- 28.Tran T. N., Wehrens R., Buydens L. M., “Clustering multispectral images: a tutorial,” Chemom. Intell. Lab. Syst. 77, 3–17 (2005). 10.1016/j.chemolab.2004.07.011 [DOI] [Google Scholar]
- 29.Saïd A. B., Foufou S., Abidi M., “A FCM and SURF based algorithm for segmentation of multispectral face images,” in Int. Conf. on Signal-Image Technology and Internet-Based Systems (SITIS), pp. 65–70 (2013). 10.1109/SITIS.2013.22 [DOI] [Google Scholar]
- 30.He Y. Y., et al. , “A new fuzzy C-means method with total variation regularization for segmentation of images with noisy and incomplete data,” Pattern Recognit. 45, 3463–3471 (2012). 10.1016/j.patcog.2012.03.009 [DOI] [Google Scholar]
- 31.Paclık P., et al. , “Segmentation of multi-spectral images using the combined classifier approach,” Image Vision Comput. 21, 473–482 (2003). 10.1016/S0262-8856(03)00013-1 [DOI] [Google Scholar]
- 32.Li J., Bioucas-Dias J. M., Plaza A., “Spectral-spatial hyperspectral image segmentation using subspace multinomial logistic regression and Markov random fields,” IEEE Trans. Geosci. Remote Sens. 50, 809–823 (2012). 10.1109/TGRS.2011.2162649 [DOI] [Google Scholar]
- 33.Yang Y., Wei Y., “Neighboring coefficients preservation for signal denoising,” Circuits Syst. Signal Process. 31, 827–832 (2012). 10.1007/s00034-011-9346-1 [DOI] [Google Scholar]
- 34.Chen Y., Nasrabadi N. M., Tran T. D., “Hyperspectral image classification using dictionary-based sparse representation,” IEEE Trans. Geosci. Remote Sens. 49, 3973–3985 (2011). 10.1109/TGRS.2011.2129595 [DOI] [Google Scholar]
- 35.Wang Z. M., et al. , “An adaptive spatial information-theoretic fuzzy clustering algorithm for image segmentation,” Comput. Vision Image Understanding 117, 1412–1420 (2013). 10.1016/j.cviu.2013.05.001 [DOI] [Google Scholar]
- 36.Chen S. C., Zhang D. Q., “Robust image segmentation using FCM with spatial constraints based on new kernel-induced distance measure,” IEEE Trans. Syst. Man Cybern. Part B Cybern. 34, 1907–1916 (2004). 10.1109/TSMCB.2004.831165 [DOI] [PubMed] [Google Scholar]
- 37.Chuang K.-S., et al. , “Fuzzy C-means clustering with spatial information for image segmentation,” Comput. Med. Imaging Graph. 30, 9–15 (2006). 10.1016/j.compmedimag.2005.10.001 [DOI] [PubMed] [Google Scholar]
- 38.Rudin L. I., Osher S., Fatemi E., “Nonlinear total variation based noise removal algorithms,” Phys. D 60, 259–268 (1992). 10.1016/0167-2789(92)90242-F [DOI] [Google Scholar]
- 39.Micchelli C. A., Shen L., Xu Y., “Proximity algorithms for image models: denoising,” Inverse Probl. 27, 045009 (2011). 10.1088/0266-5611/27/4/045009 [DOI] [Google Scholar]
- 40.Strong D., Chan T., “Edge-preserving and scale-dependent properties of total variation regularization,” Inverse Probl. 19, S165–S187 (2003). 10.1088/0266-5611/19/6/059 [DOI] [Google Scholar]
- 41.Yuan M., Lin Y., “Model selection and estimation in regression with grouped variables,” J. R. Stat. Soc. B 68, 49–67 (2006). 10.1111/j.1467-9868.2005.00532.x [DOI] [Google Scholar]
- 42.Lin D., et al. , “Group sparse canonical correlation analysis for genomic data integration,” BMC Bioinf. 14, 245 (2013). 10.1186/1471-2105-14-245 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Boyd S., et al. , “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends® Mach. Learn. 3, 1–122 (2011). 10.1561/2200000016 [DOI] [Google Scholar]
- 44.Ye J., “Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems,” J. Mach. Learn. Res. 6, 483–502 (2005). [Google Scholar]
- 45.Huang D., et al. , “Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data,” J. Exp. Clin. Cancer Res. 28, 149 (2009). 10.1186/1756-9966-28-149 [DOI] [PMC free article] [PubMed] [Google Scholar]











