Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2010 Dec 8.
Published in final edited form as: Proc IEEE Southwest Symp Image Anal Interpret. 2010 May 23;2010:93–96. doi: 10.1109/SSIAI.2010.5483911

A Clustering Algorithm for Liver Lesion Segmentation of Diffusion-Weighted MR Images

Abhinav K Jha *, Jeffrey J Rodríguez , Renu M Stephen , Alison T Stopeck
PMCID: PMC2998770  NIHMSID: NIHMS217658  PMID: 21151837

Abstract

In diffusion-weighted magnetic resonance imaging, accurate segmentation of liver lesions in the diffusion-weighted images is required for computation of the apparent diffusion coefficient (ADC) of the lesion, the parameter that serves as an indicator of lesion response to therapy. However, the segmentation problem is challenging due to low SNR, fuzzy boundaries and speckle and motion artifacts. We propose a clustering algorithm that incorporates spatial information and a geometric constraint to solve this issue. We show that our algorithm provides improved accuracy compared to existing segmentation algorithms.

I. Introduction

Diffusion-weighted magnetic resonance imaging (DWMRI) is sensitive to the microscopic random motion of molecules in a fluid [1] and therefore measures the mobility of water in tissues. This mobility is parameterized by the apparent diffusion coefficient (ADC), a parameter that has been shown to be an indicator of lesion response to therapy, both clinically and pre-clinically [2]. However, accurate computation of the ADC value requires that the segmentation of the lesion in the diffusion-weighted (DW) images be precise.

Manual segmentation of these lesions is difficult, time-consuming and error-prone. This is due to various reasons such as irregular and fuzzy lesion boundaries [3], low signal-to-noise ratio (SNR) in DW images [4], wide intensity and morphological variations of the lesion and the large number of images that the difusion analyst has to perform lesion segmentation on to determine an ADC value. Moreover, the ADCs estimated using manual segmentations are not reproducible, since the manual segmentation differs from person to person, and even one time to another. These factors serve as motivation for designing an automated segmentation algorithm for lesion segmentation. However, due to the reasons mentioned above, design of an automated/semi-automated segmentation algorithm for this task is challenging.

In [3], the authors propose a snakes-based algorithm for this segmentation task. However, the algorithm is not adaptable to different magnetic-diffusion-gradient values (b values), and also fails to segment the lesion when the contrast is low. In this paper, we present a statistical approach to solve this issue. In DWMRI, the images are corrupted by Rician noise [5], [6], which in high signal-to-noise ratio (SNR) regions (SNR > 2 dB), can be approximated as Gaussian noise [6]. In a DW image containing a lesion, the lesion and regions surounding the lesion have high SNR. Also, except for noise, each of these regions has an almost constant-valued intensity profile. Therefore, we model our image as a finite Gaussian mixture (FGM) model. However, the FGM model does not take spatial information into account. This is because all the pixel intensities are considered to be independent samples drawn from a population. To account for spatial information, we use the approach suggested by Geman et. al. [7], where they model each region in the image as a Markov random field (MRF), and characterize it by a Gibbs distribution. Therefore, each pixel has a certain a-priori probability of belonging to a particular region, depending on the region to which its neighbors belong. Combining this probability with the probability of a pixel belonging to a particular distribution in the FGM model by using the maximum-a-posteriori (MAP) method, we can compute which region a given pixel belongs to.

Similar approaches based on the above concept have been previously used for image segmentation [8], [9]. However, in our segmentation task, we had to deal with some additional issues. First, the number of regions that could be present in an image slice could be variable and not known in advance. Also the lesion could be surrounded by regions of very similar intensities, consequently decreasing the contrast between the lesion and the background. Moreover, the algorithm had to be computationally inexpensive. With these constraints, we designed an algorithm, modeling the image as a FGM model and the region process as a MRF, using a histogram-based approach for dealing with the issue of variable number of slices and an additional geometric constraint, motivated by the work of Kupinski et. al. [10], to increase the contrast between the lesion and background. In the results section, we show how accurate our algorithm is and how well it compares with some of the existing segmentation techniques in the literature.

II. Methods

A. Image Acquisition

In the current study, DWMRI is being used to monitor the therapeutic response in breast cancer patients with metastatic liver lesions. Difusion-weighted single-shot echo-planar imaging (DW-SSEPI) is performed at b values of 0, 150, 300 and 450 s/mm2. Image parameters for the DW-SSEPI images are as follows: TE = 91.1 msec, TR = 6 seconds, 250 kHz receiver and 6 mm slice thickness. Radial DWMRI image pairs (b=0 and 150, b=0 and 300, and b=0 and 450 s/mm2, respectively) are collected, where each pair is collected within a single 24 s breathhold. Each image is of dimension 256 × 256 voxels with each voxel being of dimension 1 mm × 1 mm × 6 mm. The highest gray level seen in the images is around 800. Each patient is imaged at baseline, day 4, 11 and 39 following the commencement of cytotoxic therapy.

B. Algorithm Initialization

In radial DW images of the patient, apart from the lesion, there are other organs such as the liver, kidneys, arteries and the spinal cord. Our objective is to segment only the lesion in the image and not the other regions. Therefore, the diffusion analyst demarcates a rectangular region of interest (bounding box) that contains the lesion, and marks a particular pixel in this box as the seed pixel. The criteria that need to be satisfied while demarcating the rectangular region are that the lesion should be approximately at the center of the rectangular region, the lesion should not be touching the boundary of the rectangular region, and the rectangular region’s dimensions should be approximately proportional to the dimensions of the lesion. The rectangular bounding box and the seed pixel have to be marked only on one image for a particular lesion in a patient. The algorithm can then process all the lesion slices at that b value, and all other b values. Having such a region of interest has the added advantage that the algorithm then works only on this specific region, and does not have to spend computational time trying to segment other regions in the image that the analyst is not interested in. After this step, the algorithm works only on the region of interest. From this point, the term “image” will refer to the region of interest only, unless otherwise stated.

C. Clustering in FGM Model

Let us consider a two-dimensional DW MRI slice in which a rectangular region of interest containing the lesion has been demarcated. We consider this region of interest as our initial input image for all further operations, and therefore, the coordinate system is with respect to this region of interest. Let i denote a pixel’s spatial location and let yi denote the pixel intensity at that location. Our objective is to assign to each pixel in the image a class label xi, where xi can take values only within a particular set L={1,2,..l}. Let x be the random vector that denotes the class labels of all the pixels. Also, let y denote the vector of intensities of all the pixels. Since the image pixel intensities can be characterized by a mixture of Gaussian distributions, it is assumed that the pixel intensity yi, given the class label xi = l, follows a Gaussian distribution with parameters θl =(μll). Therefore,

p(yixi=l)=12πσl2exp(yiμl)2σl2. (1)

Let there be L classes in the selected region of interest. In the first step, we determine the maximum and minimum pixel intensities in the image, and uniformly divide this range of intensity values into L classes. Each pixel is classified to one of the L classes based on its intensity. We then compute the initial mean intensity and the initial intensity variance of each class, which we denote by μl0 and (σl0)2, respectively, for the lth class. The mean and variance are determined by computing the sample mean and variance of the intensities that were initially assigned to each class.

Using (1) and the initial mean and variance values, we could have reclassifed each pixel as belonging to that class to which its intensity has the maximum probability of belonging. We could then recompute the mean and variance of each class, and repeat this process for a certain number of iterations to obtain a segmentation. However, the issue with this approach is that each pixel is classified into a particular class based solely on its intensity. In reality, there might be pixels whose intensities are very different from the other pixels in the same region. Therefore, the intensities of those pixels might not belong to the same Gaussian distribution to which most other pixels intensities in that region belong. In that case, those pixels with outlier intensities will get classified as part of a different region. Likewise, pixels with similar intensities lying in different regions in the image will get classified as pixels of the same region. These issues arise because this method is essentially a histogram-based intensity clustering method, not a region-based segmentation method. To improve the algorithm, it needs to take spatial information into consideration, which we do using a MRF model.

D. MRF Model

To incorporate spatial information into the segmentation task, we observe the classes to which the neighbors of any particular pixel belong. We then let this information influence the probability of the pixel to belong to a class. This is achieved by modeling the region process as a MRF, so that if Ni is the neighborhood of the pixel at location i, then

p(xixj,allji)=p(xixj,alljNi). (2)

The neighborhood that we consider for the ith pixel consists of the eight nearest pixels around that pixel. By the Hammersley-Clifford theorem, the density of x is given by the Gibbs distribution. To define the Gibbs distribution, we first need to define a clique. A clique C is a set of points that are neighbors of each other. VC(x) denotes the potential of clique C. To define the clique potentials, we follow the same model as in [8]. The density for x therefore takes the form

p(x)=1Zexp(CVC(x)), (3)

where Z is a normalizing constant and the sum is computed over all possible cliques in the bounding box.

E. MAP estimation of the Class Label

Using (1), we can compute p(yixi = l). Using (3), we can obtain the a-priori probability that the class label vector is a certain x, i.e. p(x). To find the class label for each pixel, the ideal approach would be to estimate the vector of pixel labels , that has the maximum probability of occurrence given the intensities of all the pixels, y. Therefore, we must find the that maximizes p(xy). To evaluate this quantity, we use Bayes theorem. We also assume that given x, the yi values are conditionally independent. Therefore, can be derived to be [11]

x^=argmaxxi(12πσxi2)×exp(i(yiμxi)22σxi2CVC(x)). (4)

This method is referred to as finding the MAP estimate. However, (4) equation is computationally infeasible to solve. Therefore, we instead maximize the conditional density at each pixel xi, given the intensity of the pixel yi and the current value of the class labels, xj at all the neighborhood points. This is the iterated conditional modes (ICM) approach proposed by Besag [12]. Mathematically, for every pixel, we want to find the value of xi = for which the probability p(xiyi,xj,j ∈ Ni) is maximum i.e.

l^=argmaxxiLp(xiyi,xj,jNi)=argmaxxiL12πσxi2exp((yiμxi)22σxi2CVC(x)), (5)

where now the sum is computed over all cliques that include pixel i. This simplifies to finding xi = that satisfies the following equation [11]

l^=argmaxxiL(log(σxi)+(yiμxi)22σxi2+xiCVC(x)). (6)

We subsequently classify the pixel as part of class by updating its class label. We perform this operation for all the pixels, accessing the pixels through raster scan. After that we update the mean and variance of each region. These steps are repeated iteratively for a fixed number of iterations.

F. Region Growing

After all the pixels have been assigned their class labels, to perform region growing of the lesion region, we need to determine the class label of the lesion region. The class label of the seed pixel (that the diffusion analyst had marked in the first step) could be considered as the label for the lesion region. However, it might so happen that the seed pixel, due to noise, may get classified as part of some other class. Therefore, to determine the class label of the lesion, we consider a 3 × 3 box around the seed pixel, and find the class that the maximum number of pixels in this box belong to.

Having determined the class label for the lesion, we start to perform region growing from the pixels in the above mentioned 3 × 3 box. The region growing stops when pixels of different regions are encountered. Consequently the lesion is segmented in the image. This segmentation approach suffers from two major issues, solutions to which we suggest next.

G. Incorporating Geometric Constraint

The first issue with this approach is that, sometimes, the lesion is surrounded by many pixels of similar intensities. In that case, the above model will not lead to correct segmentation results because neither the intensity nor the neighborhood criterion clearly distinguishes the lesion as a different region. We then use a geometric constraint to achieve accurate segmentation.

Lesions tend to be visually compact, which implies that their shape is typically convex. To incorporate this information into our algorithm, after performing the segmentation algorithm, we determine if there are any pixels on the boundary of the rectangular region of interest that belong to the same class as the lesion itself. When the rectangular region of interest was demarcated by the diffusion-analyst, a requirement was that the lesion’s boundary did not touch the boundary of the rectangular region. Assuming that the user properly selected the rectangular region, if the number of pixels on the boundary of the rectangular region that belong to the same region as the lesion exceeds a threshold, it implies that the lesion is surrounded by pixels of similar intensities, and the segmentation algorithm did not segment the lesion correctly. In that case, we multiply the region of interest by a two-dimensional Gaussian function, centered at the seed pixel. We have found that a Gaussian function works well for these lesions, although other functions could be used for other lesions.

Multiplying the region of interest by this Gaussian function results in suppression of the intensities of those pixels which are distant from the seed pixel. Since the lesion is brighter than the rest of the image, the contrast between the lesion and the rest of the image increases. The segmentation algorithm is now applied on this modified image. This process is repeated until the number of pixels on the boundary of the rectangular bounding box which belong to the same region as the lesion falls below a threshold.

H. Determining Number of Classes

Another issue with the above suggested approach is that it requires us to know, in advance, the number of classes that are present in the image. However, this is not known because the lesion is not at a fixed location in the body. To solve this issue, the algorithm hypothesizes the number of classes as belonging to a range of values. The algorithm is executed and the parameters of the FGM model are computed for each hypothesis of the number of classes. Using these parameters, an estimated pdf of the image is determined [11]. The normalized histogram of the image, which we denote by ptrue(y), and the estimated pdf pest(y) are then compared by using the Kullback-Leibler (K-L) divergence [13]:

DKL(pestptrue)=ypest(y)logpest(y)ptrue(y). (7)

The number of classes in the image is the value of the hypothesis L for which the K-L divergence is the least. Using this value of L, the image is segmented.

III. Experiments and Results

We compare our algorithm with a snakes-based algorithm [3] (an algorithm that was designed exclusively for liver lesion segmentation in DW images), an expectation-maximization (EM) based algorithm [9] and a maximum-likelihood estimation (MLE) based algorithm [10]. The MLE and EM based algorithms were modified to perform segmentation only on the bounding box as opposed to the entire image. This modification was done to improve the performance of these algorithms. A careful manual segmentation of the lesions was performed in our dataset that consisted of images acquired at multiple b values and from different patients. The manual segmentation was considered as the ground truth for all subsequent analysis. The average size of these lesions was 20 mm in width. The MLE, EM, snakes-based and our clustering algorithm were then run on these images for segmenting the lesion present in them. We then validated our algorithm using some of the conventional approaches mentioned in the literature that have been used for evaluation of automated segmentation algorithm in MR images, namely the similarity index [14], Tanimoto coefficient [15], True positive fraction and the True negative fraction [3]. Table I compares these parameters for all the algorithms, computed over all the images. We note that the similarity index, Tanimoto coefficient and TPF are the highest for our algorithm, which indicates that our algorithm is the most accurate. We also observe that the similarity index using our algorithm is greater than 0.7, which, according to [14], signifies that our algorithm is performing very well.

TABLE I.

PERFORMANCE OF THE SEGMENTATION ALGORITHMS

Similarity index Tanimoto coeff. TPF TNF

Our method 0.7353 0.6094 0.7548 0.9990
MLE 0.5380 0.4294 0.5591 0.9993
EM 0.5817 0.4373 0.6104 0.9986
Snakes 0.3068 0.2146 0.2856 0.9993

Fig. 1 shows the segmentation results using our algorithm on a lesion that is surrounded by many pixels of similar intensity. The images have been cropped and zoomed, so that the segmentation result is more clearly visible.

Fig. 1.

Fig. 1

Results using our segmentation algorithm for a lesion that is small and surrounded by many pixels of similar intensities. Figs (a), (b) and (c) are the lesion slices acquired at b values 0, 150 and 450 s/mm2 respectively.

IV. Conclusions

Accurate segmentation of lesions in liver images is a challenging task, for the manual user, as well as for any automated segmentation algorithm. We propose a solution to this problem using a statistical approach, where we model the image as a FGM model, use the Markov property to take spatial information into account, a histogram-based measure for determining the number of classes present in the image and a geometric constraint to achieve accurate segmentation. We note from the results that our algorithm performs better than the current automated segmentation algorithms available in literature for this particular problem. Also, the similarity index being greater than 0.7 indicates that the algorithm outputs results which agree very well with manual segmentations. We would also like to mention here that while performing the segmentation, the images were not pre-registered across different b values. A pre-registration step can further improve the algorithms performance and utility. Moreover, the algorithm is computationally inexpensive. It has been applied on a large number of images containing lesions. We are currently working on evaluating our algorithm based on how well it aids in the task of ADC estimation and also how reproducile the segmentation results using our algorithm are.

References

  • [1].Bammer R. Basic principles of diffusion-weighted imaging. Eur. Radiol. 2003;vol. 45:169–184. doi: 10.1016/s0720-048x(02)00303-0. [DOI] [PubMed] [Google Scholar]
  • [2].Theilmann RJ, et al. Changes in water mobility measured by diffusion MRI predict response of metastatic breast cancer to chemotherapy. Neoplasia. 2004;vol. 6(no. 6):831–837. doi: 10.1593/neo.03343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Krishnamurthy C. Master’s thesis. Department of Electrical and Computer Engineering, University of Arizona; Tucson, Arizona: 2004. Automated lesion segmentation and tracking in echo-planar diffusion-weighted liver MRI. [Google Scholar]
  • [4].Heiland S, Dietrich O, Sartor K. Diffusion-weighted imaging of the brain: comparison of stimulated- and spin-echo echo-planar sequences. Neurorad. 2001;vol. 43(no. 6):442–447. doi: 10.1007/s002340000537. [DOI] [PubMed] [Google Scholar]
  • [5].Macovski A. Noise in MRI. Magn. Reson. Med. 1996;vol. 36(no. 3):494–497. doi: 10.1002/mrm.1910360327. [DOI] [PubMed] [Google Scholar]
  • [6].Gudbjartsson H, Patz S. The Rician distribution of noisy MRI data. Magn. Reson. Med. 1995;vol. 34:910–914. doi: 10.1002/mrm.1910340618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Geman S, Geman D. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 1984;vol. 6(no. 6):721–741. doi: 10.1109/tpami.1984.4767596. [DOI] [PubMed] [Google Scholar]
  • [8].Pappas T. An adaptive clustering algorithm for image segmentation. IEEE Trans. Signal Process. 1992;vol. 40(no. 4):901–914. [Google Scholar]
  • [9].Zhang Y, Brady M, Smith S. Segmentation of brain MR images through a hidden markov random field model and the expectation-maximization algorithm. IEEE Trans. Med. Imag. 2001;vol. 20(no. 1):45–57. doi: 10.1109/42.906424. [DOI] [PubMed] [Google Scholar]
  • [10].Kupinski M, Giger M. Automated seeded lesion segmentation on digital mammograms. IEEE Trans. Med. Imag. 1998;vol. 17(no. 4):510–517. doi: 10.1109/42.730396. [DOI] [PubMed] [Google Scholar]
  • [11].Jha AK. Master’s thesis. Department of Electrical and Computer Engineering, University of Arizona; Tucson, Arizona: 2009. ADC estimation in diffusion-weighted images. [Google Scholar]
  • [12].Besag J. On the statistical analysis of dirty pictures. J. Roy. Stat. Soc. 1986;vol. 48(no. 3):259–302. [Google Scholar]
  • [13].Kullback RLS. On information and sufficiency. Annals Math. Stat. 1951;vol. 22(no. 1):79–86. [Google Scholar]
  • [14].Zijdenbos A, et al. Morphometric analysis of white matter lesions in MR images: method and validation. IEEE Trans. Med. Imag. 1994;vol. 13(no. 4):716–724. doi: 10.1109/42.363096. [DOI] [PubMed] [Google Scholar]
  • [15].Jimenez-Alaniz J, Medina-banuelos V, Yanez-suarez O. Data-driven brain MRI segmentation supported on edge confidence and a priori tissue information. IEEE Trans. Med. Imag. 2006;vol. 25(no. 1):74–83. doi: 10.1109/TMI.2005.860999. [DOI] [PubMed] [Google Scholar]

RESOURCES