Abstract
Image segmentation plays an essential role in many medical applications. Low SNR conditions and various artifacts makes its automation challenging. To achieve robust and accurate segmentation results, a good approach is to introduce proper shape priors. In this study, we present a unified variational segmentation framework that regularizes the target shape with a level-set based sparse composite prior. When the variational problem is solved with a block minimization/decent scheme, the regularizing impact of the sparse composite prior can be observed to adjust to the most recent shape estimate, and may be interpreted as a “dynamic” shape prior, yet without compromising convergence thanks to the unified energy framework. The proposed method was applied to segment corpus callosum from 2D MR images and liver from 3D CT volumes. Its performance was evaluated using Dice Similarity Coefficient and Hausdorff distance, and compared with two benchmark level-set based segmentation methods. The proposed method has achieved statistically significant higher accuracy in both experiments and avoided faulty inclusion/exclusion of surrounding structures with similar intensities, as opposed to the benchmark methods.
Keywords: variational segmentation, sparse composite shape prior, level set
1. Introduction
Image segmentation is a ubiquitous challenge in medical imaging analysis, especially when the noise level is high and/or observations are subject to partial occlusion due to signal voids. To achieve more robust and accurate segmentation results, shape priors need to be incorporated.
One typical approach is to encourage the target segmentation to be close to a single given template [1, 2, 3, 4, 5], often a “mean” representation generated from a training set. However, such approach could introduce large bias, especially when the ground truth differs significantly from most shapes in the training set. Another approach for modeling shape prior is to impose a statistical distribution on the training set [6, 7, 8] and to approximate the shape manifold as a linear space. These methods have achieved some success but are limited by the questionable presumptions of a linear shape space and the Gaussian distribution of shapes [9, 10, 11, 12]. Further investigations of nonlinear shape modeling approaches follows the general rationale of local analysis, and typical approaches inlude manifold learning [13], kernel density estimation [10, 11], and kernel PCA [14]. More recently, the development of compressed sensing techniques inspires the refinement of intermediate segmentation results with a sparse linear combination of training shapes, represented as a set of regularly placed points along the shape contour [15, 16].
In this study, we present a unified variational segmentation framework with a level-set based sparse composite shape prior, which reflects the perspective of a local linear approximation to the globally nonlinear shape manifold. When a specific block minimization/descent scheme is applied to solve the optimization problem, the target shape can be observed to be pushed towards an evolving sparse shape superposition at each iteration, which can be perceived as a dynamically evolving shape prior. We use a level-set based shape representation that avoids building point correspondences and permits flexible shape interpolation/extrapolation. Formulating the segmentation problem in a unified energy minimization framework and solving it with descent scheme guarantees numerical stability.
2. Method
2.1. Level-set based shape representation and variational energy formulation
The level set method, originally introduced by Osher et al. [17], is a flexible framework for curve representation. A contour C is represented implicitly as the zero level set of a signed distance function , which is one dimensional higher. The level set approach provides a continuous representation of curve on a fixed regular grid, avoiding the effort to maintain point correspondence and distribution regularity as with point clouds [18]. More importantly, linearly combined level set functions yields a natural interpolation/extrapolation behavior of the resulting zero level set contour. Maybe a bit counter intuitive, this property is not shared by the commonly used characteristic function representation, whose linear combination provides a staircase addition, leading to piece-wise shape results as the thresholding value varies. To see this, consider the simplest convex combination scenario, under which one would desire to correspond to interpolation of the corresponding shapes. Let two shapes define the boundaries of region Ω1 and Ω2, respectively. The convex combination of the characteristic function f(η) = η1{x ∈ Ω1} + (1 − η)1{x ∈ Ω2} is piecewise constant:
Without loss of generality, assuming η ≤ 1/2 and thresholding this “staircase” function at ξ will generate “interpolating” shapes of the form:
On the other hand, linear combination of shapes with signed distance function representation can be conceptualized as a diffusion process starting with the given shapes with different velocity governed by the weight parameter η. Figure 1 exemplifies this phenomenon by combining a circle Ω1 and a square Ω2, with a simplistic scenario Ω1 ⊂ Ω2.
Figure 1.
Shape consequence from the linearly combining shapes represented with characteristic and level set functions: (a) shape results are piecewise, and in this special case of Ω1 ⊂ Ω2 correspond to either ∂ Ω1 or ∂ Ω2. (b) As the weight η varies, convex combination of the level set functions yield natural (smooth) interpolation behavior of the resulting shapes represented by the zero level set.
The typical variational level set segmentation formulation has the following form:
| (1) |
where Efidelity is the data fidelity that depends on image appearance, Ereg regularizes the geometrical properties of the segmentation, and λ is the balancing parameter. The specific formulations for Efidelity and Ereg are provided in the subsequent sections.
2.2. Fidelity metric and likelihood function
As a generalization to the classic Chan-Vese level set segmentation method [19], we model intensity distributions with Gaussian mixtures [20, 21]. The intensity distributions for the foreground (Ωin) and background (Ωout) voxels are modeled as:
| (2) |
where nin, nout represent the number of Gaussian components for the foreground and background respectively, and θin = {win,j, μin,j, σin,j}, θout = {wout,j, μout,j, σout,j} represent the weight and parameters for the jth Gaussian component in the corresponding partition. Efidelity is constructed as the negative logarithm of the likelihood:
| (3) |
2.3. Regularization energy
Our design of Ereg consists of two parts:
| (4) |
regularizes the contour length and attracts the contour into the high image gradient area, where is an edge indicator function. Eshape regularizes the shape with corresponding priors. Shape regularization is the focus of this work, which we elaborate in Section 2.4.
2.4. Shape regularization
2.4.1. Sparse Composite Shape Prior (SCSP)
We first construct a shape library D = [ψ1, ψ2, ... ψm], where ψi is a training shape represented by the level set function. All training shapes are aligned to an arbitrarily chosen center ψ0 via rigid registration . Shape regularization is designed so that the target shape is encouraged to be close to a sparse linear combination of shapes from the library:
| (5) |
In practice, we relax the l0 constraint to l1 regularization with weighting to further enhance the concentration of weight coefficients to favor shapes that are closer to ϕ:
| (6) |
where C is a diagonal matrix with element . In the above formulation, Dw can be considered an implicit representation of a shape prior whose sparsity (in w) is enforced with the reweighed l1 norm.
With the introduced fidelity and regularization energy functions, the energy model with SCSP reads:
| (7) |
where Hε is a numerical approximation of the heaviside function [22]. We solve the above optimization problem by alternatingly minimizing w.r.t. {θin, θout}, w, and descending w.r.t. ϕ. Algorithm 1 presents the detailed block energy minimization scheme. It can be observed that the block minimization with respect to w generates an effective dynamic prior ψk+1, which subsequently used to guide the variational update of the level set function ϕ.
3. Experiments and Results
We assessed the performance of the proposed method by applying it to: (1) 2D corpus callosum segmentation from MR images, and (2) 3D liver segmentation from CT volumes. Given the ground truth contours, we evaluated the performance using the Dice Similarity Coefficient (DSC) and Hausdorff distance to quantify the segmentation accuracy.
3.1. Benchmark methods
We compared the proposed method with two benchmark methods:
3.2. Implementation details
1. Initialization
To achieve a warm start, we initialized the segmentation by choosing an arbitrary image Ir and its corresponding shape ψr from the training set as reference, and registered Ir to the target image It. The initial segmentation for It was constructed using ψr ◦ T . For a fair comparison, the same initialization was used for all benchmark methods.
2. Parameter settings
The number of Gaussian components of foreground (Nf) and background (Nb) are determined empirically by examining the intensity distribution of training images. We set Nf = 1 and Nb = 2 in corpus callosum segmentation task, and set Nf = 2 and Nb = 3 in liver segmentation experiment. The common shape prior regularization λ and curve smoothness regularization β were set to λ = 0.01 and β = 0.1, with time step Δt = 1. The l1 regularization coefficient γ in the proposed SCSP was set to 0.01.
3. Quantitative evaluation
The quantitative evaluation of the segmentation accuracy was based on DSC and Hausdorff distance. Specifically, DSC is defined as DSC , where Cseg and Ctruth are the segmented regions from the achieved and ground truth segmentation, respectively. The Hausdorff distance is defined as: H(A, B) = max(h(A, B), h(B, A)), where , with A and B being the contours from the achieved and ground truth segmentation, respectively.
3.3. Experiment 1: Corpus callosum segmentation from MR brain images
Segmentation of the corpus callosum in midsagittal sections is important to neurocognitive research: the size and shape of the corpus callosum have been shown to correlate to sex, age and neurodegenerative diseases [25]. The segmentation is challenging because corpus callosum exhibits large shape variations between subjects and neighboring structures that shares similar intensity values as the region of interests.
The test dataset contains 100 brain MR volumetric images from different subjects with image size of 256×256×128 and voxel size of 1×1×1mm3. The middle 66th slice indicates the locality of interest for segmenting corpus callosum. Manual segmentation on that specific slice is available and used as the ground-truth in our experiment.
The corpus callosum shape library was constructed using the manual segmentation from 50 slices, as shown in Figure 2. The proposed method was applied to the remaining 50 middle slices. Example segmentation results are illustrated in Figure 3, showing accurately segmented corpus callosum and reasonablely constructed shape priors. The DSC performance is reported in Figure 5. The statistics of DSC and Hausdorff distance are summarized in Table 1. The benchmark method CV tends to fail when certain exterior region shares similar intensity as the interior of corpus callosum, as shown in Figure 4(a). The benchmark method CV-SSP has experienced some difficulties in segmenting corpus callosum structures that have large variations from the central shape, as illustrated in Figure 4(e). The sparse composite shape regularization offers the proposed approach superior performance beyond both CV and CV-SSP benchmark methods. Results of paired t-test in Table 2 indicates that the proposed method performs significantly better (p-value < 0.1) than the benchmark methods in terms of both DSC and Hausdorff metric.
Figure 2.
Example corpus callosum shape library represented in signed distance functions and corresponding zero level sets (red).
Figure 3.
Left column: segmentation results (green); middle column: constructed shape prior overlaid with its zero contour (blue) and the ground truth contour (red); right column: the final weighting coefficients distribution.
Figure 5.
Comparison of DSC histograms from corpus callosum segmentation results.
Table 1.
Statistics of DSC and Hausdorff distance from corpus callosum segmentation results.
| Mean | S.D. | Median | ||
|---|---|---|---|---|
| DSC | CV | 0.91 | 0.04 | 0.92 |
| CV-SSP | 0.93 | 0.05 | 0.94 | |
| SCSP | 0.95 | 0.02 | 0.95 | |
| Hausdorff (mm) | CV | 7.86 | 6.27 | 7.00 |
| CV-SSP | 3.38 | 1.76 | 3.00 | |
| SCSP | 2.81 | 1.23 | 2.83 | |
Figure 4.
Comparison of corpus callosum segmentation results (red).
Table 2.
P-values from paired t-tests on DSC and Hausdorff distance from corpus callosum segmentation results.
| CV | CV-SSP | SCSP | ||
|---|---|---|---|---|
| DSC | CV | - | - | - |
| CV-SSP | 0.01 | - | - | |
| SCSP | 8.42e-7 | 0.02 | - | |
| Hausdorff | CV | - | - | - |
| CV-SSP | 4.46e-6 | - | - | |
| SCSP | 2.15e-7 | 0.07 | - | |
3.4. Experiment 2: Liver segmentation from CT images
The liver boundaries were manually segmented in 19 volumetric abdominal CT scans from different patients with image size of 128 × 128 × 64 and voxel size of 2.36 × 2.36 × 3mm3. Given the small size of the dataset, we employed a leave-one-out strategy to test the efficacy of the proposed scheme by picking one image volume as the test and using the rest for shape library construction. A typical liver library is shown in Figure 6. Example segmentation results are compared in Figure 7, where the proposed method with SCSP yields more accurate results than the benchmark methods. The DSC distribution of each method is compared in Figure 8. The statistics of DSC and Hausdorff distance are summarized in Table 3. Paired t-test results in Table 4 shows the statistical significance of the performance difference.
Figure 6.
Example elements in the constructed liver shape library.
Figure 7.
Comparison of liver segmentation results: the ground truth (red), actual segmentation results from different approaches (green).
Figure 8.
Comparison of DSC histograms from liver segmentation results.
Table 3.
Statistics of DSC and Hausdorff distance from liver segmentation results.
| Mean | S.D. | Median | ||
|---|---|---|---|---|
| DSC | CV | 0.84 | 0.07 | 0.87 |
| CV-SSP | 0.86 | 0.04 | 0.87 | |
| SCSP | 0.90 | 0.03 | 0.91 | |
| Hausdorff (mm) | CV | 43.6 | 19.6 | 38.5 |
| CV-SSP | 30.0 | 15.3 | 23.9 | |
| SCSP | 22.2 | 6.2 | 20.6 | |
Table 4.
P-values from paired t-tests on DSC and Hausdorff distance from liver segmentation results.
| CV | CV-SSP | SCSP | ||
|---|---|---|---|---|
| DSC | CV | - | - | - |
| CV-SSP | 0.38 | - | - | |
| SCSP | 2.30e-3 | 7.10e-4 | - | |
| Hausdorff | CV | - | - | - |
| CV-SSP | 0.02 | - | - | |
| SCSP | 6.15e-5 | 0.05 | - | |
4. Discussion and conclusion
We have presented a unified variational segmentation framework that regularizes the target shape with a level-set based sparse composite shape prior. In both corpus callosum and liver segmentation tasks, the proposed method achieved high segmentation accuracy and shown its advantage compared to the benchmark methods. Paired t-tests demonstrated the statistical significance of such superiority.
Variational segmentation methods are usually sensitive to initializations, especially when driven by edge-based fidelity alone [26]. The initialization step in our method, which consists of registering an arbitrary gaining image to the target image and propagating the training contour via the estimated transformation, can be considered as a very crude and fast single atlas based segmentation step. This “preprocessing” yields a decent initialization for the subsequent variational evolutions. In addition, the use of Gaussian mixture to model the intensity distributions is capable of accounting for more global and regional information, which provides additional drive to the shape update even when local gradient is weak. Practically, we have observed that the proposed method exhibits strong robustness to initial conditions.
It should be noted that the set of signed distance functions is not closed under addition [10]. However, this does not prevent the zero level set of linear combinations from providing the proper shape interpolation/extrapolation behavior in most situations, especially when the shapes to be combined are close enough to start with. This observation can be argued again from the perspective of a local linear (tangent space) approximation of a nonlinear shape manifold. The localness of this operation is usually achieved with a sparse regression setup for automatic support selection. In this work, this feature is further enhanced with the proposed reweighting scheme modulating the l1 regularization.
This work shares a similar principle to the works in sparse shape decomposition model [15, 16] in that shapes are considered to reside on a nonlinear manifold that can be locally represented in a low dimensional linear structure. Our work is distinct in its unified variational framework in a single optimization, as opposed to the sequential injection of shape prior as refinement steps [15, 16]. Compared to the successive refinement scheme, the single optimization provides better convergence behavior and numerical stability, particularly near the final solution. Another distinction lies in the technical aspect of shape representation. Representing shapes implicitly as level set functions, our method avoids con- structing explicit point cloud, which may involve careful interventions for landmark placement and selection. The Eulerian nature of the level set method eliminates the efforts in maintaining point regularity during the contour evolution. In all fairness, the point cloud presentation has computational speed benefit with a well-executed numerical package, yet such resource is generally unavailable. Therefore, one needs to be aware of the demands associated with the different technical decisions.
ALGORITHM 1.
Block minimization scheme
| while |Ek – Ek−1| < tol do |
| –Minimization w.r.t. : |
| –Minimization w.r.t. w: |
| –Variational descent w.r.t. ϕ: |
| end while |
Acknowledgement
This work is supported in part by NIH grant R01 CA159471-01. We also acknowledge the financial and administrative support from Radiation Oncology Department, UCLA.
References
- 1.Chan Tony, Zhu Wei. Computer Vision and Pattern Recognition. CVPR 2005. IEEE Computer Society Conference on. Vol. 2. IEEE; 2005. 2005. Level set based shape prior segmentation. pp. 1164–1170. [Google Scholar]
- 2.Cremers Daniel, Sochen Nir, Schnörr Christoph. Scale Space Methods in Computer Vision. Springer; 2003. Towards recognition-based variational segmentation using shape priors and dynamic labeling. pp. 388–400. [Google Scholar]
- 3.Chen Yunmei, Tagare Hemant D, Thiruvenkadam Sheshadri, Huang Feng, Wilson David, Gopinath Kaundinya S, Briggs Richard W, Geiser Edward A. Using prior shapes in geometric active contours in a variational framework. International Journal of Computer Vision. 2002;50(3):315–328. [Google Scholar]
- 4.Cremers Daniel, Soatto Stefano. A pseudo-distance for shape priors in level set segmentation. 2nd IEEE Workshop on Variational, Geometric and Level Set Methods in Computer Vision. 2003:169–176. [Google Scholar]
- 5.Rousson Mikael, Paragios Nikos. Computer Vision—ECCV 2002. Springer; 2002. Shape priors for level set representations. pp. 78–92. [Google Scholar]
- 6.Leventon Michael E, Grimson W Eric L, Faugeras Olivier. Proceedings. IEEE Conference on. Vol. 1. IEEE; 2000. Statistical shape influence in geodesic active contours. In Computer Vision and Pattern Recognition, 2000. pp. 316–323. [Google Scholar]
- 7.Tsai Andy, Yezzi Anthony, Jr, Wells William, Tempany Clare, Tucker Dewey, Fan Ayres, Grimson W Eric, Willsky Alan. A shape-based approach to the segmentation of medical imagery using level sets. Medical Imaging, IEEE Transactions on. 2003;22(2):137–154. doi: 10.1109/TMI.2002.808355. [DOI] [PubMed] [Google Scholar]
- 8.Cootes Timothy F, Taylor Christopher J. A mixture model for representing shape variation. Image and Vision Computing. 1999;17(8):567–573. [Google Scholar]
- 9.Rousson Mikael, Cremers Daniel. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2005. Springer; 2005. E cient kernel density estimation of shape and intensity priors for level set segmentation. pp. 757–764. [DOI] [PubMed] [Google Scholar]
- 10.Cremers Daniel, Osher Stanley J, Soatto Stefano. Kernel density estimation and intrinsic alignment for shape priors in level set segmentation. International Journal of Computer Vision. 2006;69(3):335–351. [Google Scholar]
- 11.Kim Junmo, Çetin Müjdat, Willsky Alan S. Nonparametric shape priors for active contour-based image segmentation. Signal Processing. 2007;87(12):3021–3044. [Google Scholar]
- 12.Chen Siqi, Radke Richard J. Computer Vision, 2009 IEEE 12th International Conference on. IEEE; 2009. Level set segmentation with both shape and intensity priors. pp. 763–770. [Google Scholar]
- 13.Etyngier Patrick, Segonne Florent, Keriven Renaud. Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on. IEEE; 2007. Shape priors using manifold learning techniques. pp. 1–8. [DOI] [PubMed] [Google Scholar]
- 14.Dambreville Samuel, Rathi Yogesh, Tannenbaum Allen. A framework for image segmentation using shape models and kernel space shape priors. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 2008;30(8):1385–1399. doi: 10.1109/TPAMI.2007.70774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zhang Shaoting, Zhan Yiqiang, Dewan Maneesh, Huang Junzhou, Metaxas Dimitris N, Sean Zhou Xiang. Towards robust and e ective shape modeling: Sparse shape composition. Medical Image Analysis. 2012;16(1):265–277. doi: 10.1016/j.media.2011.08.004. [DOI] [PubMed] [Google Scholar]
- 16.Zhang Shaoting, Zhan Yiqiang, Metaxas Dimitris N. Deformable segmentation via sparse representation and dictionary learning. Medical Image Analysis. 2012;16:1385–1396. doi: 10.1016/j.media.2012.07.007. [DOI] [PubMed] [Google Scholar]
- 17.Osher Stanley, Sethian James A. Fronts propagating with curvature-dependent speed: algorithms based on hamilton-jacobi formulations. Journal of Computational Physics. 1988;79(1):12–49. [Google Scholar]
- 18.Kass Michael, Witkin Andrew, Terzopoulos Demetri. Snakes: Active contour models. International Journal of Computer Vision. 1988;1(4):321–331. [Google Scholar]
- 19.Chan Tony, Vese Luminita A. Active contours without edges. Image Processing, IEEE transactions on. 2001;10(2):266–277. doi: 10.1109/83.902291. [DOI] [PubMed] [Google Scholar]
- 20.Verma Nishant, Muralidhar Gautam S, Bovik Alan C, Cowperthwaite Matthew C, Markey Mia K. Engineering in Medicine and Biology Society, EMBC, 2011 Annual International Conference of the IEEE. IEEE; 2011. Model-driven, probabilistic level set based segmentation of magnetic resonance images of the brain. pp. 2821–2824. [DOI] [PubMed] [Google Scholar]
- 21.Xie Zhenping, Wang Shitong, Hu Dewen. New insight at level set & gaussian mixture model for natural image segmentation. Signal, Image and Video Processing. 2013;7(3):521–536. [Google Scholar]
- 22.Fedkiw Ronald, Osher Stanley. Level set methods and dynamic implicit surfaces. Springer; 2003. [Google Scholar]
- 23.Dempster Arthur P, Laird Nan M, Rubin Donald B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal statistical Society. 1977;39(1):1–38. [Google Scholar]
- 24.Boyd Stephen, Parikh Neal, Chu Eric, Peleato Borja, Eckstein Jonathan. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning. 2011;3(1):1–122. [Google Scholar]
- 25.Lundervold Arvid, Duta Nicolae, Taxt Torfinn, Jain Anil K. Computer Vision and Pattern Recognition, 1999. IEEE Computer Society Conference on. Vol. 1. IEEE; 1999. Model-guided segmentation of corpus callosum in MR images. [Google Scholar]
- 26.Cremers Daniel, Rousson Mikael, Deriche Rachid. A review of statistical approaches to level set segmentation: integrating color, texture, motion and shape. International Journal of Computer Vision. 2007;72(2):195–215. [Google Scholar]








