Abstract
Standard image based segmentation approaches perform poorly when there is little or no contrast along boundaries of different regions. In such cases, segmentation is largely performed manually using prior knowledge of the shape and relative location of the underlying structures combined with partially discernible boundaries. We present an automated approach guided by covariant shape deformations of neighboring structures, which is an additional source of prior information. Captured by a shape atlas, these deformations are transformed into a statistical model using the logistic function. Structure boundaries, anatomical labels, and image inhomogeneities are estimated simultaneously within an Expectation-Maximization formulation of the maximum a posteriori probability estimation problem. We demonstrate the approach on 20 brain magnetic resonance images showing superior performance, particularly in cases where purely image based methods fail.
1 Introduction
To better understand brain diseases, many neuroscientists analyze medical images for cortical and subcortical structures that seem to be influenced by the disease [1]. The analysis is based on segmentations of the structures of interests, often performed by human experts. However, this manual process is not only expensive, but in addition, it increases risks related to inter- and intra-observer reliability [2]. In this paper, we describe an automatic method, which accurately segments these structures by considering anatomical shape constraints and image artifacts of Magnetic Resonance (MR) images.
The detection of substructures is difficult as many of them are defined by partially discernible boundaries, such as in the case of the boundary between thalamus and white matter [3]. However, the ventricles, the structure above the thalamus, is more easily identified. In order for the ventricles to guide the boundary detection between the thalamus and the white matter, automatic segmentation algorithms use spatial priors [4–6]. These spatial priors capture the spatial relationship between structures such as the fact that the ventricles are above the thalamus. This is one example in which neighboring structures are of great utility for segmentation purposes.
These types of priors are often characterized by soft boundaries representing the large spatial variability of a structure within a population. Deformable models offer an alternative as they capture the shape and permissible modes of variation within a population. In contrast to spatial priors on tissue labels, segmentation methods based on deformable models are guided by structure specific boundary conditions such as the length of the boundary in relation to others.
The work of this paper is motived by the class of deformable model-based approaches called active contour methods [7–9], in which the shape of an anatomical structure is represented as a level set function in a higher dimensional space. Similarly, our method defines anatomical shape constraints using signed distance maps in combination with the modes of variations of a Principle Component Analysis (PCA) [13]. While active contour methods were originally motivated by physical models [10], many methods are based on a Bayesian framework [11, 14, 12], which we chose for our algorithm. A Bayesian framework allows us to explicitly model the image inhomogeneities of MR images in order to segment large data sets without manual intervention.
The optimal solution within our framework is defined by a Maximum A posteriori Probability (MAP) estimation problem with incomplete data. From the MAP estimation problem we derive an instance of the Expectation Maximization algorithm (EM). The main contribution of the current work is that while we represent the shape variations through an implicit low-dimensional PCA, we additionally derive from this an explicit space-conditioned probability model by way of the logistic function. When combined with image-coupling and other terms in our Bayesian framework, the mechanism is able to identify shapes that are not restricted to the low-dimensional PCA space.
In contrast to other EM implementations [11, 14, 15], our method explicitly models the boundary via the shape model. Consequently, we achieve smooth segmentations without underestimating fine structures; a common problem in EM implementations [15]. To demonstrate the capabilities of our approach, we outline 20 sets of MR images into the major tissue classes as well as subcortical structures. The reliability of our approach is determined by the correspondence of the automatic segmentations to expert manual ones.
2 Deriving a Unified Framework for Image Inhomogeneity Correction, Shape Modeling, and Segmentation
The accuracy of outlining structures with indistinct boundaries in MR images significantly depends on properly modeling the boundary of the structure as well as estimating the inhomogeneities in the image. In this section, we develop a unified framework that performs segmentation, shape detection, and inhomogeneity correction simultaneously.
Without additional assumptions, it is difficult to extract the inhomogeneities ℬ and the shape parameters from the MR images I due to their complex dependencies. However, this problem is greatly simplified when formulated as an incomplete data problem via EM. Within this framework, we define the following MAP estimation problem:
(1) |
In general, this results in a system of equations for which there is no analytical solution. We introduce the labelmap , which assigns each voxel in the image to an anatomical structure. If is known it eases the estimation of ℬ and based on I. In our problem, the labelmap is unknown so that the instance of the EM algorithm iteratively determines the solution of [16]. At each iteration, the method improves the estimates of through
(2) |
The expected value is defined as .
In our case, Equation (2) is a less complicated MAP problem than Equation (1). However, we would like to further simplify this update rule as it depends on both shape and inhomogeneities ℬ. To split Equation (2) into two separate MAP problems, we first rephrase Equation (2) by simply applying Bayes’ rule and dropping terms that do not depend on :
(3) |
The optimization procedure decomposes nicely as a consequence of the following independence assumptions: First, we assume independence of I with respect to conditioned on and ℬ because our model characterizes each anatomical structure by a stationary intensity distribution [11, 14]. Next, we assume independence of with respect to ℬ conditioned on , as the image inhomogeneities do not influence the shape of a structure. Finally, we assume independence of ℬ with respect to and that the two conditional probabilities and are defined by the product of the corresponding conditional probabilities over all the voxels in the image space. Thus, Equation (3) simplifies to
(4) |
The labelmap is composed of the indicator random vector , where x represents a voxel on the image grid. The vector ea is zero at every position but a, where its value is one. For example, if then voxel x is assigned to the structure a. We now define the E-Step of our EM implementation as
If we assume that is independent of ℬ then
(5) |
and Equation (4) reduces to
Now, the M-Step solves the following two separate MAP problems
(6) |
(7) |
A variety of closed-form solutions for Equation (7) have been proposed in the literature such as by [14] and [11]. The remainder of this paper therefore focuses on Equation (6).
In summary, we find a local maxima to the difficult MAP estimation problem of Equation (1) by solving the simpler Equation (2), derived from an EM formulation. Based on independence assumptions, our instance of the EM algorithm iterates between the E-Step, which calculates via Equation (5), and the M-Step, which solves the MAP problems of Equation (6) and Equation (7).
3 Logistic Maps for Shape Probabilities
The solution of Equation (6) greatly depends on the shape representation that defines the space of and the probabilities that define the relationship of the attributes within our model. This section gives an example for a derivation of this equation. Before we do so, we briefly review the shape representation defined by the signed distance map.
Note, while we adopt a PCA representation of shape information, the final estimate is not restricted to the PCA parameterization of shape. This is facilitated by the use of the logistic function as described in Section 3.2. Consequently, our model captures a broader class of shapes than those methods that are restricted to the PCA model.
3.1 Shape Representation
As mentioned, the results of level set methods [7, 8, 17] using a PCA model on signed distance maps inspired us to introduce shape constraints in an EM framework. We follow the suggestion by Tsai [7], who applies PCA to all structures simultaneously to capture the covariation between structures. We initially model the shapes of all structures of interest by the distance map . (x) is a vector of dimension equal to the number of structures of interests. It defines the distance of voxel x to the boundary of each structure. Positive values are assigned to voxels within the boundary of the object, while negative values indicate voxels outside the object.
We first turn a set of manual segmentations into signed distance maps and then apply PCA to the maps in order to determine the modes of variations of each structure. The resulting shape model is represented by the eigenvector or modes of variation matrix U, eigenvalue matrix Λ, and , where is the mean distance map of the anatomical structure a. To reduce the computational complexity for the EM implementation, U and Λ are only defined by the first K eigenvectors. In our case K represents 99 % of the eigenvalues’ energy, which corresponds to the first five eigenvectors.
The shapes in a specific image are described by the expansion coefficients of the eigenvector representation, which are the shape parameters . relates to the distance maps by , where captures the distance maps of all structures of interest. We refer to the distance map of a specific structure a defined by shape as , where Ua are the entries in U corresponding to structure a. This type of shape representation is only appropriate for defining local shape deformations as the space defined by signed distance maps is not a linear vector space. Thus, is a local approximation to the manifold of distance maps.
We end this brief description of the shape model by defining the prior over the shape parameters as
(8) |
which is based on the hidden Gaussian assumption in PCA.
3.2 Estimating the Shape
In this section, we define the relationship of the unknown labelmap and the shape parameter captured by the conditional probability of Equation (6). The task is not straight-forward because unlike active contour methods, we also model the unknown labelmap and the image inhomogeneities ℬ explicitly. The shape captures global characteristics of structures, while and ℬ characterize local properties. Motivated by the need to combine global and local information, we describe the use of the logistic function of the distance transform. The logistic function provides an implicit representation of the shape and an explicit space-conditioned probability model.
As mentioned previously, our model captures the relationship between the shape parameters (which corresponds to a signed distance map) and the labelmap through the conditional probability Since the random variable is discrete, we define the conditional probability in terms of a generic shape function (·,·) as
Given the motivation above, a natural choice for this formulation is the logistic function
which maps the distance map to the range [0,1]. For example, if is positive, then the voxel is inside the object and . The variations within depend on ca, which captures the certainty of the method with respect to the shape model. Uncertainty about the shape model is represented by relative small ca. This results in a wide slope of the spatial distribution (see Figure 1), which allows greater mobility of the boundary. Large ca define spatial priors with steep slopes, which tend to position the boundary of a structure.
The probability of the segmentation conditioned on the shape is now defined as
(9) |
so that the MAP estimation problem of Equation (6) transforms to
(10) |
Determining a closed form solution to this estimation problem is generally very difficult so that we approximate its solution using Powell’s method [18].
In summary, the parameters are seen within the context of a shape atlas created by PCA on signed distance maps. We relate the shape model to the EM algorithm of the previous section by defining of Equation (6) as a composition of logistic functions on distance maps. The E-Step of the EM algorithm calculates the based on the shape parameters , intensity I, image inhomogeneities , and voxel x
The distribution of depends on the underlying image inhomogeneity model, which is an ongoing discussion [11, 14]. We choose the model by Wells et al. [11] that defines by the Gaussian distribution . capture the mean and variance of the intensity distribution of the structure a.
The M-Step updates the estimates of the inhomogeneities and shape based on the weights . The update rule of (Equation (7)) reduces to a system of linear equations and is solved in closed form [11]. The shape is updated according to Equation (10) for which a solution is found via Powell’s method [18].
4 Validation
This section compares the accuracy of our new method with (EM-Shape) and without shape modeling (EM-NoShape). Both methods segment 22 test cases into the three brain tissue classes - white matter, grey matter and corticospinal fluid. As in Figure 2, the right (pink) and left ventricle (turquoise) are extracted from the corticospinal fluid, and the grey matter is further parcellated into right (red) and left (purple) thalamus, and right (green) and left caudate (blue). We determine the accuracy of the approaches by comparing the automatic segmentations of the thalamus and the caudate to manual ones, which we view as ground-truth.
With respect to EM-Shape, the atlas of Section 3.1 represents the shape of the thalamus, caudate, and the ventricles. The three brain tissue classes are excluded from the dynamic shape model as their spatial distributions are defined by the spatial atlas of [15] and not Equation (9). The model of EM-NoShape represents all anatomical structures by the spatial atlas.
We focus on the thalamus and caudate as they are challenging structures to segment. Purely intensity based segmentation methods, such as EM without spatial priors, cannot outline these structures because part of the boundary is invisible on MR images. Consequently, EM relies heavily on the prior information. In addition, the two structures are characterized by very different shapes (see Figure 2). While the right and left thalamus are shaped like an oval with a hook attached to it, the caudate is defined by long, thin horns wrapped around the ventricles. The segmentation methods also segment the ventricles because they are clearly visible on MR images. This structure further constrains the space of possible solutions for EM-Shape as all structures of interest have to be in proper proportion to each other.
To measure the quality of the automatic generated results, we compare them to the manual segmentations using the volume overlap measure DICE [19]. The graph in Figure 2 shows the average DICE measures and standard error for the two methods with respect to the thalamus and caudate. For the thalamus, EM-Shape achieves a higher average score (88.4 ±1.0%; mean DICE score ± standard error) than EM-NoShape (87.3 ±1.2%). The impact of the shape model on the segmentation results is even more apparent in the case of the caudate, where EM-Shape (84.9 ± 0.8%) is significantly better than EM-NoShape (82.7 ±1.2%). The greater accuracy of EM-Shape is attributed to the shape atlas, which better captures the subject specific bending of the horn shaped caudate than the spatial atlas.
The initial DICE score of EM-Shape is generally lower than that of EM-NoShape because the shape model misrepresents the patient specific structures. For example, Figure 3 shows the outcome of EM-Shape after every fifth iteration. Initially, the segmentation is noisy, which indicates discrepancy between the initial shape model defined by the mean shape and the patient specific shape. With each iteration, the arch of the caudate widens and the segmentations get smoother. After 20 iterations the method converges to a solution that generally outperforms EM-NoShape.
As mentioned, it is difficult to determine the exact shape of a structure with weakly visible boundaries. From the MR images, the size of the oval and the position of the hook of the thalamus are often not clearly defined. The top-left image of Figure 4 shows an example of such a scenario. The segmentations are the results of the two automatic segmentation methods where black indicates the outline of the human expert. In this example, EM-NoShape underestimates the hook of the thalamus, which we found to be true throughout this experiment. EM-Shape can better cope with this problem as the shape model adds global constraints to the local analysis of the intensities. An example of a global constraint is the explicit definition of shape dependencies across anatomical structures. This causes the shape of the thalamus to be proportional to one of the easily segmentable ventricles. This impacts the accuracy of EM-Shape as it further constrains the space of possible segmentations.
The other structure of interest in this experiment is the caudate. The structure is adjacent to the putamen, another subcortical structure with an identical intensity distribution. In the MR image of the middle column of Figure 4, the putamen is located on the outside of image. Neither the intensity pattern nor the spatial prior can properly separate these two structures, as indicated by the noisy segmentations of EM-NoShape. The outliers visible in EM-NoShape violate the shape constraints of EM-Shape as the boundary has to satisfy the conditions set by the ventricles and the thalamus.
For both structures, EM-NoShape did not adequately segment the ends of the structure. In the right column of Figure 4, EM-NoShape underestimates the tip of the caudate. The opposite is true for the thalamus where EM-NoShape overestimates the ends. Again, spatial and intensity distributions do not allow discrimination between anatomical structures in this area. In summary, on the 20 test cases our shape based method EM-Shape was performing much better than EM-NoShape, which uses a spatial atlas instead of a shape atlas.
5 Summary and Conclusions
We presented a statistical framework for the segmentation of anatomical structures in MR images. The framework is guided but not restricted to the low-dimensional PCA shape model as the shape representation is turned into space-conditioned probability model using the logistic function. The approach is especially well suited for structures with weakly visible boundaries as it simultaneously estimates the image inhomogeneities, explicitly models the boundaries through a deformable shape model, and segments the MR images into anatomical structures. Our approach was validated by automatically segmenting 20 test cases and comparing the results to a similar EM implementation without shape priors. In general, our new method performs much better. The improvement is primarily due to explicit modelling of the shape constraints along the boundary of anatomical structures.
Acknowledgments
This investigation was supported by the NIH grants K02 MH-01110, R01 MH-50747, R01-NS051826-01, P41 RR-13218, U24 RR021382, and U54-EB-005149. We would also like to thank Bryan Russel and Polina Golland for their helpful comments.
References
- 1.Shenton M, Kikinis R, Jolesz F, Pollak S, LeMay M, Wible C, Hokama H, Martin J, Metcalf D, Coleman M, McCarley R. Left temporal lobe abnormalities in schizophrenia and thought disorder: A quantitative MRI study. New England Journal of Medicine. 1992;327:604–612. doi: 10.1056/NEJM199208273270905. [DOI] [PubMed] [Google Scholar]
- 2.Kikinis R, Shenton ME, Gering G, Martin J, Anderson M, Metcalf D, Guttmann C, McCarley RW, Lorensen W, Line H, Jolesz FA. Routine quantitative analysis of brain and cerebrospinal fluid spaces with MR imaging. MRI. 1992;2(6):619–629. doi: 10.1002/jmri.1880020603. [DOI] [PubMed] [Google Scholar]
- 3.Pohl K, Fisher J, Levitt J, Shenton M, Kikinis R, Grimson W, Wells W. A unifying approach to registration, segmentation, and intensity correction. MICCAI. 2005 doi: 10.1007/11566465_39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Collins D, Zijdenbos A, Barre W, Evans A. Animal+insect: Improved cortical structure segmentation. IPMI. 1999;1613 [Google Scholar]
- 5.Leventon M, Grimson W, Faugeras O. Statistical shape influence in geodesic active contours. CVPR. 2000:1316–1323. [Google Scholar]
- 6.Fischl B, van der Kouwe A, Destrieux C, Halgren E, Sgonne F, Salat D, Busa E, Seidman L, Goldstein J, Kennedy D, Caviness V, Makris N, Rosen B, Dale A. Automatically parcellating the human cerebral cortex. Cerebral Cortex. 2004;14:11–22. doi: 10.1093/cercor/bhg087. [DOI] [PubMed] [Google Scholar]
- 7.Tsai A, Yezzi A, Wells W, Tempany C, Tucker D, Fan A, Grimson W, Willsky A. A shape-based approach to the segmentation of medical imagery using level sets. TMI. 2003;22(2):137–154. doi: 10.1109/TMI.2002.808355. [DOI] [PubMed] [Google Scholar]
- 8.Leventon ME. Statistical Models in Medical Image Analysis PhD thesis, Massachusetts Institute of Technology. 2000 [Google Scholar]
- 9.Yang J, Duncan JS. Joint prior models of neighboring objects for 3D image segmentation. CVPR. 2004:314–319. doi: 10.1109/CVPR.2004.1315048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mumford D, Shah J. Boundary detection by minimizing functionals. CVPR. 1985:22–26. [Google Scholar]
- 11.Wells W, Grimson W, Kikinis R, Jolesz F. Adaptive segmentation of MRI data. TMI. 1996;15:429–442. doi: 10.1109/42.511747. [DOI] [PubMed] [Google Scholar]
- 12.Wyatt PP, Noble JA. MAP MRF joint segmentation and registration. MICCAI. 2002:580–587. doi: 10.1016/s1361-8415(03)00067-7. [DOI] [PubMed] [Google Scholar]
- 13.Cootes T, Hill A, Taylor C, Haslam J. The use of active shape models for locating structures in medical imaging. Imaging and Vision Computing. 1994;12(6):335–366. [Google Scholar]
- 14.Van Leemput K, Maes F, Vanermeulen D, Suetens P. Automated model-based bias field correction of MR images of the brain. TMI. 1999;18(10):885–895. doi: 10.1109/42.811268. [DOI] [PubMed] [Google Scholar]
- 15.Pohl K, Bouix S, Kikinis R, Grimson W. Anatomical guided segmentation with non-stationary tissue class distributions in an expectation-maximization framework. ISBI. 2004:81–84. doi: 10.1109/ISBI.2004.1398479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.McLachlan GJ, Krishnan T. The EM Algorithm and Extensions. John Wiley and Sons, Inc; 1997. [Google Scholar]
- 17.Yang J, Staib LH, Duncan JS. Neighbor-constrained segmentation with level set based 3D deformable models. TMI. 2004;23(8):940–948. doi: 10.1109/TMI.2004.830802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Press W, Flannery B, Teukolsky S, Vetterling W. Numerical Recipes in C : The Art of Scientific Computing. 2 ed. Cambridge University Press; 1992. [Google Scholar]
- 19.Dice LR. Measure of the amount of ecological association between species. Ecology. 1945;26:297–302. [Google Scholar]