Abstract
Radiotherapy planning requires accurate delineations of the critical structures. To avoid manual contouring, atlas-based segmentation can be used to get automatic delineations. However, the results strongly depend on the chosen atlas, especially for the head and neck region where the anatomical variability is high. To address this problem, atlases adapted to the patient’s anatomy may allow for a better registration, and already showed an improvement in segmentation accuracy. However, building such atlases requires the definition of a criterion to select among a database the images that are the most similar to the patient. Moreover, the inter-expert variability of manual contouring may be high, and therefore bias the segmentation if selecting only one image for each region. To tackle these issues, we present an original method to design a piecewise most similar atlas. Given a query image, we propose an efficient criterion to select for each anatomical region the K most similar images among a database by considering local volume variations possibly induced by the tumor. Then, we present a new approach to combine the K images selected for each region into a piecewise most similar template. Our results obtained with 105 CT images of the head and neck show that our method reduces the over-segmentation seen with an average atlas while being robust to inter-expert manual segmentation variability.
Keywords: Algorithms; Computer Simulation; Head; radiography; Humans; Imaging, Three-Dimensional; methods; Models, Anatomic; Neck; radiography; Pattern Recognition, Automated; methods; Radiographic Image Enhancement; methods; Radiographic Image Interpretation, Computer-Assisted; methods; Reproducibility of Results; Sensitivity and Specificity; Subtraction Technique; Tomography, X-Ray Computed; methods
1. Introduction
The purpose of radiotherapy planning is to optimize the dose received by the tumor while controlling the dose on the surrounding Organs At Risk (OARs). This requires the accurate delineation of the Clinical Target Volume (CTV) and the OARs. In clinical routine, this task is often performed manually, which is tedious and prone to inter-expert variability. To ease this task, atlas-based segmentation may be used to get automatic delineations, and showed satisfying results for the brain [1] and promising results for the head and neck region [2].
In the head and neck, the anatomical variability among patients is high, mainly due to corpulence and neck flexion. Previous studies showed that an average atlas has difficulties to cope with this high variability, and may result in over-segmentation for some structures [2]. Utilizing an atlas that is specifically adapted to the anatomy of the patient to delineate may help to improve the registration quality, and therefore the accuracy of the segmentation. To this end, one solution is to compute population-specific atlases, for example by clustering the database into homogeneous sub-groups [3] and computing an average atlas for each sub-group. To be even more specific to the patient (and not only to a given population), other approaches [4, 5] have been developed to consider each manually delineated image of a database as a potential atlas, and to select the most appropriate one for each new query image to segment. By extension, and to enhance robustness, it has been proposed to select several of the most appropriate images, register them independently to the patient and combine the segmentation results [6]. All these approaches bring up two questions: how to select the most appropriate images for a given patient and how to fuse them.
The selection criterion must be able to account for the anatomical variability in the database (various corpulence, neck flexion, various tumor size and grade), and it must be fast enough to be used in clinical routine. Selection criteria based on meta-information (e.g. age [6]) have been used, but they are not suitable when dealing with anatomical variability independent of simple meta-information. Therefore, criteria based on intensities [6, 4] have been proposed. However, our database is composed of pathological images, which may corrupt intensity based criteria. Commowik et al. proposed to estimate the amount of deformation needed to warp each image onto the patient image, using the average atlas to reduce computation time [5]. This criterion is computationally interesting but it still requires inverting and composing many deformation fields. Our first contribution is to propose an efficient selection criterion based on the degree of contraction and dilation of the structures. This criterion is well-suited for our case as it may account for the local volume variations caused by the tumor.
Regardless of the nature of the selection criterion, it may be applied globally on the images [6, 5], or locally in order to cope with the local changes of each region [7–10, 4]. Because of the high anatomical variability and as our database is composed of pathological images, a local selection seems more appropriate to consider the local impact of the tumor on the surrounding anatomical structures.
Once the most appropriate images have been selected for each region of interest, the fusion step has to be performed. In [9], a framework was proposed to build a piecewise most similar atlas from a set of images selected on predefined regions. This showed an improvement in segmentation accuracy with respect to an average atlas. However, it was restricted to the selection of a single image for each region, which makes it more sensitive to the selection step (e.g. outliers may exist in the selection process). Moreover, it may also be sensitive to the relatively high inter-expert variability in the head and neck region. Our second contribution is then to provide a framework to combine Kl selected images for each region Rl into one template for segmentation, taking into account the relative values of the selection criterion to weight each selected image accordingly.
We illustrate the capacities of our framework with 105 CT images of the head and neck region, showing its ability to reduce the over-segmentation seen with an average atlas while being less sensitive to inter-expert segmentation variability than a piecewise atlas computed using only one image per region.
2. Method
We present a new method to design an atlas locally adapted to the patient P to delineate on predefined regions. We assume that a database of N manually delineated images {Ij}j∈[1…N] is available. Moreover, we suppose that an average atlas M has been built from this database. The average atlas construction provides for each image Ij a transformation warping it on M. We denote by TIj←M the non-linear part of the transformation allowing to resample Ij on M, and JIj←M the corresponding image of the Jacobian determinant values.
2.1. Efficient Local Selection of the Most Similar Images through Volume Variation Estimation
We wish to select among the images {Ij}j∈[1…N] the ones that are the most similar to the query patient P on predefined regions {Rl}l∈[1…L]. The regions Rl are defined once and for all on the average atlas M. Typically, one may define them as a dilation of the anatomical structures of interest. For a given region Rl in M, we define our criterion as a comparison of the average degree of contraction/dilation when deforming Ij on M and when deforming P on M. To do this, we first average on Rl the logarithms of the determinants of the Jacobian matrices for each non-linear deformation TIj←M, as described below:
(1) |
In the same way, after registering M and P, we can estimate J̄Rl (P ← M) from TP←M. Then, the images {Ij}j∈[1…N] can be ranked from the most similar to the least similar to the patient P on Rl according to the distance dRl (Ij, P) = ||J̄Rl (P ← M) − J̄Rl (Ij ← M)||. This criterion is well-suited for the local selection of the most similar images. Our images indeed present tumors of various sizes and grades that can induce local volume variations of the CTV and of the surrounding OARs. Moreover, it is very efficient as the J̄Rl (Ij ← M) are pre-computed. It only requires performing one non-linear registration between P and M and computing J̄Rl (P ← M). By comparison, other methods either require multiple registrations [3, 4] or many inversions and compositions of deformation fields [5].
2.2. Construction of a Piecewise Most Similar Atlas Incorporating Selection Weights
For each region Rl, the Kl images of the database having the lowest distances dRl (Ij, P) are selected to build the piecewise most similar atlas and are denoted {Ĩl,n}n∈[1…Kl]. Further, we associate each image Ĩl,n with a selection weight αl,n, based on dRl (Ĩl,n, P), that reflects its relative degree of similarity to P on Rl. To compute αl,n, we used the Gaussian kernel, i.e. αl,n = Gμ,σ(dRl (Ĩl,n, P)), as it allows us to discriminate distances that are very large. The Gaussian can be centered either on zero, or on the minimum distance found for the region Rl (we chose the second solution). As to the standard deviation σ, it controls the rejection of images with a large distance and was computed from the whole distribution of distances on Rl. The weights are then normalized for each region, so that for each l, . In addition, we also consider spatial weights to allow a smooth transition when interpolating between the regions Rl in the construction of the piecewise atlas. The spatial weight of the region Rl at location x is defined as wl(x) = 1/(1+βdist(x,Rl)) where dist(x,Rl) refers to the minimal distance to Rl at location x. It is then normalized so that .
Construction of the Piecewise Most Similar Image
The construction process may be seen as a classical atlas construction [11] where the images have varying weights depending on the spatial location of each voxel (w̄l(x)) and on the selection distances (ᾱl,n). We iterate over the following steps (M̃0 = M):
Register the images Ĩl,n on the current reference M̃k. This step provides affine transformations AĨl,n←M̃k and non-linear transformations TĨl,n←M̃k
Compute the new average image Mk+1 by interpolating the intensities of the warped Ĩl,n using the two sets of weights w̄l,k(x) and ᾱl,n
Compute an average diffeomorphism T̄k from the TĨl,n←M̃k and the weights
Apply to Mk+1 to get the new reference
Update the regions of interest by applying to Rl,k: , and update the spatial weights w̄l,k+1(x) accordingly
This process is similar to [9]. However, it is much more general as it allows the combination of several images for each region Rl. This is achieved by the following equations for steps 2 and 3. First, the intensities are interpolated by:
(2) |
The inner term (sum over n) computes a weighted average of the selected images for a region Rl, while the outer term uses the spatial weights to combine the contributions from each region Rl. Similarly, in step 3, we compute an average polydiffeomorphism T̄k using the Log-Euclidean framework [12] 1. This framework ensures to remain on the manifold of diffeomorphisms and leads to an autonomous Ordinary Differential Equation that can be easily integrated: .
Construction of the Associated Segmentation
After building the piece-wise most similar template, we need to compute its associated segmentation from the delineations of the selected images. The images of our database have been delineated for a clinical purpose, and some contours are missing for some structures. To deal with this difficulty, we chose to define one region Rl for each anatomical structure in the construction of the template image.
The construction of the associated segmentation is then achieved in two steps. First, we compute a probability map for each structure independently using the selected manual segmentations and the selection weights ᾱl,n. Then, we assign each voxel of the template image to the structure that has the highest probability.
3. Evaluation
We evaluated the proposed framework with N = 105 CT images of the head and neck region. On these images, the CTVs and OARs were manually delineated following the guidelines in [13]. The structures involved are the lymph node levels II, III and IV (CTVs), the parotids, the spinal cord, and the brainstem (OARs). We performed a Leave-One-Out analysis, each patient being successively excluded from the database and delineated with each of the three following atlases built from the N − 1 remaining images: (1) AVE: average atlas built as in [2], (2) PW_1: piecewise most similar atlas built with Kl = 1 image for each region, and (3) PW_10: piecewise most similar atlas built with Kl = 10 images for each region. As registration algorithm, we used the framework described in [2].
3.1. Qualitative Results
Fig. 1 shows the three different atlases (b,c,d) computed for a given patient (a) whose neck flexion is above average. The spinal cord contours show that the average atlas (b) and the piecewise atlas PW_1 (c) both have a relatively low neck flexion, whereas the neck flexion of PW_10 (d) looks much more similar to the patient’s one (see arrows). When registering head and neck images, a different neck flexion between the atlas and the patient is a common issue, often leading to registration errors and low segmentation accuracy. Therefore, our method’s ability to provide a correct neck flexion may increase segmentation quality.
Fig. 2 illustrates some qualitative segmentation results on the parotids and on the lymph nodes levels III-IV using the three atlases. Compared to the manual contours (a), the automatic contours provided by the average atlas (AVE) (b) are too large, which was already observed in [2]. As mentioned in [9], PW_1 (image (c)) allows to reduce the over-segmentation. However, it was built from only one image for each region, and it is therefore likely to be biased by the inter-expert variability of delineation. The two small arrows on image (c) show the influence of local specificities of the selected segmentations on each region. Moreover, by construction, PW_1 segmentations can present some discontinuities. For instance, the large arrow on image (c) shows some non-connected lymph node levels III and IV, which is anatomically inconsistent. The automatic contours obtained with PW_10 (image (d)) are much less dependent on the inter-expert variability as 10 segmentations were fused for each structure. Moreover, the obtained contours are closer to the manual contours than both contours from AVE (b) and PW_1 (c), which results in shorter correction time for the clinician.
3.2. Quantitative Results
We now compare the performance of the three atlases AVE, PW_1 and PW_10 in terms of segmentation accuracy. To this end, sensitivity and specificity were averaged for each structure over all the Leave-One-Out tests. The results are presented in Fig. 3. First, as observed in [9], PW_1 shows an improvement of the specificity with respect to AVE, which is related to the reduction of the over-segmentation. However, this improvement is achieved at the expense of the sensitivity. With PW_10, the specificity is even higher than with PW_1 and the decrease in sensitivity is lower. For all structures, we also performed paired t-tests on the Dice values for each pair of methods. Whereas PW_1 has significantly lower Dice than AVE and PW_10 (P < 0.05), the differences between AVE and PW_10 are statistically not significant (P > 0.05), illustrating that the overall overlap is similar while PW_10 significantly reduces the over-segmentation. Therefore PW_10 combines the advantages of both PW_1 (avoiding over-segmentation) and AVE (avoiding errors due to inter-expert variability).
4. Conclusion
We presented a new approach to build a piecewise most similar atlas to the patient. We first introduced an efficient criterion to select among a database the images that are the most similar to the patient for each region. This criterion is well adapted to model the impact of the tumor on the CTVs and the OARs as it is based on the local degree of contraction/dilation. Then, we presented a novel approach to build from the selected images a piecewise atlas and its associated segmentation. We applied our algorithm with 105 CT images of the head and neck region. The proposed approach was compared to other atlas-based approaches (single average atlas and piecewise most similar atlas built from a single image per region). We showed that our approach combines the advantages of both techniques. It indeed enables reducing the over-segmentation observed with the average atlas, and it is less dependent on the inter-expert segmentation variability than the piecewise atlas built from a single image per region.
The number Kl of images selected for each region plays an important role, as well as the standard deviation σ of the Gaussian in the selection weights. Here we used arbitrarily Kl = 10 mainly for computational reasons, but we will study the influence of these two parameters to find out the optimal solution between the average atlas (Kl = N, infinite σ) and the method proposed in [9] (Kl = 1). Future work will also include a separate evaluation of the selection criterion and the piecewise atlas construction method. Finally, we will assess our framework on different groups of patients, e.g. on corpulent patients or patients with high neck flexion for which the average atlas provides low segmentation accuracy.
Acknowledgments
This work was undertaken in the framework of the MAESTRO project (IP CE503564) funded by the European Commission, and was also partially funded by ANRT. The authors gratefully acknowledge Pr. V. Grégoire for the manually delineated database and for his expertise.
Footnotes
The deformations in the head and neck region are close enough to the identity, ensuring that the computed logarithms are correct, as specified by Arsigny et al.
References
- 1.Bondiau PY, Malandain G, Chanalet S, et al. Atlas-based automatic segmentation of MR images: validation study on the brainstem in radiotherapy context. IJROBP. 2005;61(1):289–98. doi: 10.1016/j.ijrobp.2004.08.055. [DOI] [PubMed] [Google Scholar]
- 2.Commowick O, Grégoire V, Malandain G. Atlas-based delineation of lymph node levels in head and neck computed tomography images. Rad Oncol. 2008;87(2):281–289. doi: 10.1016/j.radonc.2008.01.018. [DOI] [PubMed] [Google Scholar]
- 3.Blezek DJ, Miller JV. Atlas stratification. MedIA. 2007;11(5):443–57. doi: 10.1016/j.media.2007.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wu M, Rosano C, Lopez-Garcia P, et al. Optimum template selection for atlas-based segmentation. Neuroimage. 2007;34(4):1612–8. doi: 10.1016/j.neuroimage.2006.07.050. [DOI] [PubMed] [Google Scholar]
- 5.Commowick O, Malandain G. Efficient selection of the most similar image in a database for critical structures segmentation. Proc MICCAI’07, Part II Volume 4792 of LNCS. 2007:203–210. doi: 10.1007/978-3-540-75759-7_25. [DOI] [PubMed] [Google Scholar]
- 6.Aljabar P, Heckemann RA, et al. Multi-atlas based segmentation of brain images: atlas selection and its effect on accuracy. Neuroimage. 2009;46(3):726–38. doi: 10.1016/j.neuroimage.2009.02.018. [DOI] [PubMed] [Google Scholar]
- 7.Isgum I, Staring M, Rutten A, et al. Multi-atlas-based segmentation with local decision fusion–application to cardiac and aortic segmentation in CT scans. IEEE TMI. 2009;28(7):1000–10. doi: 10.1109/TMI.2008.2011480. [DOI] [PubMed] [Google Scholar]
- 8.van Rikxoort EM, Isgum I, et al. Adaptive local multi-atlas segmentation: application to heart segmentation in chest CT scans. MedIA. 2010;14(1):39–49. doi: 10.1016/j.media.2009.10.001. [DOI] [PubMed] [Google Scholar]
- 9.Commowick O, Warfield SK, Malandain G. Using Frankenstein’s creature paradigm to build a patient specific atlas. Proc MICCAI’09, Part II Volume 5762 of LNCS. 2009:993–1000. doi: 10.1007/978-3-642-04271-3_120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Shi F, Yap PT, Fan Y, et al. Construction of multi-region-multi-reference atlases for neonatal brain MRI segmentation. Neuroimage. 2010;51(2):684–93. doi: 10.1016/j.neuroimage.2010.02.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Guimond A, Meunier J, Thirion JP. Average brain models: A convergence study. CVIU. 2000;77(2):192–210. [Google Scholar]
- 12.Arsigny V, Commowick O, Pennec X, Ayache N. A log-euclidean framework for statistics on diffeomorphisms. Proc MICCAI’06 Volume 4190 of LNCS. 2006:924–931. doi: 10.1007/11866565_113. [DOI] [PubMed] [Google Scholar]
- 13.Grégoire V, Levendag P, Ang KK, et al. CT-based delineation of lymph node levels and related CTVs in the node-negative neck: DAHANCA, EORTC, GORTEC, NCIC, RTOG consensus guidelines. Rad Oncol. 2003;69(3):227–236. doi: 10.1016/j.radonc.2003.09.011. [DOI] [PubMed] [Google Scholar]