Skip to main content
Medical Physics logoLink to Medical Physics
. 2008 Mar 10;35(4):1251–1260. doi: 10.1118/1.2839120

Quantifying the accuracy of automated structure segmentation in 4D CT images using a deformable image registration algorithm

Krishni Wijesooriya 1,a), E Weiss 2, V Dill 2, L Dong 3, R Mohan 3, S Joshi 4, P J Keall 5
PMCID: PMC2811553  PMID: 18491517

Abstract

Four-dimensional (4D) radiotherapy is the explicit inclusion of the temporal changes in anatomy during the imaging, planning, and delivery of radiotherapy. One key component of 4D radiotherapy planning is the ability to automatically (“auto”) create contours on all of the respiratory phase computed tomography (CT) datasets comprising a 4D CT scan, based on contours manually drawn on one CT image set from one phase. A tool that can be used to automatically propagate manually drawn contours to CT scans of other respiratory phases is deformable image registration. The purpose of the current study was to geometrically quantify the difference between automatically generated contours with manually drawn contours. Four-DCT data sets of 13 patients consisting of ten three-dimensional CT image sets acquired at different respiratory phases were used for this study. Tumor and normal tissue structures [gross tumor volume (GTV), esophagus, right lung, left lung, heart and cord] were manually drawn on each respiratory phase of each patient. Large deformable diffeomorphic image registration was performed to map each CT set from the peak-inhale respiration phase to the CT image sets corresponding with subsequent respiration phases. The calculated displacement vector fields were used to deform contours automatically drawn on the inhale phase to the other respiratory phase CT image sets. The code was interfaced to a treatment planning system to view the resulting images and to obtain the volumetric, displacement, and surface congruence information; 692 automatically generated structures were compared with 692 manually drawn structures. The auto- and manual methods showed similar trends, with a smaller difference observed between the GTVs than other structures. The auto-contoured structures agree with the manually drawn structures, especially in the case of the GTV, to within published inter-observer variations. For the GTV, fractional volumes agree to within 0.2±0.1, center of mass displacements agree to within 0.5±1.5 mm, and agreement of surface congruence is 0.0±1.1 mm. The surface congruence between automatic and manual contours for the GTV, heart, left lung, right lung and esophagus was less than 5 mm in 99%, 94%, 94%, 91% and 89%, respectively. Careful assessment of the performance of automatic algorithms is needed in the presence of 4D CT artifacts.

Keywords: automatic segmentation, deformable image registration, image artifacts

INTRODUCTION

The advent of four-dimensional (4D) thoracic computed tomography (CT) scans that create separate CT images at discrete phases of the respiratory cycle1, 2, 3, 4, 5, 6, 7 allows improved beam delivery precision and increased dose conformation.8, 9 When precise geometric knowledge of the target volume is available, 4D radiation therapy planning becomes practical. The number of image sets comprising 4D CT scans involved in 4D treatment planning brings a new challenge to the physicians who need to manually delineate the organs. The problem is twofold: (1) It is very time consuming, and (2) can lead to intra observer contouring variations. To reduce the workload of contouring multiple target volumes, different institutions are looking into automating the contouring of all or multiple phases of the respiratory cycle. Various techniques have been investigated, to be used in automatic contouring of volumes: using maximum intensity projections,10 using model based and interactive auto contouring with an adaptation of deformable models,11, 12, 13 using deformable organisms for automatic segmentation.14 Many groups15, 16, 17, 18, 19, 20 are using deformable image registration models to transfer structures and∕or dose between image sets. There are many other deformable image registration algorithms available today. Some of them are: Fast free form deformable registration via calculus of variations,21 “demons” algorithm for deformable image registration,20 voxel similarity image registration,22 multi resolution B spline image registration,23, 24 and finite element analysis image registration.25, 26 In our study, large deformation diffeomorphic image registration15, 27, 28 was incorporated into our treatment planning system which allowed us to automatically propagate structures manually drawn on a single CT phase to other phases comprising the 4D CT scan.

The purpose of the current study was to geometrically quantify the difference between automatically generated contours with manually drawn contours to determine if auto-contouring methods can be used with confidence in 4D radiotherapy planning.

MATERIALS AND METHODS

This section consists of the following four subsections: 4D CT image acquisition and segmentation, deformable image registration, procedure for auto contouring, and extracting qualitative and quantitative information for the comparison of manual versus automated contours. The details of each subsection are explained below. A diagram of the study scheme is shown in Fig. 1.

Figure 1.

Figure 1

This flow chart shows the work flow of this analysis. Contours were manually drawn by a physician in all phases, for all the volumes in every patient (manual inhale phase given by contour in inhale phase image and the bottom contour of bottom exhale phase image, and manual exhale phase given by contour on top exhale phase image). Automatic mapping from inhale phase to all the other respiratory phases were performed using a deformable image registration algorithm (auto contouring). Such drawn automatic contour for the exhale phase is shown by top contour of the bottom exhale phase image. In this study, we performed a comparative study between the manually drawn contours such as the contour on the top exhale image and the automatically drawn contours such as top contour on bottom exhale image.

4D CT image acquisition and segmentation

The 4D CT patient images from 13 patients used for this study were acquired by M.D. Anderson Cancer Center under IRB approval. Each 4D CT image set consisted of ten three-dimensional (3D) image sets at discrete respiratory phases. CT images consisted of a range of number of slices 56–87, each of thickness 0.25 cm, and a field of view of 50 cm with 512×512 axial resolution. A detailed discussion of how the 4D datasets were acquired and sorted could be found in Ref. 7.

On the CT’s of all respiratory phases, the following anatomy was defined manually under the supervision of an experienced physician: gross tumor volume (GTV), right lung, left lung, heart, esophagus and the spinal cord. Lungs (and GTVs in lungs) were delineated in a lung window (window width 600 HU (Hounsfield units) window level −300 HU), all other structures including mediastinal GTVs were delineated in the thorax window (window width 400 HU, window level 35 HU). A total of 770 structures were manually segmented. Further details on the manual segmentation can be found in Weiss et al.29

Deformable image registration

The deformable image registration algorithm employed in this study (large deformation diffeomorphic image registration) has been developed at the University of North Carolina. This algorithm was used to map with a one-to-one correspondence all points in the CT image from one respiratory phase (peak inhale) with corresponding points on the other respiratory phases, accommodating for the anatomic deformation caused by respiration.

In this algorithm, a time index transformation, h(x,t): Ω→Ω, maps the coordinate space of each of the respiratory phases of the 4D CT, where Ω is the coordinate space of the peak-inhale CT image. A tissue voxel is considered as a single point in the analysis. By minimizing an energy term subjected to an appropriate regularity condition, the algorithm finds h(x,t), the deformation field to correspond one image to the other

E(h)=v(Ip(x)IT(h(x,t)))2dx+Ereg(h).

Here, I(x) are the image intensities of the two image voxels, and Ereg is the regularity term that quantifies how severely h deforms the image, accommodating large deformations. A time parameter t has been introduced here and a function h(x,t) is being defined such that h(x,0)=x, and h(x,tfinal) is the desired deformation field h(x,t) that aligns IP and IT; h is constructed as a integral of a time-varying velocity field forward in time. And Ereg is defined as

Ereg(h)=v,tLregv(x,t)2dxdt.

It has been shown that with proper conditions on Lreg, this method produces a diffeomorphism (i.e., differentiable with a differentiable inverse). This has the physical characteristic of, each position x in the planning image corresponding to a unique position in the treatment image, and no tearing of tissue occurring. This will lead to the same results irrespective of warping image B to image A or image A to image B.

Velocity field is computed using the differential operator at each iteration.

(Ip(x)IT(h(x,t)))IT(h(x,t))=Lv(x,t),

where L is motivated by the Navier–Stokes equations for compressible fluid flow with negligible inertia. The fluid has a non-physical property that it resists compression (and dilation) inelastically, so that volume can be permanently added or removed in response to image forces.

Lv=α2v+β(.v)+γv.

At each point, the force is along the direction of greatest change in image intensity of IT(h(x,t)). The γ term ensures that L is a positive definite differential operator, and therefore invertible. One can find detailed information regarding this algorithm in Refs. 15, 27, 28, 30.

This algorithm uses a multi scale approach that initializes the fine (voxel) scale registration with the up-sampled correspondence computed at a coarser scale level. The finer scale levels only need to account for residue from coarser scale levels and is expected to require far fewer iterations to converge. For this study the coarse, medium, and fine iteration numbers were set to 500, 25, and 1. For a single minimization with the aforementioned number of iterations of an image pair of 512×512×64 bits typically this program takes about 12–15 min for the image set of one respiratory phase.

The output of the deformable registration is a 3D vector field at the resolution of the CT image. A vector originates from the center of the CT voxel in the source image, and points to some position in space in the target image which is decoupled from the center of the target image voxels. Thus if, e.g., an organ is compressed in the target relative to the source image, the vector’s density would appear higher in that region of the target image.

Procedure for auto-contouring

Having computed the transformation h that maps the peak-inhale CT onto each of the respiratory phases of the 4D CT, the segmentation for each of the remaining respiratory phases is accomplished by applying the transformation to the physician-drawn contours on the peak-inhale CT. The transformation h thus quantifies the organ motion and the fine-featured organ shape changes that occur during the breathing cycle. Such a 3D transformation vector matrix, h, is superimposed on the inhale phase image in Fig. 2. Figure 2 top left and right hand side images are inhale and exhale phase images while the bottom left hand image shows the transformation vector matrix superimposed on the inhale phase image. The bottom right image gives a 3D view of the vector field. Each source voxel has a vector drawn showing the required displacement in 3D. The color code shows the dimension of the displacement, orange representing larger and blue representing smaller displacements.

Figure 2.

Figure 2

Visualization of a transformation vector matrix. Top left and right hand side pictures are inhale and exhale phase images while bottom left hand side image shows the transformation vector matrix superimposed on the inhale phase images. The bottom right image gives a 3D view of the vector field. Each voxel has a vector drawn showing the required displacement in 3D. Length of the arrow shows the dimension of the displacement.

The deformable image registration code was interfaced to a commercial treatment planning system (Pinnacle v7.7, Philips, Medical Systems, Milpitas, CA) using the scripting capabilities available within the planning system. The 3D structures can be defined in a 3D triangular mesh format. Each vertex of the triangular mesh manually segmented on the inhale phase was displaced by the interpolated vector for that point as computed by the deformable image registration algorithm. The 3D deformation vector points from the center of the target voxel to some 3D point in the source image. Deformation from points on mesh of target image is determined by a three way interpolation.

Extracting information for the comparison of manual versus automated contours

An individual structure analysis on the manually contoured volumes was performed for the manually contoured volumes which gave the information on the data sample we were looking at. This enabled us to see the range of motion of the GTV, diaphragm, etc., as well as the changes in volumes from phase to phase. These data have already been published in.29 A code was written using the software package IDL (Interactive Data Language from ITT) to extract the qualitative and quantitative information regarding the comparison of automatic and manual contours. The steps of the analysis are below.

Qualitative analysis of the agreement of manual and auto contours

(a) A physician evaluated the agreement between the manually and automatically drawn contour sets going through all contours and all slices of each patient in expiration phase. Using end inspiration as the reference phase in this study, expiration phase was expected to show the largest amount of differences between manual and auto contours. The clinical acceptability was rated according to the following score: (a) manual correction necessary <10% of all contours of the respective structure, (b) corrections in 10%–25% of all contours, (c) corrections in 26%–50% of all contours, and (d) corrections in >50% of all contours. In addition, the direction of major discrepancies between manual and auto contours was analyzed.

(b) As a measure for clinical usefulness of auto contours, intensity-modulated radiotherapy plans were generated for both auto and manual contours in mid expiration and end expiration. Plans were compared for their coverage of manual contours which were used as a benchmark representing the current contouring standard. To estimate the impact of contour differences on the planning results, the following arbitrarily chosen DVH (dose volume histogram) parameters were analyzed: differences in PTV D98%⩾1 Gy, differences in lungs V20 Gy⩾2%, differences in heart V40 Gy⩾2%, differences in esophagus PRV (planning organ at risk volume) V55 Gy⩾2%, differences in cord PRV D1%⩾1 Gy.

The quantitative analysis was divided into several sections

(a) Volume comparison: Volumes of automated and manual contours were compared per each patient, volume and phase.

(b) Center of Mass comparison: (COM) coordinates in three dimensions for the manual and auto contours were compared per each patient, volume and phase.

(c) Surface congruence analysis: A more detailed analysis of the two volumes of interest was performed using a surface congruence analysis. This was achieved from the following steps. First, from the triangular mesh surfaces for the volumes, radial rays are drawn from the COM of the manually contoured volume as the center of the coordinate system. Spherical polar coordinates r,θ,φ were used for this analysis, while a non-equi-spaced grid of θ,φ was used to overcome the extreme solid angle changes near the poles. Radial distance is given by the distance between the center and the intersection point of the ray with the surface of the object furthest from the center (only a single surface for a given volume is considered in case of multiple volumes). Per each point θ,φ differences of r for the manually and automatically drawn contours were obtained. The normal distance between the two subplanes was considered as the radial difference of the volumes at any given grid point. The difference between the two surfaces was sampled along 20 different polar angles θ, and 20 azimuthal angles φ which resulted in a total of 400 data points for each pair of structures. The systematic uncertainty of the full procedure, i.e., starting from the creation of the meshes in the TPS to the calculation of r in the equi-spaced grid of θ and φ was estimated to be 0.25 mm using a sphere of known radius in the TPS. Of the 770 manually segmented structures, 78 were in the inhale phase and were used to generate automatic contours. Thus 692 manual and automatic structures were compared (the heart was not segmented for patient seven due to the limited field of view of the scan). For the normal anatomy segmented, the complete structures were not always within the field of view.

RESULTS

Qualitative comparison of manual versus automated contours

Qualitative comparison between the manual and automatically drawn contours was performed using visual inspection and the results from treatment planning based on manual and auto contours

Acceptance rate

The clinical acceptance rate expressed as the need for manual correction from the physician’s view is shown in Table 1. The need for manual correction varied between organs and was largest for GTVs, heart and esophagus. The required contour modifications, particularly for GTVs and esophagus, had no spatial preference. Most changes for lungs and heart were at the inferior borders of these structures where most of the respiration-related motion is observed. Part of the required modifications at the inferior edges of these structures was omissions of automatically contoured slices resulting from an exaggeration of these contours in expiration phases. The amount of required corrections varied for the analyzed structures. Corrections in the axial plane were mostly less than 2.5 mm except for GTV and esophagus. GTV corrections in the axial plane were on average always less than 5 mm, and for three patients it was less than 2.5 mm. Corrections in the superior-inferior directions for most organs affected several slices, e.g., in the form of contour deletions at the inferior borders, and were therefore frequently larger than 1 cm. Corrections for the GTV in the superior-inferior direction were on average less than 5 mm. However, three patients required corrections exceeding 5 mm.

Table 1.

Physician evaluation of the required rate and direction of manual corrections of auto contours.

N=13 patients Manual correction necessary in Major direction of discrepancy
<10% 10%–25% 26%–50% >50%
GTV 1 1 8 3 All directions
Lung right 6 6 1   Inferior
Lung left 6 7     Inferior
Heart 2 2 8   Inferior
Esophagus 1 6 5 1 All directions
Spinal cord 13       Transversal

Results from plan comparison

The impact of contour differences on the planning results is given in Ref. 31. The dosimetric impact of contour differences was low with variations larger than 1 Gy or 2% depending on the respective DVH parameter for only seven of all normal tissue structures in end expiration in 12 of the 13 patients analyzed in the present study. Differences in PTV D98%⩾1 Gy were found in five of 12 patients of the planning study. Therefore visual inspection is particularly important for the PTV and for normal tissue structures close to the PTV.

The results of manual versus automated contouring comparison for a single patient, using axial, sagittal and coronal views from the TPS, are shown in Fig. 3. Agreement is observed in all six volumes between the manually (colorwash contours) and automatically drawn contours (colored contours other than red). Notice that even the diaphragm, which has a large displacement going from inhale to exhale, is very well reproduced in this patient. Agreement of manual (shown in yellow colorwash) and automatic GTV contours (shown in black contour) shows that this deformable image registration algorithm is capable of reproducing volumes that are not only different in size but also quite deformed in shape.

Figure 3.

Figure 3

Manual vs automated contouring results for a single patient are shown. Axial, sagittal and coronal views of an exhale phase image with manually drawn exhale phase contours shown in colorwash are shown. Manually drawn inhale phase contours are also superimposed on the exhale phase image. These are the shifted contours from the colorwash ones. Most prominent shifts are for the lungs, GTV, and the heart. Notice the very good agreement of the colorwash contours with the auto contours from inhale to exhale especially for the volumes: GTV, lungs, and heart.

Figure 4 shows a comparison of six manually drawn volumes with the automatically drawn counterparts in mesh format for visual inspection. This is accomplished by IDL software. The left hand side of Fig. 4 is a comparison of manually drawn contours of inhale and exhale phase volumes. The clear distinction of the displacement of GTV, lung, heart, and esophagus volumes is noted from inhale to exhale, while the cord has no respiration related motion in intra-fraction radiotherapy. The right hand side of Fig. 4 is a comparison of manually and automatically drawn contours for the exhale phase. Agreement of automatically drawn and manually drawn volumes for the GTV (nicely mixed), both lungs (outlines of the volumes agree very well), esophagus, cord, and heart is evident. Auto contouring requires the coverage of full organ volumes by 3D CT images for all respiratory phases. Artifacts might therefore arise in automatically contoured volumes, if the field of view is not large enough to cover the full organ volume in both image sets used for manual and automated contouring. Such an effect is seen in the most inferior slice of the heart, and lungs. Largest discrepancies were most frequently observed between manual and auto contours in these most inferior slices.

Figure 4.

Figure 4

Left: Manually contoured volumes of interest for inhale and exhale phases. Right: Manually and automatically segmented volumes of interest for exhale.

Quantitative comparison of manual versus automated contours

Individual structure information from this study, which gives the absolute position information, and structure comparison which gives the relative information of manual contours for different phases with respect to the inhale phase per each patient per each volume are given in another reference by the same group.29 The structure comparison which gives the relative information of automated versus manual contours per each patient, per each volume per each phase is divided into three categories. The observables that have been used are: Volume, COM coordinates in three dimensions, and surface congruence.

Volume comparison

Differences of fractional volumes (volume as a fraction of the inhale phase volume) between automatically and manually drawn contours for GTV are shown in Fig. 5. Plot contains the fractional volume difference as a function of respiratory phase for 13 patients. Worst case is shown for the patient images with most image artifacts, in red open triangles. Furthermore, the fractional volume difference does not seem to prefer either positive or negative change. We noticed that automatic contouring is a smoothly varying function of phase compared to the manual contouring. In general, automatic contouring and manual contouring are in agreement for all the volumes considered. Fractional volumes agree to within 0.2±0.1 for the GTV case.

Figure 5.

Figure 5

Fractional volume (volume as a fraction of the inhale phase volume) differences between manually and automatically drawn contours for structure GTV is shown. Plot contains differences of fractional volume as a function of respiratory phase for 13 patients.

Center of mass comparison

Table 2 summarizes differences of COM displacements (with respect to the COM of inhale phase) between manual and automated contours for each patient averaged over nine phases, for GTV, cord, heart, and left lung, right lung, and esophagus. We also noted the non-smooth behavior of manually contoured results, and the non-symmetric nature of the COM displacements around the mid phase point (expiration). Except for the first patient that had image artifacts which will be discussed in a separate section, COM displacements agree to within 0.5±1.5 mm for the GTV case. This is an average displacement difference as opposed to a mean of the absolute value differences as shown by the equation below

|Average([COM̱Auto]Ti[COM̱Manual]Ti)|i=1,9:patNo.=1,13.
Table 2.

Difference of COM displacements for manually and automatically contoured volumes for the 13 patients for GTV, cord, heart, left lung, right lung, and, esophagus averaged over ten phases. All units are in millimeters.

Patient GTV Cord Heart Left lung Right lung Esophagus
1 2.8 0.1 0.2 3.7 5.2 8.3
2 −0.6 0.0 0.7 0.5 0.6 1.4
3 1.1 0.6 1.7 0.2 3.2 0.1
4 2.5 0.1 1.2 1.1 0.8 −0.3
5 −0.1 −0.1 1.3 −0.1 −0.1 1.4
6 2.0 0.2 1.7 0.3 −0.2 1.7
7 −1.3 0.2 0.2 0.4 0.6
8 −0.1 0.0 −0.9 2.1 0.1 −0.5
9 −0.7 0.2 1.2 4.3 4.4 1.2
10 −0.8 0.3 −0.2 2.6 0.5 3.3
11 1.9 0.8 0.1 3.3 3.1 1.0
12 −1.6 0.2 −1.4 3.9 0.5 1.7
13 0.9 0.2 0.5 −2.1 1.7 4.8

This will identify whether the differences are biased, while the standard deviation quoted as the error bar will assure that there is no large variance in the individual results. COM differences were largest for the esophagus, which is understandable as it is often difficult to discern on CT scans, particularly in the lower thorax region.

Surface congruence analysis

Results of the surface congruence analysis for two volumes are shown in Fig. 6. The top left plot is for a manually contoured heart and the top right is for an automatically contoured heart for the same phase. The bottom left plot is for a manually contoured GTV and the bottom right is for an automatically contoured GTV for the same phase. Both θ̱ and ϕ are binned to 15° bins for plotting purposes. The radial difference between manual and automatic surfaces gives the surface congruence over the full surface. Mean values for the surface congruence data summed over all phases for each volume per each patient are shown in Fig. 7, and summarized in Table 3. Each volume per each patient is shifted by 0.2 in x axis for clarity. Standard deviation (STD) for the data sampled over all phases is shown as the error bar of each point. For the case of GTV, agreement of surface congruence is 0.0±0.2 mm. Note that STD is better that 2 mm for all the patients in this case. Percentage of events that agree better than 5 mm for the two surfaces, for the accumulation of all phases, is shown in Table 4. Again, in the case of the GTV this number is very close to 100% (98.8 with a STD of 0.7%). This limit of 5 mm is based on the literature on setup (systematic and random on thorax ∼2.5 mm),32 motion (on thorax ∼2.5 mm),33 and contouring (observer variation: mean 2.1 mm, 90% confidence interval of 3.7 mm)17 errors.

Figure 6.

Figure 6

Results from the surface congruence code. Top left plot is for a manually contoured heart and the top right is for an automatically contoured heart for the same phase. Bottom left plot is for a manually contoured GTV and the bottom right is for an automatically contoured GTV for the same phase; x, y, and z axis correspond to θ, φ, and radius r in spherical polar coordinates, respectively. Both θ and φ are binned to 15° bins.

Figure 7.

Figure 7

Surface congruence results—mean values of the surface congruence results for each volume for the 13 patients analyzed. Each volume per each patient is shifted by 0.2 in x axis for clarity. Each point is a cumulative of all the phases.

Table 3.

Mean values of the surface congruence results for each volume for the 13 patients studied. Each point is a cumulative of all ten phases. All units are in millimeters.

Patient GTV Heart Left lung Right lungh Esophagus
1 0.1±1.4 −0.9±2.7 −0.8±5.3 −0.6±3.5 −0.7±2.1
2 0.2±0.9 0.0±1.8 −0.0±2.3 −0.4±3.8 0.2±3.2
3 −0.1±0.9 −0.9±2.2 0.0±2.1 −0.5±2.4 0.0±2.6
4 0.0±1.1 0.1±1.8 −0.1±2.4 −0.4±4.0 0.1±3.2
5 −0.2±1.2 0.2±1.4 0.0±1.0 −0.4±4.4 0.1±2.3
6 0.2±1.0 −0.4±1.4 −0.1±2.3 −0.1±1.6 −0.7±3.5
7 0.1±1.2   −0.1±3.6 −0.5±4.4 −0.4±5.8
8 −0.1±0.7 −0.5±1.7 −0.0±1.8 0.3±2.2 −0.0±2.6
9 −0.2±1.0 −0.3±1.7 −0.1±3.0 −0.5±3.5 −0.2±4.7
10 −0.2±0.9 −0.4±1.4 −0.4±3.1 −0.3±5.1 −0.3±3.2
11 0.3±1.4 −1.1±2.8 0.0±3.5 −0.1±3.5 0.1±2.1
12 0.0±1.2 −0.0±1.9 −0.1±2.0 −0.1±3.5 −0.1±2.1
13 −0.3±1.3 −0.3±1.1 −0.2±2.8 −0.0±3.1 −0.2±2.2
Table 4.

Percentage of events that agree within 5 mm for the two surfaces, for each volume for the 13 patients studied. Each point field includes all ten phases for each patient.

Patient GTV Heart Left lung Right lung Esophagus
1 97.8 88.4 82.0 85.2 88.5
2 99.1 97.2 95.3 90.3 94.2
3 99.4 89.6 95.6 93.7 87.4
4 99.2 95.9 95.1 89.2 90.2
5 98.0 97.2 99.1 90.5 86.6
6 98.3 94.3 94.8 96.4 85.3
7 98.9 92.9 91.1 88.8
8 99.7 95.0 95.4 91.6 94.6
9 99.3 95.6 94.0 99.0 99.0
10 99.7 96.5 94.0 87.5 79.9
11 98.8 85.5 91.5 91.2 91.3
12 98.4 95.5 95.9 90.0 92.5
13 97.8 98.5 93.3 94.5 83.5

DISCUSSION

This work shows the comparison of auto contouring using a deformable image registration algorithm, to manually contoured data for close to 692 volumes, from 13 patients. The difference of surface congruence between automatic and manual contours GTV, heart, left lung, right lung and esophagus was less than 5 mm in 99%, 94%, 94%, 91% and 89%, respectively. Results given in Table 2 show that, except for patient 1 who has extensive image artifacts in the target image, mean differences of COM displacements for auto contours and manual contours for GTV, heart, left lung, right lung, esophagus, cord range between 1.6, 1.5, 3.4, 1.5, 2.0, and 0.4 mm, respectively. The physician analysis, although detecting a need for manual correction in a substantial number of contours, also noticed mostly minor changes in the transverse plane. GTVs and esophagus were affected most frequently, which is probably due to the low contrast in the mediastinum. Larger discrepancies were observed in the lower thorax. Factors influencing the accuracy of auto contour generation in the inferior thorax are described below. The dosimetric effects of contour discrepancies were small. The results of this analysis show that in the absence of image artifacts automatic contouring based on large scale deformable diffeomorphic registration is a very helpful and in general reliable way to reduce the manual contouring workload for 4DCT data. Review and if necessary manual modification of particularly GTV and esophagus contours is, however, required. The 4DCT is a less challenging registration problem than other problems as the images sets are the same modality acquired at very close temporal separation. Furthermore, all image sets of a given patient are already in rigid alignment of the skeletal anatomy. Compared to manual contouring, auto contouring in this study resulted in less phase-to-phase respiration-related COM and surface variations, and produced smooth functions of phase, thereby leading to predictability of organ motion and deformation in the respiratory cycle.

Pevsner et al.17 uses the same deformable image registration algorithm to automatically contour volumes from end exhale to end inhale. Their results are for the GTVs of six patients for a single phase (exhale) comparison. Our study covers six volumes including the GTV for 13 patients and ten phases. In defining the region of interest (ROI) to obtain the 3D vector matrix, our study used a full ROI that covers the full thorax which enabled us to utilize a single vector matrix to warp all the contours of the thorax while in the study by Pevsner et al. the ROI was defined to accommodate only the GTV. This has advantages as well as disadvantages. Using a smaller ROI eliminates the issue of propagating problematic CT artifacts, if the artifact is not in the volume one is interested in auto contouring. On the other hand, when one has to perform 3D treatment plans, one wishes to have all the volumes of interest auto contoured within a minimum amount of time, with little disk space, as well as minimal human intervention. Our way of defining the ROI accomplishes all the latter criteria. Pevsner et al. quotes a value of 2.6 mm for the mean for the manual versus auto contour surface difference for all six patients for the GTV. This is comparable to the mean value of 1.3 mm in our study with 13 patients and just looking at end exhale phase. Given their findings of inter-observer differences of mean 2.1 mm, the results are within inter-observer variations.

Wang et al.20 uses an accelerated “demons” algorithm as the deformable image registration to auto contour volumes in the pelvis and thorax. In the study by Wang et al. a qualitative comparison between manual and auto contours for a single tumor in the thorax with three CT slices was performed, while our work qualitatively and quantitatively compares approximately 600 volumes in the thoracic region. Reasonable agreement could be observed on all three dimensions for these three slices between automatically and manually contoured GTV. In comparison, Wang et al. quotes a 6 min registration time for a 215×215×64 bit image pair while ours has a corresponding time for a 512×512×64 bit image pair.

Ragen et al.19 used an interactive deformable model algorithm to automatically contour a single patient’s 4D CT thorax dataset that consisted of eight phases. Triangulations based on the contours generated on each phase were deformed to the CT data set on the succeeding phase to generate the contours on that phase. Deformation was propagated through the eight phases, and the contours obtained on the end inspiration data set were compared with the manual contours of five structures drawn in a single phase. No quantitative information is given in this article. This study does not address the automation of GTV. From the volumes shown, lungs where one has the best contrast show the best agreement with the manually drawn contours. Where it fails to have good agreement is near the bronchi region. In the case of the heart, automatically drawn contours tend to overestimate the true heart, sometimes by several centimeters. Poorest agreement is shown in the esophagus case, with almost all the axial slices having no overlap volume between the automated and manual volumes. From visual comparison with our auto contouring results, we noted very good agreement between manual and auto contours in our study, particularly for lungs and hearts. Discrepancies for both organs were mostly found at the inferior border.

Coselmon et al.18 uses a mutual information based image registration tool using thin-plate splines driven by the selection of 30 control points. Inhale and exhale CTs were obtained in 11 patients using breath hold technique. Inhale CT was warped to obtain the contours at exhale. This manuscript looks at the agreement of manual to automated contours using locations of vascular and bronchial bifurcations. Only the lungs are studied and the single CT slice shown, good agreement could be observed between the manual and automated contours. Differences of 3 mm could be observed between the manual and automatic contours for the landmark points studied in both AP and IS directions.

Automatic contouring is directly related to the quality of the image registration. Even though we have used the end inspiration phase contours to propagate to all the other phases, we have tested the hypothesis that the selection of the initial dataset is arbitrary. We have tested not only the image registration algorithm, but the full process of auto contouring using phantom images shifted by known amounts, and successfully shown that the process of contour propagation from image A to image B to image A is consistently reproducible. This suggests that there are no hysteresis effects. The only criteria that matters for the image registration is the quality of the images, that is as free of image artifacts. It is noteworthy to point out that the quality of the T0–T5, where we expect the maximum motion, automatic contours will still have no additional change. This can be confirmed by comparing the results of Pevsner et al. only for the T0–T5 case going from end exhale to end inhale to ours as stated before. The two results are in very good agreement for the surface difference.

The quality of auto contours is affected by several limitations inherent in manual contours. In the surface congruence analysis, since the goal is to look for radial differences between the two volumes, a particular radial vector can sometimes go through multiple surfaces. An example would be GTVs that comprise multiple volumes (e.g., the primary tumor and affected lymph nodes) [Fig. 8a]. Another example is in the case of the lungs, where the bronchi reach into the lung tissue and therefore produce lung margins that are closer to the COM; compared to the regular outer lung contour abutting the mediastinum and the thoracic wall, some radial vectors drawn from the COM of lung would intersect two surfaces for the same volume as shown in Fig. 8b. Similar examples are the spinal cord and the esophagus where physiological thorax kyphosis (the curvature of the object) makes the radial vector cut through two or sometimes three surfaces for the same volume. In this analysis we have limited our investigations to the largest vector in the case of multiple surfaces, and to the largest volume in the case of multiple volumes.

Figure 8.

Figure 8

Issues to be careful about when performing a 360° full surface congruence analysis. (a) Shows the case of multiple surfaces. Radial vector passes through multiple GTV volumes. (b) Is a case where the radial vector from the COM passes through several lung surfaces before reaching the farthest lung border at the posterior thoracic wall.

Since the technique of comparing image intensities to auto contours relies upon the image quality, image artifacts and partially scanned volumes challenge auto contouring. Some examples of situations requiring careful assessment are shown in Fig. 9.

Figure 9.

Figure 9

Some examples of where caution is needed when segmenting images for 4D radiotherapy planning. Image (a) has right diaphragm split, image (b) shows an error in manual contouring (black) for the heart, image (c) corresponds to FOV not covering the diaphragm fully in inspiration (LHS), while in expiration (RHS) large parts of the diaphragm are covered by the FOV.

Figure 9a shows a case where the left lung diaphragm was imaged incorrectly. This happens most often when the patient’s breathing becomes irregular and the retrospective sorting software is unable to correct for it. If this happens only in one particular phase other than the end inhale phase, auto contouring will perform poorly only in that phase. The most unfortunate scenario would be to have this type of image artifact on the end inhale phase image, which would propagate this artifact to the auto contours of all the other phases. Furthermore, if one uses a single ROI to obtain the transformation 3D vector matrix, this type of artifact can affect not only the lung volumes but other volumes of interest as well. Such an example is patient 1.

In addition to producing auto contours and reducing the manual delineation efforts, deformable image registration can also be used as a tool for quality assurance. As an example in Fig. 9b shows, this analysis was able to detect deficiencies in manual contouring 1.6% of the time.

Figure 9c shows an image where the thorax was not imaged completely in all phases due to respiratory motion. This means that in some of the phases during inspiration, the most inferior slices, containing diaphragm and the heart, are not in the field of view. The effect of this can be seen in Fig. 4 (right side). This also affects auto contouring. As one can see from right hand side plot of Fig. 4, which shown the most inferior slice, the heart and the left lung are not being reproduced. This suggests that complete datasets of all of the structures of interest in all the phases are necessary to obtain optimal auto-contouring results.

CONCLUSIONS

A process for automatic contouring based on deformable image registration of respiratory CT phases from a 4D CT scan has been developed and coupled with a commercial treatment planning system. This system was used to automatically contour volumes of 13 patients. Per each patient, based on the physician drawn contours for the inhale phase, auto contouring was performed to propagate the structures to the nine subsequent respiratory phases. Automatically drawn contours were compared qualitatively and quantitatively with manually drawn contours at each phase, for each volume, and per patient. This work is, to our knowledge, the largest such study involving the comparison of 692 structures. The auto-contoured structures generally agree with the manually drawn structures to within published inter-observer variability, especially in the case of the GTV. The auto-contoured structures are more consistent in trajectory and volume, and also highlighted some large variations in the manually drawn contours. Careful assessment is needed in the presence of 4D CT artifacts. As a final note we would like point out that this algorithm still needs to be tested for CT to MR registration, as well as for inter-fraction image registration.

ACKNOWLEDGMENTS

We would like to thank Devon Murphy for editing this manuscript, and Dr. Mirek Fatyga for assistance visualizing the deformation vector fields. This work was supported by NCI R01 93626 and Varian Medical Systems.

References

  1. Ford E. C. et al. , “Respiration-correlated spiral CT: A method of measuring respiratory-induced anatomic motion for radiation treatment planning,” Med. Phys. 10.1118/1.1531177 30(1), 88–97 (2003). [DOI] [PubMed] [Google Scholar]
  2. Keall P. J. et al. , “Acquiring 4D thoracic CT scans using a multislice helical method,” Phys. Med. Biol. 10.1088/0031-9155/49/10/015 49(10), 2053–2067 (2004). [DOI] [PubMed] [Google Scholar]
  3. Low D. A. et al. , “A method for the reconstruction of four-dimensional synchronized CT scans acquired during free breathing,” Med. Phys. 10.1118/1.1576230 30(6), 1254–1263 (2003). [DOI] [PubMed] [Google Scholar]
  4. Vedam S. S. et al. , “Acquiring a four-dimensional computed tomography dataset using an external respiratory signal,” Phys. Med. Biol. 10.1088/0031-9155/48/1/304 48(1), 45–62 (2003). [DOI] [PubMed] [Google Scholar]
  5. Mageras G. S. et al. , “Measurement of lung tumor motion using respiration-correlated CT,” Int. J. Radiat. Oncol., Biol., Phys. 10.1016/j.ijrobp.2004.06.021 60(3), 933–941 (2004). [DOI] [PubMed] [Google Scholar]
  6. Rietzel E. et al. , “Moving targets: Detection and tracking of internal organ motion for treatment planning and patient set-up,” Radiother. Oncol. 10.1016/S0167-8140(04)80018-5 73, Suppl. 2, S68–S72 (2004). [DOI] [PubMed] [Google Scholar]
  7. Rietzel E., Pan T., and Chen G. T., “Four-dimensional computed tomography: Image formation and clinical protocol,” Med. Phys. 10.1118/1.1869852 32(4), 874–889 (2005). [DOI] [PubMed] [Google Scholar]
  8. Underberg R. W. et al. , “Benefit of respiration-gated stereotaclic radiotherapy for stage I lung cancer: An analysis of 4DCT datasets,” Int. J. Radiat. Oncol., Biol., Phys. 10.1016/j.ijrobp.2005.01.032 62(2), 554–560 (2005). [DOI] [PubMed] [Google Scholar]
  9. Underberg R. W. et al. , “Four-dimensional CT scans for treatment planning in stereotactic radiotherapy for stage I lung cancer,” Int. J. Radiat. Oncol., Biol., Phys. 10.1016/j.ijrobp.2004.07.665 60(4), 1283–1290 (2004). [DOI] [PubMed] [Google Scholar]
  10. Underberg R. W. et al. , “Use of maximum intensity projections (MIP) for target volume generation in 4DCT scans for lung cancer,” Int. J. Radiat. Oncol., Biol., Phys. 10.1016/j.ijrobp.2005.05.045 63(1), 253–260 (2005). [DOI] [PubMed] [Google Scholar]
  11. Kaus M. R. et al. , “Automated 3-D PDM construction from segmented images using deformable models,” IEEE Trans. Med. Imaging 10.1109/TMI.2003.815864 22(8), 1005–1013 (2003). [DOI] [PubMed] [Google Scholar]
  12. Kaus M. R. et al. , “Automated segmentation of the left ventricle in cardiac MRI,” Med. Image Anal. 8(3), 245–254 (2004). [DOI] [PubMed] [Google Scholar]
  13. Pekar V., McNutt T. R., and Kaus M. R., “Automated model-based organ delineation for radiotherapy planning in prostatic region,” Int. J. Radiat. Oncol., Biol., Phys. 10.1016/j.ijrobp.2004.06.004 60(3), 973–980 (2004). [DOI] [PubMed] [Google Scholar]
  14. McInerney T. et al. , “Deformable organisms for automatic medical image analysis,” Med. Image Anal. 6(3), 251–266 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Foskey M. et al. , “Large deformation three-dimensional image registration in image-guided radiation therapy,” Phys. Med. Biol. 10.1088/0031-9155/50/24/008 50(24), 5869–5892 (2005). [DOI] [PubMed] [Google Scholar]
  16. Rietzel E. et al. , “Four-dimensional image-based treatment planning: Target volume segmentation and dose calculation in the presence of respiratory motion,” Int. J. Radiat. Oncol., Biol., Phys. 10.1016/j.ijrobp.2004.11.037 61(5), 1535–1550 (2005). [DOI] [PubMed] [Google Scholar]
  17. Pevsner A. et al. , “Evaluation of an automated deformable image matching method for quantifying lung motion in respiration-correlated CT images,” Med. Phys. 10.1118/1.2161408 33(2), 369–376 (2006). [DOI] [PubMed] [Google Scholar]
  18. Coselmon M. M. et al. , “Mutual information based CT registration of the lung at exhale and inhale breathing states using thin-plate splines,” Med. Phys. 10.1118/1.1803671 31(11), 2942–2948 (2004). [DOI] [PubMed] [Google Scholar]
  19. Ragan D. et al. , “Semiautomated four-dimensional computed tomography segmentation using deformable models,” Med. Phys. 10.1118/1.1929207 32(7), 2254–2261 (2005). [DOI] [PubMed] [Google Scholar]
  20. Wang H. et al. , “Validation of an accelerated ‘demons’ algorithm for deformable image registration in radiation therapy,” Phys. Med. Biol. 10.1088/0031-9155/50/12/011 50(12), 2887–2905 (2005). [DOI] [PubMed] [Google Scholar]
  21. Lu W. et al. , “Fast free-form deformable registration via calculus of variations,” Phys. Med. Biol. 10.1088/0031-9155/49/14/003 49(14), 3067–3087 (2004). [DOI] [PubMed] [Google Scholar]
  22. Shekhar R. et al. , “High-speed registration of three- and four-dimensional medical images by using voxel similarity,” Radiographics 23(6), 1673–1681 (2003). [DOI] [PubMed] [Google Scholar]
  23. Ledesma-Carbayo M. J. et al. , “Spatio-temporal nonrigid registration for ultrasound cardiac motion estimation,” IEEE Trans. Med. Imaging 24(9), 1113–1126 (2005). [DOI] [PubMed] [Google Scholar]
  24. Kybic J. et al. , “Fast multipole acceleration of the MEG/EEG boundary element method,” Phys. Med. Biol. 10.1088/0031-9155/50/19/018 50(19), 4695–4710 (2005). [DOI] [PubMed] [Google Scholar]
  25. Brock K. K. et al. , “Accuracy of finite element model-based multi-organ deformable image registration,” Med. Phys. 10.1118/1.1915012 32(6), 1647–1659 (2005). [DOI] [PubMed] [Google Scholar]
  26. Kaus M. R. et al. , “Assessment of a model-based deformable image registration approach for radiation therapy planning,” Int. J. Radiat. Oncol., Biol., Phys. 68(2), 572–580 (2007). [DOI] [PubMed] [Google Scholar]
  27. Christensen G. E., Rabbitt R. D., and Miller M. I., “3D brain mapping using a deformable neuroanatomy,” Phys. Med. Biol. 10.1088/0031-9155/39/3/022 39(3), 609–618 (1994). [DOI] [PubMed] [Google Scholar]
  28. Christensen G. E., Joshi S. C., and Miller M. I., “Volumetric transformation of brain anatomy,” IEEE Trans. Med. Imaging 10.1109/42.650882 16(6), 864–877 (1997). [DOI] [PubMed] [Google Scholar]
  29. Weiss E. et al. , “Tumor and normal tissue motion in the thorax during respiration: Analysis of volumetric and positional variations using 4D CT,” Int. J. Radiat. Oncol., Biol., Phys. 67(1), 296–307 (2007). [DOI] [PubMed] [Google Scholar]
  30. Davis B. C.et al. , “Automatic segmentation of intra-treatment CT images for adaptive radiation therapy of the prostate,” Med. Image Comput. Comput. Assist. Interv Int Conf Med Image Comput Comput Assist Interv, Vol. 8, Pt. 1; pp. 442–450 (unpublished). [DOI] [PubMed]
  31. Weiss E., Wijesooriya K., Ramakrishnan V., and Keall P. J., “Comparison of intensity-modulated radiotherapy planning based on manual and automatically generated contours using deformable image registration in four-dimensional computed tomography of lung cancer patients,” Int. J. Radiat. Oncol., Biol., Phys. 70(2), 572–581 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Hurkmans C. W. et al. , “Set-up verification using portal imaging; review of current clinical practice,” Radiother. Oncol. 10.1016/S0167-8140(00)00260-7 58(2), 105–120 (2001). [DOI] [PubMed] [Google Scholar]
  33. Ekberg L. et al. , “What margins should be added to the clinical target volume in radiotherapy treatment planning for lung cancer,” Radiother. Oncol. 10.1016/S0167-8140(98)00046-2 48(1), 71–77 (1998). [DOI] [PubMed] [Google Scholar]

Articles from Medical Physics are provided here courtesy of American Association of Physicists in Medicine

RESOURCES