Abstract
Purpose: To present a new approach to the problem of estimating errors in deformable image registration (DIR) applied to sequential phases of a 4DCT data set.
Methods: A set of displacement vector fields (DVFs) are made by registering a sequence of 4DCT phases. The DVFs are assumed to display anatomical movement, with the addition of errors due to the imaging and registration processes. The positions of physical landmarks in each CT phase are measured as ground truth for the physical movement in the DVF. Principal component analysis of the DVFs and the landmarks is used to identify and separate the eigenmodes of physical movement from the error eigenmodes. By subtracting the physical modes from the principal components of the DVFs, the registration errors are exposed and reconstructed as DIR error maps. The method is demonstrated via a simple numerical model of 4DCT DVFs that combines breathing movement with simulated maps of spatially correlated DIR errors.
Results: The principal components of the simulated DVFs were observed to share the basic properties of principal components for actual 4DCT data. The simulated error maps were accurately recovered by the estimation method.
Conclusions: Deformable image registration errors can have complex spatial distributions. Consequently, point-by-point landmark validation can give unrepresentative results that do not accurately reflect the registration uncertainties away from the landmarks. The authors are developing a method for mapping the complete spatial distribution of DIR errors using only a small number of ground truth validation landmarks.
Keywords: deformable image registration, registration validation, breathing motion, 4DCT, principal components analysis
INTRODUCTION
The motion of the treatment target and other structures presents complications for external beam radiotherapy treatment. Movement relative to the planning anatomy can occur day-by-day or, in the case of respiration, can proceed continuously during each treatment fraction. Respiration-correlated CTs (4DCTs) record anatomical motion due to breathing at a number of discrete time points (phases) in the breathing cycle.1, 2, 3 Deformable image registration (DIR) detects and quantifies the displacement of anatomical features from one phase (i.e., time bin) to the next.4 It can also map the anatomical configuration at any particular breathing phase back to some reference phase. The anatomical displacement maps produced by image registration are commonly referred to as displacement vector fields (DVFs) because they display the displacement of anatomical elements from one position to another in a sequence of images. The DVFs can be used to track the accumulation of dose over the breathing cycle4 and to construct statistical models of where any particular anatomical element might be during treatment for use in probabilistic planning.5, 6 However, the accuracy of the DVFs directly impacts the accuracy of these applications.
At the present time, 4DCT has relatively coarse time resolution and requires the blending of motion data from several breathing cycles to fill each time bin. As a result, any particular 4DCT distorts the true thoracic movement with motion and binning artifacts. Furthermore, DIR algorithms are not perfect.7 Therefore, a DVF obtained by registering pairs of images will always have two components: One that corresponds to the actual physical displacements between the two images and a second, unphysical component due to imperfections in the imaging and registration processes. The unphysical component is the DVF error and mathematically can be represented by a vector field defined at every voxel and added to the “real” (error-free) displacement vectors. Our present study is aimed at the problem of determining voxel-by-voxel the complete registration error field throughout the imaged volume.
If the registration errors fluctuate randomly from one voxel to another and from one DVF to the next (e.g., such as white noise), then we can sample them by locating a set of landmarks that measure the true physical motion in the DVF at specific points and subtracting their displacement from the DIR displacement vector at each landmark location, leaving the error component.8 However, if the errors do not fluctuate randomly, but have patterns of spatial coherence in each DVF, then it is not trivial to extend point landmark validation elsewhere in the anatomy. For example, a DVF might show good accuracy at or near selected landmarks while having substantially greater errors elsewhere. It has therefore been suggested that a large set of landmarks (>1000) is necessary to provide an adequate picture of DVF uncertainties throughout the anatomical volume.9 However, we hypothesize that if one models mathematically the spatial correlations in the error map, it is not necessary to have a dense sampling of landmarks. We describe here a new method to determine complete DVF error maps from 4DCT registrations using a small set (<100) of ground truth validation landmarks.
Why is it proper to assume that (in general) DVF errors are spatially correlated? The deformation model of any DIR algorithm will provide only an imperfect representation of the anatomy movement. The difference between the actual motion and the model will translate into a registration error component that changes smoothly throughout the image, i.e., that has strong spatial coherence from one DVF to the next. Local fluctuations of the anatomical motion might be overly smoothed by DIR, generating a second spatially structured component of registration error. Artifacts in the image will introduce a third source of spatial error in the DVF, and so on. Each of these error components will have spatial coherence but will be approximately independent of the others.
We can use to our advantage the fact that the DIR errors are coherent by employing principal components analysis (PCA). It can be shown mathematically that if the voxel-by-voxel displacement vector amplitudes in a set of N DVFs all vary in proportion to a single scaling function (e.g., the amplitude of a breathing waveform) from one sample to the next, PCA will reveal a single eigenmode and a single nonzero eigenvalue that describes the variance. On the other hand, if the displacement vectors vary with complete randomness, voxel-by-voxel, from one DVF to the next, PCA will produce an infinite set of eigenmodes and eigenvalues. In the latter situation, the spectrum of eigenvalues will be flat, showing that all of the modes are indistinguishable. If the DVFs are a combination of several independent sources of spatially and temporally correlated movement and errors, then PCA will produce a set of statistically uncorrelated eigenmodes with a spectrum of eigenvalues that decrease in amplitude in proportion to the amount of variance contributed by each mode. Some of these modes will be associated primarily with breathing; other modes will be primarily due to DIR errors. In particular, correlated errors will have a decreasing eigenvalue spectrum, while random errors (e.g., white noise) will produce a flat spectrum extending effectively to infinity. Thus we can use PCA to identify and characterize the coherence of the error component of the DVF.
In this paper, we describe the concept of our method and report tests of its feasibility via numerical simulations based on a simple 4DCT DVF model. We used the simulations to test approximations in our equations and to analyze the fidelity with which the method can recover known DIR error maps. We also examine the impact of the number and location of landmarks and the number of 4DCT phases on the efficacy of the method. The simulation model was based on the characteristics of breathing observed in 4DCT and DVF data obtained from a publicly available database.10
METHOD AND MATERIALS
Assume that we have a 4DCT for a patient, reconstructed in N+1 breathing phases. Then one phase (e.g., end-inhale) can be designated as the reference phase, to be used as the target image for deformable registration. Deformable registration of each of the remaining breathing phases to the initial reference phase will generate a set of N DVFs that approximates voxel-by-voxel the anatomical motion. The DVFs point from each voxel in the target reference CT to the location where the associated anatomical elements have moved in each of the subsequent phases. These DVFs contain errors from 4DCT artifacts and underlying DIR inaccuracies. Assume that we also have a set of manually delineated pointlike landmarks in the CTs that track the anatomical displacements at discrete points. Setting aside for now the effects of observer errors associated with the landmarks, we can use them both as a ground truth to establish the DVF errors at those points and as a measure of respiratory motion.
For actual breathing, the anatomical movement is highly correlated in time and space, but DIR errors add unphysical variances to the apparent motion in the DVFs. We use PCA to identify and separate the true underlying modes of motion from the contributions of DIR errors to the overall DVF variance.
Each DVF becomes a feature vector for PCA. We combine the feature vectors from all of the breathing phases into an ensemble called the training set. To decorrelate the motion and error variances, we compute the principal components of the training set. Thus, we obtain a matrix of PCA eigenvectors and eigenvalues that characterize the training set, rank ordered according to how much variance in the data each accounts for. The variance due to actual physical motion is very coherent in time and space and therefore contributes mainly to the first few (most significant) eigenmodes, while the variances due to imaging artifacts and DIR errors are not so coherent (especially in time) and thus are distributed over a much larger number of components, dominating in the higher level modes. In consequence, PCA can be used as a filter to separate physical from unphysical components in the DVF. Similar observations have been made by other authors interested in 4DCT breathing analysis.11, 12
To identify the true physical breathing modes in the DVFs, we find the principal components of the validation landmark data for the 4DCT and we use this information to evaluate the contributions of the DIR errors in each of the DVFs principal modes. We can then reconstruct a feature vector in principal component space that contains only the variances due to DVF errors. The feature vector can then be turned back into a DVF error map that shows the entire spatial distribution of DVF errors rather than just the errors at the validation points.
Computational details of the DVF error mapping procedure
Suppose that we have N DVFs, each of them with M voxels. We arrange the DVFs into N feature vectors {vn},n=1,…,N, each with M components, to form the training set. We subtract from each feature vector the mean ⟨v⟩ of the training set and then we arrange the N zero-mean column vectors in the M×N matrix V. The covariance matrix of the training data is then
| (1) |
The M principal components (eigenvectors) ui of the training set and their associated eigenvalues λi are solutions of the eigenvalue problem
| (2) |
We collect the eigenvectors ui into an M×M matrix U of M column vectors.
Principal components analysis uses the M orthonormal eigenvectors ui (the principal components) to rotate the zero-mean feature vectors vi−⟨v⟩ into a new set of vectors pi that have most of the variance of the data vectors concentrated in the first few vector components
| (3) |
In this discussion we will call the pi the principal coefficient (PC) vectors. Each element pji of pi is maximally uncorrelated from all the others and has an associated eigenvalue λj that equals the variance of that component over the set of N feature vectors.13
If we write each DVF feature vector vi as the sum of a signal vector si (representing the error-free physical motion) and an error vector ei
then from PCA of the DVF we get U and pi, such that
| 4 |
Equation 4 is an exact expression for the error map ei; if we could determine UTsi then we would recover the error exactly. We do not know how to estimate the signal component of the DVF directly. However, the form of Eq. 4 suggests that we can identify the term UTsi as being approximately equal to the principal coefficients psig.i of the signal. Notice that the signal mean is not subtracted before rotation, but appears in a separate term in Eq. 4. This is because we cannot separate ⟨s⟩ from ⟨e⟩ in our DVF data to get their separate means.
The point now is to find a way to estimate the principal signal coefficients psig.i, then subtract them out to get the DVF errors from Eq. 4. First we assume that the signal is a quasiperiodic movement of anatomy from one breathing phase sample to the next and further assume that it can be completely described by the amplitude and relative phase of each anatomical voxel’s displacement vector. We show in Appendix A that under these conditions, the covariance matrix of the breathing signal DVFs is determined mainly by the breathing cycle and that principal component analysis will arrive at only a few nonzero modes.
Let us assume that the validation landmarks are accurate samples of the signal. We can locate them in the target reference image, calculate the displacement vectors that point to their new locations in each of the subsequent CT phases, organize the landmark DVFs into feature vectors li, and compute their PCA eigenvectors Uland and eigenvalues λland,j. Then we get the landmark principal coefficients πland.i (without subtracting the mean)
If the landmarks are a good representative sample of the moving anatomy, then they will have the same principal modes as the complete signal vector and thus each landmark principal coefficient πland.ji will be proportional to the corresponding signal principal coefficient to within a scaling term that accounts for the different lengths of the landmark and signal feature vectors. This assumption is argued mathematically in Appendix A. Because the principal coefficient vectors psig and πland are just rotations of the feature vectors s and l they have the same corresponding lengths. Thus we can estimate psig as
| (5) |
We do not know ∥s∥ but we can we assume that the approximate equality
is true, where lDVF is the DVF vector (with errors included) at the landmark points. Then our estimate of the DVF error map for the ith phase is
| (6) |
The remainder of this proof of concept paper is devoted to a validation via a numerical simulation of our approximate solution given in Eq. 6.
Notice that we do not associate the first few eigenvalues and eigenvectors entirely with breathing and the higher order modes entirely with DIR error, and then simply divide the PCA spectrum into physical and nonphysical parts. Instead we recognize that all of the modes are a superposition of movement and error, then estimate and subtract the movement modes to reveal the residual error modes.
To compute the principal components of the DVF and the landmark feature vectors, we note that the number of samples N (i.e., 4DCT phases) is much less than the dimensionality M of the vectors. This is a case of the high dimensionality problem. In this case there are only N nonzero eigenvectors. We show in Appendix B how PCA can be done in this case.14
Notice that each element of [VTV]ji=MΣkvjkvki [see Eq. B1] is independent of the ordering of the feature vector components (i.e., voxels). Thus the eigenvalue spectrum is independent of how we organize the DVF elements into the feature vector (as long as we do it consistently in all of the sample feature vectors). We will show the significance of this below.
Numerical simulations of the error estimation procedure
We developed and studied our algorithm using numerical model simulations. Our model was designed to resemble the behavior of actual 4DCT DVFs. To make a plausible model, we evaluated the principal modes of breathing in a real 4DCT. We used the point-validated pixel-based (POPI) data set from the website maintained at www.creatis.insa-lyon.fr.10 This data set consists of a 4D fan-beam CT in which clinicians have located 40 point landmarks in each of the ten reconstructed CT breathing phases. From the same database, we also obtained two DVF sets resulting from the deformable registration of the POPI 4DCT using both the demons algorithm15 and a parametric B-spline algorithm.16 Each set of DVFs comes validated for accuracy at each of the point landmarks. Besides the 4DCT and the sequential DVFs for both registration methods, the website also provides the public with the set of validation landmark coordinates. To supplement the POPI validation landmarks, we measured our own independent set of landmarks in each phase of the 4DCT. We performed PCA on the POPI DVFs and on the landmarks to observe the principal components (i.e., eigenmodes) of an actual 4DCT and then used the characteristics of the principal components to construct our numerical model. Before describing our numerical model and tests, we will summarize these pertinent DVF eigenmode characteristics.
The POPI 4DCT consists of ten CT image phases. Deformable registration of these CTs will generate nine DVFs. Hence, the data sample (i.e., the PCA training set) associated with the breathing motion of this single patient consists of a set of nine observations.
We prepared our PCA feature vectors in the following way:
-
(1)
We used the DVFs created by registering all the breathing phases to phase 1 (full inhale), as provided by the POPI model.
-
(2)
We segmented out the lung tissue on the phase 1 CT image, thus generating a “mask” for the lung voxels (Fig. 1).
-
(3)
We further selected from the mask only one lung lobe (right).
-
(4)
We selected one slice only in the lung mask.
-
(5)
In all the DVFs, we selected the voxels of the lung mask and we considered only the vectors originating in those voxels.
-
(6)
For all the vectors selected by the above procedure we considered only the z component.
Figure 1.
The mask used to isolate the voxels of interest in the POPI data set for PCA analysis. (a) 3D view. (b) Transversal slice. (c) Only voxels in the transversal slices of the right lung contributed to the feature vectors.
By selecting our feature vectors in the way described in the above, we sampled the breathing modes mainly along the z axis. Considering separately the x, y, and z components of the breathing movement greatly simplifies our PCA problem and suffices for a proof of concept application.
Principal component analysis of these feature vectors generated eight significant PC eigenvectors. In Fig. 2 we show a plot of the variances (eigenvalues) associated with these eigenvectors. The first three modes carry more than 96% of the total variance. We can compute a DVF “signature” for each PC mode in the following way:
-
(1)
We compute the PC eigenvectors and principal coefficient vectors [Eq. 1].
-
(2)
We use the matrix U [Eq. 1] with only the ith PC eigenvector to rotate the principal coefficient vectors back into DVF space, which projects out the ith eigenmode of the DVF.
-
(3)
For every voxel, we combine the absolute values of the variation of the projected eigenmode across the different phases, obtaining a map of the variability associated with the ith PC eigenmode across the breathing phases. We call this map a “PC signature of the ith mode.”
Figure 2.
The eigenvalue spectra for (a) the demons DVFs and (b) the parametric B-spline DVFs.
Obviously, we would expect the PC signatures of the modes associated with real physical breathing motion to be approximately the same for the two different sets of DVFs. In Figs. 34 we show the signature maps for the first three modes and one example of the signature maps for the higher order modes. From the figures it can be seen that the two sets of signatures show equivalent, easy to recognize features for the first two PC modes. The third mode contains more of the registration error components and it starts already to show significant differences between the two sets of DVFs. Finally, the higher order modes, as illustrated in Fig. 6, have basically no recognizable shared features in the two DVFs. As the DVFs are describing the same movement of the same anatomy, the only conclusion can be that the higher PC eigenmodes mainly contain noise.
Figure 3.
The PC signatures for the first two PC modes. In the right column we show the first two modes in the demons method DVFs; in the left we show the same two modes in the B-splines DVFs.
Figure 4.
The PC signatures for modes 3 and 7. In the right column we show the two modes in the demons method DVFs; in the left we show the same two modes in the B-splines DVFs.
Figure 6.
The figure with a zoom on amplitude.
Suppose that there are regions in the lung where the anatomy moves approximately in proportion to a simple amplitude scaling function, (i.e., the breathing amplitude), such that
| (7) |
In these regions, we would expect that the ratio of displacement vectors from one phase to the next would be approximately uniform. We demonstrate this for the POPI data by taking the ratios, voxel-by-voxel, of “consecutive” DVF phases (e.g., the ratio of the DVF connecting the 4DCT phases 1–5 to the DVF connecting the 4DCT phases 1–4), as shown in Figs. 56. One noticeable feature in most of these ratios is the presence of large patches in which the ratio of the z components has an almost constant value. This quasiuniformity of the successive DVF ratios across the slice indicates that the movement between two successive breathing phases can be well approximated as an amplitude scaling, as in Eq. 7, and that all the voxels within a uniform patch are moving in phase.
Figure 5.
Example of ratio of the z components of two POPI DVFs for the voxels in one CT slice as shown in Fig. 2c. For this particular figure we considered the ratios of the DVF between phases 1 and 5 and between phases 1 and 4.
If the sequence of DVFs was strictly governed by a relation such as Eq. 7, with all voxels moving in phase, then the covariance matrix would be proportional to the outer product of one feature vector and its transpose, which has rank 1. There would then be only one principal mode (see Appendix A). If the voxels move with different phases and the amplitude function α(t) is a sine or cosine function, then the covariance matrix can be separated into a sine and cosine component, each with rank 1; the complete matrix then has rank 2 and there are two principal modes that are exactly 90° out of phase (see Appendix A). As the amplitude function becomes more complex, then voxel-by-voxel phase differences cause higher order modes to appear. Because normal breathing is regular and periodic, this supports our expectation that a DVF dominated by breathing motion should have only a few strong modes and that higher order modes will be mainly due to noise.
Figure 7 shows the eigenvalue spectra of the POPI landmarks and our own landmarks. This shows that random subsamples of vector components from a training set of feature vectors will have the same eigenvalue spectrum as the original training set. This is a necessary condition for our assumption that the DVF breathing signal PCA spectrum can be approximated by the landmark samples. Figure 8 shows the phase-to-phase temporal variation of the first two principal coefficients of the landmarks. Notice how the landmark principal coefficients track the breathing cycle but with a relative phase shift of approximately 90°. This conforms to our expectations based on the above discussion.
Figure 7.
The eigenvalue spectra for subsets of the POPI point landmarks and for our manually selected landmarks.
Figure 8.
The time dependence of the two strongest principal components of the DVFs computed from the POPI data.
In order to test our error reconstruction method, we constructed a simple 2D numerical simulation to emulate as much as possible the POPI 4DCT and the associated DVFs. Our typical model consists of a 128×128 pixel square DVF describing motion in the (x,y) plane. We assumed that during breathing, the displacement of each anatomical point was completely described by its relative amplitude and phase. (NB: The relative movement phases in the breathing model are not the same as the phase bins used to describe the individual time steps of a 4DCT.) We made a reference DVF that varied spatially as a simple half-sine function in x and y. The DVF amplitude at each pixel was then varied in time in proportion to a measured (i.e., nonsinusoidal) breathing signal to produce a temporal sequence of anywhere from N=5–200 DVFs spanning multiple breathing cycles. This way we could simulate a temporal sequence of thoracic DVFs from a 4DCT of arbitrary duration, which allowed us to see the effects of sample size N on the eigenmode and DVF error analysis. To simulate relative phase differences within the DVF, we shifted the relative breathing phases of individual pixels according to various spatial patterns. As the results will show, this model emulates the essential features of the breathing eigenmodes observed in the POPI 4DCT.
We modeled four different relative phase scenarios: (1) All the pixels in the DVF moved in phase; (2) alternating pixels had a phase shift; (3) pixels were randomly assigned one of three different phases; and (4) the (x,y) plane was divided in three sectors and each sector moved with a different phase. These DVFs represented the signal feature vectors [i.e., si in Eq. 4]. We randomly selected anywhere from 5 to 1000 pixels from the signal vector as ground truth landmarks to fill the landmark feature vectors. Then we generated simulated maps of spatially correlated errors. The error maps were created by generating a white-noise image with a Gaussian noise distribution of width σ (expressed as a percent fraction of the mean image intensity), taking its Fourier transform, multiplying the real and imaginary Fourier coefficients by the noise power spectrum (NPS) in Eq. 8,
| (8) |
and then taking the inverse Fourier transform to recover a spatial noise image. The error map created in this way had the NPS spectrum of Eq. 8. The factor λ determined the correlation length (i.e., the degree of spatial correlation) of the errors in each image. The error map was then added to the simulated breathing DVF. The process was repeated phase-by-phase, each time generating a new random 2D noise image.
The error maps were spatially structured but varied randomly in amplitude from one DVF to the next. The error amplitudes were generated independently of the DVF amplitude in a given voxel. On average, the error magnitude was 10%–15% of the maximum DVF amplitude. The N noisy DVFs made up our set of feature vectors. We then applied our technique of PCA error reconstruction to the simulated data and estimated the error maps for comparison to the known errors. To evaluate the similarity in the spatial structure of the estimated and actual error maps, we calculated the normalized cross-correlation between the two maps.
RESULTS
When PCA is applied to our hypothetical breathing model (the simulated DVFs without errors), it identifies only as many principal modes as there are breathing phases present in the motion. When all anatomical points move in one phase there is exactly one eigenmode (as explained above); when they move with two relative phases there are two eigenmodes; etc. The higher modes are the combined effect of multiple phases and a nonsinusoidal breathing waveform. Each new mode adds progressively less variance to the cumulative motion, so that even if there are more than three or four modes present, the higher order modes are negligible. (Recall that in this context, “phase” refers to the relative motion of different voxels in the DVF, not to the time sequence of the DVFs.) When noise is added, additional higher order modes appear in the eigenvalue spectrum. We see this in Fig. 9, which shows the eigenvalue spectra of the simulated DVFs, with and without added noise, for three relative phases of breathing motion. The eigenvalue spectrum of our simulated DVFs shares similar characteristics with the POPI spectrum.
Figure 9.
The eigenvalue spectra for the simulated three-phase breathing motion and for the DVF with errors. The landmark spectrum is identical to the breathing spectrum.
When one uses the PCA eigenvector matrix to rotate the temporal sequence of DVF feature vectors into a sequence of principal coefficient vectors pi, via Eq. 1, the components of each pi decrease in amplitude in proportion to the amount of residual variance that they describe. Each component of p fluctuates in time with the breathing amplitude. Figure 10 shows this for the simulated case of two modes associated with two phases of alternating pixels. The two principal components follow the breathing amplitude and are approximately 90° out of phase with each other, as expected from the discussion above. This is what we also saw in the PCA of the POPI data (cf. Fig. 8).
Figure 10.
The two nonzero principal components of a DVF moving with two breathing phases, versus time, together with the breathing amplitude.
As we remarked earlier while discussing Eq. B1, the eigenvalue spectrum is independent of the ordering of the DVF voxels in the PCA feature vector. This means that it is independent of the spatial pattern of phase variations. We saw numerically that this is the case: Arranging the three phases in our model randomly or by sector in the (x,y) plane gave the same result. This leads us to suspect that true breathing motion should have a comparatively simple PCA spectrum with only a few dominant modes associated mainly with phase differences. Any additional modes in the DVFs will be due to image noise and artifacts rather than physical motion. Again, this is what we saw in the POPI data. This is the basis for our hypothesis that PCA can be used as a filter to separate motion from noise.
To test the importance of sample size N on the convergence of the principal component solution, we downsampled the DVFs to 450 and then to 200 pixels and calculated the eigenvalues of the simulated data with three phases while increasing the number of DVF samples from N=5 toN=200. Figure 11 shows the first eigenvalue for each size vector versus training set size N. In both cases the eigenvalue approached a reasonably stable value once there were about N≥10 samples in the training set. For the subsampled DVFs, the error map converged to an accurate representation of the actual errors when there were approximately 20 or more landmarks.
Figure 11.
Convergence of the first eigenvalue of 10×10 and 15×15 pixel 2D DVFs as the number of training samples increases.
We then made a matrix of calculations of the estimated 128×128 pixel error map while varying the sample size from N=5 to N=100 and the number of landmarks from 5 to 1000. For each test case in the matrix, we calculated the normalized cross-correlation of the estimated and actual error maps for all breathing phases. Figure 12a shows one of the simulated DVFs with the added errors. The magnitude of the displacement at each pixel is given by the color scale. Figure 12b shows the actual map of errors in Fig. 12; Fig. 12c is the map estimated from 50 samples and 40 landmarks using our PCA method. The cross-correlation between these two maps is 0.959. Table 1 summarizes the cross-correlations for the entire matrix of test cases while Fig. 13 displays the results graphically. We observed that, for 40 or more landmarks and 15 or more time samples, our estimated error map consistently showed a strong spatial correlation with the known error map (C>0.90), while the error amplitude was generally underestimated by 10%–30%. We also observed that the number of landmarks required for good accuracy increased very slowly (from 20 to 40) when the size of the DVF increased from 200 to 16 384 pixels.
Figure 12.
(a) The DVF with errors. (b) The actual error map. (c) The estimated error map. The color scale for the error maps is centered with yellow at zero to show the sign of the errors and is magnified six times relative to the DVF displacements.
Table 1.
Correlation coefficients between true and estimated error maps for simulation data.
| Number of samples | Number of landmarks | |||||||
|---|---|---|---|---|---|---|---|---|
| 1000 | 50 | 40 | 30 | 20 | 15 | 10 | 5 | |
| 100 | 0.965 249 | 0.970 613 | 0.971 227 | 0.882 481 | 0.845 844 | 0.805 168 | 0.693 698 | 0.554 951 |
| 50 | 0.954 346 | 0.958 393 | 0.958 901 | 0.959 96 | 0.960 524 | 0.904 191 | 0.844 078 | 0.730 723 |
| 40 | 0.952 898 | 0.955 26 | 0.955 717 | 0.956 624 | 0.957 441 | 0.958 186 | 0.862 409 | 0.783 023 |
| 30 | 0.946 151 | 0.948 507 | 0.949 305 | 0.949 683 | 0.950 201 | 0.950 848 | 0.947 608 | 0.725 655 |
| 20 | 0.931 103 | 0.931 153 | 0.931 782 | 0.932 206 | 0.933 414 | 0.933 758 | 0.933 944 | 0.802 656 |
| 15 | 0.896 587 | 0.895 343 | 0.895 296 | 0.8968 | 0.896 428 | 0.896 163 | 0.896 609 | 0.845 068 |
| 10 | 0.880 022 | 0.882 914 | 0.882 637 | 0.883 397 | 0.881 115 | 0.881 791 | 0.881 509 | 0.8737 |
| 5 | 0.652 757 | 0.650 382 | 0.666 681 | 0.669 438 | 0.664 782 | 0.655 753 | 0.654 319 | 0.650 606 |
Figure 13.
The cross-correlation of the estimatand actual error maps as a function of the number of breathing phases (i.e., DVF samples) and the number of validation landmarks.
DISCUSSION
The validation of deformable image registration results must contend with limited ground truth data and registration errors that can have complex spatial distributions. Consequently, point-by-point landmark validation can give unrepresentative results that do not realistically reflect the global pattern of registration errors.17 We have tested a new concept for DIR error evaluation, which we hope will allow us to map the complete spatial distribution of DVF errors, when only a small number of validation landmarks are available. The new method applies PCA to the landmark movement to estimate the fundamental eigenmodes of breathing and then subtracts these modes from the principal modes of each complete DVF to expose the underlying errors. This makes possible a new and more realistic picture of deformable image registration errors than is customarily obtained via simple point landmark validation.
Our method requires a set of DVFs generated by a particular DIR method and assumes that the DVFs record anatomical motion that has spatial coherence and some degree of continuity from one DVF to the next. A good example of that (and the problem that motivated us to develop the method) is a set of DVFs obtained by registering the phases of a 4DCT. Using a simple numerical simulation of such a data set, for which we could define an absolute ground truth for the entire spatial error map, we have shown that our method is capable of recovering a reasonable approximation of the known errors. In particular, we have shown that it can accurately recover the spatial structure in the error map.
Principal components analysis is a logical solution to the problem of analyzing spatially correlated data because it projects the data onto a set of orthogonal eigenvectors that maximally decorrelate each projected mode of variance from all of the others. This allows us to identify the error modes and the breathing modes regardless of their patterns of spatial coherence.
It might be tempting to simplify the problem by assuming that the first few principal DVF modes are associated entirely with breathing while the noise and registration errors are associated entirely with the higher order modes (cf. Ref. 11). If this would be the case, then we could simply cut the eigenvalue spectrum at the appropriate point and reconstruct the errors from the higher modes. Then we would avoid the need to estimate the breathing components from the landmarks. However, as Eq. 5 shows, each principal coefficient combines contributions from breathing and registration errors, i.e., although the modes are statistically decorrelated, they are not mathematically separated. [When we attempted to reconstruct the error map just from the higher modes without following the methodology of Eq. 6, the resulting estimated error map was very inaccurate.]
Within our simplifying assumptions, we have found that our method gives a robust reconstruction of the error map when there are ten or more DVF samples (i.e., breathing phases). This makes it practical for a typical 4DCT of 10–25 phases.18 We also found that the method gave good results with as few as 20–40 landmarks. Both of these results were very nearly independent of the number of pixels in the DVF, which offers some assurance that the reconstruction of a full 3D volumetric map will not require substantially more time bins or landmarks. However, it is important to ensure that the landmarks provide a reasonable sampling of the breathing modes. For example, if the landmarks cluster in one part of the thorax, but the breathing modes are asymmetrically distributed in the thorax, the landmarks might provide not only an incomplete picture of the DVF error distribution but also an incomplete picture of respiratory motion. By making DVF difference maps as in Figs. 56, one can select landmarks such that there are at least a few in each of the regions associated with a principal breathing mode.
Our method is in an early stage of development. So far, we have only used numerical simulations of 4DCT DIR results to test the general principles of the method and to validate our basic assumptions leading up to Eq. 6. These simulations were designed to be simple approximations to the basic behavior one would expect in 4DCT DVFs, so that we could perform basic proof of concept tests of the potential merit of the method. We used the POPI data to show that the basic properties of our simulation, as well as the basic properties of its PCA eigenmodes, are roughly consistent with a real 4DCT. The simulation was a long way from a realistic model of human respiratory motion as it is captured in a 4DCT, but it did show that our method has the basic capability of recovering spatially correlated error maps.
We are far from a complete validation of this technique. Our simulation assumed that the landmarks were a perfect ground truth when in reality they will have observer uncertainties that will introduce their own errors into the analysis. Already in our PCA analysis of the landmark movement, we found different eigenvalue spectra for different sets of landmarks, each with tails that are quite likely associated with observer inaccuracies. We must now analyze how those observer errors might confound our DIR error map. Furthermore, our simulation of a sequence of 4DCT DVFs was very simplistic when compared to real breathing motion. The next stage of testing will require physically realistic 4DCT and DVF data, which we intend to acquire with a 3D deformable imaging phantom rather than via more complicated simulation models. Realistic tests must also investigate the effects of irregularities in breathing (e.g., hiccups) that could affect the PCA breathing spectrum. Finally, it will be important to enlarge the PCA training set. A single 4DCT typically has only 10–15 phases, which is at the limit for which our method can work effectively. We can enlarge our training set in two ways: Either by obtaining multiple 4DCTs for a single patient, or by developing the means to assimilate 4DCTs from a cohort of patients. In both cases, we will try to exploit the assumption that breathing motion has the basic elements of regularity for individuals as well as across a population. However, because the use of DIR in the radiation therapy clinic will often involve patients with lung disease, the development of a 4DCT PCA training set across a cohort of patients must deal with the likelihood of substantially different breathing behaviors for some cases.
SUMMARY
We have presented here a proof of concept study for an improved method to analyze deformable image registration errors when the only ground truth is provided by a small set of measured landmarks. The method explicitly accounts for the spatial coherence of registration errors and is thus more physically realistic than an analysis that assumes that the errors are randomly distributed.
ACKNOWLEDGMENTS
This work was supported in part by NIH Grant No.P01-CA116602.
APPENDIX A: THE PRINCIPAL MODES OF QUASIPERIODIC BREATHING
Assume that voxel-by-voxel breathing motion can be completely described by the amplitude and temporal phase of the DVF vector field. In the simplest possible situation, suppose that our signal in Eq. 4 is a DVF for which every voxel moves in phase in proportion to some time-dependent breathing amplitude function a(t). Let the error component be zero, so that V just contains the true DVF breathing signal. Let our N time samples of the DVF be taken at the breathing amplitude steps a(tj)=aj. If v0 is the feature vector representing the DVF at the reference time t=0, then each DVF feature vector at later time tj is given by
If we write the initial DVF v0 and the breathing amplitude samples aj as column vectors, then our matrix of DVF feature vectors from Eq. 2 can be written as
From this it follows that
Similarly,
| (A1) |
The column vectors v0 and a have rank 1.
Thus VVT and VTV have rank 1 and Eqs. 2, B1 have exactly one nonzero eigenvalue and eigenvector. From Eqs. A1, B1, we see that all of the covariance in the DVF, and thus in the PCA solution, comes from the time dependence of a. Also, from Eqs. A1, B1, it is apparent that the one nonzero PCA eigenvalue depends only on the magnitude of the reference DVF feature vector. This feature vector could contain all of the voxels or only selected landmarks. Thus, for this simplest of all cases, we hypothesize that Eq. 5 will hold
More realistically, suppose that the breathing amplitude a(tj) is approximately periodic but each voxel i can move with its own relative phase ϕi. For example, approximate the breathing as a sine wave
As above, we write
as initial DVF vector amplitudes, let
and write
We observe that
and conclude that rank(V)=2 for this case. If so, then rank(VVT)=2. This predicts that for a sinusoidal breathing amplitude and any arbitrary combination of voxel-by-voxel phase differences, the DVF will have exactly two nonzero eigenvalues and eigenvectors. If one sets up a numerical computation of this situation, then that is in fact what PCA shows.
For this case we now have
| (A2) |
Again the covariance is due to the time variation of the DVF due to the breathing amplitude represented by a and b. The DVF signal (landmark) reference vector enters as a vector magnitude. However, if we are only going to use selected landmarks to estimate our signal modes, we see from Eq. A2 that they should be representative of the different phases. From this, Eq. 5 arises naturally. We rely on the numerical simulation to validate these assumptions.
If the breathing function is more complicated than a simple sine wave, then more modes will appear in the principal components of the DVF, but the PCA spectrum will still be dominated by just two or three modes for all but the most irregular breathing patterns. That is in fact what we see in the POPI data.
APPENDIX B: PRINCIPAL COMPONENTS CALCULATION FOR HIGH-DIMENSIONALITY DATA
Multiply both sides of Eq. 2 by VT
Let wi=VTui so that we get the new eigenvalue equation for the N×N matrix VTV
| (B1) |
which has the same first N−1 eigenvalues as the original M×M matrix VVT (which has M−N−1 additional eigenvalues). To get the eigenvectors of Eq. 2, multiply by V
from which we see that Vwi=ui to within a normalization factor. To complete the derivation we normalize
| (B2) |
The authors report no conflicts of interest in conducting the research.
References
- Mageras G. S., “Respiration correlated CT techniques for gated treatment of lung cancer,” Radiother. Oncol. 64 (Supplement 1), S75–S75 (2002). 10.1016/S0167-8140(02)82539-7 [DOI] [Google Scholar]
- Low D. A. et al. , “A method for the reconstruction of four-dimensional synchronized T scans acquired during free breathing,” Med. Phys. 30(6), 1254–1263 (2003). 10.1118/1.1576230 [DOI] [PubMed] [Google Scholar]
- Keall P. G. et al. , “Acquiring 4D thoracic CT scans using a multislice helical method,” Phys. Med. Biol. 49(10), 2053–2067 (2004). 10.1088/0031-9155/49/10/015 [DOI] [PubMed] [Google Scholar]
- Orban de Xivry J. et al. , “Tumour delineation and cumulative dose computation in radiotherapy based on deformable registration of respiratory correlated CT images of lung cancer patients,” Radiother. Oncol. 85, 232–238 (2007). 10.1016/j.radonc.2007.08.012 [DOI] [PubMed] [Google Scholar]
- Birkner M. et al. , “Adapting inverse planning to patient and organ geometrical variation: Algorithm and implementation,” Med. Phys. 30, 2822–2831 (2003). 10.1118/1.1610751 [DOI] [PubMed] [Google Scholar]
- Zhang P., Hugo G. D., and Yan D., “Planning study comparison of real-time target tracking and four-dimensional inverse planning for managing patient respiratory motion,” Int. J. Radiat. Oncol., Biol., Phys. 72(4), 1221–1227 (2008). 10.1016/j.ijrobp.2008.07.025 [DOI] [PubMed] [Google Scholar]
- Brock K. K. et al. , “Results of a multi-institution deformable registration accuracy study (MIDRAS),” Int. J. Radiat. Oncol., Biol., Phys. 76 (2), 583–596 (2010). [DOI] [PubMed] [Google Scholar]
- Boldea V., Sharp G. C., Jiang S. B., and Sarrut D., “4D-CT lung motion estimation with deformable registration; quantification of motion nonlinearity and hysteresis,” Med. Phys. 35 (3), 1008–1018 (2008). 10.1118/1.2839103 [DOI] [PubMed] [Google Scholar]
- Castillo R. et al. , “A framework for evaluation of deformable image registration spatial accuracy using large landmark point sets,” Phys. Med. Biol. 54, 1849–1870 (2009). 10.1088/0031-9155/54/7/001 [DOI] [PubMed] [Google Scholar]
- Vandemeulebroucke J., Sarrut D., and Clarysse P., “The POPI-model, a point-validated pixel-based breathing thorax model,” in the Proceedings of the 15th International Conference on the Use of Computers in Radiation Therapy (ICCR), Toronto, Canada, 2007.
- Zhang Q., Pevsner A., Hertanto A., Hu Y. C., Rosenzweig K. E., Ling C. C., and Mageras G. S., “A patient-specific respiratory model of anatomical motion for radiation treatment planning,” Med. Phys. 34(12), 4772–4781 (2007). 10.1118/1.2804576 [DOI] [PubMed] [Google Scholar]
- Klinder T., Lorenz C., and Ostermann J., Free-Breathing Intra- and Inter-Subject Respiratory Motion Capturing, Modeling, and Prediction (SPIE Medical Imaging, Orlando, FL, 2009). [Google Scholar]
- Kittler J. and Young P. C., “A new approach to feature selection based on the Karhunen-Loeve expansion,” Pattern Recogn. 5, 335–352 (1973). 10.1016/0031-3203(73)90025-3 [DOI] [Google Scholar]
- Bishop C. M., Pattern Recognition and Machine Learning (Springer, New York, 2006). [Google Scholar]
- Pennec X., Cachier P., and Ayache N., “Understanding the demon’s algorithm: 3D non-rigid registration by gradient descent,” in Medical Image Computing and Computer-Assisted Intervention MICCAI ’99, Cambridge, United Kingdom, edited by Taylor C. and Colschester A. (Springer-Verlag, Berlin, 1999); [Lect. Notes Comput. Sci. 1679, 597–605 (2009)]. [Google Scholar]
- Kybic J. and Unser M., “Fast parametric elastic image registration,” IEEE Trans. Image Process. 12(11), 1427–1442 (2003). 10.1109/TIP.2003.813139 [DOI] [PubMed] [Google Scholar]
- Kabus S., Klinder T., Murphy K., van Ginnekken B., Lorenz C., and Pluim J. P. W., “Evaluation of 4D-CT lung registration,” in Proceedings of MICCAI 2009, 2009, edited by Yang G. Z.et al. , pp. 747–754. [DOI] [PubMed]
- Keall P. J., Mageras G. S., Balter J. M., Emery R. S., Forster K. M., Jiang S. B., Kapatoes J. M., Low D. A., Murphy M. J., Murray B. E., Ramsey C. R., Van Herk M. B., Vedam S. S., Wong J. W., and Yorke E., “The management of respiratory motion in radiation oncology report of AAPM Task Group 76,” Med. Phys. 33, 3874–3900 (2006). 10.1118/1.2349696 [DOI] [PubMed] [Google Scholar]













