Abstract
Lung nodule volumetry is used for nodule diagnosis, as well as for monitoring tumor response to therapy. Volume measurement precision and accuracy depend on a number of factors, including image-acquisition and reconstruction parameters, nodule characteristics, and the performance of algorithms for nodule segmentation and volume estimation. The purpose of this article is to provide a review of published studies relevant to the computed tomographic (CT) volumetric analysis of lung nodules. A number of underexamined areas of research regarding volumetric accuracy are identified, including the measurement of nonsolid nodules, the effects of pitch and section overlap, and the effect of respiratory motion. The need for public databases of phantom scans, as well as of clinical data, is discussed. The review points to the need for continued research to examine volumetric accuracy as a function of a multitude of interrelated variables involved in the assessment of lung nodules. Understanding and quantifying the sources of volumetric measurement error in the assessment of lung nodules with CT would be a first step toward the development of methods to minimize that error through system improvements and to correctly account for any remaining error.
© RSNA, 2009
Lung nodule measurements made with computed tomography (CT) are used in clinical practice to assess size change estimated from serial scans obtained over time to predict the likelihood of malignancy (1) and to monitor the response of tumor to treatment (2). Size measurements need to be accurate and consistent to enable assessment of nodule change in a short time interval. The time interval depends on the specific clinical circumstance; some lung cancers, particularly adenocarcinomas, are more aggressive than others and may spread outside the thorax and become systemically disseminated, even when the primary tumor is small. Nodules as small as 5 mm (about 65 mm3 in volume) may require short-term follow up in as little as 3–6 months. A nodule of that size will have doubled in volume when it measures 6.3 mm in diameter. Such small changes in size are difficult to recognize visually, particularly when nodules are irregular in shape. Similarly, short-term knowledge of tumor response is needed to make patient-specific therapy decisions to give the best possible clinical outcome.
Currently, nodule size is typically evaluated by comparing the maximum diameter of a nodule on serial scans in accordance with the Response Evaluation Criteria in Solid Tumors (RECIST) (2,3), which place nodule response in one of four categories: complete response, partial response, stable disease, and disease progression. RECIST is an update to the 1979 World Health Organization method (4), which relied on two-dimensional measures achieved by multiplying a tumor's maximum diameter in the transverse plane by its largest perpendicular diameter on the same image.
While RECIST has been promoted as a simple and practical one-dimensional measurement approach that provides more reproducible results than the World Health Organization method, both criteria suffer from several limitations (2), including the assumptions that tumor size changes in a symmetric fashion, that tumor volume is simply related to a planar measurement, and that four discrete categories of volume change are sufficient to quantify disease response or progression. In actuality, tumors do not necessarily grow symmetrically; different portions may grow at different rates (5). Furthermore, there is substantial variability in the RECIST measures within an observer and between different observers as they interpret the displayed data and interact with the measurement software tools (6–8). This variability diminishes the statistical power of clinical trials designed to determine whether a therapy truly affects tumor growth, which in turn results in an extension of the time needed for patient participation in the trial and/or an increase in the number of patients required to make this determination. Thus, there has been substantial discourse regarding the need for other measurement approaches that could improve the assessment of change in tumor size.
Since the introduction of the first commercial helical CT scanner in 1990 (9), helical CT has been substantially improved, reaching the point where systems are capable of scanning large anatomic volumes with high axial resolution (<1.0 mm) in a single breath hold. These improvements have led to the development of three-dimensional methods for nodule volumetry, with the aim of more accurate and consistent tumor measurement and, therefore, better determination of temporal change in a shorter interval of time. One of the first applications of volumetric analysis was the study by Yankelevitz et al (5) for estimating the growth rate of small nodules. Since then, a number of studies have focused on issues related to the volumetric assessment of lung nodules with CT. The key issue in volumetric analysis is the precision and accuracy of volumetric measurements, which depend on a number of interrelated factors, including scan acquisition and reconstruction parameters, nodule characteristics, and the measurement techniques used in the volume estimation process.
The purpose of this article is to provide a review of findings from published studies relevant to the volumetric CT analysis of lung nodules. The review will focus on the sources and extent of error in the volume-based estimation of nodule size. Volumetric error will be discussed in terms of both accuracy (or bias: a systematic error that contributes to the difference between test results and an accepted reference value) and precision (or variance: the closeness of agreement among test results obtained given prescribed conditions).
An evaluation of the variance of volumetric methods on clinical nodules (ie, nodules seen at clinical CT examinations) requires repeat scans on stable or nongrowing nodules. However, it is insufficient to simply include serial images of patients deemed to be clinically “stable” because they do not necessarily have zero change in tumor size. According to the RECIST, nodules in the stable disease category could be up to 20% larger or 30% smaller in diameter on repeat scans (3).
One way to obtain nodules that are truly unchanged in size for testing volumetric measurements is to perform repeat CT acquisitions in a patient after only a few minutes. However, patient exposure considerations generally preclude the collection of this type of data on a large scale. Even if such studies were conducted, the data provide information only on the variability of the volume estimate but not on its bias, since the true volume of the underlying nodule is still unknown. Obtaining ground truth on the size of lung nodules is a difficult task because of the difficulty in obtaining pathologic truth and the uncertainty associated with the use of observer output as truth (10).
Alternatively, phantom studies incorporating synthetic lung nodules have been employed in research studies to provide a framework incorporating known truth, thus allowing both bias and variance analysis of volume measurement error. Phantom studies have been employed in thoracic CT since the early work of Zerhouni et al (11) to standardize nodule attenuation measurements. Results from studies performed with both phantom (5,12–21) and clinical (5,7,8,13,15,17,18,22–33) data on the volumetric assessment of lung nodules with CT will be reviewed and discussed in this article.
EFFECT OF VOLUME MEASUREMENT METHOD
A number of volumetric measurement methods have been reported in the literature, while others are available as commercial or public domain software. The choice and use of a measurement method can influence the precision and accuracy of volumetric analysis. It is difficult to compare performance results across methods since most are evaluated with different data sets and the use of different performance metrics and different observers. Moreover, some methods are designed for volume measurements of synthetic nodules in phantom experiments, while others are designed for clinical data, which have a wider range of nodule shapes, sizes, attenuation values, and surrounding structures.
Volume measurement methods in the content of this review fall mainly into two categories: semiautomated and manually derived. Semiautomated algorithms are typically initiated by defining a region of interest around a nodule or by a user-provided point inside the nodule area. Depending on the application, segmentation algorithms are then employed to delineate nodules from the surrounding lung parenchyma and neighboring structures such as attached vasculature and pleural surfaces. Manually derived methods require users to interactively delineate nodule boundaries, typically in a section-by-section fashion; this is followed by an application of three-dimensional software to merge the two-dimensional boundaries into a volume. The estimate of nodule volume is then based on the total number of voxels within the segmented region. The majority of volume measurement methods use voxel counting; therefore, these methods will be featured prominently in this review.
Algorithmic Approaches Attempting to Compensate for Partial Volume Effect
Some voxel-counting algorithms attempt to improve measurement accuracy by accounting for voxels affected by the partial volume effect, an imaging artifact caused by the limited resolution of CT scanners and the averaging of the linear attenuation coefficients of all materials in a pixel (34). The problem results when an algorithm counts a percentage of voxels as containing pure nodule tissue or pure parenchyma tissue when, in fact, the voxels contain both (17). Since nodule tissue attenuation values may be considerably higher than those of surrounding parenchyma tissue, voxels containing only a small portion of nodule tissue may be interpreted as nodule voxels, which leads to an overestimation of true volume (13). It has been shown that the percentage of partial volume voxels decreases as nodule size increases and as section thickness decreases (18).
A number of approaches have attempted to compensate for the partial volume effect. The major differences between these approaches are rooted in the definitions of what constitutes a voxel containing pure nodule tissue, pure parenchyma tissue, or a mixture of the two. These varying definitions result in different volume measurements. In a phantom study by Ko et al (12) focusing on small nodules (diameter, <5 mm), three volumetric methods that differed in their definitions of nodule and parenchyma voxels were compared in terms of mean absolute error. Figure 1 is a CT section showing the synthetic nodules used in that study. Results demonstrated that algorithm choice had a significant effect on measurement error for low-dose (20-mAs) scans. Kuhnigk et al (17) used a voxel-counting method, where the contribution of partial volume voxels to the overall volumetric measurement was weighted on the basis of their distance from the segmented nodule boundary. The use of this method improved reproducibility in volumetry of clinical nodules across different image-acquisition protocols. An evaluation with synthetic nodules showed that algorithm performance depended on section thickness and the choice of reconstruction kernel, which was expected since these parameters influence the partial volume effect.
Segmentation Methods for Juxtavascular and Juxtapleural Nodules
Clinical nodules are often attached to surrounding structures, including vasculature and the pleural surface adjoining the thoracic wall. Kuhnigk et al (17) analyzed a set of 700 lung nodules, 90% of which had visible connections to vessels and 30% of which were adjacent to the pleura. Nodule attachments make it difficult to accurately define boundaries, as is evident from the interobserver variability that exists in the task of volume estimation with the use of manually drawn nodule boundaries (30), and can contribute to errors in the volume measurement process (35,36).
A number of semiautomated two- and three-dimensional lung nodule segmentation methods, initialized by observer-defined nodule localizations, have been developed. Since attenuation information is not sufficient to distinguish nodule boundaries from attached vessels, most methods incorporate morphologic operators. Kostis et al (23) developed a model-based volumetric measurement method that classifies nodules into four types on the basis of surrounding structures. Results showed acceptable segmentations (as determined by a radiologist) in approximately 80% and 72% of segmented nodules with vascular and pleural attachments, respectively. Reeves et al (18) developed a multistage three-dimensional nodule-segmentation algorithm for performing volume change analysis. Examples of pleural surface and vascular segmentation are shown in Figure 2. Results in 50 stable nodules showed an error (standard deviation in percent volume change) on the order of 9%. Related work includes the studies by Zhao et al (37,38), Wiemker et al (39,40), and Okada et al (26).
Commercial and Public-Domain Software for Volumetric Analysis
A number of studies have reported on the use of commercial or public domain software for volumetric assessment of lung nodules. Das et al (19) used a commercially available lung analysis software package (LungCARE; Siemens, Forchheim, Germany) for volume measurements of synthetic nodules with various attachment categories, as illustrated in Figure 3. Four 16-section CT scanners from four vendors were used in the analysis. Overall absolute percentage error varied for different nodule attachment categories and was highest for pleural nodules, where it ranged from 10.3% to 21.2% across vendors. Magnitudes of absolute percentage error as high as 28% were also reported by Kinnard et al (20) in a phantom study that used public-domain software (OsiriX, version 2.7.5 [41]).
The results from the studies above show the dependence of volumetric error on the performance of the segmentation algorithms, particularly in the presence of the nodule's vascular and pleural attachments. The limitations and assumptions used in the design of a specific algorithm need to be well understood. For instance, the software used in the Das et al study (19) assumed that nodules have a spherical shape. This assumption does not necessarily hold for clinical nodules. Another issue is the failure (incorrect segmentation) rate of measurement methods, which ranged from 20% to 28% for nodules with different attachments in the Kostis et al study (23). Software might require the supervision of, and perhaps correction by, a radiologist to reach acceptable levels of performance. Finally, a limitation of the majority of volumetric measurement algorithms is that they are only capable of segmenting solid nodules.
Variability in the Use of Software for Volumetric Measurements
Differences in volumetric measurements are also present in repeat scans of the same object. These differences occur due to the inherent variability of the acquisition system (42–44), as well as variability associated with the manually drawn boundaries and the use of volume measurement tools by human observers (7,27). These issues will be discussed in this section.
Goodman et al (27) presented results on the variability of volumetric measurements made with a commercial semiautomated software program (Advantage Lung Analysis; GE Healthcare). The evaluation was performed on a set of 50 nodules across three repeat scans read by three observers. The second and third scans were acquired during two breath holds performed 10–20 minutes after the first scan. Nodule volume was estimated as the average volume measurement of the three observers at the first scan and was used as ground truth to measure interobserver variability. This measure of interobserver variability is inherently biased because each observer is compared against a standard that is based in part on his or her own contours. Interscan variability was measured as the percentage difference between the average volume measurements and the estimated volume. Results showed significant interscan variability (on the order of 13%) but minimal interobserver variability. High interobserver agreement was attributed to the use of semiautomated software.
Similar high interobserver agreement was reported by Revel et al (25) and Juluru et al (45). In studies by Gietema et al (29,32), the most important cause of variability was incomplete segmentation due to irregular shape or margin.
In addition to intra- and interscan variability related to semiautomated measurement tools, studies have also focused on the variability of manually derived volumetric measurements. In a study that was part of the Lung Image Database Consortium, Meyer et al (30) evaluated sources of variability in defining the spatial extent of lung nodules. Six experienced thoracic radiologists used three software methods. The first method was a manual boundary-delineation technique, while the other two were semiautomatic. For one of the semiautomated algorithms, the user drew a line from the inside of the nodule to the outside, and this trace was used to determine the segmentation threshold. The other semiautomated algorithm presented a set of prethresholded nodules, from which the radiologist could pick the one he or she perceived to be the most accurate. This algorithm also contained manual software tools so that the radiologists could locally adjust the resulting segmentations. The overall focus of the study was to let experts determine truth and to have the resulting boundaries reflect each decision. The radiologists independently applied each nodule-contouring method to the boundaries of 23 lung nodules. Results showed that statistically significant variability existed between volumetric measurements, with interobserver variability accounting for approximately 40% of total volume variance. On the basis of the standard error across all radiologist-method combinations, the measured volume would need to change by more than 55% to have at least 95% confidence that the measured difference was an actual nodule volume change and not measurement error.
Results from the above studies underscore the idea that both bias (measurement deviation from the true value) and variance should be considered when choosing software for volumetric measurements. Automated tools may decrease observer variability, but they may also yield large biases, particularly in the presence of nodule-surrounding structures such as vessels. By allowing observers to interact with the software, it may be expected that these biases would be reduced. Research efforts are needed to develop methods for minimizing interobserver variability in the use of such interactive tools.
EFFECT OF SCAN ACQUISITION AND RECONSTRUCTION PARAMETERS
Other than system-design parameters, such as number and size of detectors and tube-gantry geometry, a number of user-defined settings can influence volumetric measurements of lung nodules. The CT acquisition and nodule parameters for the major studies referenced in this review are summarized in the Table.
Table 1.
Note.—NA = not available.
Philips, Best, the Netherlands; Picker, Cleveland, Ohio.
Unless otherwise indicated, data are in milliampere-seconds.
Unless otherwise indicated, data are the diameter (d) or average (s, length, width) in millimeters.
Data are the volume (v) in cubic millimeters.
Section Thickness, Collimation, and Overlap
In clinical practice, initial screening and diagnostic thoracic CT examinations are often performed with thicker (≥3-mm) sections for improved clinical workflow (13). Follow-up examinations of suspicious areas or nodules are then performed with thinner (1–3-mm) sections. These differences in section thickness need to be considered when comparing volumes between serial scans.
A number of studies have examined the effect of section thickness on the variability in volumetric measurement error of lung nodules. Winer-Muram et al (13) showed an average percentage difference of 20% in volumetric measurements between thin and thick sections (36% for the smallest tumors). Similar findings regarding the effect of section thickness were reported by Zhao et al (46), Petrou et al (31), and Kuhnigk et al (17). These studies also demonstrated that section thickness has a larger effect on the measurement error of small nodules.
Phantom studies have also been used to examine the bias of volumetric measurements as a function of section thickness. In the Winer-Muram et al study (13), measurements of 11 spherical nodules ranging from 12.7 to 38.1 mm in diameter were derived. Results showed volume overestimation that varied directly with section thickness and inversely with tumor diameter. For the largest nodule, bias error ranged from 11.15% to 16.44% for scans acquired with section thickness of 2 mm and 10 mm respectively, whereas for the smallest nodule the error ranged from 13.04% to 28.04%. Similar results showing a more pronounced volumetric error for thicker sections were reported by Tao et al (15) and Way et al (21).
The effect of collimation (acquiring sections of a particular thickness from multiple detector rows) on multisection scanners has only recently been examined regarding the effect on volumetric accuracy. In the study by Das et al (19), the effect of section collimation on volumetric measurement accuracy was examined in a comparison of four different 16-section CT scanners. Depending on the system's settings, comparisons were made between 16 detector rows at 1.5 mm and at 0.75 mm section collimation (Siemens; Philips Medical Systems, Best), between 16 detector rows at 1.25 mm and at 0.625 mm (GE Healthcare) and between 16 detector rows at 1.0 mm and at 0.5 mm (Toshiba; Tokyo, Japan) collimations. Section collimation was found to have a significant effect on the absolute percentage error of volumetric measurements across all scanners (P = .021). The extent of the effect varied for different types of nodule attachments and nodule sizes although no analysis to determine statistical significance was reported.
Aside from different section collimations for a given number of detector rows as examined by Das et al (19), there may be several configurations that result in sections of the same thickness. The particular choice of a configuration must be made on the basis of desired volume coverage speed, which increases as the number of detector rows increases, and/or the ability to review thinner sections, which decreases as the number of detector rows increases (47). More studies are needed to examine the effect of such configurations on volumetric accuracy.
Another important acquisition parameter is section spacing (section overlap). In a study with single-detector CT, Brink et al (48) suggested that at least a 60% overlap relative to the effective section thickness is needed for maximal longitudinal resolution. However, this effect has not been examined relative to the volumetric assessment of lung nodules with multidetector CT.
Radiation Exposure
Radiation exposure settings involve a trade-off between minimizing radiation dose while maintaining image quality. The effect of radiation exposure on volumetric measurement error was examined in the study by Ko et al (12). It was reported that bias error is significantly smaller for a 120- mAs scan than for a 20-mAs scan (P < .001). A different result was reported in the Das et al (19) phantom study, in which four commercial 16-section CT scanners were compared. Volumetric measurement error was measured for scans acquired with a low dose (20 mAs) and with a standard dose (100 mAs). Nodules varied in diameter from 3–10 mm and had a number of different vascular and pleural attachments. No statistically significant difference in absolute percentage error was reported for the two dose protocols. Similar results were reported by Way et al (21). The disparity of these results could be due to the use of newer-generation 16–detector row scanners in the Das et al and Way et al studies versus a four–detector row scanner in the Ko et al study.
Modern scanners use automatic radiation exposure control systems to optimize the tube current–time product during acquisition. However, the effect on lung nodule volumetry of variations in tube current along the x, y, and z axes has not been investigated.
Reconstruction Algorithm and Filter
Image reconstruction algorithms are interrelated with section thickness, resolution, and noise and can subsequently affect volumetric measurement error. In the Ko et al study (12), the choice of reconstruction algorithm significantly affected measurement error. Specifically, the high-frequency reconstruction algorithm was more accurate (mean absolute error = 3.0 mm3) than the low-frequency algorithm (mean absolute error = 3.7 mm3, P = .002) for all 40 nodules. The effect could have occurred owing to the fact that only small nodules were studied (<5 mm), and higher spatial resolution related to the use of a high-frequency reconstruction algorithm facilitates the sampling of small nodules (12). Kuhnigk et al (17) also studied the effects of reconstruction filter, comparing bone and soft-tissue reconstruction kernels. Synthetic nodule results showed that the median absolute volume deviation between the two kernels was 5.6%. As in the Ko et al (12) study, the nodules used by Kuhnigk et al were relatively small (<10 mm).
Type of CT System
Patients often undergo thoracic CT examinations at different sites or with different scanners during their follow-up. Scanner-specific parameters can potentially influence volumetric measurements. Das et al (19) evaluated volumetric measurement accuracy for four different 16-section scanners from different vendors. Their phantom contained spherical nodules with a diameter of 3–10 mm in five categories of nodule attachment. Data were analyzed by using a commercial semiautomated lung analysis software package (LungCARE; Siemens). A statistically significant effect for differences in the system was found across all protocols (P = .004). Overall mean absolute percentage error varied from 7.5% ± 7.2 (standard deviation) for one vendor to 14.3% ± 11.1 for another. Significant effects due to system type were reported for section collimation and nodule size but not for low- and standard-dose protocols.
Marten et al (14) compared volumetric measurements for a set of 70 synthetic lung nodules with estimated diameters of 1.4–7.8 mm, acquired from two CT systems: a clinical four-section CT system (Lightspeed Plus; GE Healthcare) and a prototype volumetric CT scanner (GE Healthcare). Results showed a significant decrease in bias error for the prototype system as compared with the four-section system, particularly for small (<4-mm) nodules. The volumetric CT system was less prone to measurement errors than was the four-section system because of its higher spatial resolution and ability to achieve near isotropic conditions. The results may be somewhat biased because of the authors' use of relatively small nodules (<7.8 mm), for which high spatial resolution would be expected to have a larger impact. The authors acknowledged that technologic advances are needed before this volumetric CT technology could be incorporated into clinical practice. However, the results of this study were encouraging for the development of imaging technology that would enable the accurate measurement of small nodules.
Pitch
A parameter that has not been systematically evaluated is helical pitch, which is a function of section width and table speed (47,49). As discussed in an article by Napel (47), for single-detector CT systems the section width is controlled entirely by the collimator, and the relationships among pitch, image quality, and dose are well understood. However, pitch for multidetector helical CT systems and the resulting trade-offs between dose and image quality are complicated because section width is controlled by the detector configuration. In the study by Way et al (21), the variation in pitch settings did not significantly affect the volume measurement error. However, only spherical nodules with no attachments were used in the study. The effect of pitch on volumetric measurement accuracy is open for additional investigation.
The above studies demonstrate that the choice of CT system and protocol used for scan acquisition can have a substantial effect on volumetric measurements. With regard to acquisition protocols, section thickness was shown to be the most important factor. More studies are needed to address the effect of section overlap and pitch.
EFFECT OF LUNG NODULE CHARACTERISTICS
Nodule characteristics, such as size, shape, margination, and radiologic solidity, vary widely in clinical cases and can influence the precision and accuracy of volumetric measurements. Research findings on the effect of nodule characteristics will be discussed in this section.
Nodule Size
Nodule size is an important factor in volumetric analysis of lung nodules. Clinically, it has been shown that size is linked to nodule malignancy, with noncalcified nodules larger than 2 cm in diameter having a higher rate of malignancy than smaller nodules (50). Nodule size also has an effect on volumetric measurements because CT reconstructions of smaller nodules tend to have a larger proportion of voxels that have contributions from more than one tissue type (ie, partial volume voxels) (18).
A number of studies have shown an increase in lung nodule volume estimation error with decreasing nodule size. In a phantom study, Goo et al (16) used a set of synthetic nodules ranging from 3.2 to 12.7 mm and reported a significant increase in absolute error with decreasing nodule size. The effect was more pronounced for thicker sections. As for clinical nodules, Reeves et al (18) examined stable nodules and showed a decrease in percent volume variation from 12% to 1.8% as nodule size increased from 2 to 8 mm. A similar effect of increased variability in volumetric measurement for decreasing nodule size was reported by Petrou et al (31) and Winer-Muram et al (13).
Nodule Shape
Since nodules often grow irregularly (5), nodule shape and margination (one aspect of which is the presence of spiculation) have also been examined as variables in volume measurement. Yankelevitz et al (5) studied nodule shape, where volume was measured first by manually selecting the nodule region of interest and then by performing isotropic resampling of the extracted volume. Volume measurements showed larger measurement error for elongated shapes (0.9%–2.8%) than for spherical shapes (0.7%–1.43%). The small magnitude of error (<3%) can be attributed to the high mean attenuation of the nodules and the spherical shape of nodule boundaries. Similar findings of increased error for nonspherical shapes was reported by Marten et al (14).
Petrou et al (31) also studied the effect of shape and found no statistically significant differences for volume measurement variability for round (n = 55) versus elongated (n = 20) nodules. However, a significant effect was found for nodule margination, where nodules were classified as smooth or spiculated. For the nodules segmented by using a particular software package (Volume Analysis; GE HealthCare) the spiculated versus smooth comparison in volume measurement variability yielded statistically significant differences. The effect of nodule margin on volumetric measurement accuracy is clearly an underexamined area that needs further inquiry.
Nodule Attenuation
Lung nodules are typically categorized as solid, nonsolid (commonly known as ground-glass opacities), and part-solid (also known as ground-glass opacities with a solid center) (1,51). Studies have shown (52) that there is a higher prevalence of malignancy among nonsolid and part-solid nodules than among solid nodules. Despite the clinical importance of radiologic attenuation, only a small number of authors have reported volumetric measurement results on nonsolid nodules, possibly due to the lack of dedicated software that can accurately segment low-contrast and uneven boundaries. Ko et al (12) showed that absolute error values were higher for ground-glass opacities, or nonsolid nodules, than for solid nodules.
In summary, the size, shape, and attenuation of lung nodules have been shown to significantly affect volumetric measurement error. These characteristics need to be considered for planning CT scans (ie, select the smallest section thickness when monitoring small nodules). Moreover, there is a clear need for software dedicated to evaluation of nonsolid lung nodules.
Researchers in a few recent studies have also attempted to measure how lung nodule volume estimates vary with changes in inspiratory level. Petkovska et al (53) and Weiss et al (54) reported significant differences in measured nodule size (volume and diameter) between different inspiration levels, although Gietema et al (32) reported that inspiration level had only a weak effect.
APPLICATIONS OF VOLUMETRIC ANALYSIS
The volumetric assessment of lung nodules is primarily applied to estimation of nodule growth rate or volume doubling time (VDT) and to evaluation of nodule response to treatment. Volumetric accuracy is critical for these applications because measurements with low uncertainty allow change to be measured in a shorter period of time, facilitating quicker and more accurate diagnosis and the selection for appropriate treatment. Revel et al (55) used volumetric analysis to extract VDT and consequently distinguish malignant from benign nodules with a sensitivity and specificity of 91% and 90%, respectively. Other studies have also used volumetry for calculating VDT (5,23). A concern with the use of VDT is that it assumes a constant exponential growth rate, which is not true for all tumors.
Nodule volumetry with temporal scans has also been used to categorize treatment response (6,24,28). Marten et al (6) compared interobserver variability in the evaluation of treatment response by using specific categories (ie, complete remission, partial remission, stable disease, and progressive disease) between different measurement methods. One-dimensional measurements were extracted by two observers using electronic calipers, followed by automated volumetric measurements with a commercial software package (LungCare; Siemens). While there were no discrepancies in patient response assessment between observers using volume measurements, there was discordance in 24% of patients when one-dimensional analysis with RECIST was used. Poor agreement between volumetric and single-section measurements is commonly seen when the nodule does not conform to the approximately spherical or ellipsoidal assumptions that underlie the one- and two-dimensional measurements, respectively (33). Related work by other investigators (24,28,33) has examined disagreement between volumetric and single-section (one-, two-dimensional) assessment of nodule size change.
DISCUSSION
Drawing conclusions from the studies covered in this review is difficult owing to the interdependency among different factors, the uncertainty in measurements, and the challenge of comparing results from studies that have been conducted with different protocols, patient populations, and synthetic phantoms. For instance, the significant improvement in measurement error reported with a high-frequency reconstruction kernel (12) was based on analysis of small (<5-mm) nodules. It is not clear if the improved volume estimates for the high-frequency kernel would generalize to larger nodules or to different acquisition protocols. The relatively small bias (<3%) reported by Yankelevitz et al (5) was based on solid, homogeneous, synthetic nodules without vascular attachments and with a mean attenuation of 175 HU, imaged with a high-resolution protocol (140 kVp, 200 mA). One would not expect this small magnitude of error to be seen in an actual drug trial or patient care setting.
Even with these limitations, we can generally conclude that section thickness (section width) is one of the most important CT acquisition parameters to control. Findings from several studies demonstrated differences in volumetric measurement error ranging from 10% to 40% between scans acquired with thin and thick section widths. These large estimation errors must be taken into consideration when temporal scans with different section thicknesses are compared, particularly for small nodules. With regard to exposure, the recent study by Das et al (19) showed no significant effect on volumetric measurement error between low-dose (20-mAs) and standard-dose (100-mAs) protocols for four major scanner vendors. These exposure results could support a trend toward low-dose protocols that minimize patient exposure without degrading tumor tracking over time. Ko et al (12), however, have reported exposure results that are somewhat contradictory in comparison to the Das et al conclusions. Finally, the current literature is not sufficient for developing conclusions on the effects of collimation or pitch on volumetric measurements with the new generation of multisection CT systems. This review also revealed a paucity of studies focusing on part-solid and/or nonsolid nodules.
Phantom studies are valuable for quantifying the sources and extent of measurement error and for establishing lower bounds on the estimation error of volumetric measurements. However, they lack the complexity and variability of clinical studies in terms of the characteristics of nodules and their surrounding anatomy. Realistic thoracic phantoms could serve to bridge the gap between findings from experimental studies and those from clinical studies. Phantom technology has advanced to the point where thoracic phantoms incorporating lung vasculature are now available, as shown in Figure 4. Such phantoms allow the evaluation of segmentation algorithms within a somewhat more realistic and variable lung field. More realistic synthetic nodules including those with irregular shapes and margins, as well as those with inhomogeneous attenuations, to mimic nonsolid nodules, are also needed to depict variables present in clinical practice.
In addition to phantom studies, Monte Carlo simulation studies could also be helpful. Recent advances in simulation tools for x-ray imaging systems (56) allow the generation of images with realistic properties by tracking the transport of particles from the x-ray source through the object of interest to the detector plane. In recent studies (57,58), Monte Carlo simulation was used to generate thoracic CT images of realistic anthropomorphic phantoms described by triangle meshes and to model realistic coronary angiograms. A similar approach could be used to simulate a variety of lung nodules while controlling for variables such as image-acquisition parameters, nodule characteristics, and the complexity of surrounding structures. The relevance of simulation clearly depends on the accuracy of the simulation tools.
Nearly all current volume estimation methods in the literature use voxel counting, which relies on the assumption that voxels accurately represent the underlying object. While this may be adequate for large volume differences, it can be problematic for smaller changes, owing to the inherent error in representing small nodules and edge features with CT voxels, even in the absence of noise. Alternative approaches may include the estimation of tumor shape or volume in the continuous space of the object, taking into account the continuous-to-discrete image-formation process and noise in the data. A comprehensive treatise on these issues can be found in the text by Barrett and Myers (59). Another approach to improve volumetry may involve the inclusion of image registration into the volume estimation process. Meyer et al (60) used low-degree-of-freedom registration to grossly align liver lesions, form a subtraction image, and then compute the volume change from the difference image. Related work was performed by Thirion and Calmon (61).
Thoracic scans of patients and of phantoms containing synthetic nodules could be made publicly available to enable developers to perform comparisons regarding measurement error between different methods. Similarly, public databases of clinical data sets acquired with a set of standard protocols could be useful in testing the relative performance of measurement tools. The Reference Image Database for Evaluation of Response, or RIDER, consortium has been created by the National Cancer Institute and the National Institute of Biomedical Imaging and Bioengineering with the purpose of establishing such databases, and currently a database of serial CT scans is available for download from the National Cancer Imaging Archive. A second source of thoracic CT data is the Lung Image Database Consortium database, which contains lung nodules that have been annotated (62). Nodule boundary truth data, particularly that which has been evaluated by multiple observers as has been done by the Lung Image Database Consortium, is valuable for development and evaluation of volumetric segmentation algorithms. In a related manner, the Radiological Society of North America has recently launched the Quantitative Imaging Biomarkers Alliance, as “an initiative by researchers, healthcare professionals and industry to advance quantitative imaging and the use of imaging biomarkers in clinical trials and clinical practice” (http://qibawiki.rsna.org/index.php?title=Main_Page). Along with fluorine 18 fluorodeoxyglucose positron emission tomography/CT and dynamic contract material–enhanced MR imaging, volumetric CT was chosen by QIBA as a biomarker to quantify the effects of novel therapeutic candidates for cancer.
CONCLUSION
We have reviewed issues related to the volumetric assessment of lung nodules with thoracic CT. Our review points to the need for continued research to examine volumetric accuracy as a function of the multitude of interrelated variables that are involved in the assessment of lung nodules with CT. An understanding of the sources and extent of error would allow software developers and users of quantitative imaging to control for these effects through system improvements (hardware, software, and operator contributions), while physicians could incorporate this knowledge into their assessment of lung nodule change and patient care.
ESSENTIALS
The review examines the effect of different factors on the volumetric assessment of lung nodules with thoracic CT, including image-acquisition and reconstruction parameters and nodule characteristics.
The performance of algorithms for nodule segmentation and volume estimation are reviewed, and a number of underexamined areas of research regarding volumetric assessment of lung nodules are identified.
This review promotes understanding and tries to quantify the sources of volumetric measurement error in the assessment of lung nodules with CT as a first step toward the development of methods to minimize that error.
Improving the precision and accuracy of volumetric analysis could increase its use and lead to improved quantitative imaging for analysis of nodule size changes.
Acknowledgments
We especially thank Samuel G. Armato III, PhD, Ella Kazerooni, MD, and Charles R. Meyer, PhD, for their contributions to the preparation of this review, as well as the other members of the Reference Image Database for Evaluation of Response consortium.
Abbreviations
RECIST = Response Evaluation Criteria in Solid Tumors
Authors stated no financial relationship to disclose.
Funding: This research was supported by the Cancer Imaging Program of the National Cancer Institute and the Intramural Program of the National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health.
References
- 1.Hasegawa M, Sone S, Takashima S, et al. Growth rate of small lung cancers detected on mass CT screening. Br J Radiol 2000;73:1252–1259. [DOI] [PubMed] [Google Scholar]
- 2.Jaffe CC. Measures of response: RECIST, WHO, and new alternatives. J Clin Oncol 2006;24:3245–3251. [DOI] [PubMed] [Google Scholar]
- 3.Therasse P, Arbuck SG, Eisenhauer EA, et al. New guidelines to evaluate the response to treatment in solid tumors. J Natl Cancer Inst 2000;92:205–216. [DOI] [PubMed] [Google Scholar]
- 4.Miller AB, Hoogstraten B, Staquet M, Winkler A. Reporting results of cancer treatment. Cancer 1981;47:207–214. [DOI] [PubMed] [Google Scholar]
- 5.Yankelevitz DF, Reeves AP, Kostis WJ, Zhao B, Henschke CI. Small pulmonary nodules: volumetrically determined growth rates based on CT evaluation. Radiology 2000;217:251–256. [DOI] [PubMed] [Google Scholar]
- 6.Marten K, Auer F, Schmidt S, Kohl G, Rummeny EJ, Engelke C. Inadequacy of manual measurements compared to automated CT volumetry in assessment of treatment response of pulmonary metastases using RECIST criteria. Eur Radiol 2006;16:781–790. [DOI] [PubMed] [Google Scholar]
- 7.Bogot NR, Kazerooni EA, Kelly AM, Quint LE, Desjardins B, Nan B. Interobserver and intraobserver variability in the assessment of pulmonary nodule size on CT using film and computer display methods. Acad Radiol 2005;12:948–956. [DOI] [PubMed] [Google Scholar]
- 8.Erasmus JJ, Gladish GW, Broemeling L, et al. Interobserver and intraobserver variability in measurement of non-small-cell carcinoma lung lesions: implications for assessment of tumor response. J Clin Oncol 2003;21:2574–2582. [DOI] [PubMed] [Google Scholar]
- 9.Kalender WA. Technical foundations of spiral CT. Semin Ultrasound CT MR 1994;15:81–89. [DOI] [PubMed] [Google Scholar]
- 10.Dodd LE, Wagner RF, Armato SG, et al. Assessment methodologies and statistical issues for computer-aided diagnosis of lung nodules in CT: contemporary research topics relevant to the Lung Image Database Consortium. Acad Radiol 2004;11:462–475. [DOI] [PubMed] [Google Scholar]
- 11.Zerhouni EA, Boukadoum M, Siddiky MA, et al. A standard phantom for quantitative CT analysis of pulmonary nodules. Radiology 1983;149:767–773. [DOI] [PubMed] [Google Scholar]
- 12.Ko JP, Rusinek H, Jacobs EL, et al. Small pulmonary nodules: volume measurement at chest CT—phantom study. Radiology 2003;228:864–870. [DOI] [PubMed] [Google Scholar]
- 13.Winer-Muram HT, Jennings SG, Meyer CA, et al. Effect of varying CT section width on volumetric measurement of lung tumors and application of compensatory equations. Radiology 2003;229:184–194. [DOI] [PubMed] [Google Scholar]
- 14.Marten K, Funke M, Engelke C. Flat panel detector-based volumetric CT: prototype evaluation with volumetry of small artificial nodules in a pulmonary phantom. J Thorac Imaging 2004;19:156–163. [DOI] [PubMed] [Google Scholar]
- 15.Tao P, Griess F, Lvov Y, et al. Characterization of small nodules by automatic segmentation of x-ray computed tomography images. J Comput Assist Tomogr 2004;28:372–377. [DOI] [PubMed] [Google Scholar]
- 16.Goo JM, Tongdee T, Tongdee R, Yeo K, Hildebolt CF, Bae KT. Volumetric measurement of synthetic lung nodules with multi–detector row CT: effect of various image reconstruction parameters and segmentation thresholds on measurement accuracy. Radiology 2005;235:850–856. [DOI] [PubMed] [Google Scholar]
- 17.Kuhnigk JM, Dicken V, Bornemann L, et al. Morphological segmentation and partial volume analysis for volumetry of solid pulmonary lesions in thoracic CT scans. IEEE Trans Med Imaging 2006;25:417–434. [DOI] [PubMed] [Google Scholar]
- 18.Reeves AP, Chan AB, Yankelevitz DF, Henschke CI, Kressler B, Kostis WJ. On measuring the change in size of pulmonary nodules. IEEE Trans Med Imaging 2006;25:435–450. [DOI] [PubMed] [Google Scholar]
- 19.Das M, Ley-Zaporozhan J, Gietema HA, et al. Accuracy of automated volumetry of pulmonary nodules across different multislice CT scanners. Eur Radiol 2007;17:1979–1984. [DOI] [PubMed] [Google Scholar]
- 20.Kinnard LM, Gavrielides MA, Myers KJ, et al. Volume error analysis for lung nodules attached to bronchial vessels in an anthropomorphic thoracic phantom. In: Giger ML, Karssemeijer N, eds. Proceedings of SPIE: medical imaging 2008—computer-aided diagnosis. Vol 6915. Bellingham, Wash: International Society for Optical Engineering, 2008; 69152Q1–69152Q9.
- 21.Way TW, Chan HP, Goodsitt MM, et al. Effect of CT scanning parameters on volumetric measurements of pulmonary nodules by 3D active contour segmentation: a phantom study. Phys Med Biol 2008;53:1295–1312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ko JP, Betke M. Chest CT: automated nodule detection and assessment of change over time—preliminary experience. Radiology 2001;218:267–273. [DOI] [PubMed] [Google Scholar]
- 23.Kostis WJ, Reeves AP, Yankelevitz DF, Henschke CI. Three-dimensional segmentation and growth-rate estimation of small pulmonary nodules in helical CT images. IEEE Trans Med Imaging 2003;22:1259–1274. [DOI] [PubMed] [Google Scholar]
- 24.Tran LN, Brown MS, Goldin JG, et al. Comparison of treatment response classifications between unidimensional, bidimensional, and volumetric measurements of metastatic lung lesions on chest computed tomography. Acad Radiol 2004;11:1355–1360. [DOI] [PubMed] [Google Scholar]
- 25.Revel MP, Lefort C, Bissery A, et al. Pulmonary nodules: preliminary experience with three-dimensional evaluation. Radiology 2004;231:459–466. [DOI] [PubMed] [Google Scholar]
- 26.Okada K, Comaniciu D, Krishnan A. Robust anisotropic Gaussian fitting for volumetric characterization of pulmonary nodules in multislice CT. IEEE Trans Med Imaging 2005;24:409–423. [DOI] [PubMed] [Google Scholar]
- 27.Goodman LR, Gulsun M, Washington L, Nagy PG, Piacsek KL. Inherent variability of CT lung nodule measurements in vivo using semiautomated volumetric measurements. AJR Am J Roentgenol 2006;186:989–994. [DOI] [PubMed] [Google Scholar]
- 28.Zhao B, Schwartz LH, Moskowitz CS, Ginsberg MS, Rizvi NA, Kris MG. Lung cancer: computerized quantification of tumor response—initial results. Radiology 2006;241:892–898. [DOI] [PubMed] [Google Scholar]
- 29.Gietema HA, Wang Y, Xu D, et al. Pulmonary nodules detected at lung cancer screening: interobserver variability of semiautomated volume measurements. Radiology 2006;241:251–257. [DOI] [PubMed] [Google Scholar]
- 30.Meyer CR, Johnson TD, McLennan G, et al. Evaluation of lung MDCT nodule annotation across radiologists and methods. Acad Radiol 2006;13:1254–1265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Petrou M, Quint LE, Nan B, Baker LH. Pulmonary nodule volumetric measurement variability as a function of CT slice thickness and nodule morphology. AJR Am J Roentgenol 2007;188:306–312. [DOI] [PubMed] [Google Scholar]
- 32.Gietema HA, Schaefer-Prokop CM, Mali WPTM, Groenewegen G, Prokop M. Pulmonary nodules: interscan variability of semiautomated volume measurements with multisection CT—influence of inspiration level, nodule size, and segmentation performance. Radiology 2007;245:888–894. [DOI] [PubMed] [Google Scholar]
- 33.Reeves AP, Biancardi AM, Apanasovich TV, et al. The Lung Image Database Consortium (LIDC): a comparison of different size metrics for pulmonary nodule measurements. Acad Radiol 2007;14:1475–1485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Curry TS III, Dowdey JE, Murry RC Jr. Christensen's physics of diagnostic radiology. Philadelphia, Pa: Lea & Febiger, 1990.
- 35.Ko JM, Nicholas MJ, Mendel JB, Slanetz PJ. Prospective assessment of computer-aided detection in interpretation of screening mammography. AJR Am J Roentgenol 2006;187:1483–1491. [DOI] [PubMed] [Google Scholar]
- 36.Kuhnigk JM, Dicken V, Zidowitz S, et al. Informatics in radiology (infoRAD): new tools for computer assistance in thoracic CT. I. Functional analysis of lungs, lung lobes, and bronchopulmonary segments. RadioGraphics 2005;25:525–536. [DOI] [PubMed] [Google Scholar]
- 37.Zhao B, Reeves AP, Yankelevitz DF, Henschke CI. Three-dimensional multicriterion automatic segmentation of pulmonary nodules of helical computed tomography images. Opt Eng 1999;38:1340–1347. [Google Scholar]
- 38.Zhao B, Kostis W, Reeves AP, Yankelevitz D, Henschke CI. Consistent segmentation of repeat CT scans for growth assessment in pulmonary nodules. In: Hanson KM, ed. Proceedings of SPIE: medical imaging 1999—image processing. Vol 3661. Bellingham, Wash: International Society for Optical Engineering, 1999; 1012–1018.
- 39.Wiemker R, Rogala P, Hein E, Blaffert T, Rosch P. Computer aided segmentation of pulmonary nodules: automated vasculature cutoff in thick- and thin-slice CT. In: Proceedings of Computer Assisted Radiology and Surgery, CARS 2003. Amsterdam, the Netherlands: Elsevier, 2003; 965–970.
- 40.Wiemker R, Rogalla P, Blaffert T, et al. Aspects of computer-aided detection (CAD) and volumetry of pulmonary nodules using multislice CT. Br J Radiol 2005;78(spec no. 1):S46–S56. [DOI] [PubMed] [Google Scholar]
- 41.Rosset A, Spadola L, Ratib O. OsiriX: an open-source software for navigating in multidimensional DICOM images. J Digit Imaging 2004;17:205–216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Goodsitt MM, Chan HP, Way TW, Larson SC, Christodoulou EG, Kim J. Accuracy of the CT numbers of simulated lung nodules imaged with multi-detector CT scanners. Med Phys 2006;33:3006–3017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Birnbaum BA, Hindman N, Lee J, Babb JS. Multi–detector row CT attenuation measurements: assessment of intra- and interscanner variability with an anthropomorphic body CT phantom. Radiology 2007;242:109–119. [DOI] [PubMed] [Google Scholar]
- 44.Kostis WJ, Yankelevitz DF, Reeves AP, Fluture SC, Henschke CI. Small pulmonary nodules: reproducibility of three-dimensional volumetric measurement and estimation of time to follow-up CT. Radiology 2004;231:446–452. [DOI] [PubMed] [Google Scholar]
- 45.Juluru K, Kim W, Boonn W, King T, Siddiqui K, Siegel E. Volumetric measurements of pulmonary nodules: variability in automated analysis tools. In: Horii SC, Andriole KP, eds. Proceedings of SPIE: medical imaging 2007. —PACS and imaging informatics. Vol 6516. Bellingham, Wash: International Society for Optical Engineering, 2007; 6516131–6516137.
- 46.Zhao B, Schwartz LH, Moskowitz CS, et al. Pulmonary metastases: effect of CT section thickness on measurement—initial experience. Radiology 2005;234:934–939. [DOI] [PubMed] [Google Scholar]
- 47.Napel S. Basic principles of MDCT. In: Fishman EK Jr, Jeffrey RB, eds. Multidetector CT: principles, techniques, and clinical applications. Philadelphia, Pa: Lippincott Williams & Wilkins, 2004.
- 48.Brink JA, Wang G, McFarland EG. Optimal section spacing in single-detector helical CT. Radiology 2000;214:575–578. [DOI] [PubMed] [Google Scholar]
- 49.Mahesh M, Scatarige J, Cooper J, Fishman EK. Dose and pitch relationship for a particular multislice CT scanner. AJR Am J Roentgenol 2001;177:1273–1275. [DOI] [PubMed] [Google Scholar]
- 50.MacMahon H, Austin JH, Gamsu G, et al. Guidelines for management of small pulmonary nodules detected on CT scans: a statement from the Fleischner Society. Radiology 2005;237:395–400. [DOI] [PubMed] [Google Scholar]
- 51.Henschke CI, Yankelevitz DF. Screening for lung cancer. J Thorac Imaging 2000;15:21–27. [DOI] [PubMed] [Google Scholar]
- 52.Henschke CI, Yankelevitz DF, Mirtcheva R, McGuinness G, McCauley D, Miettinen OS. CT screening for lung cancer: frequency and significance of part-solid and nonsolid nodules. AJR Am J Roentgenol 2002;178:1053–1057. [DOI] [PubMed] [Google Scholar]
- 53.Petkovska I, Brown MS, Goldin JG, et al. The effect of lung volume on nodule size on CT. Acad Radiol 2007;14:476–485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Weiss E, Wijesooriya K, Dill SV, Keall PJ. Tumor and normal tissue motion in the thorax during respiration: analysis of volumetric and positional variations using 4D CT. Int J Radiat Oncol Biol Phys 2007;67:296–307. [DOI] [PubMed] [Google Scholar]
- 55.Revel MP, Merlin A, Peyrard S, et al. Software volumetric evaluation of doubling times for differentiating benign versus malignant pulmonary nodules. AJR Am J Roentgenol 2006;187:135–142. [DOI] [PubMed] [Google Scholar]
- 56.Badano A, Sempau J. MANTIS: combined x-ray, electron and optical Monte Carlo simulations of indirect radiation imaging systems. Phys Med Biol 2006;51:1545–1561. [DOI] [PubMed] [Google Scholar]
- 57.Badal A, Kyprianou I, Badano A, Sempau J, Myers KJ. Monte Carlo package for simulating radiographic images of realistic anthropomorphic phantoms described by triangle meshes. In: Hsieh J, Flynn MJ, eds. Proceedings of SPIE: medical imaging 2007—physics of medical imaging. Vol 6510. Bellingham, Wash: International Society for Optical Engineering, 2007; 65100Z1–65100Z10.
- 58.Kyprianou IS, Badal A, Badano A, et al. Monte Carlo simulated coronary angiograms of realistic anatomy and pathology models. In: Cleary KR, Miga MI, eds. Proceedings of SPIE: medical imaging 2007—visualization and image-guided procedures. Vol 6509. Bellingham, Wash: International Society for Optical Engineering, 2007; 65090O1–65090O12.
- 59.Barrett HH, Myers KJ. Foundations of image science. New York, NY: Wiley, 2004.
- 60.Meyer C, Park H, Balter JM, Bland PH. Method for quantifying volumetric lesion change in interval liver CT examinations. IEEE Trans Med Imaging 2003;22:776–781. [DOI] [PubMed] [Google Scholar]
- 61.Thirion JP, Calmon G. Deformation analysis to detect and quantify active lesions in three-dimensional medical image sequences. IEEE Trans Med Imaging 1999;18:429–441. [DOI] [PubMed] [Google Scholar]
- 62.McNitt-Gray MF, Armato SG 3rd, Meyer CR, et al. The Lung Image Database Consortium (LIDC) data collection process for nodule detection and annotation. Acad Radiol 2007;14:1464–1474. [DOI] [PMC free article] [PubMed] [Google Scholar]