Skip to main content
RSNA Journals logoLink to RSNA Journals
. 2017 Sep 5;286(1):286–295. doi: 10.1148/radiol.2017162725

Added Value of Computer-aided CT Image Features for Early Lung Cancer Diagnosis with Small Pulmonary Nodules: A Matched Case-Control Study

Peng Huang 1,, Seyoun Park 1, Rongkai Yan 1, Junghoon Lee 1, Linda C Chu 1, Cheng T Lin 1, Amira Hussien 1, Joshua Rathmell 1, Brett Thomas 1, Chen Chen 1, Russell Hales 1, David S Ettinger 1, Malcolm Brock 1, Ping Hu 1,, Elliot K Fishman 1, Edward Gabrielson 1, Stephen Lam 1
PMCID: PMC5779085  PMID: 28872442

Computer-aided diagnosis significantly reduces the low-dose CT screening false-positive rate and increases the positive predictive value of lung nodule evaluation.

Abstract

Purpose

To test whether computer-aided diagnosis (CAD) approaches can increase the positive predictive value (PPV) and reduce the false-positive rate in lung cancer screening for small nodules compared with human reading by thoracic radiologists.

Materials and Methods

A matched case-control sample of low-dose computed tomography (CT) studies in 186 participants with 4–20-mm noncalcified lung nodules who underwent biopsy in the National Lung Screening Trial (NLST) was selected. Variables used for matching were age, sex, smoking status, chronic obstructive pulmonary disease status, body mass index, study year of the positive screening test, and screening results. Studies before lung biopsy were randomly split into a training set (70 cancers plus 70 benign controls) and a validation set (20 cancers plus 26 benign controls). Image features from within and outside dominant nodules were extracted. A CAD algorithm developed from the training set and a random forest classifier were applied to the validation set to predict biopsy outcomes. Receiver operating characteristic analysis was used to compare the prediction accuracy of CAD with the NLST investigator’s diagnosis and readings from three experienced and board-certified thoracic radiologists who used contemporary clinical practice guidelines.

Results

In the validation cohort, the area under the receiver operating characteristic curve for CAD was 0.9154. By default, the sensitivity, specificity, and PPV of the NLST investigators were 1.00, 0.00, and 0.43, respectively. The sensitivity, specificity, PPV, and negative predictive value of CAD and the three radiologists’ combined reading were 0.95, 0.88, 0.86, and 0.96 and 0.70, 0.69, 0.64, and 0.75, respectively.

Conclusion

CAD could increase PPV and reduce the false-positive rate in the early diagnosis of lung cancer.

© RSNA, 2017

Online supplemental material is available for this article.


See also the editorial by MacMahon in this issue.

Introduction

The National Lung Screening Trial (NLST) demonstrated a 20% reduction in lung cancer mortality with low-dose computed tomography (CT) screening among heavy smokers (1), but the overall benefits of low-dose CT have been questioned because of concerns about its low positive predictive value (PPV) and high false-positive (FP) rates, which lead to unnecessary morbidity and mortality from additional radiographic or diagnostic procedures (24). Separating small malignant nodules from the majority of benign nodules in low-dose CT images is particularly challenging because their morphologic characteristics are difficult to discern with visual inspection. In the NLST study, the PPV for suspicious nodules was less than 10% (4). Other clinical trials and cohort studies of low-dose CT screening in lung cancer similarly found a PPV of less than 10% for screening-detected nodules, despite the use of different nodule size threshold values (5). Even when screening is restricted to high-risk populations, this consistently low PPV raises concerns about balancing the benefits of early lung cancer detection with the harms of low-dose CT screening, including FP findings, overdiagnosis, and radiation exposure (6,7).

The current clinical standard of reference in nodule diagnosis is the radiologist’s reading. However, many nodule heterogeneity patterns cannot be visualized. On the other hand, various computer-aided image features have been proposed to quantify such heterogeneity, in addition to those features in alignment with the radiologist’s description, such as size, shape, margin, attenuation, and growth rate (823). Although extensive investigations have revealed that different computer-aided image features are associated with lung cancer development, computer-aided diagnosis (CAD) approaches have not been translated into clinical practice because it is unclear whether such approaches can provide additive information beyond the radiographic characteristics used by radiologists in routine clinical practice. Furthermore, multiple optimal cutoff values are often used in CAD approaches without independent validation, which could lead to overestimated results and largely inflated type I errors (24,25); a small change in the data could result in different conclusions because of complex multicollinearity and interactions among features. A direct comparison in diagnostic performance between CAD approaches and radiologist reading is lacking.

We performed a matched case-control study using NLST data to evaluate the value of a novel CAD algorithm that analyzes texture features of nodules as well as surrounding lung tissues. Study samples were split into training and validation sets. We derived the CAD algorithm using machine learning from analysis of a training set and tested it in a validation set, using prespecified cutoff values derived from the training set. The purpose of this study was to test whether CAD approaches can increase the PPV and reduce the FP rate in lung cancer screening for small nodules, as compared with readings by thoracic radiologists.

Materials and Methods

Financial Support

This work was partially supported by the Johns Hopkins University Discovery Award, Johns Hopkins-Allegheny Health Network Cancer Research, and P30CA006973. The National Cancer Institute (NCI) provided access to the NLST database.

Technical Parameters

The technical parameters of the low-dose CT scanning protocol were 120 kVp; 40–80 mAs; detector collimation, 0.5–2.5 mm (for one data channel); image reconstruction section width, 2–3.2 mm; and interval, 1.8–2 mm. A soft-tissue/smoothing algorithm without high spatial frequency enhancement was used in image reconstruction.

NLST Low-Dose CT Image Selection

Low-dose CT studies were selected from among those performed in 26 722 NLST participants (age range, 55–74 years; 30 or more pack-years of cigarette smoking history; no more than 15 years of smoking cessation for former smokers) (1). A positive low-dose CT screening test was defined as the finding of one or more indeterminate noncalcified lung nodule(s) with a largest dimension of 4 mm or larger or mediastinal masses, pleural disease, or atelectasis of more than one segment. Figure 1 describes the sample selection procedure, which was performed by the Information Management Services team (J.R. and B.T.) who managed the NLST database and a statistician (P. Hu) from the NCI. A total of 8392 participants were identified, and 810 of them underwent lung biopsy linked to their positive low-dose CT screening results; 479 had positive biopsies (case patients) and 331 had negative biopsies (control subjects). One low-dose CT image per person acquired in the same study a year before the biopsy was selected to form matched case-control pairs. The matching variables were as follows: age ± 5 years, sex, smoking status at randomization (current/former), COPD status at randomization, BMI (<30 or ≥30 kg/m2), study year of the positive screening, number of screening examinations, and screening results. Screening results were defined as follows: A score of 1 indicated a negative screening, with no clinically significant abnormalities; a score of 2, a negative screening, with minor abnormalities not suspicious for lung cancer; a score of 3, a negative screening, with clinically significant abnormalities not suspicious for lung cancer; a score of 4, a positive screening, with change unspecified, nodule(s) 4 mm or larger or enlarging nodule(s), mass(es), or other nonspecific abnormalities suspicious for lung cancer; a score of 5, a positive screening, with no significant change, stable abnormalities potentially related to lung cancer, or no significant change since prior screening examination; and a score of 6, a positive screening, with “other.” A total of 201 matched case-control image pairs were identified. The Information Management Services team randomly sampled 134 image pairs and sent them to the Johns Hopkins University team for image processing and CAD algorithm development.

Figure 1:

Figure 1:

The NLST sample selection process. The eight eligibility criteria were as follows: (a) had submitted the baseline questionnaire, (b) had adequate CT images, (c) had no lung cancer diagnosis prior to the first low-dose CT (LDCT ) screening examination, (d) had at least one lung nodule diameter measured in the NLST database, (e) had at least one positive screening examination, (f ) had at least one nodule larger than 4 mm in diameter, (g) had relevant images (ie, a patient with cancer had undergone low-dose CT screening within 1 year before the lung cancer diagnosis, and a subject without cancer had undergone all three screening examinations), and (h) did not have fluid or water attenuation in the largest nodule from the last low-dose CT screening examination. Variables used to match case patients and control subjects were study year of the positive screening, result of the low-dose CT screening, number of missed screening rounds, smoking status at randomization (current or former), sex, chronic obstructive pulmonary disease (COPD) status at randomization, body mass index (BMI) (<30 or ≥30 kg/m2), and age ± 5 years.

CT Examinations

Two radiologists (R.Y. and C.T.L., with 4 and 3 years of postfellowship experience, respectively) together identified one dominant nodule per image that was considered most likely to be biopsied or resected. Nodule locations were sent to a statistician (P. Huang, who was not blinded to biopsy outcomes), who checked whether tumor anatomic locations and diameters matched those from the NLST database reported by pathologists. Five dominant nodules were found to be inconsistent with records in the NLST database because of mismatched section numbers. In one case, a hilar lymph node rather than a parenchymal nodule was noted by the NLST. Meeting discussions were conducted to reach consensus regarding these dominant nodule locations from all radiologists (R.Y., L.C.C., C.T.L., A.H., and E.K.F., all of whom were blinded to biopsy outcomes). Among 134 pairs, 90 case patients and 96 control subjects had a dominant noncalcified nodule diameter of 20 mm or smaller. These images were randomly split into a training set (70 cancers plus 70 benign controls) for prediction algorithm development and a test set (20 cancers plus 26 benign controls) for algorithm validation (Fig 1). Tables E1 and E2 (online) summarize the nodule distribution on these images. The 46 images in the test set were sent to three experienced and board-certified thoracic radiologists (L.C.C., C.T.L., and A.H., with 4, 3, and 4 years of postfellowship experience, respectively) to read independently using contemporary clinical practice guidelines.

Computer-aided CT Image Analysis

Image processing was performed by S.P. and J.L. (with 5 and 11 years of experience, respectively, after PhD training). Both lungs were first automatically segmented by using a thresholding approach followed by morphologic image processing. For each image, the single identified dominant nodule and adjacent draining mediastinal lymph nodes were segmented by using a single-click ensemble semiautomatic segmentation approach (26). More specifically, the nodule region automatically grew from a user-given seed until it met the boundary, which is determined by intensity similarities and proximity to different intensity regions (eg, air). All segmented regions were reviewed on screen with manual correction when needed. To extract local image features from the lung, we detected lobar fissures and segmented the lung lobes with an adaptive fissure sweep to coarsely define fissure regions of the lobar fissures and followed this with a watershed transformation to refine the location and curvature of the fissures within the fissure regions (2729). Subdivision of the lung allowed us to compute local lung image features (in addition to global features extracted from the whole lung), which, when combined with the target nodule image features, could improve prediction accuracy as compared with conventional methods that use image features computed only from the segmented nodules. Table 1 lists image feature categories extracted from intranodular areas (within the segmented nodules), surrounding lung parenchymal areas (perinodular), and extranodular areas (from the segmented lung). For each nodule, separated features from its core region and the margin region were extracted (Fig 2) to quantify how nodule central activities extended to the adjacent tissues. A total of 1342 features were extracted, including 1108 radiomics features (12), as well as others computed from our in-house software.

Table 1.

List of Image Features Extracted in the Study

graphic file with name radiol.2017162725.tbl1.jpg

Figure 2:

Figure 2:

CT images show the segmentation of two three-dimensional regions for each identified dominant nodule. The first region consists of all voxels inside the core nodule, and the second region consists of all voxels in the nodule margin area. For the sections displayed, the first region is indicated by pixels inside the green curve and the second region is indicated by pixels between the green curve and yellow curve.

Statistical Analysis

Descriptive statistics were used to summarize the demographic data of the 186 individuals, the distribution of the dominant nodule size by margin, and attenuation categories. For highly correlated image features (R > 0.95), only a single representative feature was used, and others were removed. Features with variance equal to 0 were also removed. This process selected 458 variables, and, by using the training data, the dimension was further reduced by choosing the 38 most frequently top-ranked variables from a random forest algorithm with five-folder cross-validation (Table E3 [online]). A random forest with 5000 trees was also used to develop the CAD algorithm (from the training set) and was then applied to the validation set of 46 low-dose CT images to calculate their predicted probabilities of malignancy score Pm for dominant nodules by using the majority vote of all 5000 trees. By using the prespecified cutoff value of 0.5, CAD classified an image as positive if its Pm was greater than 0.5, and negative otherwise. The biopsy outcomes were used as the ground truth because all biopsy outcomes in the NLST study have been confirmed by clinical outcomes at long-term follow-up (30). Two statisticians (including P. Huang, with 17 years of experience after PhD training) performed data analyses independently. The results of these two investigators were then compared for developing consensus results.

CAD performance on the validation set was evaluated by using three methods: (a) the area under the receiver operating characteristic curve (AUC); (b) a comparison with readings of NLST trial investigators, who were likely to have compared these images with historical images and to have considered other available clinical information that was not included in the NLST database (and thus was not available to us); and (c) a comparison with the combined reading (majority voted) of three radiologists.

Results

Study Participants

Among the selected 186 positive low-dose CT images, 25, 15, and 146 images were from the T0, T1, and T2 annual screening examinations, respectively. The mean dominant nodule size was 11.73 mm ± 4.08 (median, 11 mm; range, 4–20 mm; Fig E1 [online]). Thirty-six (40%) of 90 case patients and 47 (49%) of 96 control subjects had dominant nodule diameters of 10 mm or smaller. The demographic characteristics between the 90 case patients and the 96 control subjects were well matched, although patients with cancer had a slightly higher number of pack-years and total years of smoking (Table 2). The age of the female participants ranged from 55 to 74 years for both case patients and control subjects. The age of the male participants ranged from 55 to 74 years for case patients and from 55 to 73 years for control subjects. The total number of identified noncalcified nodules (≥4 mm in diameter) in case patients and control subjects was 139 and 183, respectively (Tables 3, E1, and E2 [online]). Dominant nodules (one per image per person) in both case patients and control subjects had a similar radiographic appearance and distributions (Table 4).

Table 2.

Demographic Data in the 90 Patients with Cancer and 96 Matched Control Subjects with Benign Conditions

graphic file with name radiol.2017162725.tbl2.jpg

Note.—Unless otherwise stated, data are means ± standard deviations. All individuals underwent lung biopsy because of a positive CT screening result.

*Data are numbers of patients or control subjects, with percentages in parentheses.

Table 3.

Distribution of All Detected Noncalcified Nodules or Masses with Opacity ≥ 4 mm in Diameter from the 186 Images

graphic file with name radiol.2017162725.tbl3.jpg

Note.—Unless otherwise stated, data are means ± standard deviations.

Table 4.

Characteristics of Dominant Nodules

graphic file with name radiol.2017162725.tbl4.jpg

Note.—Unless otherwise stated, data are means ± standard deviations.

Because nodule diameter was used to match case patients and control subjects, it was not associated with biopsy outcomes. Figure 3 displays the nodule distribution by using a single classification tree. The number of dominant nodules with cancerlike scores of 1, 2, and 3, respectively, was 68, 52, and 66. Among 66 nodules with a cancerlike score of 3, 12 were biopsy negative, and 51 biopsy-positive nodules had the highest cp feature scores (all > 0.649). Among the 68 nodules with a cancerlike score of 1, 11 were biopsy positive, and 53 biopsy-negative nodules had the highest perinodular mean second-order gradient score (d2m2 > 0.03). For the 52 nodules with a cancerlike score of 2, 45 were correctly classified as positive or negative by using a combination of smoking pack-year history, contrast features (f169) from wavelet imaging, and relative change in mean density from intranodular to perinodular regions (feature ct.m21).

Figure 3:

Figure 3:

Graph shows distribution of the 186 dominant nodules using a single classification tree. f169 Is the contrast feature from the wavelet image with LLL-pass filter. pkyr = Pack-year.

The random forest built from the 140 training set images with 38 variables (Table E3 [online]) identified 10 top-ranked variables (Fig E2 [online]). The summary score cancerlike is the most important feature to improve the prediction accuracy, on the basis of both mean decrease in accuracy and mean decrease in Gini impurity criteria in feature importance ranking. Gini impurity measures the loss in accuracy when observed labeling from that variable is not used. The second most important feature was vessel involvement, which indicates whether there is a vessel (either normal or abnormal) penetrating into the nodule. Although multiple image features were found to be significantly different (P < .001, rank test) between case patients and control subjects, none was strong enough when used alone, with AUCs for individual features ranging from 0.48 to 0.60. Thus, all single identified variables were weak classifiers.

Prediction in the Validation Sample

The Pm score from the random forest analysis of the 140 training set images was applied to the test set of 46 images. The Pm scores in the 20 case patients were significantly higher than those in the 26 control subjects (Fig 4a, P < .001), with an AUC of 0.9154 (Fig 4b). When we used a prespecified cutoff of a Pm of 0.5 to define image positive prediction and with biopsy outcome as the ground truth, the sensitivity and specificity were 0.95 (19 of 20) and 0.88 (23 of 26), respectively. The PPV and negative predictive value (NPV) were 0.86 and 0.96, respectively. In contrast, all 46 individuals had positive low-dose CT screening results in the NLST study, while only 20 had positive lung biopsies. Because we selected only positive images, the sensitivity of the NLST investigators was 1.00 and specificity was 0.00 by default. The PPV for the NLST investigators was 0.43, and the NPV was not evaluable. The sensitivity, specificity, PPV, and NPV from the three radiologists using a majority vote were 0.70 (14 of 20), 0.69 (18 of 26), 0.64, and 0.75, respectively. Thus, CAD increased PPV by 0.43 as compared with the NLST and by 0.22 as compared with the three radiologists’ reading. Meanwhile, CAD decreased the FP rate by 0.88 as compared with the NLST and by 0.19 as compared with the three radiologists’ reading. The overall prediction accuracy from CAD (91% [42 of 46]) was significantly higher than the radiologist reading (70% [32 of 46]), with P = .0180 (two-sided test).

Figure 4a:

Figure 4a:

Graphs show predicted probability of (a) malignancy and (b) diagnostic performance of the random forest algorithm applied to the independent 46 test set image scans. The random forest algorithm was built by using 140 image scans in the training set. CI = confidence interval.

Figure 4b:

Figure 4b:

Graphs show predicted probability of (a) malignancy and (b) diagnostic performance of the random forest algorithm applied to the independent 46 test set image scans. The random forest algorithm was built by using 140 image scans in the training set. CI = confidence interval.

CAD performed well, even for very small nodules. For example, Figure 5, A shows a nodule in a 71-year-old man with a 90–pack-year smoking history and no history of extrapulmonary malignancy who had negative T0 and T1 screenings and only a single nodule smaller than 4 mm identified in the left upper lobe. At T2 screening, the nodule size increased to 5 mm, showing an air bronchogram, a sign that is indeterminate for malignancy. There was no other abnormality in the lung parenchyma, pleura, or chest wall. Because the nodule grew slowly, it fell into the Lung CT Screening Reporting and Data System (Lung-RADS) 4A category. The patient underwent thoracotomy at T2 after diagnostic CT, fluorodeoxyglucose (FDG) positron emission tomography (PET), pulmonary function testing and spirometry, and clinical evaluation. The T2 image had a Pm of 0.7628 (> 0.5), and the patient was given a diagnosis of stage IA acinar adenocarcinoma at resection at T2. As another example, Figure 5, B is from the T0 screening study in a 68-year-old man with a 50–pack-year smoking history and no history of extrapulmonary malignancy. A 6-mm ovoid nodule was seen along the right minor fissure, with no other abnormality of the lung parenchyma, pleura, or chest wall. This nodule was best categorized as a Lung-RADS 3 lesion (probably benign) and may possibly represent an intrapulmonary lymph node. However, CAD scored this nodule as Pm of 0.7778 (> 0.5), suggesting a high malignancy potential. After magnetic resonance imaging, ultrasonography, diagnostic CT, FDG PET, and clinical evaluation, this individual underwent lung biopsy and was given a diagnosis of acinar adenocarcinoma. The malignancy rapidly spread (stage IV), and the patient died within 13 months of T0 screening, despite systemic chemotherapy.

Figure 5:

Figure 5:

The texture analysis algorithm correctly classified, A, B, biopsy-positive nodules with a benign appearance as positive and, C, D, biopsy-negative nodules with a malignant appearance as negative. Biopsies of nodules in A, B, C, and D were performed at screening years T2, T0, T2, and T0, respectively, as described in the text. The low-dose CT images from the same screening year before these biopsies were used in the texture analysis to calculate their predicted probabilities of malignancy Pm scores, which were as follows: 0.7628 for the nodule in A, 0.7778 for the nodule in B, 0.317 for the nodule in C, and 0.113 for the nodule in D. Texture maps from these images are shown at the right.

An interesting finding is that CAD could also correctly classify benign nodules with CT image features that were generally considered to be typical for malignancy. For example, the nodule in Figure 5, C was in a 63-year-old woman with 46 pack-years of smoking history and grew from 12 mm (at T0 screening) to 14 mm (at T1 screening) and further increased to 19 mm (at T2 screening). The patient underwent diagnostic CT and FDG PET, and a subsequent percutaneous transthoracic biopsy was negative, consistent with the CAD score of Pm of 0.317 (< 0.5) derived from the T2 screening image. She was alive without lung cancer after 5 years of follow-up by the end of the NLST study. In another case, the nodule shown in Figure 5, D was present in a 61-year-old woman with 43 pack-years of smoking. Initial screening at T0 demonstrated a 15-mm spiculated solid nodule in the lower left lobe, and after diagnostic CT and clinical evaluation, the patient underwent lung biopsy at the T0 screening year that was negative for malignancy at pathologic examination. This T0 nodule was scored as having a Pm of 0.113 (< 0.5) by CAD, and the nodule was stable during the T1 and T2 follow-up screening rounds, with sizes of 15 and 14 mm, respectively. This patient was also alive without lung cancer at the end of the NLST study, with 7.11 years follow up.

Exploratory Analysis

Several CAD features provided independent nodule information other than the image features commonly used by radiologists. The cp score was significantly higher in case patients than in control subjects (P = .005, Wilcoxon rank test). Histologic subtype reports were available for 46 malignant nodules that were surgically resected. For each hematoxylin-eosin–stained slide, up to four regions of interest (ROIs) with a total of 103 ROIs were identified. The histologic subtype in each ROI was recorded. Among the ROIs, nodules with highest median cp scores were from small-cell carcinoma (median = 30.7; second quartile [Q2] = 30.7; third quartile [Q3] = 30.7; n = 2), papillary adenocarcinoma (median = 8.0; Q2 = 4.4; Q3 = 9.4; n = 7), and squamous cell carcinoma (median = 7.2; Q2 = 3.2; Q3 = 9.2; n = 10), while nodules with low median cp scores were from mixed subtypes of adenocarcinoma (median = 2.3; Q2 = 0.8; Q3 = 4.7; n = 8) or nonmucinous adenocarcinoma in situ (median = 3.9; Q2 = 1.2; Q3 = 8.4; n = 22). In addition, cp score was positively correlated with the percentage of tumor cells (Spearman correlation R = 0.2446, P = .0169), but not with the percentage of inflammatory cells (R = −0.0115, P = .9126). 

Feature d2m2 measures the mean second gradient in the margin area. It is a measure of the convexity of the image intensity. Larger values of d2m2 correspond to clearer margins (or sharper differences in Hounsfield unit scale between the nodule’s internal and margin areas), while nodules with lower d2m2 values often have blurred margins. The d2m2 score was negatively correlated with the percentage of tumor cells (R = −0.2079, P = .0432) and was positively correlated with the percentage of inflammatory cells (R = 0.2673, P = .0096).

Discussion

Radiologists typically risk stratify noncalcified indeterminate pulmonary nodules by interpreting nodule characteristics such as diameter, volume, margin, attenuation, and location. However, none of these variables alone is sufficient to accurately classify indeterminate pulmonary nodules, because of significant overlap among risk categories and complicated interactions among features (3134). For example, many benign nodules, such as granulomas, could have a malignant appearance at CT because of spiculated or lobulated margins without a subsolid or ground-glass component. The CAD algorithm performed well, even for small (<10 mm) lung nodules.

By the design of this matched case-control study, both benign and malignant nodules were from patients with similar levels of risk for lung cancer, on the basis of smoking status and radiographic characteristics commonly used by radiologists. The improved diagnostic performance of CAD was thus likely due to the independent information from image features, and the higher prediction accuracy of the independent validation data from the CAD system, as compared with the three radiologists’ readings, suggest that CAD substantially reduces the low-dose CT screening FP rate and increases the PPV of lung nodule evaluation. By helping radiologists distinguish benign lesions from malignant ones, CAD has the potential to reduce the morbidity associated with low-dose CT screening, including radiation exposure, overdiagnosis of incidental findings, and anxiety, as well as to reduce unnecessary testing and the financial costs of lung cancer screening.

There were a number of limitations to this study. First, the study sample was restricted only to participants who underwent lung biopsy or resection after a positive low-dose CT screening result, with case patients and control subjects retrospectively selected and matched. Thus, our study sample might not be representative of a general lung screening population. Second, our analysis used only one low-dose CT image acquired right before the lung biopsy per person, and only one dominant nodule per image was used in feature extraction. In practice, patients could have multiple detected nodules and repeated CT studies. Including growth or stability information is likely to improve the diagnostic performance (35,36). However, by including only one nodule per patient in our analysis, correlation among nodules in the same patient was avoided. Third, the CAD algorithm was built from a random forest that used bootstrap samples from the training data. Because of the small sample size in this study, and because image reconstruction parameters vary among images, the performance of CAD requires prospective validation in a screening population.

Several prediction models have recently been proposed to estimate the probability of cancer for detected nodules by using demographic risk factors and limited image features such as nodule size, volume, margin, attenuation, and count (37,38). Addition of nodule and non-nodule image texture features to these prediction models could substantially increase their accuracies for lung cancer screening.

Advances in Knowledge

  • ■ Even when patients with lung cancer and those without cancer were matched by known lung cancer development risk factors in age, sex, body mass index, smoking status, chronic obstructive pulmonary disease, dominant lung nodule diameter, margin category, and attenuation (nonsolid, part solid, or solid), computer-aided diagnosis (CAD) could still use intra- and extranodule characteristics to improve small (≤20 mm) nodule prediction accuracy from 0.70 (radiologists’ reading) to 0.91 (two-sided P = .0180).

  • ■ As compared with radiologists’ reading using the contemporary clinical practice guidelines, CAD increased positive predictive value from 0.64 to 0.86 and decreased the false-positive rate (1 − specificity) from 0.31 to 0.11 for indeterminate pulmonary nodules with diameters of 20 mm or smaller.

Implication for Patient Care

  • ■ The CAD image analysis method significantly improved diagnostic accuracy for lung nodules detected at low-dose CT, from 70% to 91% (two-sided P = .0180).

SUPPLEMENTAL TABLES

Tables E1 E3
ry162725suppa1.pdf (104.2KB, pdf)

SUPPLEMENTAL FIGURES

Figure E1:
ry162725suppf1.jpg (66.6KB, jpg)
Figure E2:
ry162725suppf2.jpg (68.9KB, jpg)

Acknowledgments

Acknowledgments

The authors thank the NCI for access to the NCI data collected by the NLST. We also thank Paul Pinsky, PhD, for his helpful comments to the early version of the manuscript, and Jon Steingrimsson, PhD, for statistical data analysis.

Received December 6, 2016; revision requested January 23, 2017; revision received April 10; accepted May 9; final version accepted May 24.

P. Huang, R.Y., J.L., and C.T.L. supported by Johns Hopkins-Allegheny Health Network Cancer Research. P. Huang and D.S.E. supported by National Cancer Institute (P30CA006973). P. Huang, S.P., J.L., M.B., and E.K.F. supported by Johns Hopkins University (2015 Discovery Award).

The statements contained herein are solely those of the authors and do not represent or imply concurrence or endorsement by NCI.

E.G. and S.L. contributed equally to this work.

Disclosures of Conflicts of Interest: P. Huang disclosed no relevant relationships. S.P. disclosed no relevant relationships. R.Y. disclosed no relevant relationships. J.L. disclosed no relevant relationships. L.C.C. disclosed no relevant relationships. C.T.L. disclosed no relevant relationships. A.H. disclosed no relevant relationships. J.R. disclosed no relevant relationships. B.T. disclosed no relevant relationships. C.C. disclosed no relevant relationships. R.H. disclosed no relevant relationships. D.S.E. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is a consultant for Boehringer Ingelheim, BMS, Eli Lilly, EMD Serono, Genentech, Helsinn Therapeutics, Herron Therapeutics, Trovagene, BeyondSpring Pharma, Celgene, Regeneron, Guardant, and AbbVie. Other relationships: disclosed no relevant relationships. M.B. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is a consultant with Cepheid. Other relationships: disclosed no relevant relationships. P. Hu disclosed no relevant relationships. E.K.F. disclosed no relevant relationships. E.G. disclosed no relevant relationships. S.L. disclosed no relevant relationships.

Abbreviations:

AUC
area under the receiver operating characteristic curve
BMI
body mass index
CAD
computer-aided diagnosis
COPD
chronic obstructive pulmonary disease
FP
false-positive
Lung-RADS
Lung CT Screening Recording and Data System
NCI
National Cancer Institute
NLST
National Lung Screening Trial
PPV
positive predictive value

Contributor Information

Peng Huang, Email: phuang12@jhmi.edu.

Ping Hu, Email: phuang12@jhmi.edu.

References

  • 1.National Lung Screening Trial Research Team , Aberle DR, Adams AM, et al. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 2011;365(5):395–409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Mahadevia PJ, Fleisher LA, Frick KD, Eng J, Goodman SN, Powe NR. Lung cancer screening with helical computed tomography in older adult smokers: a decision and cost-effectiveness analysis. JAMA 2003;289(3):313–322. [DOI] [PubMed] [Google Scholar]
  • 3.Nanavaty P, Alvarez MS, Alberts WM. Lung cancer screening: advantages, controversies, and applications. Cancer Contr 2014;21(1):9–14. [DOI] [PubMed] [Google Scholar]
  • 4.Bach PB, Mirkin JN, Oliver TK, et al. Benefits and harms of CT screening for lung cancer: a systematic review. JAMA 2012;307(22):2418–2429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Gierada DS, Pinsky P, Nath H, Chiles C, Duan F, Aberle DR. Projected outcomes using different nodule sizes to define a positive CT lung cancer screening examination. J Natl Cancer Inst 2014;106(11):dju284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Meza R, ten Haaf K, Kong CY, et al. Comparative analysis of 5 lung cancer natural history and screening models that reproduce outcomes of the NLST and PLCO trials. Cancer 2014;120(11):1713–1724. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Moyer VA; U.S. Preventive Services Task Force. Screening for lung cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med 2014;160(5):330–338. [DOI] [PubMed] [Google Scholar]
  • 8.Bartholmai BJ, Koo CW, Johnson GB, et al. Pulmonary nodule characterization, including computer analysis and quantitative features. J Thorac Imaging 2015;30(2):139–156. [DOI] [PubMed] [Google Scholar]
  • 9.El-Baz A, Beache GM, Gimel’farb G, et al. Computer-aided diagnosis systems for lung cancer: challenges and methodologies. Int J Biomed Imaging 2013;2013:942353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ganeshan B, Abaleke S, Young RC, Chatwin CR, Miles KA. Texture analysis of non-small cell lung cancer on unenhanced computed tomography: initial evidence for a relationship with tumour glucose metabolism and stage. Cancer Imaging 2010;10:137–143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ganeshan B, Goh V, Mandeville HC, Ng QS, Hoskin PJ, Miles KA. Non-small cell lung cancer: histopathologic correlates for texture parameters at CT. Radiology 2013;266(1):326–336. [DOI] [PubMed] [Google Scholar]
  • 12.Aerts HJ, Velazquez ER, Leijenaar RT, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 2014;5:4006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Lambin P, Rios-Velazquez E, Leijenaar R, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 2012;48(4):441–446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Raghunath S, Maldonado F, Rajagopalan S, et al. Noninvasive risk stratification of lung adenocarcinoma using quantitative computed tomography. J Thorac Oncol 2014;9(11):1698–1703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Foley F, Rajagopalan S, Raghunath SM, et al. Computer-aided nodule assessment and risk yield risk management of adenocarcinoma: the future of imaging? Semin Thorac Cardiovasc Surg 2016;28(1):120–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Maldonado F, Duan F, Raghunath SM, et al. Noninvasive computed tomography-based risk stratification of lung adenocarcinomas in the National Lung Screening Trial. Am J Respir Crit Care Med 2015;192(6):737–744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Sluimer I, Schilham A, Prokop M, van Ginneken B. Computer analysis of computed tomography scans of the lung: a survey. IEEE Trans Med Imaging 2006;25(4):385–405. [DOI] [PubMed] [Google Scholar]
  • 18.Cirujeda P, Dicente Cid Y, Muller H, et al. A 3-D Riesz-covariance texture model for prediction of nodule recurrence in lung CT. IEEE Trans Med Imaging 2016 Jul 18. [Epub ahead of print] [DOI] [PubMed] [Google Scholar]
  • 19.Depeursinge A, Yanagawa M, Leung AN, Rubin DL. Predicting adenocarcinoma recurrence using computational texture models of nodule components in lung CT. Med Phys 2015;42(4):2054–2063. [Published correction appears in Med Phys 2015;42(5):2653.] [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Way TW, Hadjiiski LM, Sahiner B, et al. Computer-aided diagnosis of pulmonary nodules on CT scans: segmentation and classification using 3D active contours. Med Phys 2006;33(7):2323–2337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Way TW, Sahiner B, Chan HP, et al. Computer-aided diagnosis of pulmonary nodules on CT scans: improvement of classification performance with nodule surface features. Med Phys 2009;36(7):3086–3098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Raman SP, Chen Y, Schroeder JL, Huang P, Fishman EK. CT texture analysis of renal masses: pilot study using random forest classification for prediction of pathology. Acad Radiol 2014;21(12):1587–1596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Raman SP, Schroeder JL, Huang P, et al. Preliminary data using computed tomography texture analysis for the classification of hypervascular liver lesions: generation of a predictive model on the basis of quantitative spatial frequency measurements–a work in progress. J Comput Assist Tomogr 2015;39(3):383–395. [DOI] [PubMed] [Google Scholar]
  • 24.Chalkidou A, O’Doherty MJ, Marsden PK. False discovery rates in PET and CT studies with texture features: a systematic review. PLoS One 2015;10(5):e0124165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Chen DC, Huang P, Cheng XZ. A concrete statistical realization of Kleinberg’s stochastic discrimination for pattern recognition. Part I. Two-class classification. Ann Stat 2003;31(5):1393–1413. . [Google Scholar]
  • 26.Gu Y, Kumar V, Hall LO, et al. Automated delineation of lung tumors from CT images using a single click ensemble segmentation approach. Pattern Recognit 2013;46(3):692–702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kuhnigk JM, Hahn H, Hindennach M, Dicken V, Krass S, Peitgen HO, eds. Lung lobe segmentation by anatomy-guided 3D watershed transform. In: Sonka M, Fitzpatrick JM, eds. Pr oceedings of SPIE: medical imaging 2003—image processing. Vol 5032. Bellingham, Wash: International Society for Optics and Photonics, 2003; 1482. [Google Scholar]
  • 28.Wei Q, Hu Y, Gelfand G, Macgregor JH. Segmentation of lung lobes in high-resolution isotropic CT images. IEEE Trans Biomed Eng 2009;56(5):1383–1393. [DOI] [PubMed] [Google Scholar]
  • 29.Wei Q, Hu Y, MacGregor JH, Gelfand G. Segmentation of lung lobes in clinical CT images. Int J Comput Assist Radiol Surg 2008;3(1-2):151–163. [Google Scholar]
  • 30.Patz EF, Jr, Greco E, Gatsonis C, Pinsky P, Kramer BS, Aberle DR. Lung cancer incidence and mortality in National Lung Screening Trial participants who underwent low-dose CT prevalence screening: a retrospective cohort analysis of a randomised, multicentre, diagnostic screening trial. Lancet Oncol 2016;17(5):590–599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Li F, Sone S, Abe H, Macmahon H, Doi K. Malignant versus benign nodules at CT screening for lung cancer: comparison of thin-section CT findings. Radiology 2004;233(3):793–798. [DOI] [PubMed] [Google Scholar]
  • 32.Siegelman SS, Zerhouni EA, Leo FP, Khouri NF, Stitik FP. CT of the solitary pulmonary nodule. AJR Am J Roentgenol 1980;135(1):1–13. [DOI] [PubMed] [Google Scholar]
  • 33.Wahidi MM, Govert JA, Goudar RK, Gould MK, McCrory DC; American College of Chest Physicians. Evidence for the treatment of patients with pulmonary nodules: when is it lung cancer?: ACCP evidence-based clinical practice guidelines (2nd edition). Chest 2007;132(3 Suppl):94S–107S. [DOI] [PubMed] [Google Scholar]
  • 34.Xu DM, van Klaveren RJ, de Bock GH, et al. Limited value of shape, margin and CT density in the discrimination between benign and malignant screen detected solid pulmonary nodules of the NELSON trial. Eur J Radiol 2008;68(2):347–352. [DOI] [PubMed] [Google Scholar]
  • 35.Huang P, Tilley BC, Woolson RF, Lipsitz S. Adjusting O’Brien’s test to control type I error for the generalized nonparametric Behrens-Fisher problem. Biometrics 2005;61(2):532–539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Huang P, Woolson RF, O’Brien PC. A rank-based sample size method for multiple outcomes in clinical trials. Stat Med 2008;27(16):3084–3104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.McWilliams A, Tammemagi MC, Mayo JR, et al. Probability of cancer in pulmonary nodules detected on first screening CT. N Engl J Med 2013;369(10):910–919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Horeweg N, van Rosmalen J, Heuvelmans MA, et al. Lung cancer probability in patients with CT-detected pulmonary nodules: a prespecified analysis of data from the NELSON trial of low-dose CT screening. Lancet Oncol 2014;15(12):1332–1341. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Tables E1 E3
ry162725suppa1.pdf (104.2KB, pdf)
Figure E1:
ry162725suppf1.jpg (66.6KB, jpg)
Figure E2:
ry162725suppf2.jpg (68.9KB, jpg)

Articles from Radiology are provided here courtesy of Radiological Society of North America

RESOURCES