Abstract
Objective
To demonstrate a data-driven method for personalizing lung cancer risk prediction using a large clinical dataset.
Materials and Methods
An algorithm was used to categorize nodules found in the first screening year of the National Lung Screening Trial as malignant or nonmalignant. Risk of malignancy for nodules was calculated based on size criteria according to the Fleischner Society recommendations from 2005, along with the additional discriminators of pack-years smoking history, sex, and nodule location. Imaging follow-up recommendations were assigned according to Fleischner size category malignancy risk.
Results
Nodule size correlated with malignancy risk as predicted by the Fleischner Society recommendations. With the additional discriminators of smoking history, sex, and nodule location, significant risk stratification was observed. For example, men with ≥60 pack-years smoking history and upper lobe nodules measuring >4 and ≤6 mm demonstrated significantly increased risk of malignancy at 12.4% compared to the mean of 3.81% for similarly sized nodules (P < .0001).
Based on personalized malignancy risk, 54% of nodules >4 and ≤6 mm were reclassified to longer-term follow-up than recommended by Fleischner. Twenty-seven percent of nodules ≤4 mm were reclassified to shorter-term follow-up.
Discussion
Using available clinical datasets such as the National Lung Screening Trial in conjunction with locally collected datasets can help clinicians provide more personalized malignancy risk predictions and follow-up recommendations.
Conclusion
By incorporating 3 demographic data points, the risk of lung nodule malignancy within the Fleischner categories can be considerably stratified and more personalized follow-up recommendations can be made.
Keywords: medical informatics, data mining, clinical decision support, cancer screening, lung cancer
OBJECTIVE
To demonstrate a data-driven method for personalizing lung cancer risk prediction using a large clinical dataset.
BACKGROUND AND SIGNIFICANCE
Incidental pulmonary nodules in patients at high risk for lung cancer are a frequent diagnostic finding necessitating appropriate imaging follow-up. The Fleischner Society’s recommendations regarding management of small pulmonary nodules incidentally detected on computed tomography (CT) scans provide guidelines for the management of these often clinically challenging lesions.1,2 These guidelines, created by an expert panel performing a comprehensive review of the available literature, represent an estimate of the malignancy potential of a pulmonary nodule based on size and the patient’s relative lung cancer risk. A limitation of the Fleischner criteria is that risk stratification does not take into account a patient’s medical history, except for assignment to low- or high-risk categories based on clinical risk factors. Other widely used incidental lesion and screening criteria (eg, Liver Imaging and Reporting Data System [LI-RADS], Breast Imaging Reporting and Data System [BI-RADS], and Lung Imaging Reporting and Data System [Lung-RADS]) similarly make limited use of individual risk factors when estimating risk.1–5
Large clinical datasets, including the National Lung Screening Trial (NLST) and Pan-Canadian Early Detection of Lung Cancer Study (PanCan), have facilitated the development of data-driven predictive modeling of lung cancer risk, identifying patient demographics and nodule characteristics associated with a higher risk of cancer.6,7 McWilliams et al. used the PanCan dataset to identify multiple predictors of malignancy, such as larger nodule size, spiculation, female sex, and location of the nodule in the upper lobes.7 The availability of these databases for analysis facilitates evaluation of the efficacy of widely utilized, expert-based clinical guidelines such as the Fleischner criteria. In addition, these databases offer a resource that can form the basis of decision support tools, incorporating multiple data points and providing more precise recommendations based on individual patient characteristics.8
A major strength of generalized guidelines such as the Fleischner criteria is simplicity; they require a radiologist or clinician to know only a few data points (such as “high-risk” vs “low-risk” patient history, nodule density, and nodule size) that are often readily available. As clinical information systems become more widespread and integrated, automated availability of multiple data points could allow for more sophisticated clinical decision support tools. To demonstrate the utility of building such a tool backed by a large clinical dataset, we performed a retrospective evaluation of Fleischner criteria performance using the NLST clinical dataset. Three additional cancer risk factors were incorporated to identify potential risk adjustments to follow-up recommendations based on personalized knowledge of individual patient and nodule characteristics.
MATERIALS AND METHODS
The NLST dataset was obtained through the Cancer Data Access System, administered by the National Cancer Institute at the National Institutes of Health. A data transfer agreement was signed between the authors and the National Cancer Institute, permitting access to the dataset for use as described in the proposed research plan. Personally identifiable health information was not provided for any patient, in accordance with the Health Information Portability and Accountability Act, and researchers are restricted from attempting to identify participants.
The NLST enrolled 53 454 participants ages 55–75 years with a smoking history of ≥30 pack-years who were current or former smokers (those who reported quitting within the last 15 years). A total of 26 722 participants were randomized to a helical CT arm and 26 732 to a chest radiography arm. Three yearly follow-up studies were performed from the time of randomization, with a median follow-up time of 6.5 years.9,10
A custom web application developed at the authors’ institution was used for initial querying and analysis of the NLST dataset and has been previously described.8 The interface allows real-time interaction with the data via a relational database. Secondary data analysis was performed with Excel (Microsoft; Redmond, WA, USA).
Nodule malignancy algorithm
To determine cancer outcomes from individual pulmonary nodules identified in the CT arm of NLST, several assumptions were made. Each cancer diagnosis in the NLST dataset is given a location by lobe; however, the cancer is not linked to a specific nodule identified in screening. Therefore, we were unable to absolutely confirm which specific nodules in a given lobe eventually became malignant. As a surrogate, an algorithm was developed to estimate nodule malignancy. The largest nodule found in each patient on the initial screening scan (T0) was recorded, and the nodule was designated as malignant if a cancer was eventually diagnosed within the same lobe. All other smaller nodules were not included in this analysis, even if cancer was eventually diagnosed within another lobe.
Nodule categorization
Nodules were categorized by size according to the Fleischner criteria, using longest diameter, then subdivided based on smoking history (ie, ≥60 pack-years vs <60 pack-years), sex, and nodule location (ie, within or not within the upper lobes); see Figure 1 . A smoking history cutoff of 60 pack-years was chosen empirically to produce roughly equal numbers of nodules above and below the cutoff number. For simplicity, the original size-based Fleischner recommendations from 2005 were used.1 Nodule opacity, as incorporated in the revised Fleischner recommendations of 2013, was not considered.1 As the NLST participant population only includes high-risk patients, only the “high-risk patient” Fleischner follow-up guidelines were analyzed, in order to match recommendations with the appropriate patient population as closely as possible. The order of stratification – smoking history, followed by nodule location, and finally sex – was arbitrary. For simplicity, not all possible categorical combinations were analyzed. These stratification parameters were chosen because they were previously shown to be associated with cancer risk.7
Figure 1.
Nodule subcategorization schema. Nodules initially categorized by size according to the Fleischner Society recommendations were further subdivided by pack-year smoking history, nodule location, and sex.
Based on the NLST nodule database, the risk of malignancy was calculated within each Fleischner size category. This percent of risk was used to describe the risk of malignancy required to recommend a specific imaging follow-up interval using the Fleischner recommendations. For example, for nodules in the >4 mm and ≤6 mm size range, Fleischner recommends a follow-up interval of 6–12 months for high-risk patients. In the NLST dataset, nodules of this size were associated with a 3.81% risk of malignancy, on average. The risk associated with each subcategory (eg, percent of malignant upper lobe nodules between 4 and 6 mm in men with a ≥60 pack-year smoking history) was then directly compared with the risks of each Fleischner size category. If the risk of a subcategory was found to be greater than the next larger Fleischner size category or less than the next smaller size category, the subcategory was reassigned to the follow-up recommendation of the larger or smaller size category.
Statistical analysis
Nodule subcategories were compared using Microsoft Excel 2011 (version 14.6.2). Odds ratios, confidence intervals, and P values were calculated using a 95% confidence level. P < .05 was considered significant.
RESULTS
A total of 5480 patients were found to have nodules of any size on initial CT screening, with 465 total cancers in the same lobe as the largest nodules, according to our algorithm, for a total calculated prevalence of malignancy in the nodules at initial screening of 8.5%. In comparison, the NLST reported 649 confirmed cancers in the low-dose CT screening arm over the course of the study, with 7191 positive screening exams, for an overall rate of malignancy in patients with positive initial screening studies of 9% (649/7191). As our analysis only considers the initial screening study, malignant nodules that developed after the initial screening exam are not included, and therefore these rates of malignancy are likely overestimated. Categorization of the nodules by Fleischner criteria demonstrated an exponential increase in rate of malignancy as nodule diameter increased (Figure 2 ). The 886 nodules ≤4 mm in diameter included 28 cancers (3.16%), whereas 1239 nodules >8 mm included 270 cancers (21.79%). Rates of malignancy among nodules in the smallest 2 categories (≤4 mm and >4 to ≤6 mm) were similar, at 3.16% and 3.81%, respectively. Nodules >6 and ≤8 mm had a rate of malignancy of 7.39% (Table 1).
Figure 2.
Rate of nodule malignancy by size, categorized according to the Fleischner criteria, demonstrating exponential increase in malignancy risk with increasing nodule size.
Table 1.
Rate of malignancy among largest nodules identified at initial screening, categorized by Fleischner size criteria
| Rate of Malignancy | |||
|---|---|---|---|
| Category | Total Nodules | Cancers | Rate (%) |
| ≤4 mm | 886 | 28 | 3.16 |
| >4 and ≤6 mm | 2259 | 86 | 3.81 |
| >6 and ≤8 mm | 1096 | 81 | 7.39 |
| >8 mm | 1239 | 270 | 21.79 |
Additional discriminators
With added discriminators, further stratification of malignancy rates was observed. Nodules ≤4 mm were associated with significantly increased risk when located in the upper lobes of all participants (men or women) with a ≥60 pack-year smoking history, with a rate of 7.0% compared to the overall group mean of 3.16% (odds ratio [OR] = 2.29; 95% CI, 1.02–5.16; P = .04). With the additional discriminator of sex, the trend toward increased risk in the group with ≥60 pack-years and upper lobe nodules remained; however, this did not reach the level of significance (Figure 3A ).
Figure 3.
Odds ratio of malignancy risk for nodules within the Fleischner size categories, further stratified by smoking pack-years, nodule location, and sex. Nodules with longest diameter: (A) ≤4 mm, (B) >4 and ≤6 mm, (C) >6 and ≤8 mm, and (D) >8 mm. PY = pack years; “not in upper lobes” = bilateral lower lobes and right middle lobe.
For all nodules in the >4 and ≤6 mm Fleischner size category, the malignancy rate was 3.81% (86 cancers, 2259 nodules). For the subcategory of upper lobe nodules in men with ≥60 pack-years smoking history, the rate was 12.4% (21 cancers, 169 nodules), which was significantly increased from the mean (OR = 3.59; 95% CI, 2.16–5.94; P < .0001) (Figure 3B).
For nodules >6 and ≤8 mm, a significantly decreased rate of malignancy of 5.8% (18 cancers, 310 nodules) was observed among nodules not located in the upper lobes in men with a <60 pack-year smoking history, which was significantly less than the overall Fleischner size category mean of 7.39% (OR = 0.52; 95% CI, 0.31–0.89; P = .02) (Figure 3C).
For nodules >8 mm in diameter, upper lobe nodules in women with a ≥60 pack-year smoking history demonstrated a significantly increased rate of malignancy of 41.8% (28 cancers, 67 nodules) when compared to the mean of all nodules in the >8 mm Fleischner size category of 21.8% (270 cancers, 1239 nodules) (OR = 2.58; 95% CI, 1.56–4.26; P = .0002). This subcategory also demonstrated the highest malignancy rate of any subcategory in our analysis. In contrast, lower lobe nodules in women with a <60 pack-year history demonstrated a significantly decreased rate of malignancy when compared to the overall group mean (13.8% vs 21.8%, respectively; OR = 0.58; 95% CI, 0.37–0.89; P = .01) (Figure 3D).
The combination of <60 pack-year smoking history, nodule not within the upper lobes, and male sex was associated with significantly decreased rate of malignancy in 3 out of 4 Fleischner size categories (>4 to ≤6, >6 to ≤8, and >8 mm). Significantly increased risk was seen particularly in the combination of ≥60 pack-year smoking history with upper lobe nodules (>4 to ≤6, >8 mm). The largest range of malignancy risk within a single Fleischner category was seen in the >8 mm group, with a minimum risk of 13.8% for women with fewer than 60 pack-years and nodules not in the upper lobes and a maximum risk of 41.8% for women with 60 or more pack-years and upper lobe nodules, for a difference of 28.0%. Differences, or risk ranges, of 5.8, 11.0, and 8.5% between the highest and lowest risk subgroups were seen within the ≤4, >4 to ≤6, and >6 to ≤8 mm Fleischner categories, respectively.
Follow-up interval reclassification
After subcategorization based on nodule location, smoking history, and patient sex, nodule subcategories were assigned new “recommended” follow-up intervals based on the average malignancy risk of nodules in Fleischner categories, as described above. More than half (54%; 1180 of 2259) of nodules >4 and ≤6 mm were reclassified from the 6- to 12-month follow-up recommendation to 12-month follow-up, because the average risk of malignancy of the subcategory was less than the average risk of all nodules ≤4 mm (ie, 3.16%, as listed in Table 1). Of the total 2259 nodules in the >4 and ≤6 mm category, 169 (7%) had a higher risk of malignancy than the 7.39% average risk of the 6 to 8 mm size group, and were therefore reclassified from 6- to 12-month follow-up to 3 to 6 months. Of the nodules ≤4 mm, 8% (68 nodules) demonstrated greatly increased risk and were reclassified upward by 2 categories, from the 12-month follow-up category to the 3- to 6-month follow-up recommendation. Nineteen percent of nodules ≤4 mm (171 nodules) were reclassified upward 1 category to the 6- to 12-month follow-up recommendation. No nodules >6 mm in diameter were reclassified according to this method (Figure 4 ).
Figure 4.
Reclassification of nodules based on mean risk of malignancy after application of additional discriminating factors.
DISCUSSION
Retrospective analysis of the NLST data confirms that the risk of malignancy of a given nodule is correlated with the Fleischner criteria and also demonstrates areas for potential improvement. Our analysis shows significant stratification of malignancy risk when incorporating additional clinical historical and nodule characteristics.
For instance, a smoking history of ≥60 pack-years with a nodule in the upper lobes was associated with significantly increased risk of malignancy in 3 of the 4 Fleischner categories (≤4 mm, P = .04; >6 to ≤8 mm, P < .0001; >8 mm, P < .0001) when compared to all nodules within each category. On the other hand, a <60 pack-year smoking history (but >30 pack-years, per the NLST inclusion criteria) with a nodule not in the upper lobes in men was associated with significantly decreased risk in 3 of 4 Fleischner categories compared to baseline (>4 to ≤6 mm, P < .0001; >6 to ≤8 mm, P = .02; >8 mm, P < .0001). These findings are in line with those from the PanCan dataset that found overall increased risk of malignancy with upper lobe nodules in women.7
Lung-RADS has been released and promoted by the American College of Radiology as superseding the Fleischner criteria in the setting of lung cancer screening.11 Lung-RADS improves on the Fleischner guidelines by including additional variables beyond nodule diameter (nodule composition, calcification, and growth vs stability). For this initial study, the 2005 Fleischner recommendations were chosen for their simplicity and widespread use; however, the same methods could be applied to any structured guideline, including Lung-RADS.
Lung-RADS improves on Fleischner by including several risk modifiers, such as prior history of lung cancer, rapid nodule growth, or other suspicious findings increasing the risk of cancer. These modifiers are limited in that the risk contribution associated with each one is not quantified, and modifiers that could significantly decrease risk are not considered. In addition, modifiers could contribute more or less risk depending on the local population being screened. For instance, a screening program in an area with a high proportion of workers exposed to occupational respiratory hazards might need to place more weight on an occupational risk modifier. It then becomes more important to both maintain a local clinical screening dataset and have the tools to mine that dataset and appropriately modify Lung-RADS or Fleischner categorizations. Structured reporting guidelines like Fleischner and Lung-RADS add clarity and reproducibility to lung cancer screening exams; however, the criteria for assigning a Lung-RADS or Fleischner category to an abnormality could benefit from incorporating additional data.
As an example of an important risk modification from our data, consider 2 high-risk patients with 5 mm lung nodules categorized equally under the Fleischner criteria. If 1 patient happens to have a 30 pack-year smoking history with a lower lobe nodule and the other a 65 pack-year history with an upper lobe nodule, the latter has 5–15 times greater risk of malignancy based on the NLST cohort. With geographically constrained datasets supplementing national datasets, these risk differences could become even more pronounced, as endemic exposure can lead to high incidence of lung nodules.
The cost-saving potential of improving risk estimation in screening programs must also be considered. Our findings show that more than half (54%) of nodules >4 and ≤6 mm in diameter are less likely to be malignant than the average ≤4 mm nodule in the NLST dataset. These nodules could potentially be followed after a longer interval than currently recommended, reducing utilization and costs for screening programs. Conversely, a subset of small nodules (27% of nodules ≤4 mm; 239 of 886 nodules) have increased adjusted risk and may require closer follow-up than currently recommended (Figure 5 ). However, increased overall accuracy of lung cancer screening could ultimately increase the number of malignancies detected at an early stage, further decreasing long-term costs.
Figure 5.
Difference in distribution of nodule follow-up recommendations after application of additional discriminators, using average risk of Fleischner size categories as baseline.
Increased use and integration of electronic medical records facilitates easier access to the patient demographic data necessary for effective data mining and decision support. Computer-assisted detection software, or exportable annotations such as those created with Annotation Image Markup12 or DICOM Structured Reporting13 could be used to automate the collection of size and characteristic descriptors of nodules. This would allow one-click entry of required patient and nodule data to a clinical decision support tool similar to the one we developed and utilized. The tool could then dynamically query a large national clinical dataset such as the NLST, or a locally maintained dataset, to produce personalized malignancy risk and follow-up recommendations at the point of care for both interpreting radiologists and clinicians in the office setting.
Dynamic querying and risk analysis for each patient affords several benefits. The underlying datasets can be continually updated and improved, and new data can be used immediately. Also, each risk calculation can be performed based on a tailored cohort of similar patients, depending on which pieces of data are available in the electronic medical record. Risk calculators based on precalculated formulas, on the other hand, may be inaccurate if certain input data are missing.
Limitations
The 2005 Fleischner recommendations were created to guide decision-making specifically in the setting of incidental pulmonary nodules and were not necessarily designed for the screening population. In contrast, the NLST data we have analyzed are the result of a screening study, therefore there might be discordance between the NLST population and the intended target of Fleischner recommendations. However, the Fleischner guidelines do specify “low-risk” and “high-risk” patients, the latter defined as those with a “history of smoking or of other known risk factors,” and also specify the exclusion of young patients and those with suspected or known malignancy. NLST enrollment criteria similarly require at least a 30 pack-year smoking history, age between 55 and 74 years, and no known or suspected malignancy. It was felt that the NLST population likely matched the Fleischner “high-risk” population closely enough for this analysis to be applicable.
The NLST dataset lacks specific nodule-tracking data to determine which specific nodules eventually become malignant. We therefore developed an algorithm, described above, based on the available data to determine the likely culprit nodule. Our strategy is susceptible to false positives in the form of a larger benign nodule in the same lobe as a smaller malignant nodule, as well as a malignancy not present on T0 screening developing in the same lobe as a previous benign nodule. False negatives can occur in the form of malignant nodules in a separate lobe from a larger benign nodule. Despite these disadvantages, it was predicted that the described algorithm would provide a reasonable estimate using available data. The close concordance between the rate of nodule malignancy found using our algorithm (8.5%) and the rate of cancer among patients with positive initial exams (9.0%) suggests that the algorithm performs well, and the underestimation may be due to false negatives related to cancer developing in a different lobe from the largest nodule or after the initial screening exam. To remove the need for this estimation, the source dataset ideally would track specific nodules from time of detection to time of cancer diagnosis, noting nodule growth characteristics as well as which nodule (if possible to determine) produced the primary tumor.
CONCLUSION
Large clinical datasets such as the NLST can be used to personalize individual lung nodule cancer risk by incorporating additional demographic and nodule characteristic data points. Personalized risk could be used to tailor imaging and clinical follow-up recommendations. Dynamic querying of large clinical datasets could be used to improve the accuracy of clinical decision support tools.
Competing interests
None.
Funding
None.
ACKNOWLEDGMENTS
The authors thank Brigitte Pocta for her assistance in preparation and editing of this manuscript.
References
- 1. Macmahon H, Austin JHM, Gamsu G. et al. Radiology guidelines for management of small pulmonary nodules detected on CT scans: a statement from the Fleischner Society. Radiology. 2005;237:395–400. [DOI] [PubMed] [Google Scholar]
- 2. Naidich DP, Bankier AA, Macmahon H. et al. Recommendations for the management of subsolid pulmonary nodules detected at CT: a statement from the Fleischner Society. Radiology. 2013;2661:304–17. [DOI] [PubMed] [Google Scholar]
- 3. American College of Radiology. Liver Imaging Reporting and Data System. http://www.acr.org/quality-safety/resources/LIRADS. Accessed January 14, 2015.
- 4. Hathout GM, Fink JR, El-Saden SM, Grant EG. Sonographic NASCET index: a new doppler parameter for assessment of internal carotid artery stenosis. Am J Neuroradiol. 2005;26:68–75. [PMC free article] [PubMed] [Google Scholar]
- 5. Gould MK, Fletcher J, Iannettoni MD. et al. Evaluation of patients with pulmonary nodules: when is it lung cancer?: ACCP evidence-based clinical practice guidelines (2nd ed.). Chest. 2007;132(3 Suppl):108S–130S. [DOI] [PubMed] [Google Scholar]
- 6. Tammemagi CM, Pinsky PF, Caporaso NE. et al. Lung cancer risk prediction: Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial models and validation. J Natl Cancer Inst. 2011;10313:1058–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. McWilliams A, Tammemagi MC, Mayo JR. et al. Probability of cancer in pulmonary nodules detected on first screening CT. N Engl J Med. 2013;36910:910–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Morrison JJ, Hostetter J, Wang K. et al. Data-driven decision support for radiologists: re-using the National Lung Screening Trial dataset for pulmonary nodule management. J Digit Imaging. 2014;281:18–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. National Lung Screening Trial Research Team, Aberle DR, Berg CD. et al. The National Lung Screening Trial: overview and study design. Radiology. 2011;2581:243–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. National Lung Screening Trial Research Team, Aberle DR, Adams AM. et al. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med. 2011;3655:395–409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. American College of Radiology. Lung CT Screening Reporting and Data System (Lung-RADS). http://www.acr.org/Quality-Safety/Resources/LungRADS. Accessed January 14, 2015.
- 12. Mongkolwat P, Kleper V, Talbot S. et al. The National Cancer Informatics Program (NCIP) Annotation and Image Markup (AIM) Foundation model. J Digit Imaging. 2014;276:692–701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Noumeir R. Benefits of the DICOM structured report. J Digit Imaging. 2006;194:295–306. [DOI] [PMC free article] [PubMed] [Google Scholar]





