Structured Abstract
Purpose of review
Risk prediction models may be useful for facilitating effective and high-quality decision-making at critical steps in the lung cancer screening process. This review provides a current overview of published lung cancer risk prediction models and their applications to lung cancer screening and highlights both challenges and strategies for improving their predictive performance and use in clinical practice.
Recent findings
Since the 2011 publication of the National Lung Screening Trial results, numerous prediction models have been proposed to estimate the probability of developing or dying from lung cancer or the probability that a pulmonary nodule is malignant. Respective models appear to exhibit high discriminatory accuracy in identifying individuals at highest risk of lung cancer or differentiating malignant from benign pulmonary nodules. However, validation and critical comparison of the performance of these models in independent populations are limited. Little is also known about the extent to which risk prediction models are being applied in clinical practice and influencing decision-making processes and outcomes related to lung cancer screening.
Summary
Current evidence is insufficient to determine which lung cancer risk prediction models are most clinically useful and how to best implement their use to optimize screening effectiveness and quality. To address these knowledge gaps, future research should be directed toward validating and enhancing existing risk prediction models for lung cancer and evaluating the application of model-based risk calculators and its corresponding impact on screening processes and outcomes.
Keywords: Risk prediction models, lung cancer, pulmonary nodules, lung cancer screening
INTRODUCTION
Aside from smoking cessation, the only strategy proven to reduce lung cancer mortality is screening with low-dose computed tomography (LDCT). In 2011, National Lung Screening Trial (NLST) investigators reported that annual screening with LDCT versus chest radiography decreased lung cancer mortality by 20% in high-risk individuals, specifically asymptomatic individuals ages 55–74 years with a ≥30 pack-year smoking history who currently smoke or quit within the last 15 years [1]. Results from this trial now support recommendations by the U.S. Preventive Services Task Force (USPSTF) and others to screen individuals of similar age and smoking history for lung cancer [2–5]. However, the NLST revealed potential harms to patients, including frequent false-positive findings (benign pulmonary nodules) and unnecessary invasive follow-up procedures; approximately 20% of those screened per round required diagnostic follow-up, yet only 1% had lung cancer [6].
Risk prediction models are potentially useful tools for optimizing the effectiveness and quality of lung cancer screening [7]. They enable decision-making to be tailored to an individual’s disease risk, as predicted by clinical, lifestyle, and genetic factors. In fact, the USPSTF recommendation statement [2] emphasizes “the importance of accurately identifying persons who are at highest risk to maximize the benefits and minimize the harms of screening and calls for more research to improve risk assessment tools.” Many new models have emerged recently to estimate the probability of developing or dying from lung cancer or the probability of pulmonary nodule malignancy. This review seeks to evaluate existing risk prediction models for lung cancer, discuss their applications to lung cancer screening, and highlight challenges and strategies to improve their predictive performance and use in clinical practice.
EVALUATING RISK PREDICTION MODELS: KEY CONSIDERATIONS
A clinically useful risk prediction model should be accurate, reliable, and generalizable when applied to intended target populations. A model may not perform as well in populations beyond the one in which it was developed, due to flaws in the study design or methods used in its derivation, or differences in population characteristics and variable measurement [8, 9]. Therefore, external validation of a model’s predictive performance, preferably in large, well-characterized, and diverse populations, is essential [10].
Predictive performance is traditionally evaluated using measures of discrimination and calibration. Discrimination, the ability of a model to accurately classify individuals as cases versus non-cases, is often quantified by the area under the receiver operator curve (AUC), where a higher AUC indicates better discrimination [11]. Calibration reflects the closeness of model-predicted and observed event probabilities (number of expected (E) and observed (O) cases) in a population. Less E/O deviation from 1.0 indicates better calibration, with values below and above 1.0 signifying underestimation and overestimation of risk, respectively. Fundamentally, a model that is miscalibrated provides biased risk estimates and should not be used, even if it discriminates well. Since discrimination and calibration do not weigh tradeoffs between predicted benefits and harms, decision curve analysis can determine the net benefit of a model across different risk thresholds upon which decisions may be based (e.g., whether to initiate screening) [12]; however, as a prerequisite, a model must exhibit good discrimination and calibration, particularly at critical decision-making thresholds [13, 14].
PREDICTING LUNG CANCER RISK PRIOR TO SCREENING INITIATION
Employing validated risk prediction models to select high-risk individuals for LDCT screening could optimize the balance of benefits and harms by screening fewer individuals and discovering fewer false-positive findings, while also detecting more early-stage lung cancers, compared with current screening criteria. There are over 25 distinct models, most proposed in the last five years, which include varying combinations of risk factors to predict an individual’s probability of developing or dying from lung cancer within a specified period (Table 1). However, empirical evidence to determine which are optimal for selecting individuals likely to benefit from screening remains limited.
TABLE 1.
PREDICTION MODELS FOR ESTIMATING FUTURE PROBABILITY OF DEVELOPING OR DYING FROM LUNG CANCER
Author, Year | Location | Model Data Sources and Outcomes | Target Population | Predictors | AUC* |
---|---|---|---|---|---|
Bach [15], 2003 | US | CARET: 18,172 high-risk smokers Lung cancer incidence in 1 year |
Ever smokers | Age; sex; smoking intensity, duration, quit-years; asbestos exposure | 0.72 |
Cassidy [17], 2008 | England | Population-based CC (LLP): 579 cases, 1,157 controls Lung cancer incidence in 5 years |
General population | Smoking duration; history of pneumonia; history of cancer; family history of lung cancer; asbestos exposure | 0.70 |
Etzel [18], 2008 | US | Development - Hospital-based CC: 491 cases, 497 controls Internal Validation - Hospital-based CC: 89 cases, 67 controls External Validation - 2 CC studies: 172 cases, 153 controls Lung cancer incidence in 5 years |
African Americans | Smoking status, pack-years, quit age; COPD; hay fever; asbestos exposure; wood dust exposure | 0.75 0.63 |
El-Zein [34], 2014 | US | Development - Hospital-based CC: 527 cases, 468 controls External validation - Hospital-based CC: 239 cases, 272 controls Lung cancer incidence in 1 year |
Current smokers | Smoking pack-years; emphysema; family history of any smoking-related cancer; asbestos exposure; wood dust exposure; cytokinesis-blocked micronucleus assay | 0.925 |
Former smokers | Smoking quit age; emphysema; family history of cancer; wood dust exposure; cytokinesis-blocked micronucleus assay | 0.910 | |||
Never smokers | Environmental tobacco smoke exposure; family history of cancer; cytokinesis-blocked micronucleus assay | 0.918 | |||
Hippisley-Cox [27], 2015 | England | Adults registered at 753 general health practices from 1998 to 2013 Development - 4.96 million adults ages 25–84 years Validation - 1.64 million adults ages 25–84 years Lung cancer incidence in 10 years |
General population –Men | Age; ethnicity; Townsend deprivation score; BMI; smoking status; COPD; asthma; history of specific cancer types; family history of lung cancer; alcohol use | 0.911 |
General population –Women | Age; ethnicity; Townsend deprivation score; BMI; smoking status; COPD; asthma; history of specific cancer types; family history of lung cancer | 0.905 | |||
Hoggart [25], 2012 | Europe | European Investigation into Cancer and Nutrition cohort: 169,035 smokers Lung cancer incidence in 1 year |
Ever smokers | Age; smoking intensity; age at smoking initiation; smoking duration | 0.843 |
Current smokers | Age; smoking intensity; age at smoking initiation | 0.824 | |||
Former smokers | Age; smoking intensity; age at smoking initiation; smoking duration | 0.830 | |||
Katki [23], 2016 | US | Development - 39,180 PLCO control arm smokers ages 55–74 years Validation - 39,822 PLCO intervention arm smokers ages 55–74 years - 26,554 NLST control arm participants Lung cancer incidence in 5 years |
Ever smokers | Age; sex; education; race/ethnicity; BMI; smoking intensity, duration, pack-years, and quit-years; emphysema; family history of lung cancer | PLCO: 0.80 NLST: 0.70 |
Development - 39,180 PLCO control arm smokers ages 55–74 years Validation - 39,822 PLCO intervention arm smokers ages 55–74 years - 26,554 NLST control arm participants - 29,091 NHIS smokers ages 50–80 years Lung cancer death in 5 years |
Ever smokers | Age; sex; education; race/ethnicity; BMI; smoking intensity, duration, pack-years, and quit-years; emphysema; family history of lung cancer | PLCO: 0.81 NHIS: 0.78 |
||
Kovalchik [22], 2013 | US | Development: 26,554 NLST control arm participants External validation: 37,763 PLCO intervention arm smokers ages 55–74 Lung cancer death in 5 years |
Ever smokers | Age; BMI; smoking pack-years and quit-years; emphysema; family history of lung cancer | 0.80 |
Li [36], 2012 | China | Hospital-based CC: 2,283 cases, 2,785 controls Lung cancer incidence (undefined time period) |
General population (Han Chinese) | Smoking status; 4 candidate SNPs | 0.63 |
Marcus [26], 2015 | England | LLP cohort: 8,760 adults ages 45–79 years Lung cancer incidence in 8.7 years |
General population | Age; sex; smoking duration; COPD; history of cancer; family history of lung cancer | 0.849 |
Marcus [37], 2016 | England | Population-based CC (LLP): 718 cases, 1,667 controls Lung cancer incidence in 5 years |
General population | Age; sex; smoking duration; history of pneumonia; history of cancer; family history of lung cancer; asbestos exposure; 3 candidate SNPs | 0.79 |
Muller [31], 2017 | UK | UK Biobank cohort: 502,321 adults ages 37–70 years Lung cancer incidence in 2 years |
Ever smokers | Sex; smoking intensity and quit age; difficulty of not smoking for one day; emphysema/chronic bronchitis; hay fever, allergic rhinitis or eczema; history of cancer; family history of lung cancer; maximum FEV1 | 0.82 |
Never smokers | Sex; emphysema/chronic bronchitis; hay fever, allergic rhinitis or eczema; history of cancer; family history of lung cancer; maximum FEV1 | 0.74 | |||
Park [29], 2013 | Korea | Development - 1,309,144 men ages 30–80 years who underwent health examinations from 1996 to 1997 External validation - 507,046 men from the Korean National Health Corporation from 1998 to 1999 Lung cancer incidence in 8 years |
General population - Men | Age; BMI; smoking status, pack-years, age at smoking initiation; physical activity; fasting glucose level | 0.87 |
Raji [38], 2010 | England | Population-based CC (LLP): 200 cases, 188 controls Lung cancer incidence in 5 years |
General population | Smoking duration; history of pneumonia; history of cancer; family history of lung cancer; asbestos exposure; rs663048 (SEZ6L SNP) | 0.75 |
Sin [35], 2013 | Canada | Development (PanCan): 2,485 high-risk smokers External validation (CARET - current smokers): 61 cases, 121 controls Lung cancer incidence (undefined time period) |
Ever smokers | Age; sex; BMI; smoking intensity and duration; history of pneumonia; history of cancer; family history of cancer; FEV1 %; plasma pro-surfactant protein B | 0.74 0.68 |
Spitz [16], 2007 | US | Hospital-based CC: 1,851 cases, 2,001 controls Lung cancer incidence in 1 year |
Current smokers | Smoking pack-years; emphysema; hay fever; family history of any smoking-related cancer; asbestos exposure; wood dust exposure | 0.58 |
Former smokers | Smoking quit age; emphysema; hay fever; family history of cancer; asbestos exposure; wood dust exposure | 0.63 | |||
Never smokers | Environmental tobacco smoke exposure; family history of cancer | 0.57 | |||
Spitz [33], 2008 | US | Hospital-based CC: 725 cases, 615 controls Lung cancer incidence in 1 year |
Current smokers | Smoking pack-years; emphysema; hay fever; family history of any smoking-related cancer; asbestos exposure; wood dust exposure; DNA repair capacity; bleomycin sensitivity | 0.73 |
Former smokers | Smoking quit age; emphysema; hay fever; family history of cancer; wood dust exposure; DNA repair capacity; bleomycin sensitivity | 0.70 | |||
Spitz [39], 2013 | US | Development - Hospital-based CC: 477 cases, 366 controls External validation - Population-based CC: 330 cases, 342 controls Lung cancer incidence in 1 year |
African Americans | Age; sex; smoking pack-years; emphysema; hay fever; family history of cancer; asbestos exposure; 6 candidate SNPs | 0.67 |
Tammemagi [32], 2011 | Canada | BCCA chemoprevention trials: 2,596 high-risk smokers Lung cancer incidence in 8 years |
Ever smokers | Age; sex; BMI; smoking status, pack-years, and quit-years; family history of lung cancer; FEV1 %; sputum DNA image cytometry | 0.75 |
Tammemagi [19], 2011 | US | Development + internal validation - 70,692 PLCO control arm participants External validation - 44,223 PLCO intervention arm participants Lung cancer incidence (undefined time period) |
General population | Age; education; BMI; smoking status, pack-years, and duration; COPD; family history of lung cancer; recent chest x-ray | 0.857 0.841 |
Ever smokers | Age; education; BMI; smoking status, pack-years, duration, and quit-years; COPD; family history of lung cancer; recent chest x-ray | 0.805 0.784 |
|||
Tammemagi [21], 2013 | US | Development - 36,286 PLCO control arm smokers ages 55–74 years External validation - 37,332 PLCO intervention arm smokers ages 55–74 years and 51,033 NLST participants Lung cancer incidence in 6 years |
Ever smokers | Age; race/ethnicity; education; BMI; smoking status, intensity, duration, and quit-years; COPD; history of cancer; family history of lung cancer | PLCO: 0.797 NLST: 0.701 |
Tammemagi [20], 2014 | US | Development - 77,456 PLCO control arm participants Validation - 77,445 PLCO intervention arm participants Lung cancer incidence in 6 years |
General population | Age; race/ethnicity; education; BMI; smoking status, intensity, duration, and quit-years; COPD; history of cancer; family history of lung cancer | 0.848 |
Wang [28], 2015 | China | Hospital-based CC: 705 cases, 988 controls Lung cancer incidence (undefined time period) |
General population (Han Chinese) | Age; sex; education level; BMI; smoking pack-years; COPD; history of pneumonia; pulmonary tuberculosis; family history of cancer; occupational pesticide exposure; heavy exposure to cooking emissions; dietary intake of specific foods (seafood, vegetables, meat, dairy products, soybean products and nuts) | 0.885 |
Wilson [24], 2015 | US | NLST: 51,577 participants Pittsburgh Lung Screening Study: 3,654 smokers ages 50–79 years Lung cancer incidence in 6 years |
Ever smokers | Age; smoking status, intensity, and duration | 0.701 |
Wu [30], 2015 | Taiwan | 395,875 Taiwan residents enrolled in a national health screening program Lung cancer incidence in 5 and 10 years |
General population | Age; sex; BMI; smoking status, intensity, pack-years; history of cancer; family history of lung cancer; maximum midexpiratory flow; alpha-fetoprotein level; bilirubin level; carcinoembryonic antigen level; C-reactive protein level | 0.851 |
Heavy smokers (30+ pack-years) | Age; sex; BMI; smoking status and intensity; maximum mid-expiratory flow; carcinoembryonic antigen level | 0.732 | |||
Light smokers (<30 pack-years) | Age; sex; smoking status and intensity; family history of lung cancer; alpha-fetoprotein level; carcinoembryonic antigen level | 0.847 | |||
Never smokers | Age; sex; BMI; family history of lung cancer; maximum mid-expiratory flow; alpha-fetoprotein level; carcinoembryonic antigen level | 0.806 | |||
Young [41, 40], 2009 | New Zealand | Hospital-based CC Only smokers with a 15+ pack-year history of European ancestry Development - 239 cases, 200 controls Validation - 207 cases, 248 controls Lung cancer incidence (undefined time period) |
Ever smokers | Age; sex; COPD; family history of lung cancer; 20 candidate SNPs | 0.75 |
From internal or external validation analyses, except for studies by Raji et al. and Wang et al.
Abbreviations: BCCA, British Columbia Cancer Agency; BMI, body mass index; CARET, Carotene and Retinol Efficacy Trial; CC, case-control; COPD, chronic obstructive pulmonary disease; FEV1, forced expiratory volume in 1 second; LLP, Liverpool Lung Project; NHIS, National Health Interview Survey; PanCan, Pan-Canadian Early Detection of Lung Cancer Study; PLCO, Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial; SNP, single nucleotide polymorphism; US, United States; UK, United Kingdom
Models including conventional risk factors
The earliest prediction models, including the Bach [15], Spitz [16], and Liverpool Lung Project (LLP) [17] models, were built incorporating smoking history, asbestos exposure, and other known risk factors for lung cancer. Among high-risk smokers from the Carotene and Retinol Efficacy Trial (CARET), Bach et al. [15] developed a model to estimate annual lung cancer risk, as a function of age, sex, smoking history, and asbestos exposure (AUC, 0.72). In comparison, Spitz et al. constructed separate models for never, former, and current smokers that estimate one-year lung cancer risk, using data from a hospital-based case-control study matched on age, sex, ethnicity, and smoking status [16]. Although these models included more predictors than the Bach model, internal validation indicated only fair to moderate discrimination (AUCs, 0.57–0.63), presumably due to the loss of information from matching on two strong predictors, age and smoking status. Likewise using hospital-based matched case-control data and incorporating similar predictors to Spitz et al. [16], Etzel et al. [18] proposed the first model to predict five-year lung cancer risk in African Americans, which displayed moderate discrimination in external validation (AUC, 0.63). The LLP model was developed using population-based matched case-control data to estimate five-year lung cancer risk based on smoking duration, asbestos exposure, pneumonia, and both personal and family history of cancer [17]. This model exhibited good discrimination (AUC, 0.70), where a 2.5% risk cutoff corresponded to a sensitivity of 62% and specificity of 70%, although calibration was not measured.
Subsequent risk prediction models with similar variables were derived from U.S. cancer screening cohorts. Tammemagi et al. were the first to construct models for all individuals, ever smokers, and never smokers, using data from Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) participants in the control arm for model development and the intervention (chest radiography) arm for model validation [19–21]. Their model for ever smokers (PLCOM2012) was designed to estimate the probability of lung cancer diagnosis within six years, permitting comparison of its predictive performance in the NLST population [21]. Predictors included age; race/ethnicity; education; body mass index (BMI); chronic obstructive pulmonary disease (COPD); history of cancer; family history of lung cancer; and smoking status, intensity, duration, and quit-years. Discrimination was equivalent in PLCO control and intervention arm smokers (AUCs, 0.802 and 0.797), yet lower in NLST participants (0.701), likely because the NLST population was more homogeneous with respect to smoking history. Among NLST control arm participants, Kovalchik et al. [22] developed a model to predict five-year risk of lung cancer death, as a function of age, BMI, emphysema, family history of lung cancer, smoking pack-years, and quit-years. When externally validated among PLCO intervention arm smokers, their model showed good discrimination and calibration (AUC, 0.80; E/O, 0.97). Leveraging PLCO, NLST, and National Health Interview Survey (NHIS) data, Katki et al. [23] developed and validated separate risk prediction models for lung cancer incidence (LCRAT, Lung Cancer Risk Assessment Tool) and death (LDCRAT, Lung Cancer Death Risk Assessment Tool) that included similar predictors to PLCOM2012. In external validation, both models for five-year lung cancer incidence (PLCO intervention: AUC 0.80, E/O 0.94; NLST: AUC 0.70, E/O 1.06) and death (PLCO intervention: AUC 0.81, E/O 1.08; NHIS: AUC 0.78, E/O 0.94) performed well. Using NLST and Pittsburgh Lung Screening Study (PLuSS) data, Wilson et al. [24] constructed a much simpler model, the Pittsburgh Predictor, to estimate the probability of lung cancer diagnosis within six years, based on age and smoking status, intensity, and duration. Among PLuSS smokers aged 50–79 years screened by LDCT, this model discriminated reasonably well (AUC, 0.70), relative to the Bach and PLCOM2012 models (0.71 and 0.72), despite its simplicity.
Cumulative evidence suggests that, by using these risk prediction models, a personalized risk-based approach is more effective than current age and smoking history criteria in selecting individuals for LDCT screening. Fundamentally, these models are able to better identify high-risk smokers who benefit from screening than current screening criteria, thereby potentially resulting in reduced patient burden, clinical workload, and financial costs. Among PLCO intervention arm smokers, applying a PLCOM2012 risk threshold of ≥1.51%, compared to USPSTF screening criteria, identified 8.8% fewer individuals for screening, yet detected 12.4% more lung cancers and fewer false positive results [20]. Comparing NLST participants who underwent LDCT versus chest radiography by lung cancer risk quintiles, Kovalchik et al. determined that 99% of lung cancer deaths prevented by LDCT occurred among the 80% at highest predicted risk, suggesting that the bottom 20% received little benefit from screening [22]. Similarly among NLST participants who received LDCT, Wilson et al. reported that 93% of lung cancers were diagnosed among the 80% at highest risk (>1.7%) according to the Pittsburgh Predictor [24]. Furthermore, Katki et al. estimated that screening smokers at highest five-year risk (≥1.9% by LCRAT), compared to smokers meeting USPSTF criteria, might prevent 20% more lung cancer deaths, as well as reduce the number needed to screen to prevent one lung cancer death by 17% and the number of false-positives per prevented lung cancer death by 13% [23].
Prediction models have also recently been derived from unscreened European and Asian populations. In the European Prospective Investigation into Cancer and Nutrition cohort, Hoggart et al. [25] developed and internally validated separate models to predict one-year lung cancer risk for former, current, and ever smokers (AUCs, 0.82–0.84), considering age and smoking history. Based on longitudinal data on LLP participants, Marcus et al. [26] constructed the LLPi model to include similar predictors to the original LLP model, which displayed good calibration and excellent discrimination (AUC, 0.85). Leveraging routinely-collected data from English general health practices on >6.5 million adults, Hippisley-Cox and Coupland [27] built highly discriminatory and well-calibrated sex-specific models that predict ten-year lung cancer risk based on age, race/ethnicity, BMI, Townsend deprivation score, smoking status and intensity, COPD, asthma, history of cancer, family history of lung cancer, asbestos exposure, and alcohol use (AUCs >0.90). By contrast, Wang et al. [28] constructed a model tailored to the Han Chinese population using hospital-based case-control data, which incorporated conventional predictors along with pulmonary tuberculosis, occupational pesticide exposure, heavy exposure to cooking emissions, and intake of specific foods. Their model appeared to discriminate well (AUC, 0.88); however, internal validation was not performed. Of note, the latter three models were derived from populations including never smokers, partly explaining their relatively high predictive accuracy (AUCs ≥0.85).
Models including clinical, molecular, and genetic markers
Other models have included measures requiring clinical assessment. Among >1.3 million Korean men, Park et al. [29] developed and externally validated a model that exhibited excellent discrimination (AUC, 0.87) in predicting eight-year lung cancer risk, based on age; BMI; smoking status, intensity, and age at initiation; physical activity; and fasting glucose. Wu et al. [30] developed and internally validated models for never, light, and heavy smokers to predict five and ten-year probabilities of lung cancer diagnosis using comprehensive health screening data on Taiwanese residents. Their general population model incorporating traditional risk factors, along with lung function measurement and four serum biomarkers, exhibited sufficient calibration and high discrimination (AUC, 0.85). Additional models have likewise demonstrated the added value of lung function measures in predicting risk among high-risk and never smokers [31, 32].
Increasingly, molecular and genetic markers have been incorporated to enhance risk prediction, yet observed gains have been mostly incremental. Examining a subset of cases and controls from their original study, Spitz et al. found their models for former and current smokers performed better when adding two host DNA repair capacity markers (AUCs, 0.68 and 0.70) [33]. El-Zein et al. [34] likewise extended the original smoking-stratified Spitz models by adding a cytokinesis-blocked micronucleus assay endpoint, which substantially improved prediction in a small external validation sample (AUC, from 0.61 to 0.92). Due to the case-control design, however, it is not entirely clear whether these markers reflect underlying causes or effects of lung cancer. In a cohort of high-risk smokers, Sin et al. [35] found that adding plasma pro-surfactant protein B improved prediction beyond a base model including age, sex, BMI, pneumonia, personal and family history of cancer, forced expiratory volume in one second % predicted, and smoking intensity and duration (AUC, 0.74 vs. 0.67). When externally validated using CARET data, however, discrimination was modest (AUC, 0.68). Similarly, others have extended existing models or developed new models to incorporate candidate single nucleotide polymorphisms (SNPs) associated with lung cancer [25, 36–41]. Improvements in risk prediction were consistently modest, although at most 20 SNPs were incorporated in a single model.
External validation of models predicting lung cancer risk
Unfortunately, relative to developing new risk prediction models, less effort has been devoted toward validating their performance, especially calibration, in independent populations. The few studies conducted so far (and not discussed above) have largely assessed the external validity of the Bach, Spitz, LLP, and PLCOM2012 models in populations of ever smokers. Among male smokers from the Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study, Cronin et al. [42] discovered that the Bach model underestimated ten-year lung cancer risk by 11%; yet, they may have observed more cancers than expected because participants underwent periodic radiographic screening. Comparing models head-to-head, D’Amelio et al. [43] found the Spitz and LLP models (AUCs, 0.69) exhibited slightly better discrimination in predicting five-year lung cancer risk than the Bach model (AUC, 0.66), although calibration was not examined. The LLP model also exhibited modest to good discrimination (AUCs, 0.67–0.82) when assessed alone in several European and U.S. study populations [44]. In more recent and extensive external validation studies, however, the PLCOM2012 model demonstrated the best performance, with respect to discrimination, calibration, sensitivity, and specificity, although not exceedingly better than the Bach model [45, 46]. In support, a study of >95,000 Australian smokers aged ≥45 years also found that the PLCOM2012 model displayed good calibration and discrimination (AUC, 0.80), and that its performance was largely driven by the main predictors of the Bach model, age and smoking history [47].
PREDICTING NODULE MALIGNANCY AT INITIAL DETECTION
For lung cancer screening to be effective, prompt and accurate evaluation of newly detected pulmonary nodules is critical. Employing risk prediction models that accurately distinguish malignant from benign nodules could optimize nodule management by promoting early diagnosis and treatment of cancer, while limiting harms and costs associated with unnecessary diagnostic workup. In fact, pulmonary nodule management guidelines recommend that, before ordering follow-up imaging or biopsy, clinicians estimate the pre-test probability of malignancy using intuition or validated prediction models [48]. Existing models have originated from different populations with respect to clinical setting, patient selection, and prevalence of malignancy, with varying combinations of patient and radiologic characteristics as predictors (Table 2).
TABLE 2.
PREDICTION MODELS FOR ESTIMATING PRE-TEST PROBABILITY OF PULMONARY NODULE MALIGNANCY AT INITIAL DETECTION OR SURGICAL EVALUATION
Author, Year | Location | Data Source(s) for Model Development and Validation | Malignancy Prevalence | Predictors | AUC* |
---|---|---|---|---|---|
Dong [57], 2013 | China | 3,358 patients with SPN who underwent surgical resection from 2005 to 2013 | 77% | Age; smoking status; family history of cancer; nodule diameter, spiculation, border, calcification, lobulation; presence of satellite lesions; serum CEA level; serum CYFRA-21 level | 0.92 |
| |||||
Deppen [74], 2014 | US | Development: 492 VUMC patients evaluated for suspicious lung nodule/mass External validation: 226 VA patients who underwent lung cancer surgery |
VUMC: 72% VA: 93% |
Age; sex; smoking pack-years; history of cancer; body mass index; nodule diameter, spiculation, location, growth; FDG-PET avidity; predicted FEV1; hemoptysis | 0.87 0.89 |
| |||||
Gould [52], 2007 | US | 375 VA patients with 7–30 mm SPN found by chest radiography | 54% | Age; smoking status and quit-years; nodule diameter | 0.78 |
| |||||
Herder [53], 2005 | US | 106 VUMC patients with indeterminate ≤30 mm SPN referred for FDG-PET | 57% | Age; smoking status; history of extrathoracic cancer; nodule diameter, spiculation, location; FDG-PET avidity | 0.88 |
| |||||
Jin [60], 2017 | China | 293 patients with solitary peripheral subsolid nodule who underwent surgical resection | 58% | Nodule diameter, spiculation, solidity; CT attenuation; vascular convergence; pleural tag | 0.89 |
| |||||
Li [55], 2011 | China | Development: 371 patients with pathologically diagnosed SPN from 2000 to 2009 Validation: 62 patients with pathologically diagnosed SPN from 2009 to 2010 |
53% | Age; family history of cancer; nodule diameter, spiculation, border, calcification | 0.89 |
| |||||
McWilliams [62], 2013 | Canada | Development: 1,871 high-risk smokers with nodules from the PanCan Study External validation: 1,090 high-risk smokers with nodules from BCCA chemoprevention trials |
PanCan: 5.5% BCCA: 3.7% |
Age; sex; family history of lung cancer; emphysema; nodule diameter, spiculation, location, type, count | >0.90 |
| |||||
Mehta [54], 2014 | US | 221 Medical University of South Carolina patients with small (3–15 mm) pulmonary nodules found by chest CT | 37% | Model 1: Age; smoking status; history of cancer; nodule diameter, spiculation, location, volume | 0.786 |
Model 2: Age; smoking status; history of cancer; nodule diameter, spiculation, location, V:D ratio | 0.784 | ||||
Model 3: Age; smoking status; history of cancer; nodule diameter, spiculation, location, sphericity index | 0.780 | ||||
| |||||
Swensen [51], 1997 | US | 629 Mayo Clinic patients with indeterminate 4–30 mm SPN found by chest radiography | 23% | Age; smoking status; history of extrathoracic cancer; nodule diameter, spiculation, location | 0.80 |
| |||||
Yang [56], 2017 | China | Development: 1,078 patients with SPNs who had CT-guided needle biopsy from 2011 to 2015 Validation: 344 patients with SPNs who had CT-guided needle biopsy from 2015 to 2016 |
67% | Age; sex; smoking pack-years; history of cancer; nodule size, spiculation, lobulation | 0.78 |
| |||||
Yonemori [58], 2007 | Japan | Development: 452 patients with CT-detected SPN who underwent surgical resection from 1998 to 2004 Validation: 148 patients with CT-detected SPN who underwent surgical resection from 2004 to 2005 |
79% | Nodule spiculation, calcification; presence of CT bronchus sign; serum CEA level; serum C-reactive protein level | 0.84 |
| |||||
Zhang [59], 2015 | China | Development: 294 patients with pathologically diagnosed SPN from 2005 to 2011 Validation: 120 patients with pathologically diagnosed SPN from 2012 to 2014 |
60% | Age; smoking status; nodule diameter, spiculation, border; serum CYFRA-21 level | 0.91 |
| |||||
Zheng [61], 2015 | China | 846 patients with newly detected SPN by chest CT who were referred to Fujian Medical University Union Hospital | 63% | Model for SPN with <50% GGO: Age; presence of symptoms; nodule diameter, lobulation; calcified nodes; serum total protein | 0.808 |
Model for SPN with ≥50% GGO: Sex, FEV1 %; nodule diameter; calcified nodes | 0.845 |
From internal or external validation analyses
Abbreviations: BCCA, British Columbia Cancer Agency; CC, case-control; CEA. carcinoembryonic antigen; CYFRA, cytokeratin fragment; FDG-PET, 18F-fluourodeoxyglucose positron emission tomography; FEV1, forced expiratory volume in 1 second; GGO, ground-glass opacity; PanCan, Pan-Canadian Early Detection of Lung Cancer Study; SPN, solitary pulmonary nodule; US, United States; V:D, volume:diameter; VA, Veteran Affairs; VUMC, Vanderbilt University Medical Center
Since these guidelines were last published, the American College of Radiology introduced the Lung CT Screening Reporting and Data System (Lung-RADS), which is now widely used to standardize interpretation and management of LDCT screening results. Lung-RADS was developed in response to the high rate of false-positive findings in the NLST, and its utility to reduce the false-positive rate, although with a decrease in sensitivity, has been demonstrated retrospectively using NLST data [49]. Under Lung-RADS, LDCT results are classified into four major categories, upon which a scan is defined as negative (categories 1 or 2) or positive (categories 3 or 4A/B/X), based exclusively on characteristics of any pulmonary nodule(s) detected [50]. This classification reflects the likelihood of nodule malignancy, but involves no calculation of the pre-test probability of malignancy.
The earliest risk prediction models were developed by Swensen et al. [51] and Gould et al. [52] to estimate the pre-test probability of malignancy for solitary pulmonary nodules (SPN) detected by chest radiography. Using data on Mayo Clinic patients, Swensen et al. [51] built a highly discriminatory and well-calibrated model including age, smoking status, history of cancer, plus nodule diameter, spiculation, and location (AUC, 0.80). In their cohort, patients with any history of lung cancer or history of extrathoracic cancer in the prior five years were excluded; 12% of patients lacked a final diagnosis; and the prevalence of malignancy was low (23%). Despite these limitations, Herder et al. [53] found this model had external validity (AUC, 0.79) in a smaller cohort with a higher prevalence of malignancy (57%); however, they also reported that it underestimated the actual probability of malignancy at lower values of the predicted probability, and that adding 18F-fluorodeoxyglucose positron emission tomography avidity enhanced prediction (AUC, 0.88). Mehta et al. [54] similarly demonstrated the added value of nodule volume to the Mayo Clinic model in predicting malignancy of small nodules. In comparison, Gould et al. [52] used a geographically diverse sample of Veteran Affairs (VA) patients with a relatively high prevalence of malignancy (54%) to construct a more parsimonious model including age, smoking status, time since quitting smoking, and nodule diameter (AUC, 0.78). Although no exclusion on history of cancer was imposed, their cohort was limited to patients with ≥7 mm SPNs, who were mostly older male smokers.
More recent prediction models were developed from cohorts of patients who underwent surgical resection or biopsy for SPNs, primarily from China, with a high prevalence of malignancy (53–79%) [55–61]. Among Peking University People’s Hospital (PKUPH) patients with SPNs and no history of cancer in the prior five years, Li et al. [55] developed a model including age, family history of cancer, plus nodule diameter, spiculation, border, and calcification (AUC, 0.89), which outperformed both Mayo Clinic (0.75) and VA (0.71) models. Several models have also incorporated blood-based biomarkers, such as serum carcinoembryonic antigen and cytokeratin fragment 21-1, and appear to discriminate better than the Mayo Clinic, VA, and PKUPH models [57–59]. Other models have been tailored to estimate malignancy risk for certain nodule types, such as peripheral subsolid nodules [60, 61].
The only models to estimate malignancy risk for nodules detected by LDCT screening and specifically among smokers have been proposed by McWilliams et al. [62]. Using two trial-based cohorts of high-risk smokers, with a much lower prevalence of malignancy (<6%), they developed and externally validated full and parsimonious models including age, sex, family history of lung cancer, emphysema, and nodule diameter, spiculation, location, type, and count. These models displayed excellent calibration and discrimination (AUC >0.90), even for smaller nodules (1–10 mm) and without accounting for smoking history. Using Danish Lung Cancer Screening Trial data, the McWilliams full model has been shown to outperform Lung-RADS in discriminating malignant from benign nodules detected at baseline [63].
External validation of models predicting nodule malignancy risk
Independent external validation studies have primarily assessed the performance of the Mayo Clinic, VA, and McWilliams models, generally yielding lower AUCs than originally reported. In the earliest studies, the Mayo Clinic model was examined alone or in comparison to other models in small, unscreened populations, nearly all of patients undergoing resection for lung nodules (malignancy prevalence: 44–84%) [64–68]. The Mayo Clinic model exhibited good discrimination in all studies except one [67] (AUCs, 0.67–0.80) and performed better than the VA model (0.68–0.73) [65, 66, 68], yet similar to the PKUPH model (0.81) [68]. However, calibration results, when reported, suggested that the Mayo Clinic model underestimated while the VA model overestimated the actual probability of malignancy [64, 66]. Recent studies have examined the McWilliams full model, either alone in high-risk LDCT-screened populations (malignancy prevalence: 3–9%) [69–71] or compared to other models in unscreened populations (malignancy prevalence: 41–46%) [72, 73]. Overall, the McWilliams model exhibited relatively high accuracy in discriminating malignant from benign nodules (AUCs, 0.82–0.96) [72, 73, 69–71], particularly among NLST participants. In head-to-head comparisons, the Herder model displayed higher accuracy than the McWilliams, Mayo Clinic, and VA models [72], although for small nodules (which are more commonly detected by LDCT), the accuracy of the McWilliams model exceeded that of the Mayo Clinic and VA models [72, 73]. Of these latter studies [72, 73, 69–71], the only one to assess calibration found that, while the Mayo Clinic, VA, and McWilliams models underestimated the actual probability of malignancy, the McWilliams model displayed the greatest ability to predict accurately across the widest range of probabilities [73].
PREDICTING NODULE MALIGNANCY RISK AT SURGICAL EVALUATION
Patients with a suspicious lung nodule are commonly referred to a surgeon for further assessment. In the NLST, 24% of surgical procedures were performed on lung nodules that were ultimately benign [1]. Employing risk prediction models that accurately estimate malignancy risk at referral could optimize decision-making and outcomes by minimizing unnecessary surgeries and missed opportunities for early diagnosis.
To date, Deppen et al. [74] have proposed the only model tailored to estimate risk at preoperative evaluation, although the aforementioned models developed in surgical populations might be applied in this context as well. Their model was developed and internally validated in Vanderbilt University Medical Center (VUMC) patients evaluated for a suspicious lung nodule/mass and externally validated in VA patients who underwent lung cancer surgery, outperforming the Mayo Clinic model on calibration and discrimination (AUCs, VUMC: 0.87 vs. 0.80; VA: 0.89 vs. 0.73). Whether this model performs similarly in other surgical populations, particularly patients with screening-detected nodules, and what risk threshold is optimal for guiding treatment decisions have yet to be determined.
EVALUATING AND IMPROVING RISK PREDICTION MODELS
To employ the best risk assessment tools in lung cancer screening, further research is needed to identify which lung cancer risk prediction models demonstrate the highest predictive performance when applied to screening-eligible or screened populations, preferably in community-based practice settings. That encompasses establishing model-based lung cancer risk thresholds to support optimal decision-making. Although many models appear to have similar discriminatory ability, few studies have externally validated and compared the performance of multiple models, particularly those incorporating biomarker or clinical assessment measures, within the same population. In particular, model calibration requires validation in population-representative cohorts, and existing risk prediction models that are most applicable to the screening context (e.g., Bach, PLCOM2012, LCRAT/LCDRAT, and McWilliams models) have originated from trial-based cohorts. Examination of the extent to which models perform differently within subgroups by age, sex, and race/ethnicity has also been limited. Furthermore, model-selected populations for screening have yet to be compared, in terms of how many smokers are selected for screening by each model and how much agreement exists between models on who is selected. Among the challenges include identifying suitable populations with well-measured data on all model predictors, calculating risk scores and performance measures for multiple models, and interpreting results across different studies. These challenges, however, may be overcome by analyzing data from the UK Biobank and other unique cohorts, including electronic health record-based cohorts; utilizing R-based packages developed to externally validate lung cancer risk prediction models [75, 76]; and reporting study results following TRIPOD guidelines [77].
Continued efforts are also warranted to evaluate the added value of novel predictors to existing lung cancer risk prediction models. Improving risk prediction, even if marginal, is valuable, given the potential harms associated with screening. Although adding a polygenic risk score constructed from selected lung cancer-related SNPs to existing models has resulted in minimal improvement, better discrimination may be achieved by adding a polygenic risk score constructed from a much greater number of top-ranking SNPs that explain a larger proportion of variance [78] and accounting for genetic pathway effects [79]. Common respiratory conditions, including COPD, hay fever, asthma, and pneumonia, have been incorporated in nearly all models estimating risk of developing or dying from lung cancer. For those models, however, such conditions have not always been considered in combination, and other predisposing conditions, including hypertension and rheumatoid arthritis [80, 81], have not been examined as predictors. For models estimating nodule malignancy risk, only emphysema has been included; therefore, prediction may be enhanced by incorporating other conditions, along with new radiologic features and emerging biomarkers [82]. Risk prediction models requiring biomarker or clinical assessment, nevertheless, may have limited utility in practice, if such data are not readily available or recently measured. Also, modest gains in discrimination from incorporating such information may not outweigh the added required costs.
APPLYING RISK PREDICTION MODELS TO LDCT SCREENING
With the recent implementation of LDCT screening, little is known about the use of risk prediction models and its impact on screening processes and outcomes. For coverage of lung cancer screening, the Centers for Medicare and Medicaid Services has mandated that a face-to-face shared decision-making (SDM) visit occur before the initial scan, representing a prime opportunity to use decision aids that calculate future lung cancer risk. Although one model cannot be clearly recommended, selecting one of the more well-validated models to implement is reasonable, given their ability to better identify high-risk smokers than current screening criteria. User-friendly web-based risk calculators for several models are accessible [83–85], which National Comprehensive Cancer Network (NCCN) guidelines acknowledge may assist with SDM [86]. Current NCCN guidelines also endorse LDCT screening for individuals who do not meet NLST eligibility criteria, but have calculated six-year risks >1.3% using the PLCOM2012 model [86]. For diagnostic management of Lung-RADS category 4B findings, consideration of the probability of malignancy as predicted by the McWilliams model, along with patient evaluation and preference, is recommended [50]. Implementation and influence of risk calculators in these decision-making processes, however, depend on how well they are integrated into clinical workflows and facilitate risk communication between patient and provider. Other practical considerations include the ease in measuring and obtaining data on model predictors, particularly for estimating nodule malignancy risk, since key radiologic findings are not typically dictated in any structured format. This challenge could be addressed by creating automated tools that enable radiologists to extract and input relevant data to calculate nodule malignancy risk when images are read [69].
CONCLUSIONS
Many recently proposed risk prediction models show great promise in meeting critical needs for implementing effective and cost-efficient lung cancer screening -- accurate identification of individuals who will truly benefit from being screened and accurate discrimination of malignant from benign lung nodules detected by LDCT. However, current evidence is insufficient to determine which risk prediction models for lung cancer are most clinically useful and how to best implement their use to optimize screening effectiveness and quality. Future research should focus on externally validating and improving existing risk prediction models and evaluating the application of model-based risk calculators and its impact on screening processes and outcomes.
Acknowledgments
This work was supported in part by a career development award to Dr. Sakoda (K07 CA188142).
Footnotes
Human and Animal Rights:
All reported studies/experiments with human or animal subjects performed by the authors have been previously published and complied with all applicable ethical standards (including the Helsinki declaration and its amendments, institutional/national research committee standards, and international/national/institutional guidelines).
Reference List
Important (•) or very important (••) references within the past three years
- 1.Aberle DR, Adams AM, Berg CD, Black WC, Clapp JD, Fagerstrom RM, et al. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med. 2011;365(5):395–409. doi: 10.1056/NEJMoa1102873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2••.Moyer VA. Screening for lung cancer: U.S. Preventive Services Task Force recommendation statement. Annals of internal medicine. 2014;160(5):330–8. doi: 10.7326/m13-2771. Describes the U.S. Preventive Services Task Force recommendation to screen high-risk adults annually for lung cancer with low-dose computed tomography. [DOI] [PubMed] [Google Scholar]
- 3.Jacobson FL, Austin JH, Field JK, Jett JR, Keshavjee S, MacMahon H, et al. Development of The American Association for Thoracic Surgery guidelines for low-dose computed tomography scans to screen for lung cancer in North America: recommendations of The American Association for Thoracic Surgery Task Force for Lung Cancer Screening and Surveillance. J Thorac Cardiovasc Surg. 2012;144(1):25–32. doi: 10.1016/j.jtcvs.2012.05.059. [DOI] [PubMed] [Google Scholar]
- 4.Wender R, Fontham ET, Barrera E, Jr, Colditz GA, Church TR, Ettinger DS, et al. American Cancer Society lung cancer screening guidelines. CA Cancer J Clin. 2013;63(2):107–17. doi: 10.3322/caac.21172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Wood DE, Eapen GA, Ettinger DS, Hou L, Jackman D, Kazerooni E, et al. Lung cancer screening. J Natl Compr Cancer Netw. 2012;10(2):240–65. doi: 10.6004/jnccn.2012.0022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Bach PB, Mirkin JN, Oliver TK, Azzoli CG, Berry DA, Brawley OW, et al. Benefits and harms of CT screening for lung cancer: a systematic review. JAMA. 2012;307(22):2418–29. doi: 10.1001/jama.2012.5521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Marcus PM, Pashayan N, Church TR, Doria-Rose VP, Gould MK, Hubbard RA, et al. Population-Based Precision Cancer Screening: A Symposium on Evidence, Epidemiology, and Next Steps. Cancer Epidemiol Biomarkers Prev. 2016;25(11):1449–55. doi: 10.1158/1055-9965.epi-16-0555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Altman DG, Vergouwe Y, Royston P, Moons KG. Prognosis and prognostic research: validating a prognostic model. BMJ (Clinical research ed) 2009;338:b605. doi: 10.1136/bmj.b605. [DOI] [PubMed] [Google Scholar]
- 9.Justice AC, Covinsky KE, Berlin JA. Assessing the generalizability of prognostic information. Ann Intern Med. 1999;130(6):515–24. doi: 10.7326/0003-4819-130-6-199903160-00016. [DOI] [PubMed] [Google Scholar]
- 10.Bleeker SE, Moll HA, Steyerberg EW, Donders AR, Derksen-Lubsen G, Grobbee DE, et al. External validation is necessary in prediction research: a clinical example. J Clin Epidemiol. 2003;56(9):826–32. doi: 10.1016/s0895-4356(03)00207-5. [DOI] [PubMed] [Google Scholar]
- 11.Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36. doi: 10.1148/radiology.143.1.7063747. [DOI] [PubMed] [Google Scholar]
- 12.Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Dec Making. 2006;26(6):565–74. doi: 10.1177/0272989x06295361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tammemagi MC. Application of risk prediction models to lung cancer screening: a review. J Thorac Imaging. 2015;30(2):88–100. doi: 10.1097/rti.0000000000000142. [DOI] [PubMed] [Google Scholar]
- 14.Van Calster B, Vickers AJ. Calibration of risk prediction models: impact on decision-analytic performance. Med Dec Making. 2015;35(2):162–9. doi: 10.1177/0272989x14547233. [DOI] [PubMed] [Google Scholar]
- 15.Bach PB, Kattan MW, Thornquist MD, Kris MG, Tate RC, Barnett MJ, et al. Variations in lung cancer risk among smokers. J Natl Cancer Inst. 2003;95(6):470–8. doi: 10.1093/jnci/95.6.470. [DOI] [PubMed] [Google Scholar]
- 16.Spitz MR, Hong WK, Amos CI, Wu X, Schabath MB, Dong Q, et al. A risk model for prediction of lung cancer. J Natl Cancer Inst. 2007;99(9):715–26. doi: 10.1093/jnci/djk153. [DOI] [PubMed] [Google Scholar]
- 17.Cassidy A, Myles JP, van Tongeren M, Page RD, Liloglou T, Duffy SW, et al. The LLP risk model: an individual risk prediction model for lung cancer. Br J Cancer. 2008;98(2):270–6. doi: 10.1038/sj.bjc.6604158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Etzel CJ, Kachroo S, Liu M, D’Amelio A, Dong Q, Cote ML, et al. Development and validation of a lung cancer risk prediction model for African-Americans. Cancer Prev Res (Philadelphia, Pa) 2008;1(4):255–65. doi: 10.1158/1940-6207.capr-08-0082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Tammemagi CM, Pinsky PF, Caporaso NE, Kvale PA, Hocking WG, Church TR, et al. Lung cancer risk prediction: Prostate, Lung, Colorectal And Ovarian Cancer Screening Trial models and validation. J Natl Cancer Inst. 2011;103(13):1058–68. doi: 10.1093/jnci/djr173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Tammemagi MC, Church TR, Hocking WG, Silvestri GA, Kvale PA, Riley TL, et al. Evaluation of the lung cancer risks at which to screen ever- and never-smokers: screening rules applied to the PLCO and NLST cohorts. PLoS Med. 2014;11(12):e1001764. doi: 10.1371/journal.pmed.1001764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Tammemagi MC, Katki HA, Hocking WG, Church TR, Caporaso N, Kvale PA, et al. Selection criteria for lung-cancer screening. N Engl J Med. 2013;368(8):728–36. doi: 10.1056/NEJMoa1211776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kovalchik SA, Tammemagi M, Berg CD, Caporaso NE, Riley TL, Korch M, et al. Targeting of low-dose CT screening according to the risk of lung-cancer death. N Engl J Med. 2013;369(3):245–54. doi: 10.1056/NEJMoa1301851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Katki HA, Kovalchik SA, Berg CD, Cheung LC, Chaturvedi AK. Development and Validation of Risk Models to Select Ever-Smokers for CT Lung Cancer Screening. JAMA. 2016;315(21):2300–11. doi: 10.1001/jama.2016.6255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wilson DO, Weissfeld J. A simple model for predicting lung cancer occurrence in a lung cancer screening program: The Pittsburgh Predictor. Lung Cancer (Amsterdam, Netherlands) 2015;89(1):31–7. doi: 10.1016/j.lungcan.2015.03.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hoggart C, Brennan P, Tjonneland A, Vogel U, Overvad K, Ostergaard JN, et al. A risk model for lung cancer incidence. Cancer Prev Res (Philadelphia, Pa) 2012;5(6):834–46. doi: 10.1158/1940-6207.capr-11-0237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Marcus MW, Chen Y, Raji OY, Duffy SW, Field JK. LLPi: Liverpool Lung Project Risk Prediction Model for Lung Cancer Incidence. Cancer Prev Res (Philadelphia, Pa) 2015;8(6):570–5. doi: 10.1158/1940-6207.capr-14-0438. [DOI] [PubMed] [Google Scholar]
- 27.Hippisley-Cox J, Coupland C. Development and validation of risk prediction algorithms to estimate future risk of common cancers in men and women: prospective cohort study. BMJ Open. 2015;5(3):e007825. doi: 10.1136/bmjopen-2015-007825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Wang X, Ma K, Cui J, Chen X, Jin L, Li W. An individual risk prediction model for lung cancer based on a study in a Chinese population. Tumori. 2015;101(1):16–23. doi: 10.5301/tj.5000205. [DOI] [PubMed] [Google Scholar]
- 29.Park S, Nam BH, Yang HR, Lee JA, Lim H, Han JT, et al. Individualized risk prediction model for lung cancer in Korean men. PloS One. 2013;8(2):e54823. doi: 10.1371/journal.pone.0054823. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Wu X, Wen CP, Ye Y, Tsai M, Wen C, Roth JA, et al. Personalized risk assessment in never, light, and heavy smokers in a prospective cohort in Taiwan. Sci Rep. 2016;6:36482. doi: 10.1038/srep36482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Muller DC, Johansson M, Brennan P. Lung Cancer Risk Prediction Model Incorporating Lung Function: Development and Validation in the UK Biobank Prospective Cohort Study. J Clin Oncol. 2017 doi: 10.1200/jco.2016.69.2467. Jco2016692467. [DOI] [PubMed] [Google Scholar]
- 32.Tammemagi MC, Lam SC, McWilliams AM, Sin DD. Incremental value of pulmonary function and sputum DNA image cytometry in lung cancer risk prediction. Cancer Prev Res (Philadelphia, Pa) 2011;4(4):552–61. doi: 10.1158/1940-6207.capr-10-0183. [DOI] [PubMed] [Google Scholar]
- 33.Spitz MR, Etzel CJ, Dong Q, Amos CI, Wei Q, Wu X, et al. An expanded risk prediction model for lung cancer. Cancer Prev Res (Philadelphia, Pa) 2008;1(4):250–4. doi: 10.1158/1940-6207.capr-08-0060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.El-Zein RA, Lopez MS, D’Amelio AM, Jr, Liu M, Munden RF, Christiani D, et al. The cytokinesis-blocked micronucleus assay as a strong predictor of lung cancer: extension of a lung cancer risk prediction model. Cancer Epidemiol Biomarkers Prev. 2014;23(11):2462–70. doi: 10.1158/1055-9965.epi-14-0462. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Sin DD, Tammemagi CM, Lam S, Barnett MJ, Duan X, Tam A, et al. Pro-surfactant protein B as a biomarker for lung cancer prediction. J Clin Oncol. 2013;31(36):4536–43. doi: 10.1200/jco.2013.50.6105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Li H, Yang L, Zhao X, Wang J, Qian J, Chen H, et al. Prediction of lung cancer risk in a Chinese population using a multifactorial genetic model. BMC Med Genet. 2012;13:118. doi: 10.1186/1471-2350-13-118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Marcus MW, Raji OY, Duffy SW, Young RP, Hopkins RJ, Field JK. Incorporating epistasis interaction of genetic susceptibility single nucleotide polymorphisms in a lung cancer risk prediction model. Int J Oncol. 2016;49(1):361–70. doi: 10.3892/ijo.2016.3499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Raji OY, Agbaje OF, Duffy SW, Cassidy A, Field JK. Incorporation of a genetic factor into an epidemiologic model for prediction of individual risk of lung cancer: the Liverpool Lung Project. Cancer Prev Res (Philadelphia, Pa) 2010;3(5):664–9. doi: 10.1158/1940-6207.capr-09-0141. [DOI] [PubMed] [Google Scholar]
- 39.Spitz MR, Amos CI, Land S, Wu X, Dong Q, Wenzlaff AS, et al. Role of selected genetic variants in lung cancer risk in African Americans. J Thorac Oncol. 2013;8(4):391–7. doi: 10.1097/JTO.0b013e318283da29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Young RP, Hopkins RJ, Hay BA, Epton MJ, Mills GD, Black PN, et al. A gene-based risk score for lung cancer susceptibility in smokers and ex-smokers. Postgr Med J. 2009;85(1008):515–24. doi: 10.1136/pgmj.2008.077107. [DOI] [PubMed] [Google Scholar]
- 41.Young RP, Hopkins RJ, Hay BA, Epton MJ, Mills GD, Black PN, et al. Lung cancer susceptibility model based on age, family history and genetic variants. PloS One. 2009;4(4):e5302. doi: 10.1371/journal.pone.0005302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Cronin KA, Gail MH, Zou Z, Bach PB, Virtamo J, Albanes D. Validation of a model of lung cancer risk prediction among smokers. J Natl Cancer Inst. 2006;98(9):637–40. doi: 10.1093/jnci/djj163. [DOI] [PubMed] [Google Scholar]
- 43.D’Amelio AM, Jr, Cassidy A, Asomaning K, Raji OY, Duffy SW, Field JK, et al. Comparison of discriminatory power and accuracy of three lung cancer risk models. Br J Cancer. 2010;103(3):423–9. doi: 10.1038/sj.bjc.6605759. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Raji OY, Duffy SW, Agbaje OF, Baker SG, Christiani DC, Cassidy A, et al. Predictive accuracy of the Liverpool Lung Project risk model for stratifying patients for computed tomography screening for lung cancer: a case-control and cohort validation study. Ann Intern Med. 2012;157(4):242–50. doi: 10.7326/0003-4819-157-4-201208210-00004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Li K, Husing A, Sookthai D, Bergmann M, Boeing H, Becker N, et al. Selecting High-Risk Individuals for Lung Cancer Screening: A Prospective Evaluation of Existing Risk Models and Eligibility Criteria in the German EPIC Cohort. Cancer Prev Res (Philadelphia, Pa) 2015;8(9):777–85. doi: 10.1158/1940-6207.capr-14-0424. [DOI] [PubMed] [Google Scholar]
- 46.Ten Haaf K, Jeon J, Tammemagi MC, Han SS, Kong CY, Plevritis SK, et al. Risk prediction models for selection of lung cancer screening candidates: A retrospective validation study. PLoS Med. 2017;14(4):e1002277. doi: 10.1371/journal.pmed.1002277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Weber M, Yap S, Goldsbury D, Manners D, Tammemagi M, Marshall H, et al. Identifying high risk individuals for targeted lung cancer screening: Independent validation of the PLCOM2012 risk prediction tool. Int J Cancer. 2017 doi: 10.1002/ijc.30673. [DOI] [PubMed] [Google Scholar]
- 48.Gould MK, Donington J, Lynch WR, Mazzone PJ, Midthun DE, Naidich DP, et al. Evaluation of individuals with pulmonary nodules: when is it lung cancer? Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest. 2013;143(5 Suppl):e93S–e120S. doi: 10.1378/chest.12-2351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Pinsky PF, Gierada DS, Black W, Munden R, Nath H, Aberle D, et al. Performance of Lung-RADS in the National Lung Screening Trial: a retrospective assessment. Ann Intern Med. 2015;162(7):485–91. doi: 10.7326/M14-2086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.American College of Radiology. [Last accessed 09/12/2017];Lung-RADS Version 1.0 Assessment Categories. Release Date: April 28, 2014. https://www.acr.org/~/media/ACR/Documents/PDF/QualitySafety/Resources/LungRADS/AssessmentCategories.pdf.
- 51.Swensen SJ, Silverstein MD, Ilstrup DM, Schleck CD, Edell ES. The probability of malignancy in solitary pulmonary nodules. Application to small radiologically indeterminate nodules. Arch Intern Med. 1997;157(8):849–55. [PubMed] [Google Scholar]
- 52.Gould MK, Ananth L, Barnett PG. A clinical model to estimate the pretest probability of lung cancer in patients with solitary pulmonary nodules. Chest. 2007;131(2):383–8. doi: 10.1378/chest.06-1261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Herder GJ, van Tinteren H, Golding RP, Kostense PJ, Comans EF, Smit EF, et al. Clinical prediction model to characterize pulmonary nodules: validation and added value of 18F-fluorodeoxyglucose positron emission tomography. Chest. 2005;128(4):2490–6. doi: 10.1378/chest.128.4.2490. [DOI] [PubMed] [Google Scholar]
- 54.Mehta HJ, Ravenel JG, Shaftman SR, Tanner NT, Paoletti L, Taylor KK, et al. The utility of nodule volume in the context of malignancy prediction for small pulmonary nodules. Chest. 2014;145(3):464–72. doi: 10.1378/chest.13-0708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Li Y, Chen KZ, Wang J. Development and validation of a clinical prediction model to estimate the probability of malignancy in solitary pulmonary nodules in Chinese people. Clin Lung Cancer. 2011;12(5):313–9. doi: 10.1016/j.cllc.2011.06.005. [DOI] [PubMed] [Google Scholar]
- 56.Yang L, Zhang Q, Bai L, Li T-Y, He C, Ma Q-L, et al. Assessment of the cancer risk factors of solitary pulmonary nodules. Oncotarget. 2017;8(17):29318–27. doi: 10.18632/oncotarget.16426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Dong J, Sun N, Li J, Liu Z, Zhang B, Chen Z, et al. Development and validation of clinical diagnostic models for the probability of malignancy in solitary pulmonary nodules. Thorac Cancer. 2014;5(2):162–8. doi: 10.1111/1759-7714.12077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Yonemori K, Tateishi U, Uno H, Yonemori Y, Tsuta K, Takeuchi M, et al. Development and validation of diagnostic prediction model for solitary pulmonary nodules. Respirology (Carlton, Vic) 2007;12(6):856–62. doi: 10.1111/j.1440-1843.2007.01158.x. [DOI] [PubMed] [Google Scholar]
- 59.Zhang M, Zhuo N, Guo Z, Zhang X, Liang W, Zhao S, et al. Establishment of a mathematic model for predicting malignancy in solitary pulmonary nodules. J Thorac Dis. 2015;7(10):1833–41. doi: 10.3978/j.issn.2072-1439.2015.10.56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Jin C, Cao J, Cai Y, Wang L, Liu K, Shen W, et al. A nomogram for predicting the risk of invasive pulmonary adenocarcinoma for patients with solitary peripheral subsolid nodules. J Thorac Cardiovasc Surg. 2017;153(2):462–9. e1. doi: 10.1016/j.jtcvs.2016.10.019. [DOI] [PubMed] [Google Scholar]
- 61.Zheng B, Zhou X, Chen J, Zheng W, Duan Q, Chen C. A Modified Model for Preoperatively Predicting Malignancy of Solitary Pulmonary Nodules: An Asia Cohort Study. Ann Thorac Surg. 2015;100(1):288–94. doi: 10.1016/j.athoracsur.2015.03.071. [DOI] [PubMed] [Google Scholar]
- 62.McWilliams A, Tammemagi MC, Mayo JR, Roberts H, Liu G, Soghrati K, et al. Probability of cancer in pulmonary nodules detected on first screening CT. N Engl J Med. 2013;369(10):910–9. doi: 10.1056/NEJMoa1214726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.van Riel SJ, Ciompi F, Jacobs C, Winkler Wille MM, Scholten ET, Naqibullah M, et al. Malignancy risk estimation of screen-detected nodules at baseline CT: comparison of the PanCan model, Lung-RADS and NCCN guidelines. Eur Radiology. 2017 doi: 10.1007/s00330-017-4767-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Isbell JM, Deppen S, Putnam JB, Jr, Nesbitt JC, Lambright ES, Dawes A, et al. Existing general population models inaccurately predict lung cancer risk in patients referred for surgical evaluation. The Ann Thorac Surg. 2011;91(1):227–33. doi: 10.1016/j.athoracsur.2010.08.054. discussion 33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Melo CB, Perfeito JA, Daud DF, da Costa AS, Junior, Santoro IL, Leao LE. Analysis and validation of probabilistic models for predicting malignancy in solitary pulmonary nodules in a population in Brazil. J Bras Pneumol. 2012;38(5):559–65. doi: 10.1590/s1806-37132012000500004. [DOI] [PubMed] [Google Scholar]
- 66.Schultz EM, Sanders GD, Trotter PR, Patz EF, Jr, Silvestri GA, Owens DK, et al. Validation of two models to estimate the probability of malignancy in patients with solitary pulmonary nodules. Thorax. 2008;63(4):335–41. doi: 10.1136/thx.2007.084731. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Shinohara S, Hanagiri T, Takenaka M, Chikaishi Y, Oka S, Shimokawa H, et al. Evaluation of undiagnosed solitary lung nodules according to the probability of malignancy in the American College of Chest Physicians (ACCP) evidence-based clinical practice guidelines. Radiol Oncol. 2014;48(1):50–5. doi: 10.2478/raon-2013-0064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Xiao F, Liu D, Guo Y, Shi B, Song Z, Tian Y, et al. Novel and convenient method to evaluate the character of solitary pulmonary nodule-comparison of three mathematical prediction models and further stratification of risk factors. PloS One. 2013;8(10):e78271. doi: 10.1371/journal.pone.0078271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.White CS, Dharaiya E, Campbell E, Boroczky L. The Vancouver Lung Cancer Risk Prediction Model: Assessment by Using a Subset of the National Lung Screening Trial Cohort. Radiology. 2016:152627. doi: 10.1148/radiol.2016152627. [DOI] [PubMed] [Google Scholar]
- 70.Winkler Wille MM, van Riel SJ, Saghir Z, Dirksen A, Pedersen JH, Jacobs C, et al. Predictive accuracy of the PanCan Lung Cancer Risk Prediction Model -External validation based on CT from the Danish Lung Cancer Screening Trial. Eur Radiol. 2015;25(10):3093–9. doi: 10.1007/s00330-015-3689-0. [DOI] [PubMed] [Google Scholar]
- 71.Zhao H, Marshall HM, Yang IA, Bowman RV, Ayres J, Crossin J, et al. Screen-detected subsolid pulmonary nodules: long-term follow-up and application of the PanCan lung cancer risk prediction model. Br J Radiol. 2016;89(1060):20160016. doi: 10.1259/bjr.20160016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Al-Ameri A, Malhotra P, Thygesen H, Plant PK, Vaidyanathan S, Karthik S, et al. Risk of malignancy in pulmonary nodules: A validation study of four prediction models. Lung cancer (Amsterdam, Netherlands) 2015;89(1):27–30. doi: 10.1016/j.lungcan.2015.03.018. [DOI] [PubMed] [Google Scholar]
- 73.Talwar A, Rahman NM, Kadir T, Pickup LC, Gleeson F. A retrospective validation study of three models to estimate the probability of malignancy in patients with small pulmonary nodules from a tertiary oncology follow-up centre. Clin Radiol. 2017;72(2):177e1–e8. doi: 10.1016/j.crad.2016.09.014. [DOI] [PubMed] [Google Scholar]
- 74.Deppen SA, Blume JD, Aldrich MC, Fletcher SA, Massion PP, Walker RC, et al. Predicting lung cancer prior to surgical resection in patients with lung nodules. J Thorac Oncol. 2014;9(10):1477–84. doi: 10.1097/jto.0000000000000287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.National Cancer Institute, Division of Cancer Epidemiology & Genetics, Lung Cancer. [Last accessed 09/12/2017];Risk Models for Screening (R package: lcrisks) http://dceg.cancer.gov/tools/risk-assessment/lcrisks.
- 76.National Cancer Institute, Division of Cancer Epidemiology & Genetics. [Last accessed 09/12/2017];R package: lcmodels. http://dceg.cancer.gov/tools/risk-assessment/lcmodels.
- 77•.Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162(1):55–63. doi: 10.7326/m14-0697. Presents guidelines for the systematic and transparent reporting of studies designed to develop, validate, or update a prediction model. [DOI] [PubMed] [Google Scholar]
- 78.Simonson MA, Wills AG, Keller MC, McQueen MB. Recent methods for polygenic analysis of genome-wide data implicate an important effect of common variants on cardiovascular disease risk. BMC Med Genet. 2011;12:146. doi: 10.1186/1471-2350-12-146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Qian DC, Han Y, Byun J, Shin HR, Hung RJ, McLaughlin JR, et al. A Novel Pathway-Based Approach Improves Lung Cancer Risk Prediction Using Germline Genetic Variations. Cancer Epidemiol Biomarkers Prev. 2016;25(8):1208–15. doi: 10.1158/1055-9965.epi-15-1318. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Khurana R, Wolf R, Berney S, Caldito G, Hayat S, Berney SM. Risk of development of lung cancer is increased in patients with rheumatoid arthritis: a large case control study in US veterans. J Rheumatol. 2008;35(9):1704–8. [PubMed] [Google Scholar]
- 81.Stocks T, Van Hemelrijck M, Manjer J, Bjorge T, Ulmer H, Hallmans G, et al. Blood pressure and risk of cancer incidence and mortality in the Metabolic Syndrome and Cancer Project. Hypertension (Dallas, Tex : 1979) 2012;59(4):802–10. doi: 10.1161/hypertensionaha.111.189258. [DOI] [PubMed] [Google Scholar]
- 82.Carter BW, Godoy MC, Erasmus JJ. Predicting Malignant Nodules from Screening CTs. J Thorac Oncol. 2016;11(12):2045–7. doi: 10.1016/j.jtho.2016.09.117. [DOI] [PubMed] [Google Scholar]
- 83.Memorial Sloan Ketting Cancer Center. [Last accessed 09/12/2017];Lung Cancer Screening Decision Tool. http://nomograms.mskcc.org/Lung/Screening.aspx.
- 84. [Last accessed 09/12/2017];Lung Cancer CT Screening. http://www.shouldiscreen.com/
- 85.National Cancer Institute, Division of Cancer Epidemiology & Genetics. [Last accessed 09/12/2017];Risk-based NLST Outcomes Tool (RNOT) http://analysistools.nci.nih.gov/lungCancerScreening/
- 86•.NCCN Clinical Practice Guidelines in Oncology - Lung Cancer Screening Version 2.2018. Represents the first clinical practice guidelines for lung cancer screening in the United States to endorse LDCT screening based on model-based predicted lung cancer risk