Table 3.
Logistic regression lung cancer prediction models prepared in all of the PLCO control arms (model 1) and in smokers only (model 2)*
Variable | Model 1† |
Model 2‡ |
||
All PLCO control arm (N = 61 999), OR (95% CI) | P | Smokers only in PLCO control arm (N = 33 049), OR (95% CI) | P | |
Age, per year | ||||
Age spline 1 | 1.212 (1.110 to 1.322) | <.001 | 1.245 (1.130 to 1.372) | <.001 |
Age spline 2 | 0.732 (0.551 to 0.972) | .031 | 0.705 (0.505 to 0.984) | .040 |
Age spline 3 | 1.884 (0.917 to 3.869) | .085 | 2.205 (0.860 to 5.651) | .100 |
Education, per 1 of 7 levels change | 0.930 (0.890 to 0.971) | .001 | 0.928 (0.887 to 0.971) | .001 |
BMI, per 1 unit change | 0.970 (0.955 to 0.985) | <.001 | 0.972 (0.956 to 0.988) | .001 |
Family history of lung cancer, yes vs no | 1.564 (1.323 to 1.848) | <.001 | 1.561 (1.313 to 1.856) | <.001 |
COPD, yes vs no | 1.380 (1.153 to 1.651) | <.001 | 1.374 (1.145 to 1.648) | .001 |
Chest x-ray in past 3 y, per 1 of 3 levels | 1.114 (1.020 to 1.217) | .017 | 1.117 (1.019 to 1.225) | .019 |
Pack-years smoked, per 1 pack-year | ||||
PKYR spline 1 | 1.108 (1.073 to 1.144) | <.001 | 1.059 (1.044 to 1.074) | <.001 |
PKYR spline 2 | 0.500 (0.392 to 0.636) | <.001 | 0.949 (0.935 to 0.964) | <.001 |
Smoking duration, linear, per 1 y | 1.012 (0.995 to 1.029) | .171 | ||
Smoking duration, splines, per 1 y | ||||
Duration spline 1 | 0.986 (0.949 to 1.025) | .480 | ||
Duration spline 2 | 1.127 (1.019 to 1.246) | .020 | ||
Smoking quit-time in smokers, per 1 y | ||||
Quit-time spline 1 | 0.945 (0.918 to 0.974) | <.001 | ||
Quit-time spline 2 | 1.047 (1.011 to 1.085) | .010 | ||
Smoking status | ||||
Never/former | Baseline | <.001 | Baseline | .010 |
Current | 1.721 (1.426 to 2.077) | 1.356 (1.077 to 1.708) | ||
Model performance statistics | ||||
Hosmer–Lemeshow goodness of fit | .274 | .416 | ||
Nagelkerke’s R2 (BOC) | 0.199 (0.195)§ | 0.152 (0.147)§ | ||
ROC AUC/c statistic (95% CI) (BOC) | 0.859 (95% CI = 0.8476 to 0.8707) (0.857)§ | 0.809 (95% CI = 0.7957 to 0.8219) (0.805)§ | ||
Calibration line | Slope (BOC) = 0.987§ | Slope (BOC) = 0.979§ | ||
Intercept (BOC) = −0.042§ | Intercept (BOC) = −0.061§ | |||
Mean absolute error = 0.0009 | Mean absolute error = 0.0014 | |||
0.9 quantile of absolute error = 0.0025 | 0.9 quantile of absolute error = 0.0029 | |||
External validation | ROC AUC (95% CI) | ROC AUC (95% CI) | ||
All validation sample | 0.841 (0.813 to 0.870), n = 36 363 | 0.784 (0.745 to 0.824), n = 15 169 | ||
Women | 0.828 (0.781 to 0.876), n = 18 988 | 0.779 (0.711 to 0.847), n = 6121 | ||
Men | 0.849 (0.815 to 0.883), n = 17 375 | 0.789 (0.743 to 0.835), n = 9048 | ||
Whites | 0.843 (0.813 to 0.872; n = 33 116) | 0.778 (0.736 to 0.819), n = 13 900 | ||
Nonwhites, including Hispanics | 0.829 (0.719 to 0.939), n = 3247 | 0.876 (0.809 to 0.943), n = 1269 |
BMI = body mass index; BOC = bootstrap optimism corrected; CI = confidence interval; PKYR = pack-years smoked; PLCO = Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial; ROC AUC = receiver operator characteristic area under the curve.
Splines for age, pack-years smoked, and smoking duration in model 1 are based on all PLCO control subjects. Knots for age were at 55, 60, 65, and 72 years. Knots for pack-years were at 0, 2.25, and 49 pack-years. Knots for smoking duration were at 0, 6, and 41 years.
Splines for age, pack-years smoked, and quit-time in model 2 are based on the distribution of these variables in smokers only. Knots for age were at 55, 60, 64, and 72 years. Knots for pack-years were at 3.25, 23.25, and 63 pack-years. Knots for quit-time were at 0, 15, and 35 years.
Bootstrap optimism corrected estimate of model performance based on 200 bootstrap resamplings.