Skip to main content
. 2019 Jul 25;5(10):1421–1429. doi: 10.1001/jamaoncol.2019.1800

Table 2. Model Performance Characteristics in 14 230 Total Imaging Reports Among Curation Set Patientsa.

Outcome (% of All Reports With Outcome) Metric Training Subset (11 182 Reports) Validation Subset (1545 Reports) Test Subset (1503 Reports)
CV 1 CV 2 CV 3 CV 4 CV 5
Any cancer (61.7) AUC 0.92 0.94 0.91 0.93 0.90 0.91 0.92
Area under PR curve 0.95 0.96 0.92 0.95 0.93 0.92 0.94
Best F1 score 0.90 0.90 0.86 0.90 0.87 0.88 0.88
Worsening/progressing (24.7) AUC 0.92 0.94 0.90 0.92 0.92 0.92 0.94
Area under PR curve 0.79 0.85 0.78 0.81 0.82 0.78 0.83
Best F1 score 0.74 0.79 0.72 0.75 0.76 0.73 0.78
Improving/responding (11.5) AUC 0.94 0.93 0.94 0.93 0.93 0.93 0.95
Area under PR curve 0.78 0.75 0.76 0.74 0.72 0.75 0.82
Best F1 score 0.74 0.73 0.73 0.73 0.70 0.72 0.76
Metastasis in liver (8.1) AUC 0.97 0.95 0.97 0.97 0.97 0.96 0.98
Area under PR curve 0.83 0.73 0.86 0.84 0.83 0.76 0.74
Best F1 score 0.78 0.73 0.81 0.78 0.77 0.71 0.71
Metastases in bone (17.3) AUC 0.96 0.95 0.96 0.94 0.95 0.93 0.95
Area under PR curve 0.85 0.87 0.85 0.81 0.82 0.74 0.75
Best F1 score 0.80 0.82 0.83 0.78 0.79 0.75 0.76
Metastases in brain/spine (8.3) AUC 0.99 0.97 0.98 0.97 0.99 0.95 0.97
Area under PR curve 0.85 0.77 0.90 0.77 0.89 0.79 0.83
Best F1 score 0.83 0.78 0.85 0.77 0.83 0.75 0.79
Metastases in lymph nodes (13.4) AUC 0.86 0.84 0.87 0.82 0.87 0.87 0.89
Area under PR curve 0.54 0.45 0.47 0.43 0.41 0.49 0.49
Best F1 score 0.55 0.51 0.53 0.49 0.48 0.53 0.55
Metastases in adrenal (4.7) AUC 0.97 0.96 0.97 0.96 0.97 0.96 0.97
Area under PR curve 0.76 0.80 0.75 0.68 0.82 0.69 0.73
Best F1 score 0.70 0.77 0.72 0.64 0.75 0.76 0.73

Abbreviations: AUC, area under the receiver operating characteristic curve; CV, cross-validation; F1 score, harmonic mean between precision and recall; PR, precision recall.

a

Cross-validation models, each of which was trained using a random sample of 80% of the training subset patients and evaluated using the remaining 20% of the training subset. Training subset patients were allowed into more than 1 cross-validation model.