Skip to main content
. 2020 Apr 7;112(10):979–988. doi: 10.1093/jnci/djaa050

Table 2.

Algorithm outcome and validation

Reference Model type Sensitivity, %
(95% CI)
Specificity, %
(95% CI)
PPV, %
(95% CI)
NPV, %
(95% CI)
Calculated performance measures, accuracy, % (95% CI) Internal validation (size and performance) External validation (size and performance) Distribution of development and validation set
Xu, 2019 (7) CART 94.2 (90.1 to 98.4) 98.3 (97.2 to 99.5) 93.4 (89.1 to 97.8) 98.5 (97.4 to 99.6) 97.5 (95.8 to 98.5)a Yes (—) 60-40
Rasmussen, 2019 (15) Rules 97.3 (93 to 99) 97.2 (95 to 99) 94.2 98.7 97.2 (95.3 to 98.4)a
Cronin-Fenton, 2018 (12) Rules 88.1 (75.9 to 95.3) 87.6 (80.6 to 92.7) 72.5 (59.3 to 83.3) 95.2 (89.8 to 98.1) 87.7 (81.6 to 92.1)a
Ritzwoller, 2018 (16) Logistic regression 80.5 (77.5 to 87.7) 97.3 (91.9 to 98.1) 70.0 (44.2 to 77.7) 98.5 (98.2 to 99.0) 96.1 (95.4 to 96.7) Yes (3370; AUROC of validation set: 0.96 [0.94 to 0.97]) Yes (3961; AUROC of validation set: 0.90 [0.87 to 0.93]) 50-50
Chubak, 2012 (17), CART 94 (90 to 97) 92 (91 to 94) 58 (52 to 63) 99 (99 to 100) 92.2 (90.9 to 93.3) Yes (—) 60-40
Chubak, 2017 (18) CART 94 (90 to 97) 92 (91 to 94) 58 (52 to 63) 99 (99 to 100) 92.2 (90.9 to 93.3)
Kroenke, 2016 (5) CART LACE cohort: 88.5 LACE cohort: 94.6 LACE cohort: 76.4 LACE cohort: 97.7 93.6 (92.3 to 94.7) Validates algorithms for a previous dataset published by Chubak et al. (18)
WHI cohort: 86.7 WHI cohort: 92.3 WHI cohort: 63.4 WHI cohort: 97.8 91.6 (87.1 to 94.6)
Haque, 2015 (13) Rules 96.8 (87.6 to 99.4) 93.0 (85.1 to 96.1) 88.2 (75.9 to 93.4) 98.1 (92.8 to 99.7) 94.3 (89.7 to 97.0) Yes 500; sensitivity: 96.9% (88.4% to 99.5%); specificity: 92.4% (89.4% to 94.6%); PPV: 65.6% (55.2% to 74.8%); NPV: 99.5% (98.0% to 99.9%)
Liede, 2015 (6) Rules 77.5 (73.9 to 81.1) 98.1 (97.8 to 98.4) 72.1 (68.4 to 75.8) 98.6 (98.3 to 98.8) 96.9 (96.5 to 97.2)
Lamont, 2006 (19) Rules 92 (66 to 100) 94 (82 to 99) 93.3 (81.5 to 98.4)
Nordstrom, 2012 (8) CART and random forestsb 62 97 75 95 92.6 (91.1 to 93.9)a Yes (—) 60-40
Nordstrom, 2015 (9) Random forestsb 47.1 95.9 28.6 98.1 94.2 (91.8 to 96.0) Yes (—)
Whyte, 2015 (11) Rules 77.19 79.54 30.91 97.71 79.3 (78.1 to 80.5)a Validates 28 different algorithms used to identify metastatic cancers (algorithms not published)
Chawla, 2014 (20) Rules 43.9 98.6 93.8 78.5 80.8 (80.3 to 81.3)
Hassett, 2014 (21) Rules 81 (67 to 90) 78 (76 to 80) 20 78.1 (76.4 to 79.6)
Sathiakumar, 2017 (10) Rules 96.8 (83.8 to 99.4) 98.6 (95.9 to 99.5) 90.9 (76.4 to 96.9) 99.5 (97.3 to 99.9) 98.3 (95.6 to 99.5)a
McClish, 2003 (14) Logistic regression a Yes (—)
a

Studies reporting performance measures: Xu et al. (2019): accuracy = 97.5% (96.2%–98.7%); Ritzwoller et al. (2019): AUROC = 0.96 (0.94–0.97); Rasmussen et al. (2019): kappa = 0.94 (0.90–0.97); Nordstrom et al. (2012): AUROC = 0.82; Whyte et al. (2015): accuracy: 79.3%; Sathiakumar et al. (2017): kappa = 0.93 (0.80–1.05); McClish et al. (2003): AUROC = 0.90; — = not reported. AUROC = area under the receiver-operator characteristics curve; CART = classification, regression and decision tree; CI = confidence interval; LACE = Life After Cancer Epidemiology; NPV = negative predictive value; PPV = positive predictive value; WHI = Women's Health Initiative.

b

CART is a decision tree built on a dataset, whereas random forests are a collection of decision trees which randomly selects variables. CART and random trees are categorized together as model-based approaches.