Figure 3.
Test statistics for a potential screening tool using the Wang-attMIL image-only models
Test performances at thresholds of 0.25, 0.5, and 0.75 (top) and at a threshold that yielded 95% in-domain sensitivity (95-Sens. threshold) averaged across the five models per biomarker. In-domain performances are measured by the summed model predictions over respective test sets. External performances on DACHS are obtained by averaging scores for biomarker prediction over all five Wang-attMIL models per biomarker. Clinical statistics for correctly classified and misclassified patients in QUASAR and DACHS at a threshold value of 0.5 are given in Tables S7 and S8.