Receiver Operating Characteristic (ROC) plots comparing five
anaesthesiologists’ predictions from recorded data with and without
Prescience assistance. Light colored lines represent individual
anaesthesiologist’s performances; dark lines represent their average
performance. (A) For initial risk prediction, anaesthesiologists
(green, AUC = 0.60) performed significantly better with Prescience assistance
(blue, AUC = 0.76; P-value < 0.0001) than without Prescience assistance,
and Prescience performed better in a direct comparison with anaesthesiologists
(purple, AUC = 0.83; P-value < 0.0001). (B) For
intraoperative real-time (next 5 minute) risk prediction anaesthesiologists
(green, AUC = 0.66) again performed better with Prescience assistance (blue, AUC
= 0.78; P-value < 0.0001), and Prescience alone outperformed
anaesthesiologists predictions (purple, AUC = 0.81; P-value < 0.0001).
Note that the False Positive Rate (FPR) (x-axis) measures how many points
without upcoming hypoxemia were incorrectly predicted to have upcoming
hypoxemia. The True Positive Rate (TPR) (y-axis) measures what fraction of
hypoxemic events were correctly predicted. P-values were computed using
bootstrap resampling over the tested time points while measuring the difference
in area between the curves. If we instead resample over anaesthesiologists we
observe bootstrap P-values of 0, and t-test P-values < 0.001 for
Prescience improvements. See Supplementary Figure 8 for plots of the statistical separation
between the mean ROC curves across all false positive rates.