Skip to main content
. 2023 May 8;29(5):1113–1122. doi: 10.1038/s41591-023-02332-5

Fig. 4. Estimated performance of a surveillance program for high-risk patients in different health systems and with different operational choices.

Fig. 4

Estimated relative risk (RR) for the top n (horizontal axis) high-risk patients is based on evaluating the accuracy of prediction on the withheld test set (a,c,d) and on a full external dataset (b). a,c, In designing surveillance programs, one can choose between models trained on all data (Exclusion 0) versus models trained excluding data from the last 3 months before cancer occurrence (Exclusion 3) and between prediction for cancer within 12 months or 36 months of assessment (legend top right in each panel). b, Estimated performance is somewhat lower for cross-application of a model trained on Danish (DK) data applied to US-VA patient data, illustrating the challenge of deriving globally valid prediction tools without independent localized or system-specific training. d, A proposed practical choice for a surveillance program with good estimated accuracy of prediction, in either system, would involve application of independently trained models with 3-month data exclusion for a prediction interval of 12 months for patients older than age 50 years. 1M, 1 million.