Assessments of model performance on training and test data. (A) Calibration plot for internal temporal validation using training data (the number of observations = 405,692, the number of patients = 12,307). (B) (Left): Receiver-operating characteristic curves for fitting each method to the test data (the number of observations = 36,769, the number of patients = 1,150, the reference random-classification gives a 45-degree straight line with area under the curve at 50%). (B) (Right): Precision-recall curves for fitting each method to the test data (the reference is the horizontal line with precision equal to the prevalence of adverse events, 0.93%). P denotes the number of adverse events and N denotes the number of non-events.