a, b, Precision-recall curves for predictions of antibiotic activity, for an ensemble of 10 Chemprop models without RDKit features (a) and the best-performing random forest classifier model based on Morgan fingerprints (b), trained and tested using data from a screen of 39,312 molecules (Fig. 1 of the main text). The black dashed line represents the baseline fraction of active compounds in the training set (1.3%). Blue curves and the 95% confidence interval indicate the variation generated by bootstrapping. AUC, area under the curve.