Skip to main content
. Author manuscript; available in PMC: 2025 Apr 18.
Published in final edited form as: Cell Chem Biol. 2023 Nov 28;31(4):712–728.e9. doi: 10.1016/j.chembiol.2023.10.026

Figure 2. A multi-task graph neural network enriches predictions for compounds with killing activity.

Figure 2.

A) Models that use chemical fingerprints as input are compared to the graph neural network (GNN) and evaluated by auROC. The random forest classifier (RFC), support vector machine (SVM), and feed forward neural network (FFN) were evaluated on several 10% test sets held-out from folds with scaffold splits. Each point denotes one model in the ensemble. Asterisks denote significance p < 0.05 with two-sided Mann-Whitney U test. B) The multi-task GNN predicts both growth inhibition and stationary-phase killing by aggregating information in neighborhoods of atoms and bonds, shown here as a graph representation of an arbitrary compound (e.g., aspirin). C) Predicted high-activity compounds in clusters of similar chemical structure, such as polymyxin-like structures (Cluster 10) and carbazole-containing structures (Cluster 13). D) Comparison of killing activity (y-axis) and growth inhibitory activity (x-axis) of the 86 top-scoring predictions. E) Compounds validated to have killing activity from the 36 top-scoring predictions after model retraining and strict similarity and drug-likeness filtering. For this similarity comparison and all following, a more comprehensive set of known antibiotics was used than for the initial antibiotic similarity filtering. F) Experimentally validated compounds predicted by the models and compounds identified in the primary screen evaluated on similarity to known antibiotics and antiseptics. ML-curated compounds are derived from both rounds of model prediction on the Broad 800K compound library and larger chemical vendor libraries, respectively. Asterisks denote p < 0.05 with two-sided Mann-Whitney U test.