Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Aug 12.
Published in final edited form as: ACS Infect Dis. 2022 Jul 27;8(8):1553–1562. doi: 10.1021/acsinfecdis.2c00189

Random Forest Model Predictions Afford Dual-Stage Antimalarial Agents

Haseeb Mughal 1,¥, Elise C Bell 2,¥, Khadija Mughal 1, Emily R Derbyshire 2,3,*, Joel S Freundlich 1,4,*
PMCID: PMC9987178  NIHMSID: NIHMS1870301  PMID: 35894649

Abstract

The need for novel antimalarials is apparent given the continuing disease burden worldwide, despite significant drug discovery advances from the bench to the bedside. In particular, small molecule agents with potent efficacy against both the liver and blood stages of Plasmodium parasite infection are critical for clinical settings as they would simultaneously prevent and treat malaria with a reduced selection pressure for resistance. While experimental screens for such dual-stage inhibitors have been conducted, the time and cost of these efforts limit their scope. Here, we have focused on leveraging machine learning approaches to discover novel antimalarials with such properties. A random forest modeling approach was taken to predict small molecules with in vitro efficacy versus liver stage Plasmodium berghei parasites and a lack of human liver cell cytotoxicity. Empirical validation of the model was achieved with the realization of hits with liver stage efficacy after prospective scoring of a commercial diversity library and consideration of structural diversity. A subset of these hits also demonstrated promising blood stage Plasmodium falciparum efficacy. These 18 validated dual-stage antimalarials represent novel starting points for drug discovery and mechanism of action studies with significant potential for seeding a new generation of therapies.

Mughal and colleagues present a computational platform that leverages high-throughput screening data to afford experimentally validated dual-stage antimalarial agents of relevance to downstream drug discovery and target identification efforts. Next generation therapeutic approaches may benefit from this time- and cost-efficient approach.

Keywords: Random Forest, Plasmodium spp., dual-stage, antimalarial

Graphical Abstract

graphic file with name nihms-1870301-f0005.jpg


Malaria, caused by the parasite Plasmodium spp., continues to impact global morbidity and mortality. In 2020, over 241 million cases of malaria in 87 endemic countries were reported1, but our ability to combat this disease is hindered by emerging parasite resistance that is steadily reducing the effectiveness of all currently used antimalarials2. New chemical agents of novel chemotypes that target multiple stages of the Plasmodium life cycle are urgently needed. Plasmodium spp. exhibit a complex life cycle that involves a multiplicity of stages within the human host and the mosquitoes that transmit the disease. This cycle within the human host begins with the bite of a Plasmodium-infected Anopheles mosquito, which transmits sporozoites into the bloodstream3. Sporozoites then migrate to the liver where they traverse several cells before invasion4, 5. After this event, sporozoites dedifferentiate and replicate asexually, generating exo-erythrocytic forms3, 6. After maturation, tens of thousands of blood infective merozoites are released from a single liver cell into the blood stream7. Merozoites then cyclically invade, replicate and burst erythrocytes, generating the debilitating symptoms of malaria. Small molecules that inhibit asexual blood stage Plasmodium parasites are critical for disease treatment. In contrast, chemical entities capable of targeting the obligatory, asymptomatic liver stage of Plasmodium provide a means to prevent the disease and block transmission. Consequently, the discovery of molecules with dual-stage activity against the Plasmodium blood and liver stages would be highly valuable to malaria drug development efforts.

Phenotypic high-throughput screening (HTS) has highlighted thousands of small molecules with activity against Plasmodium blood stage parasites812 and liver stage parasites1317 as well as parasite transmission1824. Specifically, molecules have been evaluated for efficacy against blood-stage P. falciparum, the species that contributes most to disease mortality. This human-infective parasite can be continuously cultured to support small molecule inhibitor screening25. HTS to identify liver stage Plasmodium inhibitors has also been completed utilizing rodent-infective P. berghei or P. yoelii models13, 26. These in vitro models require the harvesting of Plasmodium spp. sporozoites from Anopheles mosquitoes to infect hepatoma cells. Unfortunately, low in vitro infection rates and the dissection of live Plasmodium-infected mosquitoes create a bottleneck for identifying molecules with liver stage activity. Dual liver and blood stage inhibitors are ideal as they would enable prophylaxis to prevent disease from development while also addressing symptoms if the infection has already progressed. Dual-stage inhibitors would additionally limit transmission, and impede the development of drug-resistant parasites.27 While campaigns have been completed to identify dual-stage inhibitors15, 2833, a significant drug discovery need for new hit and lead compounds remains. Confounding the urgent need for dual-stage antimalarials are the relatively limited resources for antimalarial drug discovery compared to other human diseases such as cancer. Thus, innovative alternative strategies are needed to reduce both the time and cost for dual-stage malaria drug discovery.

A machine learning strategy to finding novel dual-stage small molecule antimalarial hit compounds appears a fitting and innovative solution. This general strategy has only been reported with blood stage inhibitor data to predict novel compounds with activity against the asexual blood stage of P. falciparum34, 35. Thus, an opportunity was realized, given the paucity of promising antimalarials with this type of activity, to leverage liver stage inhibitor data to first predict molecules with liver stage efficacy in the absence of significant cytotoxicity to the liver cell, and then to experimentally select for dual-stage activity. For the first time, we have applied machine learning to liver stage Plasmodium spp. HTS data to predict new active compounds against this clinically silent liver stage of infection. A significant fraction of these antimalarials additionally demonstrated potent blood stage efficacy. We present herein the results of our studies where computation has leveraged HTS data to afford candidate dual-stage antimalarials with translational promise.

RESULTS

Random Forest Model Construction.

The machine learning approach relied on a binary classification to accurately differentiate active versus inactive compounds with a random forest model36. The data set was comprised of 5,972 small molecules that were screened at the concentration of 5.0 μg/mL for the inhibition of P. berghei ANKA parasite load in human hepatoma HepG2 cells in addition to their cytotoxicity to HepG2 cells37, 38 (Supplementary Table 1). Compounds that exhibited ≥85% inhibition of P. berghei ANKA load and afforded hepatocyte growth of ≥50%, with respect to the no-drug controls, were assigned as active compounds. Molecules that did not meet these criteria were assigned as inactive compounds. This, in essence, was a dual-event approach. Our earlier efforts with the prediction of antitubercular agents via a dual-event approach uncovered active compounds with a success rate of 5 out of 739. With this binary classification, the liver stage data set was highly unbalanced with 245 actives (4.1%) and 5,727 inactives (95.9%). A 60:20:20 stratified split of the data set afforded training, cross validation and external test sets. The external test was augmented by an additional 116 compounds, derived from two previously reported liver stage screening campaigns15, 26.

Machine learning efforts focused on random forest models in the Knime Version 4.2.2 (www.knime.com) modeling environment (Fig. 1). Details as to data preprocessing, feature selection with two-dimensional descriptors, and hyperparameter optimization are contained in the Methods section. The first random forest models were created with a standard probability threshold of 0.50 and expectedly performed poorly (Supplementary Table 2) given the unbalanced nature of the training set. In contrast, considering all possible thresholds (0.0 – 1.0) resulted in significantly better performance (Supplementary Table 2). Subsequently, feature selection was considered in the optimizations with square root sampling compared to linear sampling. The five best models from consideration of feature selection with square root sampling method were noted as models 1 – 5 (Supplementary Table 3). These models were further optimized for their hyperparameters. Optimizations considered 1) only the number of trees or 2) the number of trees, maximum tree depth, and minimum node split. The combination of hyperparameters that performed best are listed as models 1a – 5a (Supplementary Table 3). Comparing the average balanced accuracy with secondary consideration of the area under the curve for the Receiver Operating Characteristic curve (AUC; where AUC = 1.0 corresponds to a perfect model, while an AUC = 0.5 is no better than random)40 from a ten-fold cross-validation and Matthews Correlation Coefficient (MCC = (TPxTN – FPxFN)/((TP+FP)(TP+FN)(TN+FP)(TN+FN))1/2; a chance-corrected statistic where an MCC of 1 indicates perfect agreement, an MCC of −1 indicates total disagreement, and an MCC of 0 indicates the model is no better than chance; TP = true positives, FP = false positives, TN = true negatives, FN = false negatives)41, we determined models 2a and 4a were the two best performing models on the cross-validation and external test sets. This model building and performance comparison were then repeated with the same workflow except that linear sampling was substituted for square root sampling. The five best models were noted both without (models 6 – 10) and with hyperparameter optimization (6a – 10a) (Supplementary Table 4). While models 6a and 8 performed best amongst this subset, they were inferior to models 2a or 4a.

Figure 1. Random forest workflow for the prediction of liver stage Plasmodium activity without host cytotoxicity.

Figure 1.

The workflow (Supplementary File 1 – Malaria workflow.knwf) was created and executed within the Knime Version 4.2.2 modeling environment (www.knime.com).

Random Forest Model Prospective Testing.

To score a library of compounds, we looked to select a single model. Inspection of the statistics for models 2a and 4a (Supplementary Table 3) showed them to be equivalent. We, thus, decided to proceed with the simpler model (2a) which only optimized the number of trees. With model 2a, a library of over 1.5 million ChemDiv (www.chemdiv.com) compounds was scored. Amongst the top 275 and bottom 100 scoring compounds, comprising the most and least likely, respectively, to meet the model’s classification of activity, the most diverse compounds with respect to the training set and each other were chosen for biological testing. From this analysis, 120 compounds (100 predicted to be active and 20 predicted to be inactive) were purchased and subsequently tested for their inhibition of liver stage P. berghei ANKA parasites and blood stage P. falciparum 3D7 parasites as well as cytotoxicity to in vitro cultured HepG2 cells at the single point concentration of 15 μM (Figure 2, Supplementary Figure 1, and Supplementary Table 5). It was not possible to standardize the concentration tested between the training (μg/mL) and external (μM) libraries due to their availability from vendors. Therefore, we tested at 15 μM with a more stringent inhibition cutoff. This enhanced the probability of identifying compounds that led to the near complete inhibition of the parasite at a reasonable screening concentration.

Figure 2. Summary of the prospective antimalarial testing.

Figure 2.

Heat map depicting the biological activity profiles of the 120 candidates (100 predicted actives and 20 predicted inactives). The predicted active and inactive compounds may be viewed with regard to their inhibition of HepG2 cells, the P. falciparum 3D7 blood stage and the P. berghei ANKA liver stage. Compounds identified as dual-stage, liver stage (LS) and blood stage (BS) actives, in the absence of HepG2 cytotoxicity, are indicated. Data shown are the average of triplicate measurements and scaled between 0 – 100% after normalization to controls.

Among the predicted inactive compounds, 19 out of the 20 (95%) of the predicted inactive compounds failed to demonstrate ≥90% inhibition of liver stage P. berghei ANKA parasite load. In contrast, 26 of the 100 (26%) candidate active compounds exhibited ≥90% liver stage P. berghei ANKA parasite load inhibition with less than 50% HepG2 cell growth inhibition. Of note, many other molecules significantly reduced parasite load (50–90% inhibition) but were not defined as actives since they did not meet our stringent selection criteria. To determine if liver stage active molecules were enriched in the test set, the P. berghei ANKA parasite load was compared to the unbiased HTS data set of 5,972 molecules used for model construction and validation (Supplementary Figure 2). We observed that our library of predicted actives had a significantly lower average parasite load signal (p < 0.0001, unpaired Student’s t test), suggesting there was an enrichment of liver stage molecules.

Identification of Dual-Stage Antimalarials.

Of these 26 compounds, 20 also showed ≥90% inhibition of blood stage P. falciparum 3D7 parasites. Interestingly, when the entire test set was tested for activity against the blood stage of P. falciparum 3D7, 40 of the 100 (40%) compounds demonstrated ≥90% inhibition with less than 50% HepG2 cell growth inhibition. By comparison, none of the predicted inactive compounds demonstrated ≥90% inhibition of blood stage P. falciparum 3D7 parasites. These results further support the ability of the random forest model to successfully predict liver stage anti-plasmodial activity (p = 0.0002, unpaired Student’s t test) and also perhaps unexpectedly blood stage anti-plasmodial activity (p < 0.0001, unpaired Student’s t test) (Supplementary Figure 1).

The 20 compounds with single concentration dual-stage activity were then tested in dose response for reduction of parasite load in assays with liver stage P. berghei ANKA or blood stage P. falciparum 3D7, in addition to the growth inhibition of HepG2 cells. These studies validated 18 out of 20 compounds as dual-stage hits with liver stage EC50 values of 0.53 – 19 μM, blood stage EC50 values of 0.66 – 14 μM, and HepG2 CC50 values of ≥40 μM (where EC50 was the minimum compound concentration to afford 50% inhibition of the parasite load and CC50 was the minimum compound concentration to afford 50% inhibition of the liver cell line) (Table 1). Amongst the dual-stage hits, we noted common motifs such as the nitrofuran (10/18 compounds), thiazole/thiadiazole (4/18 compounds), and N-(2,2,2-trichloro-1-(phenylamino)ethyl)benzamide (3/18 compounds). None of the eighteen compounds were noted for their antimalarial efficacy in searches of SciFinder (scifinder.cas.org), PubChem (https://pubchem.ncbi.nlm.nih.gov), or ChEMBL (https://www.ebi.ac.uk/chembl/). The 18 dual-stage inhibitors differentially occupy a composite activity-chemical space (Figure 3). The chemical space of these molecules is distinct from that of the training set given their mean pairwise Tanimoto similarities (with respect to the training set) of ca. 0.3. Molecules 1501–0649, 3350–0127, and 8002–2180 are representative of prioritized chemotypes and notably exhibit different degrees of inhibition of liver stage versus blood stage parasites (Figure 4). 3350–0127 is liver stage selective (8-fold), 1501–0649 is blood stage selective (6-fold), and 8002–2180 displayed similar efficacy against the different parasite stages (1.6-fold).

Table 1.

Biological profiling of the validated dual-stage antimalarials.

Compound ID Chemical Structure Liver Stage EC50 (μM) Blood Stage EC50 (μM)

3305–0127 graphic file with name nihms-1870301-t0006.jpg 0.53 4.4
1936–1986 graphic file with name nihms-1870301-t0007.jpg 1.2 3.0
1936–1704 graphic file with name nihms-1870301-t0008.jpg 1.5 3.2
2181–0317 graphic file with name nihms-1870301-t0009.jpg 1.6 1.4
8008–8109 graphic file with name nihms-1870301-t0010.jpg 2.0 3.2
2207–0038 graphic file with name nihms-1870301-t0011.jpg 2.2 4.7
4227–4396 graphic file with name nihms-1870301-t0012.jpg 2.4 2.2
4373–2978 graphic file with name nihms-1870301-t0013.jpg 3.4 5.9
1897–1416 graphic file with name nihms-1870301-t0014.jpg 3.8 11
8007–7439 graphic file with name nihms-1870301-t0015.jpg 7.3 13
8009–8306 graphic file with name nihms-1870301-t0016.jpg 3.9 1.3
8002–2180 graphic file with name nihms-1870301-t0017.jpg 4.2 2.6
8001–5597 graphic file with name nihms-1870301-t0018.jpg 6.9 5.0
8020–3806 graphic file with name nihms-1870301-t0019.jpg 5.6 7.1
1501–0649 graphic file with name nihms-1870301-t0020.jpg 4.1 0.66
1501–0954 graphic file with name nihms-1870301-t0021.jpg 5.0 2.4
1503–0792 graphic file with name nihms-1870301-t0022.jpg 6.7 2.1
V007–0627 graphic file with name nihms-1870301-t0023.jpg 19 14

Figure 3. Three-dimensional plot of dual-stage hits.

Figure 3.

Compounds are represented by spheres plotted against the liver stage P. berghei ANKA potency, blood stage P. falciparum 3D7 potency, and Tanimoto coefficients (mean pairwise with respect to the 200 most diverse amongst the top-scoring 275 candidates). The plot was generated in Spotfire (www.tibco.com).

Figure 4. EC50 values of prioritized compounds.

Figure 4.

Structures and dose-response curves of (A) 1501–0649, (B) 3305–0127 and (C) 8002–2180 against the P. berghei ANKA liver stage (green circles) and P. falciparum 3D7 blood stage (blue circles).

DISCUSSION

The efforts herein constitute to the best of our knowledge the first reported utilization of machine learning methods to predict in vitro antimalarial liver stage efficacy. It builds on our previous efforts predicting antibacterial in vitro efficacy39, 40, 42, 43 and molecular properties4447 critical to the attainment of high quality chemical tools and drug discovery agents. Leveraging a data set of 5,972 compounds, a random forest model was constructed, validated, and then utilized to predict the liver stage efficacy of drug-like small molecules from a ChemDiv diversity library. The model performed well in our opinion, making correct predictions amongst labeled active (26%) and inactive candidates (95%) as based on single concentration assays. This performance is consistent with our experience with our in vitro efficacy models for a range of bacteria39, 40, 43, 48. The machine learning approach offers hit rates considerably better than typical empirical screening (≤0.5%) while demonstrating a significant ability to predict inactive compounds. For resource-limited screening efforts, especially for parasitic diseases such as malaria, we view this as a meaningful savings in time and cost to identify active compounds for downstream study.

Our liver stage modeling complements previously published antimalarial machine learning modeling efforts. For example, a recent effort49 combined a generative approach for the design of candidate antimalarials with a quantitative structure-activity relationship model for in vitro blood stage activity50. Two molecules, with a high similarity to the training set (i.e., the closest training set compound exhibited a Tanimoto similarity of 0.67), were designed, synthesized, and demonstrated sub-micromolar EC50 values in a blood stage assay with the P. falciparum 3D7 strain. Another report leveraged a naïve Bayesian modeling approach with a large compendium of blood stage inhibition screens to produce different predictive strategies51. These were subjected to external validation, although prospective scoring and experimental testing with other sets of molecules were not discussed. Neves and colleagues have reported both binary classifier and regression models to learn from blood stage activity and mammalian cell (NIH/3T3) cytotoxicity assay data, leveraging deep learning approaches with Keras and TensorFlow52. A combination of models was used to select five molecules from a commercial diversity library of 486,155 compounds. One candidate demonstrated a sub-micromolar EC50 value versus the P. falciparum 3D7 strain, while it and a second candidate exhibited similar efficacy versus the W2 strain. Summarily, to the best of our knowledge, these previous reports have focused on predicting solely blood stage efficacy and have relied on significantly smaller scale prospective testing as compared to our efforts with 120 candidates.

From the prospective testing with our model for liver stage efficacy, we were able to identify 18 novel dual-stage antimalarials as validated in dose-response assays. These may be characterized as principally falling into the nitrofuran, thiazole/thiadiazole, and N-(2,2,2-trichloro-1-(phenylamino)ethyl)benzamide chemotypes. While examples of each parental chemotype may be found in the antimalarial literature11, 53, 54, we assert that the dual-stage antimalarial hits disclosed herein by virtue of their novelty of overall structure may offer additional opportunities to evolve molecules of significance for mechanism of action studies to establish new drug targets and drug discovery entities of translational value.

Stemming from our ability to identify liver stage inhibitors and from these a high percentage of dual-stage antimalarials (18 out of 26), we were drawn to contemplate the connection between blood stage and liver stage efficacy. To the best of our knowledge, a systematic study has not been reported that compares both activities for a specific library of small molecules tested at the same concentration in both screens. We note that previous liver stage screens of libraries of compounds active against the blood stage demonstrated a moderate correlation between potencies of hit compounds in the liver and blood stage13, 15. With regard to our experimental results, an analysis of the single-concentration blood stage activity of the 100 liver stage candidates (Supplementary Figure 3) demonstrated two distinct populations using a cutoff of 50% reduction in parasite load. With this cutoff, the 59 blood stage actives and 41 blood stage inactives were described using a t-distributed Stochastic Neighbor Embedding (t-SNE) analysis55 (Supplementary Figure 4) which failed to visually separate the two sets. Subsequently, we calculated the mean and standard deviation for each of the two-dimensional descriptors (MOE Version 2019.0102, Chemical Computing Group) for these blood stage active and inactive sets. Of the ten descriptors which displayed the largest difference in mean value between these two sets of molecules that were statistically significant (unpaired Student’s t test, p < 0.01), we, in particular, noted the importance of various estimations of atomic charge and van der Waals surface area (Supplementary Table 6). Future efforts in the computational and experimental realms will seek to better comprehend the physicochemical and chemical properties that correlate with blood stage activity as well as liver stage activity.

CONCLUSIONS

The need for new starting points for dual-stage antimalarials has motivated our contribution at the interface between computation, chemistry, and biology. Data from previous high-throughput phenotypic screens have enabled the construction of a random forest model to learn the chemical and physicochemical properties of molecules with respect to the inhibition of P. berghei ANKA liver stage infection in the absence of HepG2 cell cytotoxicity. In conjunction with experimental assessment, the model’s scoring of a commercial diversity library afforded small molecule hits with the desired properties. A subset of these molecules also demonstrated potent in vitro efficacy versus blood stage P. falciparum 3D7 parasites. These dual-stage antimalarials represent promising starting points for drug discovery and mechanism of action studies that would form the basis of novel therapeutic strategies to impact this disease of global health significance.

METHODS

Materials.

HepG2 cells were obtained from the Duke Cell Culture Facility, Anopheles stephensi mosquitoes infected with luciferase-expressing P. berghei ANKA were obtained from the SporoCore (University of Georgia, Athens), and P. falciparum 3D7 parasites (MRA-102) were obtained from BEI Resources. The sets of predicted parasite inhibitors and the predicted inactive compounds were purchased from ChemDiv (www.chemdiv.com). Compounds in 96-well plates were reconstituted in DMSO to 10 mM and stored at −80°C.

Data Preprocessing.

All compounds and their SMILES strings were uploaded into the Molecular Operating Environment (MOE; Version 2018.01, Chemical Computing Group) where the two-dimensional descriptors for the compounds were calculated. A CSV file containing the 2D descriptors, parasite activity, and SMILES string for each compound was uploaded into Knime (Version 4.2.2). The complete Knime workflow is available as Supplementary File 1. All compounds were stripped of any counterions. To obtain fingerprints, the RDKit fingerprint node (Version 2020.03.1; http://www.rdkit.org/docs/Install.html) was used to obtain a 1024 bit Morgan fingerprint. To the best of our knowledge, Knime does not enable raw fingerprints and features to be processed together, so the Morgan fingerprint was split into 1024 bit vectors, with each integer of the fingerprint string of numbers becoming its own feature. A Gaussian normalization was applied to the features, with the exception of the 1024 bit fingerprint features and dual-event activity call. The 5,972-member dataset37, 38 was split (60:20:20) using stratified sampling to arrive at the training set, cross-validation set, and an external test set that was fortified with an additional 116 compounds (113 active and 3 inactive) from two previously reported liver stage screens15, 26. In accordance with our previous work, the features were filtered to remove those with a low variance (selected via a parameter optimization loop where values from 0.0 to 1.0 were tested to determine the value that resulted in the largest balanced accuracy), constant value, and linear correlation with R2 greater than 0.9047. During the feature optimization, fingerprints were excluded.

Model Optimization.

The Youden’s index (sensitivity + specificity – 1) was selected within the Knime platform as the optimal metric for maximizing model performance since it accounted for performance on inactive and active compounds simultaneously. It should be noted that sensitivity = TP/(TP + FN) and specificity = TN/(TN + FP) where TP = the number of true positives, TN = the number of true negatives, FP = the number of false positives, and FN = the number of false negatives (positive = active; negative = inactive). Also calculated for each model were the AUC ROC and MCC, as defined in the main text. During the course of the optimizations, the use of balanced accuracy (equal to the mean of sensitivity and specificity) was found to be more intuitive than Youden’s index. Given their mathematical relationship, we have chosen in the text to discuss optimization of balanced accuracy although explicitly Youden’s index was being maximized.

Tree Ensemble Hyperparameters.

A series of hyperparameter optimization loops was conducted to determine the optimum number of trees, minimum node split, and maximum tree depth. For each of the top five models from the two feature selection loops (linear versus square root sampling), a parameter optimization loop for the number of models to maximize balanced accuracy was conducted. The number of trees between 100 and 1000 with a step size of 20 was tested. Once an initial optimum number of trees was found, that number ±20 with a step size of one was examined in a loop. The resulting optimum tree number was then used in a parameter optimization loop to determine the maximum depth of the trees, searching between 1 and 300 with a step size of 20. When the initial optimum maximum depth level was identified, that number ±20 with a step size of one was examined in a loop. The resulting optimum maximum tree depth, along with the optimum number of trees, was passed to the last loop, which tested the minimum node split of the trees between 1 and 300 with a step size of 20. When the initial optimum maximum depth level was identified, that number ±20 with a step size of one was examined in a loop. Each loop was performed with optimizing the threshold to maximize Youden’s index (and, similarly, balanced accuracy). The final model utilized the optimum number of trees, depth, and minimum node split, which was compared to its corresponding model with standard hyperparameters.

Scoring Compounds.

A library of 1.5 million ChemDiv compounds (www.chemdiv.com) was scored with the best model (2a). Among the predicted active compounds, the top 275 compounds were selected and compared to the training set with a calculation of mean pairwise Tanimoto similarity. The 200 most dissimilar compounds to the training set were then compared to one another and 100 of these compounds with the lowest similarity values were chosen for biological testing. Among the predicted inactive compounds, the bottom 100 compounds were selected and compared to the training set through calculation of each compound’s mean pairwise Tanimoto similarity. The 50 most dissimilar compounds to the training set were then compared to one another and 20 of these compounds with the lowest similarity values were chosen for biological assessment. The Python code for these similarity calculations may be found in Supplementary File 2 (Diversity filtering workflow.ipynb).

P. berghei Liver Stage Assays.

HepG2 cells were maintained in Dulbecco’s Modified Eagle Medium (DMEM) with L-glutamine (Gibco) supplemented with 10% heat-inactivated fetal bovine serum (HI-FBS) (v/v) (Sigma-Aldrich) and 1% antibiotic-antimycotic (ThermoFisher Scientific) in a standard tissue culture incubator (37 °C, 5% CO2). P. berghei ANKA sporozoites used for liver stage experiments were isolated from freshly dissected mosquito salivary glands. P. berghei ANKA parasite load in hepatocytes was evaluated as previously described37. Briefly, HepG2 (8,000 cells/well) were seeded into 384-well white microplates (Corning). After 24 h, compounds (typically at 15 μM) were added using a multichannel pipette before infection with P. berghei ANKA sporozoites (4,000 sporozoites/well). Atovaquone (10 nM) and DMSO (1% v/v) were added as positive and negative controls, respectively. All samples were evaluated in triplicate and had a final DMSO concentration of 1%. After 48 h post-infection, HepG2 viability and parasite load were assessed using CellTiter-Fluor (Promega) and Bright-Glo (Promega) reagents, respectively, according to manufacturer’s protocols. Relative fluorescence and luminescence signals were measured using an EnVision plate reader (PerkinElmer). HepG2 viability was assessed by normalizing the signal intensity of each well to the negative control (1% DMSO). To assess relative parasite viability, the signal intensity of each well was normalized to the negative control (1% DMSO) and the positive control (10 nM atovaquone), or the highest concentration of compounds that lacked cytotoxicity. Parasite inhibition (%) was calculated using the formula:

%Inhibition=100×SignalCompoundSignalAverageNegativeControlSignalAveragePositiveControlSignalAverageNegativeControl

Dose-response studies (typically 0.31 – 40 μM, 8-point curve) were completed similar to described above, but with the use of a D300 picoliter dispenser (Hewlett-Packard) for compound addition in duplicate. The Z-factor ranged from 0.5 – 0.6. EC50 values were then obtained by fitting data with a standard inhibition dose-response curve (GraphPad Prism Version 9).

P. falciparum Blood Stage Assays.

P. falciparum 3D7 parasites were maintained in red blood cells (Golf Coast Regional Blood Center) and cultured in RPMI 1640 medium supplemented with 0.5% (m/v) AlbuMAX II, 25 mM HEPES, pH 7.2, 24 mM sodium bicarbonate, 25 μg/mL gentamycin, and 50 μg/mL hypoxanthine, and maintained in a standard tissue culture incubator (37 °C, 5% CO2). Synchronization was performed with 5% D-sorbitol (Sigma) as previously described56. For parasite inhibition assays, P. falciparum 3D7 parasites were synchronized with 5% sorbitol at 37 °C for 10 min and adjusted to 2% parasitemia and 2% hematocrit. Parasites (100 μL) were dispensed into 96-well black microplates (Corning) containing complete medium (100 μL) in the presence or absence of compounds (15 μM). Quinacrine dihydrochloride (5.0 μM) and DMSO (0.5% v/v) were added as positive and negative controls, respectively. All samples were evaluated in triplicate and had final DMSO concentration of 0.5%. Plates were incubated at 37 °C for 72 h before cell lysis with addition of 20 mM Tris-HCl, pH 7.5, 5 mM EDTA dipotassium salt dihydrate, 0.16% saponin, 1.6% Triton X-100 and fresh 1x SYBR Green I (ThermoFisher Scientific). Plates were incubated at rt in the dark for 24 h12 before measuring fluorescence at 535 nm with excitation at 485 nm using an Envision plate reader (PerkinElmer). To assess relative parasite viability, the signal intensity of each well was normalized to the negative control (0.5% DMSO) and the positive control (5.0 μM quinacrine dihydrochloride). Parasite inhibition (%) was calculated as described above. Dose-response studies (typically 0.31 – 40 μM, 8-point curve) were completed similar to described above, but with the use of a D300 picoliter dispenser (Hewlett-Packard) for compound addition in duplicate. The Z-factor ranged from 0.5 – 0.6. EC50 values were then obtained by fitting data with a standard inhibition dose-response curve (GraphPad Prism).

Supplementary Material

SI text and figures
Table S5
CSV
SI zip
Table S1

Acknowledgments

This work was supported by the US National Institutes of Health grants U19AI109713 (J.S.F.) and 1DP2AI138239 (E.R.D).

Abbreviations

HTS

high-throughput screening

AUC

area under the curve

MCC

Matthews Correlation Coefficient

TP

true positives

FP

false positives

TN

true negatives

FN

false negatives

t-SNE

t-distributed Stochastic Neighbor Embedding

Footnotes

Supporting Information

Bioactivity of predicted liver stage Plasmodium inhibitors; Comparison of the liver stage efficacy of the 5,972 compound liver stage data set versus the random forest predicted actives; Violin plot demonstrating two distinct populations in blood stage anti-plasmodial activity amongst the candidate liver stage actives; t-SNE visualization of the 100 candidate liver stage actives with respect to their empirical blood stage activity; Data for the small molecule antimalarial liver stage screen; Random forest models constructed with or without a fixed threshold; Random forest models constructed with feature selection and square root sampling; Random forest models constructed with feature selection and linear sampling; Data for the biological assay of the model predicted active and inactive compounds; Select MOE 2D features ascribed to the differential blood stage activity amongst the 100 liver stage candidate actives; Supplementary File 1: Malaria workflow.knwf. Supplementary File 2: Diversity filtering workflow.ipynb; Supplementary File 3: tSNE visualization.ipynb; Supplementary File 4: Predicted liver stage actives stratified by blood stage activity.csv.

References

  • (1).World Health Organization, World Malaria Report 2021 [Google Scholar]
  • (2).Haldar K; Bhattacharjee S; Safeukui I. Drug resistance in Plasmodium. Nat. Rev. Microbiol 2018, 16, 156–170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (3).Ejigiri I; Sinnis P. Plasmodium sporozoite-host interactions from the dermis to the hepatocyte. Curr. Opin. Microbiol 2009, 12, 401–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (4).Mota MM; Pradel G; Vanderberg JP; Hafalla JC; Frevert U; Nussenzweig RS; Nussenzweig V; Rodriguez A. Migration of Plasmodium sporozoites through cells before infection. Science 2001, 291, 141–144. [DOI] [PubMed] [Google Scholar]
  • (5).Amino R; Giovannini D; Thiberge S; Gueirard P; Boisson B; Dubremetz JF; Prevost MC; Ishino T; Yuda M; Menard R. Host cell traversal is important for progression of the malaria parasite through the dermis to the liver. Cell Host Microbe 2008, 3, 88–96. [DOI] [PubMed] [Google Scholar]
  • (6).Ploemen IH; Prudencio M; Douradinha BG; Ramesar J; Fonager J; van Gemert GJ; Luty AJ; Hermsen CC; Sauerwein RW; Baptista FG; Mota MM; Waters AP; Que I; Lowik CWGM; Khan SM; Janse CJ; Franke-Fayard BMD Visualisation and quantitative analysis of the rodent malaria liver stage by real time imaging. PLoS One 2009, 4, e7881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (7).Prudencio M; Rodriguez A; Mota MM The silent path to thousands of merozoites: the Plasmodium liver stage. Nat. Rev. Microbiol 2006, 4, 849–856. [DOI] [PubMed] [Google Scholar]
  • (8).Plouffe D; Brinker A; McNamara C; Henson K; Kato N; Kuhen K; Nagle A; Adrian F; Matzen JT; Anderson P; Nam T-G; Gray NS; Chatterjee A; Janes J; Yan SF; Trager R; Caldwell JS; Schultz PG; Zhou Y; Winzeler EA In silico activity profiling reveals the mechanism of action of antimalarials discovered in a high-throughput screen. Proc. Natl. Acad. Sci. USA 2008, 105, 9059–9064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Guiguemde WA; Shelat AA; Bouck D; Duffy S; Crowther GJ; Davis PH; Smithson DC; Connelly M; Clark J; Zhu F; Jimenez-Diaz MB; Martinez MS; Wilson EB; Tripathi AK; Gut J; Sharlow ER; Bathurst I; El Mazouni F; Fowble JW; Forquer I; McGinley PL; Castro S; Angulo-Barturen I; Ferrer S; Rosenthal PJ; Derisi JL; Sullivan DJ; Lazo JS; Roos DS; Risco MK; Phillips MA; Rathod PK; Van Voorhis WC; Avery VM; Guy RK Chemical genetics of Plasmodium falciparum. Nature 2010, 465, 311–315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Avery VM; Bashyam S; Burrows JN; Duffy S; Papadatos G; Puthukkuti S; Sambandan Y; Singh S; Spangenberg T; Waterson D; Willis P. Screening and hit evaluation of a chemical library against blood-stage Plasmodium falciparum. Malar. J 2014, 13, 190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (11).Gamo FJ; Sanz LM; Vidal J; de Cozar C; Alvarez E; Lavandera JL; Vanderwall DE; Green DV; Kumar V; Hasan S; Brown JR; Peishoff CE; Cardon LR; Garcias-Bustos JF Thousands of chemical starting points for antimalarial lead identification. Nature 2010, 465, 305–310. [DOI] [PubMed] [Google Scholar]
  • (12).Kato N; Comer E; Sakata-Kato T; Sharma A; Sharma M; Maetani M; Bastien J; Brancucci NM; Bittker JA; Corey V; Clarke D; Derbyshire ER; Dornan GL; Duffy S; Eckley S; Itoe MA; Koolen KMJ; Lewis TA; Lui PS; Lukens AK; Lund E; March S; Meibalan E; Meier BC; McPhail JA; Mitasev B; Moss EL Sayes M, Van Gessel Y, Wawer MJ; Yoshinaga T; Zeeman A-M; Avery VM; Bhatia SN; Burke JE; Catteruccia F; Clardy JC; Clemons PA; Dechering KJ, Duvall JR; Foley MA; Gusovsky F; Kocken CHM; Marti M; Morningstar ML, Munoz B, Neafsey DE; Sharma A. Winzeler EA; Wirth DF; Scherer CA; Schreiber SL Diversity-oriented synthesis yields novel multistage antimalarial inhibitors. Nature 2016, 538, 344–349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (13).Meister S; Plouffe DM; Kuhen KL; Bonamy GM; Wu T; Barnes SW; Bopp SE; Borboa R; Bright AT; Che J; Cohen S; Dharia NV; Gagaring K; Gettayacamin M; Gordon P; Groessl T; Kato N; Lee MCS; McNamara CW; Fidock DA; Nagle A; Nam T. -g.; Richmond W; Roland J; Rottman M; Zhou B; Froissard P; Glynne RJ; Mazier D; Sattabongkot J; Schultz PG; Tuntland T; Walker JR; Zhou Y; Chatterjee A; Diagana TT; Winzeler EA Imaging of Plasmodium liver stages to drive next-generation antimalarial drug discovery. Science 2011, 334, 1372–1377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).da Cruz FP; Martin C; Buchholz K; Lafuente-Monasterio MJ; Rodrigues T; Sonnichsen B; Moreira R; Gamo FJ; Marti M; Mota MM; Hannus M; Prudencio M. Drug screen targeted at Plasmodium liver stages identifies a potent multistage antimalarial drug. J. Infect. Dis 2012, 205, 1278–1286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (15).Raphemot R; Lafuente-Monasterio MJ; Gamo-Benito FJ; Clardy J; Derbyshire ER Discovery of Dual-Stage Malaria Inhibitors with New Targets. Antimicrob. Agents Chemother 2015, 60, 1430–1437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (16).Swann J; Corey V; Scherer CA; Kato N; Comer E; Maetani M; Antonova-Koch Y; Reimer C; Gagaring K; Ibanez M; Plouffe D; Zeeman A-M; Kocken CHM; McNamara CW; Schreiber SL; Campo B; Winzeler EA; Meister S. High-Throughput Luciferase-Based Assay for the Discovery of Therapeutics That Prevent Malaria. ACS Infect. Dis 2016, 2, 281–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Antonova-Koch Y; Meister S; Abraham M; Luth MR; Ottilie S; Lukens AK; Sakata-Kato T; Vanaerschot M; Owen E; Jado JC; Maher SP; Calla J; Plouffe D; Zhong Y; Chen K; Chaumeau V; Conway AJ; McNamara CW; Ibanez M; Gagaring K; Serrano FN; Eribez K; Taggard CM; Cheung AL; Lincoln C; Ambachew B; Rouillier M; Siegel D; Nosten F; Kyle DE; Gamo F-J; Zhou Y; Llinas M; Fidock DA; Wirth DF; Burrows J; Campo B; Winzeler EA Open-source discovery of chemical leads for next-generation chemoprotective antimalarials. Science 2018, 362, eaat944. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Duffy S; Avery VM Identification of inhibitors of Plasmodium falciparum gametocyte development. Malar J 2013, 12, 408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (19).Lelievre J; Almela MJ; Lozano S; Miguel C; Franco V; Leroy D; Herreros E. Activity of clinically relevant antimalarial drugs on Plasmodium falciparum mature gametocytes in an ATP bioluminescence “transmission blocking” assay. PLoS One 2012, 7, e35019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Almela MJ; Lozano S; Lelievre J; Colmenarejo G; Coteron JM; Rodrigues J; Gonzalez C; Herreros E. A New Set of Chemical Starting Points with Plasmodium falciparum Transmission-Blocking Potential for Antimalarial Drug Discovery. PLoS One 2015, 10, e0135139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).D’Alessandro S; Silvestrini F; Dechering K; Corbett Y; Parapini S; Timmerman M; Galastri L; Basilico N; Sauerwein R; Alano P; Taramelli D. A Plasmodium falciparum screening assay for anti-gametocyte drugs based on parasite lactate dehydrogenase detection. J. Antimicrob. Chemother 2013, 68, 2048–2058. [DOI] [PubMed] [Google Scholar]
  • (22).Tanaka TQ; Williamson KC A malaria gametocytocidal assay using oxidoreduction indicator, alamarBlue. Mol. Biochem. Parasitol 2011, 177, 160–163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Lucantoni L; Silvestrini F; Signore M; Siciliano G; Eldering M; Dechering KJ; Avery VM; Alano P. A simple and predictive phenotypic High Content Imaging assay for Plasmodium falciparum mature gametocytes to identify malaria transmission blocking compounds. Sci Rep 2015, 5, 16414. DOI: 10.1038/srep16414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (24).Plouffe DM; Wree M; Du AY; Meister S; Li F; Patra K; Lubar A; Okitsu SL; Flannery EL; Kato N; Tanaseichuk O; Comer E; Zhou B; Kuhen K; Zhou Y; Leroy D; Schreiber SL; Scherer CA; Vinetz J; Winzeler EA High-Throughput Assay and Discovery of Small Molecules that Interrupt Malaria Transmission. Cell Host Microbe 2016, 19, 114–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (25).Trager W; Jensen JB Human malaria parasites in continuous culture. Science 1976, 193, 673–675. [DOI] [PubMed] [Google Scholar]
  • (26).Derbyshire ER; Min J; Guiguemde WA; Clark JA; Connelly MC; Magalhaes AD; Guy RK; Clardy J. Dihydroquinazolinone inhibitors of proliferation of blood and liver stage malaria parasites. Antimicrob. Agents Chemother 2014, 58, 1516–1522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (27).White NJ Antimalarial drug resistance. J. Clin. Invest 2004, 113, 1084–1092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (28).Dorjsuren D; Eastman RT; Wicht KJ; Jansen D; Talley DC; Sigmon BA; Zakharov AV; Roncal N; Girvin AT; Antonova-Koch Y; Will PM; Shah P; Sun H; Klumpp-Thomas C; Mok S; Yeo T; Meister S; Marugan JJ; Ross LS; Xu X; Maloney DJ; Jadhav A; Mott BT; Sciotti RJ; Winzeler EA; Waters NC; Campbell RF; Huang W; Simeonov A; Fidock DA Chemoprotective antimalarials identified through quantitative high-throughput screening of Plasmodium blood and liver stage parasites. Sci. Rep 2021, 11, 2121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (29).Mackwitz MKW; Hesping E; Eribez K; Scholer A; Antonova-Koch Y; Held J; Winzeler EA; Andrews KT; Hansen FK Investigation of the in vitro and in vivo efficacy of peptoid-based HDAC inhibitors with dual-stage antiplasmodial activity. Eur. J. Med. Chem 2021, 211, 113065. [DOI] [PubMed] [Google Scholar]
  • (30).Lima MNN; Borba JVB; Cassiano GC; Mottin M; Mendonca SS; Silva AC; Tomaz KCP; Calit J; Bargieri DY; Costa FTM; Andrade CH Artificial Intelligence Applied to the Rapid Identification of New Antimalarial Candidates with Dual-Stage Activity. ChemMedChem 2021, 16, 1093–1103. [DOI] [PubMed] [Google Scholar]
  • (31).Capela R; Magalhaes J; Miranda D; Machado M; Sanches-Vaz M; Albuquerque IS; Sharma M; Gut J; Rosenthal PJ; Frade R; Perry MJ; Moreira R; Prudencio M. Lopes F. Endoperoxide-8-aminoquinoline hybrids as dual-stage antimalarial agents with enhanced metabolic stability. Eur. J. Med. Chem 2018, 149, 69–78. [DOI] [PubMed] [Google Scholar]
  • (32).Carrasco MP; Machado M; Goncalves L; Sharma M; Gut J; Lukens AK; Wirth DF; Andre V; Duarte MT; Guedes RC; Dos Santos DJVA; Rosenthal PJ; Mazitschek R; Prudencio M; Moreira R. Probing the Azaaurone Scaffold against the Hepatic and Erythrocytic Stages of Malaria Parasites. ChemMedChem 2016, 11, 2194–2204. [DOI] [PubMed] [Google Scholar]
  • (33).Sun W; Huang X; Li H; Tawa G; Fisher E; Tanaka TQ; Shinn P; Huang W; Williamson KC; Zheng W. Novel lead structures with both Plasmodium falciparum gametocytocidal and asexual blood stage activity identified from high throughput compound screening. Malar. J 2017, 16, 147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (34).Danishuddin; Madhukar G; Malik MZ; Subbarao N. Development and rigorous validation of antimalarial predictive models using machine learning approaches. SAR QSAR Environ. Res 2019, 30, 543–560. [DOI] [PubMed] [Google Scholar]
  • (35).Mahmoudi N; de Julian-Ortiz JV; Ciceron L; Galvez J; Mazier D; Danis M; Derouin F; Garcia-Domenech R. Identification of new antimalarial drugs by linear discriminant analysis and topological virtual screening. J. Antimicrob. Chemother 2006, 57, 489–497. [DOI] [PubMed] [Google Scholar]
  • (36).Breiman L. Random Forest. Mach. Learn 2001, 45, 5–32. [Google Scholar]
  • (37).Derbyshire ER; Prudencio M; Mota MM; Clardy J. Liver-stage malaria parasites vulnerable to diverse chemical scaffolds. Proc. Natl. Acad. Sci. USA 2012, 109, 8511–8516. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (38).Derbyshire ER; Zuzarte-Luis V; Magalhaes AD; Kato N; Sanschagrin PC; Wang J; Zhou W; Miduturu CV; Mazitschek R; Sliz P; Mota MM; Gray NS; Clardy J. Chemical interrogation of the malaria kinome. Chembiochem 2014, 15, 1920–1930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (39).Ekins S; Reynolds RC; Kim H; Koo M-S; Ekonomidis M; Talaue M; Paget SD; Woolhiser LK; Lenaerts A; Bunin BA; Connell N; Freundlich JS Bayesian models leveraging bioactivity and cytotoxicity information for drug discovery. Chem. Biol 2013, 20, 370–378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (40).Pereira JC; Daher SS; Zorn KM; Sherwood M; Russo R; Perryman AL; Wang X; Freundlich MJ; Ekins S; Freundlich JS Machine Learning Platform to Discover Novel Growth Inhibitors of Neisseria gonorrhoeae. Pharm. Res 2020, 37, 141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (41).Matthews BW Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 1975, 405, 442–451. [DOI] [PubMed] [Google Scholar]
  • (42).Wang X; Inoyama D; Russo R; Li SG; Jadhav R; Stratton TP; Mittal N; Bilotta JA; Singleton E; Kim T; Paget SD; Pottorf RS; Ahn Y; Davila-Pagan A; Kandasamy S; Grady C; Hussain S; Soteropoulos P; Zimmerman MD; Ho HP; Park S; Dartois V; Ekins S; Connell N; Kumar P; Freundlich JS Antitubercular Triazines: Optimization and Intrabacterial Metabolism. Cell Chem. Biol 2020, 27, 172–185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (43).Patel JS; Norambuena J; Al-Tameemi H; Ahn YM; Perryman AL; Wang X; Daher SS; Occi J; Russo R; Park S; Zimmerman M; Ho H-P; Perlin DS; Dartois V; Ekins S; Kumar P; Connell N; Boyd JM; Freundlich JS Bayesian Modeling and Intrabacterial Drug Metabolism Applied to Drug-Resistant Staphylococcus aureus. ACS Infect. Dis 2021, 7, 2508–2521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (44).Perryman AL; Inoyama D; Patel JS; Ekins S; Freundlich JS Pruned Machine Learning Models to Predict Aqueous Solubility. ACS Omega 2020, 5, 16562–16567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (45).Perryman AL; Patel JS; Russo R; Singleton E; Connell N; Ekins S; Freundlich JS Naive Bayesian Models for Vero Cell Cytotoxicity. Pharm. Res 2018, 35, 170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (46).Stratton TP; Perryman AL; Vilcheze C; Russo R; Li SG; Patel JS; Singleton E; Ekins S; Connell N; Jacobs WR Jr.; Freundlich JS Addressing the Metabolic Stability of Antituberculars through Machine Learning. ACS Med. Chem. Lett 2017, 8 (10), 1099–1104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (47).Mughal H; Wang H; Zimmerman M; Paradis MD; Freundlich JS Random Forest Model Prediction of Compound Oral Exposure in the Mouse. ACS Pharmacol. Transl. Sci 2021, 4, 338–343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (48).Wang X; Perryman AL; Li SG; Paget SD; Stratton TP; Lemenze A; Olson AJ; Ekins S; Kumar P; Freundlich JS Intrabacterial Metabolism Obscures the Successful Prediction of an InhA Inhibitor of Mycobacterium tuberculosis. ACS Infect. Dis 2019, 5, 2148–2163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (49).Godinez WJ; Ma EJ; Chao AT; Pei L; Skewes-Cox P; Canham SM; Jenkins JL; Young JM; Martin EJ; Guiguemde WA Design of potent antimalarials with generative chemistry. Nature Machine Intelligence 2022, 4, 180–186. [Google Scholar]
  • (50).Martin EJ; Polyakov VR; Zhu XW; Tian L; Mukherjee P; Liu X. All-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration IC50s for 8558 Novartis Assays. J. Chem. Inf. Model 2019, 59, 4450–4459. [DOI] [PubMed] [Google Scholar]
  • (51).Bosc N; Felix E; Arcila R; Mendez D; Saunders MR; Green DVS; Ochoada J; Shelat AA; Martin EJ; Iyer P; Engkvist O; Verras A; Duffy J; Burrows J; Gardner JMF; Leach A. MAIP: a web service for predicting blood-stage malaria inhibitors. J. Cheminform 2021, 13, 13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (52).Neves BJ; Braga RC; Alves VM; Lima MNN; Cassiano GC; Muratov EN; Costa FTM; Andrade CH Deep Learning-driven research for drug discovery: Tackling Malaria. PLoS Comput. Biol 2020, 16, e1007025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (53).Mjambili F; Njoroge M; Naran K; De Kock C; Smith PJ; Mizrahi V; Warner D; Chibale K. Synthesis and biological evaluation of 2-aminothiazole derivatives as antimycobacterial and antiplasmodial agents. Bioorg. Med. Chem. Let.t 2014, 24, 560–564. [DOI] [PubMed] [Google Scholar]
  • (54).Navidpour L; Chibale K; Esmaeili S; Ghiaee A; Hadj-Esfandiari N; Irani M; Ahmadi Koulaei S; Yassa N. Antimalarial Activities of (Z)-2-(Nitroheteroarylmethylene)-3(2H)-Benzofuranone Derivatives: In Vitro and In Vivo Assessment and beta-Hematin Formation Inhibition Activity. Antimicrob. Agents Chemother 2021, 65, e0268320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (55).van der Maaten L; Hinton G. Visualizing data using t-sne. J. Mach. Learn. Res 2008, 15, 3221–3245. [Google Scholar]
  • (56).Radfar A; Mendez D; Moneriz C; Linares M; Marin-Garcia P; Puyet A; Diez A; Bautista JM Synchronous culture of Plasmodium falciparum at high parasitemia levels. Nat. Protoc 2009, 4, 1899–1915 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI text and figures
Table S5
CSV
SI zip
Table S1

RESOURCES