EDock‐ML: A web server for using ensemble docking with machine learning to aid drug discovery

Tanay Chandak; Chung F Wong

doi:10.1002/pro.4065

. 2021 Mar 25;30(5):1087–1097. doi: 10.1002/pro.4065

EDock‐ML: A web server for using ensemble docking with machine learning to aid drug discovery

Tanay Chandak ¹, Chung F Wong ^1,^✉

PMCID: PMC8040857 PMID: 33733530

Abstract

EDock‐ML is a web server that facilitates the use of ensemble docking with machine learning to help decide whether a compound is worthwhile to be considered further in a drug discovery process. Ensemble docking provides an economical way to account for receptor flexibility in molecular docking. Machine learning improves the use of the resulting docking scores to evaluate whether a compound is likely to be useful. EDock‐ML takes a bottom‐up approach in which machine‐learning models are developed one protein at a time to improve predictions for the proteins included in its database. Because the machine‐learning models are intended to be used without changing the docking and model parameters with which the models were trained, novice users can use it directly without worrying about what parameters to choose. A user simply submits a compound specified by an ID from the ZINC database (Sterling, T.; Irwin, J. J., J Chem Inf Model 2015, 55[11], 2,324–2,337.) or upload a file prepared by a chemical drawing program and receives an output helping the user decide the likelihood of the compound to be active or inactive for a drug target. EDock‐ML can be accessed freely at edock‐ml.umsl.edu

Keywords: cloud computing, drug discovery, ensemble docking, machine learning, web server

1. INTRODUCTION

Molecular docking provides a popular tool for screening chemical libraries to find potential drug candidates for a biomolecular target. ¹ Early docking studies were performed with both the ligand and the receptor treated as rigid molecules. ² , ³ , ⁴ With increasing computing power and improvement of docking methodologies, later studies allow flexibility of the ligand but that of the receptor is still often ignored. Nevertheless, publications incorporating receptor flexibility in various ways have appeared more frequently recently. Although simulations incorporating full system flexibility have become feasible, ⁵ , ⁶ these simulations are still difficult to be used for screening many compounds. Approximate methods, such as ensemble docking, ⁷ , ⁸ , ⁹ , ¹⁰ , ¹¹ , ¹² , ¹³ , ¹⁴ , ¹⁵ , ¹⁶ , ¹⁷ , ¹⁸ , ¹⁹ , ²⁰ can screen a large number of compounds more easily. Ensemble docking does not attempt to simulate the coupled motion of the receptor and the ligand directly. Instead, it first generates an ensemble of structures of the receptor alone, once. Many ligands are then docked to the same structural ensemble.

However, ensemble docking is complicated by not having a rigorous theory to guide the use of the docking scores of a compound to multiple structures to decide whether a compound is likely to be active. When only one structure is used, one simply uses the docking scores of different compounds to the same structure to rank‐order the compounds. When many structures are used, more than one docking scores are obtained for each compound. Without a rigorous theory, researchers have used the multiple docking scores in different ways to identify from a chemical library compounds that are more likely to be active. Examples include choosing the best x% of structures from a virtual screening by taking the top x/N% from each structure where N is the number of structures included in the ensemble for docking, ⁹ selecting the best x% based on the best docking score of each compound among all the structures, ⁹ and picking molecules that give favorable docking scores to most of the receptor's conformations. ²¹ More variations were discussed in recent reviews. ²² , ²³ These variations differ in their reliability in classifying compounds into actives and inactives, and could depend on the systems studied. An additional problem was that including more than a few structures in the ensemble decreased rather than increased performance. ²⁴

Recently, we used machine learning to resolve these problems. ²⁵ , ²⁶ Machine learning compensates for the lack of a rigorous theory by learning from known experimental data. We found that this approach could improve the ability of ensemble docking to classify compounds into actives and inactives, and the performance did not decrease as more structures were added to the ensemble.

The basic idea of our machine‐learning approach is to use the multiple docking scores of each compound to the ensemble of structures as features in machine‐learning models. ²⁵ By learning how active compounds differ from inactive compounds in these features, the machine‐learning approach improves predictive performance. As illustrated earlier, our machine‐learning approach has also removed the problem that including just beyond a few structures could worsen rather than improve performance. In addition, instead of using only one machine‐learning model, we use four machine‐learning models–k nearest neighbors, logistic regression, random forest, and support vector machine – to aid predictions.

Our machine‐learning approach takes a bottom up approach in which machine‐learning models are trained one drug target at a time. This provides an alternative to a generic approach that can be applied to many targets. The generic approach has the advantage that it can be applied to many drug targets. On the other hand, it is difficult for a generic approach to be accurate for a wide range of drug targets because of the many underlying approximations employed. ²⁷ By training models for one target at a time, one can develop more reliable predictive models for each target. This is useful in practical drug discovery that often has a central drug target. It is worthwhile to fine tune predictive models for the central drug target as they will be used over and over again during the lengthy process of drug discovery.

Because of the success of our machine‐learning approach, ²⁵ it is beneficial to make them more widely available through a web server. As the machine‐learning models are trained using specific parameters in the docking and in the machine‐learning models, users should not change these parameters when they apply these models to new compounds they want to evaluate. This provides the advantage that users not as familiar with the methodologies of docking or machine learning can also use the methods easily as they do not need to worry about choosing model parameters. A user simply submits a compound and chooses its target for docking and receives results that help the user decide whether the compound is likely to be active or inactive. The web server developed here will make it easier for bench scientists in drug discovery to use it on a day‐to‐day basis to help decide whether a compound is worthwhile to be tested biologically, especially when it takes time to be synthesized first.

The web server described here currently includes only protein kinases, one of the largest classes of current drug targets. However, more drug targets will be gradually added in the future using the web framework already developed here. In addition, EDock‐ML provides an option for users to train their own models for targets that are not yet included in EDock‐ML. A user familiar with docking can supply docking results for EDock‐ML to train the machine‐learning models, which can then be used by their colleagues less familiar with the methodologies.

EDock‐ML also allows users to perform ensemble docking without machine learning if they supply their own input files required for docking. This can be useful when sufficient experimental data are not yet available for training the machine‐learning models. One can still use the docking scores to make preliminary predictions, as illustrated by an example below.

EDock‐ML also allows a user to supply their own docking scores to the machine‐learning models in EDock‐ML to predict whether a compound is active or inactive, as long as they use the same docking parameters as those employed in training the machine‐learning models.

In Results, we use several examples to illustrate how EDock‐ML can be used. In Materials and Methods, we outline the implementation of EDock‐ML, and summarize the underlying methods and benchmarking published earlier. ²⁵

2. RESULTS

2.1. An example on docking Sorafenib to the vascular epidermal growth factor receptor 2 (VEGFR2)

In the main menu (Figure 1(a)), select “Docking + Machine Learning”. In the menu that comes up (Figure 2(b)), check the circle “Retrieve from Zinc”. Enter in the box labeled “Zinc Id (ex 123)” the id for Sorafenib (1493878) found from the ZINC database. ²⁸ Click “Select Protein” and select VEGFR2. Then click “Begin Docking”. Figure 2 shows the progress of the docking at three different time points. In Figure 2(a), the animated ellipses (…) indicates that the dockings of the compound to the structures labeled 1y6b_A, 2oh4_A, 3c7q_A, and 3cjf_A were being done by four processors that were available. Dockings to the other structures were waiting in queue. Figure 2(b) shows that the docking jobs to the first four structures had been completed and their docking scores are shown in the boxes. The figure also shows that four docking runs were being performed to the next four structures. In Figure 2(c), all docking jobs to all the structures had been completed and their docking scores shown.

(a) The home page of EDock‐ML with the “Run Models” pull‐down menu activated to show the three options available: performing machine learning only, performing docking only, and performing docking followed by evaluation with machine‐learning models. (b). Showing that the option docking with machine learning, “Docking and ML”, is selected. The option using proteins stored in the database of EDock‐ML and retrieve compounds from the ZINC database ²⁸ are also selected

Monitoring progress after a docking job has been submitted. (a). docking are being done to four structures as indicated by the animated ellipses (…) appearing under the labels of the structures. Docking runs to the other structures are being queued and the queue numbers are shown. (b) Docking to the first four structures have been completed and their docking scores shown. Docking to the next four structures are being run. (c). Docking to all structures have been completed and the docking scores are shown below the labels of the structures

One can then click “Run All Models”, on the same page, to run the four machine‐learning models. Upon completion, a table is shown: Figure 3. The column labeled “Specificity = 0.9” indicates that three out of the four machine‐learning models predicted with 90% probability that the compound was active. The column labeled “AUC Value” tells more story. AUC stands for the Area Under Receiver Operating Characteristic Curve. Each value varies between 0 and 1 with 1 giving the best possible model. A value of 0.5 means that a model performs only as well as a random model does. The closer the AUC value to 1, the more reliable the machine‐learning model is. From this table, we see that the Support Vector Machine that did not predict the compound to be active with 90% probability gave the lowest AUC value. In other words, this model was less reliable than the other three models for this drug target. As the three best machine‐learning models, with higher AUC values, were consistent in predicting the compound to be active, the probability that this compound was active was high. One could examine this further by clicking “Select Model” to choose the receiver operating characteristic curve for the Support Vector Machine model to be displayed (Figure 4).

The table that comes up after the four machine‐learning models have been run on studying the interactions between Sorafenib and the vascular epidermal growth factor receptor 2. The column “Specificity = 0.9” shows that three machine‐learning models: K nearest neighbors, logistic regression, and random forest – predicted the compound to be active with 90% certainty. The Support Vector Machine model did not predict the compound to be active with 90% certainty. The column labeled “AUC Value” gives the Area Under Receiver Operating Characteristics Curve for each machine‐learning model, with a value of 1 giving the best possible model, and 0.5 for a model that performs only as well as a random model does

The receiver operating characteristic curve obtained for the Support Vector Machine model shown in Figure 3. The points on the curve were colored red when the model with the corresponding sensitivity and specificity predicted Sorafenib to be inactive against the vascular epidermal growth factor receptor 2, colored green when the model predicted the compound to be active

A receiver operating characteristic curve for a predictive model is generated by choosing different cutoffs to divide compounds into actives and inactives and check how well the model performs in terms of sensitivity and specificity. Choosing a cutoff giving high sensitivity does not miss many true positives but gives many false positives. On the other hand, choosing a cutoff giving high specificity does not give many false positives but miss many true positives. By default, EDock‐ML always gives a table with sensitivity = 0.9 and sensitivity = 0.9 as described above but sometimes one does not need 90% certainty. The receiver operating characteristic curve helps a user draw conclusions using other values of sensitivity or specificity.

The receiver operating characteristic curve in Figure 4 colored the points on the curve red when the cutoff value was chosen in a range that predicted the compound to be inactive. On the other hand, it colored the points green when the cutoff value was chosen in a range that predicted the compound to be active. The crossing point from the red points to the green ones gave the best specificity achievable by this model ≈ 0.66 (because 1‐specificity ≈ 0.34) for the system studied in this example. In other words, the Support Vector Machine model could predict the compound to be active with 66% certainty. This probability is not as high as 90% but is still good, providing support to the other three machine‐learning models that predicted the compound to be active with 90% probability. This feature in EDock‐ML is valuable in helping a user make decisions, especially for compounds lying at the bother line between active and inactive. A user can make decisions based on the acceptable uncertainty for the problem at hand. For example, if a user has not found many active compounds yet, the user may accept a compound for further evaluation even if the models predict with lower certainty that the compound is active. On the other hand, if a user already has many active compounds, the user may accept a compound for further evaluation only when the models predict with high probability that the compound is active.

If one clicks a point on the receiver operating characteristic curve, another table (Figure 5) opens up to show the numerical values of the specificity and sensitivity at that point. Clicking the box labeled “Save data to Image” on the same page allows a user to save the results to a file in png format. The button “Print Page” allows the user to print the current page displaying the results.

Additional details provided when a point on the receiver operating characteristic curve in Figure 4 was clicked. “Model Probability” was obtained from the machine‐learning model. Threshold was the cutoff used to divide compounds into actives and inactives at the point chosen

2.2. Docking aspirin (ZINC id 53) to VEGFR2

For comparison, Figure 6 shows the results for an inactive compound. In this case, aspirin was docked to VEGFR2. For this compound, all four models predicted the compound to be inactive at sensitivity = 0.9 and specificity = 0.9. The column labeled “Sensitivity = 0.9” is particularly useful for discarding compounds less likely to be active. If “Decoy”, for a compound that is not active, is shown for a machine‐learning model in that column, it indicates that the machine‐learning model predicts with 90% chance that the compound is inactive. In this example, all four machine‐learning models predicted with 90% probability that aspirin was not active against VEGFR2.

The ensemble docking/machine‐learning models predicted with 90% certainty that aspirin was an inactive compound for the vascular epidermal growth factor receptor 2. “Decoy” means inactive

2.3. Performing docking only

For systems that do not yet have much experimental data to train machine‐learning models, one can still use EDock‐ML to perform ensemble docking and use the resulting docking scores to predict whether a compound is likely to be active. This may not provide as accurate predictions but offers a good start. Once some active compounds have been found and confirmed experimentally, the experimental data can be used to start training machine‐learning models. This process can be iterated to gradually zoom into better and better machine‐learning models for predictions.

Without machine‐learning models yet, one useful way to utilize docking scores from ensemble docking is to choose the best docking scores of a compound among the structures for comparison with another compound. It still gave reasonable enrichment factors in virtual screening from previous studies. ⁹ One way to use this approach is to first perform a docking study for a positive control and extract its best docking score. One can then dock another compound and compare its best docking score with that of the positive control to estimate whether the compound is likely to be active. An experienced modeler may prepare the structure and docking configuration files for AutoDock Vina ²⁹ for the desired drug target so that a collaborator less familiar with docking can simply use the files in EDock‐ML to dock other compounds to this target.

To use this option, a user chooses the “Docking” option (Figure 1(a)) and then select the option “Upload Custom Structures” (Figure 1(b)) from the web page. The user then supplies the structure and configuration files required by AutoDock Vina ²⁹ for each structure of the receptor included in the ensemble. The user may upload the files for each structure one at a time by using the button “Upload One at a time” (Figure 7) or select all the files for all the structures at once by selecting the option “Select Multiple”. Once a window opens up, the user may navigate to the directory/folder containing the files and select them to be uploaded to EDock‐ML. The names of the configuration and receptor files should be in the form (structure‐name).conf and (structure‐name).receptor.pdbqt or (structure‐name).pdbqt, respectively. Clicking “Begin Docking” will display the same menu as shown in Figure 2. Once the jobs have been completed, a table like that shown in Figure 3 comes up to show the docking scores to each structure in the ensemble. One can continue to evaluate more compounds by using the button “dock another ligand” that can be found on the same page.

Screen shot of EDock‐ML for uploading custom structures for ensemble docking. In this screen, the option “Choose File” and “Upload One at a Time” were checked. If one checks “Select Multiple”, a screen will open up to allow a user to upload all the files placed in a folder

2.4. Performing machine learning only

EDock‐ML also provides an option for a user to supply their own docking scores to the machine‐learning models for a drug target already trained by EDock‐ML. For using this option, the user needs to use exactly the same docking grid and parameters as those used in training the machine‐leaning models.

To use this option, the user first selects “machine learning” (Figure 1(a)), select a protein (such as IGF1R shown in Figure 8) and then enters the docking scores as a JSON object. Alternatively, a user may enter the docking scores to each structure by hand by clicking “Find Structures”. Another page comes up (Figure 9) with small boxes, labelled by the names of the structures, to which docking scores can be entered. After all the docking scores have been entered, the button “Run All Models” becomes active. Clicking it will run the pre‐trained machine‐learning models to predict the likelihood of the compound to be active as described above.

Screen shot of EDock‐ML for performing machine learning. The button “Use Stored Proteins” was highlighted and the insulin growth factor receptor 1 in the database was chosen. The box starting with “Structure Object” allows a user to enter docking scores as a JSON object

If one clicks “Find Structures” shown in Figure 8, a menu with small boxes will open up to allow a user to enter the docking scores to the different structures in the boxes. Each box is labeled by the ID of a structure used in the structural ensemble

3. DISCUSSION

In this “Tools for Protein Science”, we describe with examples a new web server EDock‐ML that uses ensemble docking assisted by machine learning to help predict whether a compound is likely to be active against a biomolecular target. As described in an earlier paper, ²⁵ machine learning improves the reliability of ensemble docking in classifying compounds into actives and inactives. EDock‐ML provides a web application to help bench scientists to use this approach in practical applications. A user can submit a compound from the ZINC database ²⁸ or a compound prepared by a chemical drawing program and receive results that help the user decide whether the compound is worthwhile to be considered further. It first creates a table telling which machine‐learning models predict the compound to be active with 90% probability or which models predict the compound to be inactive with 90% probability. If all four machine‐learning models predict the compound to be active or inactive with 90% probability, it will be straight forward for the user to decide whether the compound is active or inactive. In cases when the results are less clear, such as not all models predict the compound to be active or inactive, EDock‐ML provides other assistance. One is the Area under Receiver Operating Characteristics Curve (AUC) that measures how well each machine‐learning model performs. When there are discrepancies among the models, one may believe more in the models that have better AUC values. In addition, EDock‐ML allows a user to display the receiver operating characteristic curve to examine situation other than 90% probability. For example, if a user is satisfied with 80% certainty in prediction, the user can deduce from the curve whether the model predicts the compound to be active or inactive with this certainty.

EDock‐ML takes a bottom‐up approach in which machine‐learning models are developed one drug target at a time. Unlike generic models applying to many drug targets, this bottom‐up approach is usually more accurate because it does not try to require a generic model containing numerous approximations to work for many systems. This targeted approach is useful in a practical drug discovery project in which researchers have a specific drug target for developing drug candidates. On the other hand, this bottom‐up approach takes time to develop machine‐learning models for many targets. Although we shall continue to use the web framework already developed here to add more drug targets in the future, EDock‐ML also provides an option for users to develop their own machine‐learning models for targets not yet included.

4. MATERIALS AND METHODS

4.1. Software

EDock‐ML is built with python in the backend. It uses AutoDock Vina ²⁹ as the docking engine, Open Babel ³⁰ as the converter of file formats for the ligands, SQLAlchemy as the tool for developing SQL databases, ³¹ and scikit‐learn ³² as the source of libraries for machine learning. All front‐end web pages are built from HTML/CSS and JavaScript. All the buttons, tables, forms, and graphs in the web user interface are built with newer HTML and CSS styling to make the website look modern. We provide the software on http://www.umsl.edu/~wongch/Software/EDock-ML/edock-ml_0.0.1.gz for those interested in running their own web server.

The website is split up into three different operations: Docking, Machine Learning, and Docking and Machine Learning. Each primary action has two sub‐actions: to use proteins and machine‐learning models already stored in EDock‐ML, or to upload the user's own data for proteins not yet included in EDock‐ML. These menus can be accessed from the bottom half of the home page, or from the pulldown menu accessible from the button labeled “Run Models” at the home page or “Run” at the top of other pages. This paper does not describe all these features, only those expected to be used by many users.

4.2. File formats accepted

Currently, EDock‐ML accepts protein structural files prepared in pdbqt format (not needed if a scientist uses proteins already included in EDock‐ML's database), and ligand files in pdbqt or mol2 format. If a ligand file is submitted in mol2 format, EDock‐ML uses Open Babel ³⁰ to convert the mol2 file to a pdbqt file before use. Alternatively, the user can specify a ZINC ID and EDock‐ML will pull the corresponding ligand directly from the ZINC database. ³³

4.3. Exporting docking scores and viewing processes later

To record the docking scores, EDock‐ML offers a few options in its web interface: saving them to a JSON object, saving them to a PDF file, or printing them out. The first option is available when the user clicks the button “show data” after an ensemble docking has been completed.

After submitting a docking job, EDock‐ML offers the user a shareable URL link. The user can return later to check the results by using this link.

4.4. Underlying methods and benchmarking

Although the methods and benchmarking were described in an earlier publication, ²⁵ we summarize the key ideas here for completeness and clarity.

4.4.1. Dataset for benchmarking

We used all the protein kinases and the associated compounds in the DUD‐E dataset ³⁴ designed for evaluating docking programs.

4.4.2. Docking

For all the actives and decoys for each protein kinase in DUD‐E, we docked them to an ensemble of structures selected from the Protein Data Bank. ³⁵ Initial exploration was done on the epidermal growth factor receptor (EGFR). The number of structures needed to give good performance was then also used for the other protein kinases studied.

For EGFR, ²⁶ we docked 832 active compounds and 35,442 decoys in DUD‐E to the protein using AutoDock Vina. ²⁹ Initially, we used 34 structures from the Protein Data Bank ³⁵ for docking: entries 2RGP, 2J5F, 3VJO, 4G5J, 4JQ7, 5CAV, 4LI5, 5FED, 1XKK, 3BEL, 4ZJV, 1 M14, 2GS2, 4I23, 4TKS, 5CNN, 4RIW, 3IKA, 4LQM, 4LL0, 2JIU, 2 EB2, 5XDL, 2 EB3, 4I24, 5Y9T, 4LRM, 2ITN, 2JIT, 4G5P, 5FEE, 2ITT, 3UG1, and 2GS7. We chose these structures by performing a sequence search on the Protein Data Bank ³⁵ using the sequence of the wild‐type protein. We selected structures with the wild‐type sequence first. To add more structures, we also selected structures with single mutations. For these mutants, we modelled the structures back to the wild‐type before docking. As we found that the performance increased slower after a few structures had been added to the structural ensemble, we included only 11 structures for the other 20 protein kinases studied. These protein kinases were ABL1, AKT2, CDK2, CSF1R, JAK2, LCK, MAPK2, MET, MK01, MK10, MK14 MP2K1, PLK1, ROCK1, TGFR1, VGFR2, WEE1, BRAF, FGFR1, and IGF1R.

4.4.3. Measurement of performance

We calculated areas under the receiver operating characteristic curves to measure performance. ³⁶ , ³⁷ , ³⁸ , ³⁹ , ⁴⁰ The area under this curve (AUC) ranges from 0 to 1, with 1 giving the best possible model. A model with an AUC of 0.5 only performs as well as a random model does.

4.4.4. Training and test sets

We divided the actives and decoys of the DUD‐E dataset ³⁴ into three groups. We took each group in turn as a test set, with the remaining groups as training sets. The variations of AUCs from the three studies gave an idea on the statistical fluctuations of the results.

4.4.5. Feature representation

We represented each compound by a vector containing its docking scores to an ensemble of structures. For example, when all the 34 structures of EGFR were used, this vector had a length of 34. When fewer structures were used, the length of the vector decreased.

4.4.6. K nearest neighbors

In using this method, we first calculated the Euclidian distance between each compound in the test set and each compound in the training set. The compounds were then sorted according to the distance and the top k compounds with the shortest distance were recorded. The percentage of active compounds in these k compounds were then calculated and used to predict whether a compound was likely to be active. A larger percentage implied a higher probability to be active. For EGFR, we varied the value of k to find the k's that gave the best performance as measured by the AUCs. As k ≈ 50 gave good performance, we continued to use this value for all the other protein kinases without individual optimization for each protein kinase.

4.4.7. Logistic regression

We used scikit‐learn ³² to perform logistic regression. It minimizes the cost function:

\frac{1}{2} {\overset{`}{w}}^{T} \overset{`}{w} + C \sum_{i}^{n} \log (\exp (- y_{i} ({\overset{`}{S}}_{i}^{T} \overset{`}{w} + c))) .

with respect to the weights that form components of the column vector $\overset{`}{w} .$ In our study, ${\overset{`}{S}}_{i}$ was a column vector containing the docking scores of compound i. y_i = 1 if compound i was active and = −1 if inactive. n was the number of compounds in a training set. c was the intercept of the logistic regression model. C controlled the strength of regularization. We used the default value of 1 for C as it already gave good AUCs.

4.4.8. Support Vector Machine

We used the implementation in scikit‐learn ³² to perform this task. For EGFR, we first carried out a grid search to find a good combination of the gamma value of the radial basis functions and the C value that controlled to what extent misclassified points were penalized. We performed the grid search with C = 0.1, 1, 10, 100, and 1,000 and gamma = 1, 0.5, 0.45, 0.4, 0.35, 0.3, 0.25, 0.2, 0.15, 0.1, 0.01, 0.001, and 0.0001. The combination C = 100 and gamma = 0.2 gave better performance and was used in all subsequent calculations for all the proteins studied. We also applied the “probability” option to give estimates for the probabilities of compounds to be active rather than only classifying them as actives or inactives.

4.5. Random forest

We used scikit‐learn ³² to perform this task. For EGFR, we varied the parameter n_estimators, the number of trees in the forest, from 500 to 2000 but the performance was not sensitive to the choice and we used n_estimators = 750 for all the subsequent calculations for all the proteins studied.

4.6. Summary of benchmarking results

We found that the machine‐learning enhanced ensemble docking was able to remove the previously observed artifacts that increasing the number of structures in an ensemble could decrease rather than increase performance. ²⁵ k nearest neighbors and random forest gave better performance than logistic regression and support vector machine in most cases. The best area under receiving operating characteristic curve ranged from ≈0.72 to close to 1 for the proteins studied, with most greater than 0.8. The performance is generally higher than the results obtained by using NNScore derived from machine learning in previous studies. ²⁷ This supports our statement in the introduction that a bottom‐up approach using machine learning for each protein could perform better than a generic approach applying to many proteins. This is particularly worth noting as EDock‐ML applies machine learning only on the docking scores from Autodock vina without changing its scoring function.

AUTHOR CONTRIBUTIONS

Tanay Chandak: Formal analysis; investigation; methodology; software; validation; writing‐original draft; writing‐review & editing. Chung F. Wong: Conceptualization; formal analysis; investigation; methodology; software; validation; writing‐original draft; writing‐review & editing.

ACKNOWLEDGMENTS

The U.S. National Institutes of Health (CA224033) has provided support for this work. We also acknowledge computing support from UMSL Information Technology Services. John Mayginnes provided some docking data described in reference [25].

Chandak T, Wong CF. EDock‐ML: A web server for using ensemble docking with machine learning to aid drug discovery. Protein Science. 2021;30:1087–1097. 10.1002/pro.4065

Funding information National Cancer Institute, Grant/Award Number: CA224033

REFERENCES

1. Pinzi L, Rastelli G. Molecular docking: Shifting paradigms in drug discovery. Intl J Mol Sci. 2019;20:4331. [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Meng EC, Shoichet BK, Kuntz ID. Automated docking with grid‐based energy evaluation. J Comput Chem. 1992;13:505–524. [Google Scholar]
3. Kuntz ID, Meng EC, Shoichet BK. Structure‐based molecular design. Acc Chem Res. 1994;27:117–123. [Google Scholar]
4. Shoichet BK, Kuntz ID. Matching chemistry and shape in molecular docking. Protein Eng. 1993;6:723–732. [DOI] [PubMed] [Google Scholar]
5. Shan Y, Kim ET, Eastwood MP, Dror RO, Seeliger MA, Shaw DE. How does a drug molecule find its target binding site? J Am Chem Soc. 2011;133:9181–9183. [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Decherchi S, Berteotti A, Bottegoni G, Rocchia W, Cavalli A. The ligand binding mechanism to purine nucleoside phosphorylase elucidated via molecular dynamics and machine learning. Nat Commun. 2015;6:7155. [DOI] [PMC free article] [PubMed] [Google Scholar]
7. Lorber DM, Shoichet BK. Flexible ligand docking using conformational ensembles. Protein Sci. 1998;7:938–950. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Kong X, Pan P, Li D, Tian S, Li Y, Hou T. Importance of protein flexibility in ranking inhibitor affinities: Modeling the binding mechanisms of piperidine carboxamides as type i1/2 ALK inhibitors. PCCP. 2015;17:6098–6113. [DOI] [PubMed] [Google Scholar]
9. Ellingson SR, Miao Y, Baudry J, Smith JC. Multi‐conformer ensemble docking to difficult protein targets. J Phys Chem B. 2015;119:1026–1034. [DOI] [PubMed] [Google Scholar]
10. Amaro RE, Li WW. Emerging methods for ensemble‐based virtual screening. Curr Top Med Chem. 2010;10:3–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Polgar T, Keseru GM. Ensemble docking into flexible active sites. Critical evaluation of FlexE against JNK‐3 and beta‐secretase. J Chem Inf Model. 2006;46:1795–1805. [DOI] [PubMed] [Google Scholar]
12. Sørensen J, Demir Ö, Swift RV, Feher VA, Amaro RE. Molecular docking to flexible targets. Methods Mol Biol. 2015;1215:445–469. [DOI] [PubMed] [Google Scholar]
13. Osguthorpe DJ, Sherman W, Hagler AT. Exploring protein flexibility: Incorporating structural ensembles from crystal structures and simulation into virtual screening protocols. J Phys Chem B. 2012;116:6952–6959. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Wong CF, Kua J, Zhang Y, Straatsma TP, McCammon JA. Molecular docking of balanol to dynamics snapshots of protein kinase A. Proteins. 2005;61:850–858. [DOI] [PubMed] [Google Scholar]
15. Cavasotto CN, Kovacs JA, Abagyan RA. Representing receptor flexibility in ligand docking through relevant normal modes. J Am Chem Soc. 2005;127:9632–9640. [DOI] [PubMed] [Google Scholar]
16. Osterberg F, Morris GM, Sanner MF, Olson AJ, Goodsell DS. Automated docking to multiple target structures: Incorporation of protein mobility and structural water heterogeneity in AutoDock. Proteins. 2002;46:34–40. [DOI] [PubMed] [Google Scholar]
17. Knegtel RM, Kuntz ID, Oshiro CM. Molecular docking to ensembles of protein structures. J Mol Biol. 1997;266:424–440. [DOI] [PubMed] [Google Scholar]
18. Lin JH, Perryman AL, Schames JR, McCammon JA. Computational drug design accommodating receptor flexibility: The relaxed complex scheme. J Am Chem Soc. 2002;124:5632–5633. [DOI] [PubMed] [Google Scholar]
19. Lin JH, Baker NA, McCammon JA. Bridging implicit and explicit solvent approaches for membrane electrostatics. Biophys J. 2002;83:1374–1379. [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Totrov M, Abagyan R. Flexible ligand docking to multiple receptor conformations: A practical alternative. Curr Opin Struct Biol. 2008;18:178–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Bottegoni G, Rocchia W, Rueda M, Abagyan R, Cavalli A. Systematic exploitation of multiple receptor conformations for virtual ligand screening. PLoS ONE. 2011;6:e18845. [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Wong CF. Incorporating Receptor Flexibility into Structure‐Based Drug Discovery. Computer‐Aided Drug Discovery, Methods in Pharmacology and Toxicology. New York, NY: Springer, 2016. [Google Scholar]
23. Wong CF. Flexible receptor docking for drug discovery. Expert Opin Drug Discov. 2015;10:1189–1200. [DOI] [PubMed] [Google Scholar]
24. Abagyan R, Rueda M, Bottegoni G. Recipes for the selection of experimental protein conformations for virtual screening. J Chem Inf Model. 2010;50:186–193. [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Chandak T, Mayginnes JP, Mayes H, Wong CF. Using machine learning to improve ensemble docking for drug discovery. Proteins. 2020;88:1263–1270. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Wong CF. Improving ensemble docking for drug discovery by machine learning. J Theor Comput Chem. 2019;18:1920001. [Google Scholar]
27. Durrant JD, McCammon JA. NNScore 2.0: A neural‐network receptor‐ligand scoring function. J Chem Inf Model. 2011;51:2897–2903. [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Irwin JJ, Shoichet BK. ZINC–a free database of commercially available compounds for virtual screening. J Chem Inf Model. 2005;45:177–182. [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Trott O, Olson AJ. Software news and update AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010;31:455–461. [DOI] [PMC free article] [PubMed] [Google Scholar]
30. O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open babel: An open chemical toolbox. J Chem. 2011;3:33. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Bayer M. SQLAlchemy. In: Brown a, Wilson G (eds) the architecture of open source applications volume II: Structure. Scale, and a Few More Fearless Hacks. 2012;2:291–314. aosabook.org. [Google Scholar]
32. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit‐learn: Machine learning in python. J Mach Learn Res. 2011;12:2825–2830. [Google Scholar]
33. Sterling T, Irwin JJ. ZINC 15 ‐ ligand discovery for everyone. J Chem Inf Model. 2015;55:2324–2337. [DOI] [PMC free article] [PubMed] [Google Scholar]
34. Mysinger MM, Carchia M, Irwin JJ, Shoichet BK. Directory of useful decoys, enhanced (DUD‐E): Better ligands and decoys for better benchmarking. J Med Chem. 2012;55:6582–6594. [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Berman HM, Westbrook J, Feng Z, et al. The protein data Bank. Nucleic Acids Res. 2000;28:235–242. [DOI] [PMC free article] [PubMed] [Google Scholar]
36. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36. [DOI] [PubMed] [Google Scholar]
37. Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983;148:839–843. [DOI] [PubMed] [Google Scholar]
38. Huang Z, Wong CF. Inexpensive method for selecting receptor structures for virtual screening. J Chem Inf Model. 2016;56:21–34. [DOI] [PubMed] [Google Scholar]
39. Huang Z, He Y, Zhang X, et al. Derivatives of salicylic acid as inhibitors of YopH in Yersinia pestis. Chem Biol Drug Des. 2010;76:85–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Triballeau N, Acher F, Brabet I, Pin JP, Bertrand HO. Virtual screening workflow development guided by the “receiver operating characteristic” curve approach. Application to high‐throughput docking on metabotropic glutamate receptor subtype 4. J Med Chem. 2005;48:2534–2547. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0001] 1. Pinzi L, Rastelli G. Molecular docking: Shifting paradigms in drug discovery. Intl J Mol Sci. 2019;20:4331. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0002] 2. Meng EC, Shoichet BK, Kuntz ID. Automated docking with grid‐based energy evaluation. J Comput Chem. 1992;13:505–524. [Google Scholar]

[pro4065-bib-0003] 3. Kuntz ID, Meng EC, Shoichet BK. Structure‐based molecular design. Acc Chem Res. 1994;27:117–123. [Google Scholar]

[pro4065-bib-0004] 4. Shoichet BK, Kuntz ID. Matching chemistry and shape in molecular docking. Protein Eng. 1993;6:723–732. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0005] 5. Shan Y, Kim ET, Eastwood MP, Dror RO, Seeliger MA, Shaw DE. How does a drug molecule find its target binding site? J Am Chem Soc. 2011;133:9181–9183. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0006] 6. Decherchi S, Berteotti A, Bottegoni G, Rocchia W, Cavalli A. The ligand binding mechanism to purine nucleoside phosphorylase elucidated via molecular dynamics and machine learning. Nat Commun. 2015;6:7155. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0007] 7. Lorber DM, Shoichet BK. Flexible ligand docking using conformational ensembles. Protein Sci. 1998;7:938–950. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0008] 8. Kong X, Pan P, Li D, Tian S, Li Y, Hou T. Importance of protein flexibility in ranking inhibitor affinities: Modeling the binding mechanisms of piperidine carboxamides as type i1/2 ALK inhibitors. PCCP. 2015;17:6098–6113. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0009] 9. Ellingson SR, Miao Y, Baudry J, Smith JC. Multi‐conformer ensemble docking to difficult protein targets. J Phys Chem B. 2015;119:1026–1034. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0010] 10. Amaro RE, Li WW. Emerging methods for ensemble‐based virtual screening. Curr Top Med Chem. 2010;10:3–13. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0011] 11. Polgar T, Keseru GM. Ensemble docking into flexible active sites. Critical evaluation of FlexE against JNK‐3 and beta‐secretase. J Chem Inf Model. 2006;46:1795–1805. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0012] 12. Sørensen J, Demir Ö, Swift RV, Feher VA, Amaro RE. Molecular docking to flexible targets. Methods Mol Biol. 2015;1215:445–469. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0013] 13. Osguthorpe DJ, Sherman W, Hagler AT. Exploring protein flexibility: Incorporating structural ensembles from crystal structures and simulation into virtual screening protocols. J Phys Chem B. 2012;116:6952–6959. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0014] 14. Wong CF, Kua J, Zhang Y, Straatsma TP, McCammon JA. Molecular docking of balanol to dynamics snapshots of protein kinase A. Proteins. 2005;61:850–858. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0015] 15. Cavasotto CN, Kovacs JA, Abagyan RA. Representing receptor flexibility in ligand docking through relevant normal modes. J Am Chem Soc. 2005;127:9632–9640. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0016] 16. Osterberg F, Morris GM, Sanner MF, Olson AJ, Goodsell DS. Automated docking to multiple target structures: Incorporation of protein mobility and structural water heterogeneity in AutoDock. Proteins. 2002;46:34–40. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0017] 17. Knegtel RM, Kuntz ID, Oshiro CM. Molecular docking to ensembles of protein structures. J Mol Biol. 1997;266:424–440. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0018] 18. Lin JH, Perryman AL, Schames JR, McCammon JA. Computational drug design accommodating receptor flexibility: The relaxed complex scheme. J Am Chem Soc. 2002;124:5632–5633. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0019] 19. Lin JH, Baker NA, McCammon JA. Bridging implicit and explicit solvent approaches for membrane electrostatics. Biophys J. 2002;83:1374–1379. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0020] 20. Totrov M, Abagyan R. Flexible ligand docking to multiple receptor conformations: A practical alternative. Curr Opin Struct Biol. 2008;18:178–184. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0021] 21. Bottegoni G, Rocchia W, Rueda M, Abagyan R, Cavalli A. Systematic exploitation of multiple receptor conformations for virtual ligand screening. PLoS ONE. 2011;6:e18845. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0022] 22. Wong CF. Incorporating Receptor Flexibility into Structure‐Based Drug Discovery. Computer‐Aided Drug Discovery, Methods in Pharmacology and Toxicology. New York, NY: Springer, 2016. [Google Scholar]

[pro4065-bib-0023] 23. Wong CF. Flexible receptor docking for drug discovery. Expert Opin Drug Discov. 2015;10:1189–1200. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0024] 24. Abagyan R, Rueda M, Bottegoni G. Recipes for the selection of experimental protein conformations for virtual screening. J Chem Inf Model. 2010;50:186–193. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0025] 25. Chandak T, Mayginnes JP, Mayes H, Wong CF. Using machine learning to improve ensemble docking for drug discovery. Proteins. 2020;88:1263–1270. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0026] 26. Wong CF. Improving ensemble docking for drug discovery by machine learning. J Theor Comput Chem. 2019;18:1920001. [Google Scholar]

[pro4065-bib-0027] 27. Durrant JD, McCammon JA. NNScore 2.0: A neural‐network receptor‐ligand scoring function. J Chem Inf Model. 2011;51:2897–2903. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0028] 28. Irwin JJ, Shoichet BK. ZINC–a free database of commercially available compounds for virtual screening. J Chem Inf Model. 2005;45:177–182. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0029] 29. Trott O, Olson AJ. Software news and update AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010;31:455–461. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0030] 30. O'Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open babel: An open chemical toolbox. J Chem. 2011;3:33. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0031] 31. Bayer M. SQLAlchemy. In: Brown a, Wilson G (eds) the architecture of open source applications volume II: Structure. Scale, and a Few More Fearless Hacks. 2012;2:291–314. aosabook.org. [Google Scholar]

[pro4065-bib-0032] 32. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit‐learn: Machine learning in python. J Mach Learn Res. 2011;12:2825–2830. [Google Scholar]

[pro4065-bib-0033] 33. Sterling T, Irwin JJ. ZINC 15 ‐ ligand discovery for everyone. J Chem Inf Model. 2015;55:2324–2337. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0034] 34. Mysinger MM, Carchia M, Irwin JJ, Shoichet BK. Directory of useful decoys, enhanced (DUD‐E): Better ligands and decoys for better benchmarking. J Med Chem. 2012;55:6582–6594. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0035] 35. Berman HM, Westbrook J, Feng Z, et al. The protein data Bank. Nucleic Acids Res. 2000;28:235–242. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0036] 36. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0037] 37. Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983;148:839–843. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0038] 38. Huang Z, Wong CF. Inexpensive method for selecting receptor structures for virtual screening. J Chem Inf Model. 2016;56:21–34. [DOI] [PubMed] [Google Scholar]

[pro4065-bib-0039] 39. Huang Z, He Y, Zhang X, et al. Derivatives of salicylic acid as inhibitors of YopH in Yersinia pestis. Chem Biol Drug Des. 2010;76:85–99. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro4065-bib-0040] 40. Triballeau N, Acher F, Brabet I, Pin JP, Bertrand HO. Virtual screening workflow development guided by the “receiver operating characteristic” curve approach. Application to high‐throughput docking on metabotropic glutamate receptor subtype 4. J Med Chem. 2005;48:2534–2547. [DOI] [PubMed] [Google Scholar]

PERMALINK

EDock‐ML: A web server for using ensemble docking with machine learning to aid drug discovery

Tanay Chandak

Chung F Wong

Abstract

1. INTRODUCTION

2. RESULTS

2.1. An example on docking Sorafenib to the vascular epidermal growth factor receptor 2 (VEGFR2)

FIGURE 1.

FIGURE 2.

FIGURE 3.

FIGURE 4.

FIGURE 5.

2.2. Docking aspirin (ZINC id 53) to VEGFR2

FIGURE 6.

2.3. Performing docking only

FIGURE 7.

2.4. Performing machine learning only

FIGURE 8.

FIGURE 9.

3. DISCUSSION

4. MATERIALS AND METHODS

4.1. Software

4.2. File formats accepted

4.3. Exporting docking scores and viewing processes later

4.4. Underlying methods and benchmarking

4.4.1. Dataset for benchmarking

4.4.2. Docking

4.4.3. Measurement of performance

4.4.4. Training and test sets

4.4.5. Feature representation

4.4.6. K nearest neighbors

4.4.7. Logistic regression

4.4.8. Support Vector Machine

4.5. Random forest

4.6. Summary of benchmarking results

AUTHOR CONTRIBUTIONS

ACKNOWLEDGMENTS

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases