Abstract
We describe two methods of automated covalent docking using Autodock4: the two‐point attractor method and the flexible side chain method. Both methods were applied to a training set of 20 diverse protein–ligand covalent complexes, evaluating their reliability in predicting the crystallographic pose of the ligands. The flexible side chain method performed best, recovering the pose in 75% of cases, with failures for the largest inhibitors tested. Both methods are freely available at the AutoDock website (http://autodock.scripps.edu).
Keywords: computational docking, computer‐aided drug design, covalent inhibitors, ligand–protein interactions
Introduction
There has recently been a resurgence of interest in inhibitors that bind covalently to their biomolecular targets.1, 2, 3 These compounds have the advantage of very tight binding, allowing design of compounds with small molecular mass but with high potency. Potential problems with selectivity have been a concern for development of covalent inhibitors, in spite of the fact that roughly one‐third of currently approved drugs act through covalent mechanisms. The major approach to reducing toxicity is to improve the selectivity of the compounds, both by optimizing the noncovalent interactions with the site of binding and by tailoring the chemistry of reaction with a specific site of alkylation.
Computational docking provides an effective way to evaluate the interaction of a trial compound with a target. With the recent increased interest in covalent inhibitors, a variety of methods have been reported for using computational docking to predict the binding of covalently bound compounds (see, for instance, Refs 4, 5, 6, 7). These methods search the conformations available to the ligand in its covalently attached state, and evaluate the energetics of interaction with the target binding site.8, 9, 10 More detailed methods, such as quantum mechanical analysis, complement these docking analyses to address the chemical reactivity of the compound.
We present methods to modify the distributed version of AutoDock11 to evaluate covalent complexes, by extending its standard functionality with custom potentials and new atom types. The specification and architecture of AutoDock was designed to allow this type of user modification for specialized systems, and has been used previously to model protein flexibility,12 flexible rings in ligands,13 zinc interaction,14 and selective hydration.15 Two different approaches have been developed for covalent docking,11 which differ in the way the ligand is modeled [Fig. 1]. With the flexible side chain method, the ligand is joined in an arbitrary conformation with the target, and the attached ligand is modeled as a fully flexible side chain in the AutoDock simulation, in a similar fashion as other docking programs (GOLD9 and FlexX10). Previously reported results suggest that AutoDock flexible residue method outperformed GOLD in a dataset of 76 complexes.4 With the two‐point attractor method, the alkylating molecule is modeled as a free ligand, and a custom potential is used to bring together the covalently bound portions of ligand and alkylated residue. With both methods, it is possible to model the most frequently covalently modified residues, such as cysteine, lysine, threonine, and serine. The methods are presented and tested on a set of 20 diverse systems in order to compare their relative performance and identify pros and cons of each.
Figure 1.

Schematic diagram of the two covalent docking approaches, with the protein in gray and the ligand in black. a) The two‐point attractor method calculates two energetically attractive maps (shown schematically with circles) at the site of covalent attachment on the protein, and uses two dummy atom types (X and Z) to target the ligand to the site. b) The flexible side chain method overlaps the ligand with the protein residue, and then uses AutoDock to optimize the conformation.
Results and Discussion
The goal of this work is to assess and compare performance of the two approaches for modifying AutoDock to handle flexible docking: the two‐point attractor method and the flexible side chain method. Accuracy was assessed by measuring the RMSD between the lowest energy pose and the experimental coordinates, and results with deviations below 3.0 Å were considered successful. The total number of result clusters (2.0 Å RMSD tolerance) and the number of poses in the lowest energy cluster were used to assess the reliability in finding the correct pose. For the two‐point attractor method, the RMSD between the Cα and Cβ of the docked poses and the experimental coordinates were all less than 0.09 Å, so the method is effective for forming the covalent bond. RMSD results are included in Figure 2 along with images of the best docked conformation for each complex. Data for RMSD and cluster size, and their relationship to flexibility of complexes, are presented graphically in Figure 3 and in Supporting Information Tables 2–4.
Figure 2.

Best pose of compounds with the covalently bound residues (purple) compared with the crystallographic pose (green) of the ligands. RMSD values are given in parentheses. a) Two‐point attractor method at 100 runs; b) Flexible side chain method at 50 runs.
Figure 3.

Correlation between the torsional degrees of freedom and RMSD results. a) Two‐point attractor method; b) Flexible side chain method. The size of data points is roughly proportional to the number of conformations in the cluster of best predicted energy.
Overall, the two‐point attractor method performed poorly, reproducing the experimental coordinates in 4 systems out of 20 (20% success rate). Energies improved slightly when going from default LGA settings (Table 3 in Supporting Information) to extended settings (Table 4 in Supporting Information), but no significant variations were found in the overall success rate. Also, accuracy is roughly correlated with the number of rotatable bonds of the ligand [Fig. 3(a)], with only one successful result with ligands over 9 torsional degrees of freedom. Accuracy of the flexible side chain method is considerably higher, reproducing experimental coordinates in 15 out of 20 systems (75% success rate). As expected, because of the reduction in degrees of freedom in the tethered docking method, it is less sensitive than the two‐point attractor method to the torsional complexity of the ligands, obtaining successful results with up to 14 torsional degrees of freedom [Fig. 3(b)].
Overall, highly flexible ligands showed poorer reliability in finding the correct result, as observed in the larger number of scarcely populated clusters for very flexible ligands. Since the ligand is docked untethered with the two‐point attractor method, it results in larger numbers of degrees of freedom, making the search implicitly more complex. In the flexible side chain method, the ligand is properly aligned along the Cα–Cβ bond and only torsional degrees of freedom need to be searched, whereas the two‐point attractor method needs to expend much computational effort finding and optimizing the location and orientation of the covalent bond at the site of attachment. Therefore, the two‐point attractor method is more prone to finding false‐positive poses that can be ranked higher than the correct pose, resulting in RMSD values as high as 6 Å. As expected, search efficiency is more effective with the flexible residue method than with the two‐point attractor method. Both methods are successful with low torsional degrees of freedom (6, 7, and 8 for 2awz,18 3c9w,19 and 1pwc,17 respectively), but they both fail with complex 3lok20 (7 torsions). This is due to the absence of co‐ordinating water between the ligand and residue Glu339 (data not shown). Also, neither method is successful in redocking the complex 1w3c,21 due to the large fraction of the ligand (i.e., thiophene and indole rings) exposed to the solvent.
Methods
AutoDock is an automated computational procedure for predicting the interactions of ligands with macromolecular targets.16 A typical AutoDock calculation is a two‐step process: first, a map of interaction energies is calculated using a series of probe atoms, and then these maps are used as look‐up tables during the docking conformational search to speed up ligand energy evaluation. The two covalent docking methods modify different aspects of the standard AutoDock method.
Two‐point attractor method
The two‐point attractor method uses modified interaction maps and modified atom types. The target residue side chain is clipped to remove two terminal atoms (i.e., Cβ and Oγ for serine). These two atoms are attached to the alkylating ligand at the appropriate location with ideal chemical geometry, and assigned two special atom types (X and Z). Two specialized interaction maps are created for these atom types, with a Gaussian potential (termed here the Z‐potential) centered on the location of the original atoms in the receptor structure, with a negative value close to the desired location and rising to zero at distant locations. The Z‐potential penalizes poses where Z or X atoms are outside their covalently attached location, pulling the ligand in the proper pose [Fig. 1(a)].
The Gaussian Z‐potential is defined with two parameters: width (δ, unit: Å) and amplitude (ε, unit: kcal/mol). Optimal values for these parameters need to provide a good compromise between atom placement accuracy and search accessibility. The combination of small δ and large ε results in a narrow and deep potential, whose energetic minimum will be accurately located on the atom position, but will be much harder to find during the search. Conversely, large δ and small ε will be easier to reach during the search, but it will provide lower precision in reproducing the covalent geometry. Prior to performing docking on the whole dataset with the two‐point attractor method, we identified the optimal values of Gaussian potential coefficients δ and ε using the complex of a DD‐peptidase with penicillin G (PDB entry 1pwc17), which was also used as test case for the first implementation of the method.11 We tested a range of values (0.5–100 Å for δ and 10–50 kcal/mol for ε, Supporting Information Table 1). Most values provided satisfying results, with several combinations achieving RMSD below 1.0 Å. As expected, extreme value pairs (i.e., high‐δ/low‐ε, low‐δ/high‐ε) resulted in the worst results with higher deviations from the experimental poses. To identify the optimal values, the following criteria were adopted: (a) the largest δ was chosen that still provides proper atom placement, to improve the search effectiveness; (b) the smallest magnitude of ε was chosen that results in proper placement, to ensure that the contribution of the Ζ‐potential does not swamp out the energetics of interaction of the noncovalent portion of the molecule. The optimal values selected were δ = 3 Å, and ε = 10 kcal/mol.
Flexible side chain method
In the flexible side chain method, a ligand coordinate file is modified by connecting the two target residue atoms at the site of alkylation, with ideal chemical geometry. These two ligand atoms are then overlapped with the matching atoms in the receptor structure to establish the covalent bond with the residue before running the docking. Then, during the docking, the complex is treated as a fully flexible side chain, using the existing AutoDock method for modeling selected receptor flexibility. In our approach, which assumes a rigid protein backbone, Cα and Cβ atoms are fixed in space while all torsions of both ligand and residue (including Cα–Cβ) are allowed to rotate [Fig. 1(b)].
Data set
In order to compare results obtained with the two methods, we filtered the PDB to obtain a representative data set using the following criteria:
covalent ligand binding;
X‐ray diffraction resolution ≤2.65 Å;
structurally diverse ligands, with torsional degrees of freedom between 5 and 22;
Ser or Cys covalent residue;
no co‐factors in the binding site.
The final data set includes 20 structures from 19 different protein families. Ligands in these complexes present a wide range of structural complexity, with torsional degrees of freedom ranging from 6 to 20, and molecular weight from 369 to 738 D. The full list of PDB accession codes and information is available in Supporting Information Table 2.
Coordinate preparation and docking
The two covalent protocols require different input preparation protocols. Receptor structures were obtained from the PDB complexes by selecting a single protein subunit and removing all waters and co‐factors. AutoDockTools11 was used to add hydrogens, calculate Gasteiger charges, and generate PDBQT files.
For the two‐point attractor method, the ligand is extracted from the PDB file and modified by adding two extra atoms X and Z, corresponding to the two side chain terminal atoms (i.e., serine Cβ and Oγ). Hydrogens and charges were added, and to prevent any bias toward the known input ligand configuration, the resulting PDBQT coordinates were randomized (orientation, translation, and torsions) using AutoDock Vina.16 The grid parameter file (GPF) was modified to include the definition of the two Z‐potentials for atoms X and Z. The potential centers were defined on the original coordinates of the residue atoms, and grid map calculated following the standard AutoDock protocol for ligand docking.
In the flexible side chain method, the ligand file was created by adding two receptor atoms to the ligand coordinates in ideal chemical geometry, and then using this file to superimpose the ligand on the appropriate residue in the target protein [Fig. 1(b)]. ADT was used to add hydrogens, calculate Gasteiger charges, and generate a modified flexible ligand, using default methods. The resulting side chain–ligand structure is treated as flexible during the docking simulation, sampling torsional degrees of freedom to optimize the interaction of the tethered ligand with the rest of the protein. Grid maps were calculated following the standard AutoDock protocol for flexible side chains.11 The torsions of the flexible residue have been randomized using AutoDock Vina.
Docking simulations were performed using the Lamarckian Genetic Algorithm. For the flexible residue method, dockings on the whole dataset were run with default LGA settings (50 poses generated). Due to the higher complexity of untethered two‐point attractor method, results obtained with the default LGA settings (50 poses) were compared with extended search parameters (ga_population = 300, ga_num_evals = 106, 100 poses generated). Final results were clustered with AutoDock with 2.0 Å RMSD tolerance.
Conclusions
We compared two methods for performing covalent dockings with AutoDock: a two‐point attractor method and a modified protocol of the flexible receptor residues. The methods were tested on a diverse set of complexes to assess their relative performance. Overall, the flexible residue method provided better accuracy (75% success rate) than the two‐point attractor method (20%). The success rate of both methods showed a correlation with the complexity of the ligands, decreasing accuracy with the increase of torsional degrees of freedom. Based on the results from the data set, the flexible residue method provides the most effective approach in simulating covalent ligands. The two‐point attractor method, on the other hand, allows incorporation of a certain degree of approximation in the position of the covalent attachment, which may be important for accommodating any conformational changes in residue backbone structure surrounding the site of the covalent ligand (i.e., when docking in apo structures). We are currently assessing this possibility. Finally, both methods provide examples of how AutoDock suite parameters can be customized to implement new methods and extend basic functionalities. Both methods are implemented in the current version of AutoDock (v.4.2.6), and tools and scripts for preparing input files are included in latest release of AutoDockTools (v.1.5.7‐latest), all available at http://autodock.scripps.edu.
Supporting information
Supporting Information
Acknowledgments
This is manuscript 29076 from the Scripps Research Institute. We acknowledge Dr. Simona Distinto of University of Cagliari for her support of G.B. for this work.
It is with great please we contribute this article to this special issue of Protein Science in honor of our long‐time colleague and collaborator Prof. Ron Levy. A.J.O. first met Ron Levy almost 40 years ago and has appreciated his significant contributions to the fields of computational chemistry and biology over these intervening years. Ron has been an active and inspirational collaborator on a number of our recent papers on docking and importantly has introduced his free energy methods to improve the results of virtually screening in the drug design pipeline. Congratulations and best wishes for continued research excellence go to Ron on the occasion of this Festschrift.
References
- 1. Johnson DS, Weerapana E, Cravatt BF (2010) Strategies for discovering and derisking covalent, irreversible enzyme inhibitors. Future Med Chem 2:949–964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Singh J, Petter RC, Baillie TA, Whitty A (2011) The resurgence of covalent drugs. Nat Rev Drug Discov 10:307–317. [DOI] [PubMed] [Google Scholar]
- 3. Mah R, Thomas JR, Shafer CM (2014) Drug discovery considerations in the development of covalent inhibitors. Bioorg Med Chem Lett 24:33–39. [DOI] [PubMed] [Google Scholar]
- 4. Ouyang X, Zhou S, Su CTT, Ge Z, Li R, Kwoh CK (2013) CovalentDock: automated covalent docking with parameterized covalent linkage energy estimation and molecular geometry constraints. J Comput Chem 34:326–336. [DOI] [PubMed] [Google Scholar]
- 5. Zhu K, Borrelli K, Greenwood JR, Day T, Abel R, Farid RS, Harder E (2014) Docking covalent inhibitors: a parameter free approach to pose prediction and scoring. J Chem Inf Model 54:1932–1940. [DOI] [PubMed] [Google Scholar]
- 6. London N, Miller RM, Krishnan S, Uchida K, Irwin JJ, Eidam O, Gibold L, Cimermancic P, Bonnet R, Shoichet BK, Taunton J (2014) Covalent docking of large libraries for the discovery of chemical probes. Nat Chem Biol 10:1066–1072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Scholz C, Knorr S, Hamacher K, Schmidt B (2015) DOCKTITE‐A highly versatile step‐by‐step workflow for covalent docking and virtual screening in the molecular operating environment. J Chem Inf Model 55:398–406. [DOI] [PubMed] [Google Scholar]
- 8. Moitessier N, Englebienne P, Lee D, Lawandi J, Corbeil CR (2008) Towards the development of universal, fast and highly accurate docking/scoring methods: a long way to go. British J Pharmacol 153:S7–S26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD (2003) Improved protein–ligand docking using GOLD. Proteins 52:609–623. [DOI] [PubMed] [Google Scholar]
- 10. Kramer B, Rarey M, Lengauer T (1999) Evaluation of the FlexX incremental construction algorithm for protein‐ligand docking. Proteins 37:228–241. [DOI] [PubMed] [Google Scholar]
- 11. Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, Olson AJ (2009) AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J Comput Chem 30:2785–2791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Österberg F, Morris GM, Sanner MF, Olson AJ, Goodsell DS (2002) Automated docking to multiple target structures: incorporation of protein mobility and structural water heterogeneity in AutoDock. Proteins: Structure, Function, and Bioinformatics 46:34–40. [DOI] [PubMed] [Google Scholar]
- 13. Forli S, Botta M (2007) Lennard‐Jones potential and dummy atom settings to overcome the AUTODOCK limitation in treating flexible ring systems. J Chem Inf Model 47:1481–1492. [DOI] [PubMed] [Google Scholar]
- 14. Santos‐Martins D, Forli S, Ramos MJ, Olson AJ (2014) AutoDock4(Zn): an improved AutoDock force field for small‐molecule docking to zinc metalloproteins. J Chem Inf Model 54:2371–2379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Forli S, Olson AJ (2012) A force field with discrete displaceable waters and desolvation entropy for hydrated ligand docking. J Med Chem 55:623–638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, Olson AJ (1998) Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J Comp Chem 19:1639–1662. [Google Scholar]
- 17. Silvaggi NR, Josephine HR, Kuzin AP, Nagarajan R, Pratt RF, Kelly JA (2005) Crystal structures of complexes between the R61 DD‐peptidase and peptidoglycan‐mimetic beta‐lactams: a non‐covalent complex with a "perfect penicillin". J Mol Biol 345:521–533. [DOI] [PubMed] [Google Scholar]
- 18. Powers JP, Piper DE, Li Y, Mayorga V, Anzola J, Chen JM, Jaen JC, Lee G, Liu J, Peterson G, Tonn GR, Yu Q, Walker NPC, Wang Z (2006) SAR and mode of action of novel non‐nucleoside inhibitors of hepatitis C NS5b RNA polymerase. J Med Chem 49:1034–1046. [DOI] [PubMed] [Google Scholar]
- 19. Rastelli G, Rosenfeld R, Reid R, Santi DV (2008) Molecular modeling and crystal structure of ERK2‐hypothemycin complexes. J Struct Biol 164:18–23. [DOI] [PubMed] [Google Scholar]
- 20. Kluter S, Simard JR, Rode HB, Grutter C, Pawar V, Raaijmakers HCA, Barf TA, Rabiller M, van Otterlo WAL, Rauh D (2010) Characterization of irreversible kinase inhibitors by directly detecting covalent bond formation: a tool for dissecting kinase drug resistance. Chembiochem 11:2557–2566. [DOI] [PubMed] [Google Scholar]
- 21. Ontoria JM, Di Marco S, Conte I, Di Francesco ME, Gardelli C, Koch U, Matassa VG, Poma M, Steinkuhler C, Volpari C, Harper S (2004) The design and enzyme‐bound crystal structure of indoline based peptidomimetic inhibitors of hepatitis C virus NS3 protease. J Med Chem 47:6443–6446. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supporting Information
