Abstract
We describe the formalization of the reactive docking protocol, a method developed to model and predict reactions between small molecules and biological macromolecules. The method has been successfully used in a number of applications already, including recapitulating large proteomics datasets, performing structure-reactivity target optimizations and prospective virtual screenings. By modeling a near-attack conformation-like state, no QM calculations are required to model ligand and receptor geometries. Here, we present its generalization using a large dataset containing more than 400 ligand-target complexes, 8 nucleophilic modifiable residue types, and more than 30 warheads. The method correctly predicts the modified residue in ~85% of complexes and shows enrichments comparable to standard focused virtual screenings in ranking ligands. This performance supports this approach for the docking and screening of reactive ligands in virtual chemoproteomics and drug design campaigns.
Graphical Abstract

Introduction
Small molecules containing electrophilic reactive warheads have emerged as an important ligand class1, especially in drug design and chemical proteomics. Covalent drugs have been employed against a variety of targets as well as disease classes and their rational design has been a topic of increasing study2. Relative to conventional drugs, they can exhibit improved pharmacodynamics, higher potency, and improved selectivity. However, concerns have been raised due to their potential toxicity and promiscuous binding3.
Taking into account the issues that a reactive molecule could potentially cause, research in the past decade focused on merging the strengths of covalent and non-covalent binders (i.e.: high selectivity and high affinity) giving rise to the “targeted covalent inhibitors” (TCIs)1. These molecules install low reactivity warheads (i.e.: the electrophile moiety of the ligand) on a molecular scaffold with high binding affinity with the target structure, resulting in less promiscuity4,5. An example of TCIs is represented by molecules bearing acrylamide warheads. These molecules have been extensively used to design targeted covalent inhibitors for aiming at non-catalytic residues in different protein families, such as kinases and enzymes, of which EGFR6 and KRAS7 are notable examples, where selectivity is challenging to achieve with conventional binders. TCIs target the mutant G12C of KRAS, which is located close to an allosteric pocket that has been found difficult to target with non-covalent ligands6. Binders with acrylamides warheads engage covalently C797 in EGFR, a residue present in a small group of kinases6, while the inhibition with conventional binders gave rise to drug resistance mechanisms. To date, there are at least 7 TCIs targeting this residue, including neratinib and acalabrutinib6.
This rediscovered interest in covalent binders increases the importance of having a structure or high-quality model on which to perform structure-based optimizations8. Conversely, in chemical proteomics the promiscuity of electrophilic fragments is a feature that allows the identification of potentially modifiable residues across a proteome. Nonetheless, structure-based modeling can be critical here, for example in disambiguating which residue is labeled in a peptide, or which enantiomer of a racemic probe may be more active for further development. Modeling may also provide rational bases for the selectivity across probes for a given site, which can improve the SAR information extracted from the original proteomic experiment.
This interest in TCIs has been accompanied by a resurgence of interest in computational methods that can model such inhibitors, which come with unique modeling challenges9. Existing methods tend to focus on the evaluation of the end-point of the reaction, modeling the ligand in the bound state, such as the tethered10,11 or biased (or constrained) approaches12,13. More computationally expensive methods supplement these models with free energy calculations8,14,15.
These methods are suitable for the analysis of well-defined sets of molecules into well-characterized binding sites16 but are not ideal for binding sites prediction across nucleophilic residues in a protein, let alone proteomes. For that, an ideal method should be capable of predicting both the correct residue for modification and the optimal reactive ligand(s).
In this work, we present the formalization of the reactive docking method, which was specifically designed to address both challenges. This method uses a modified version of the AutoDock4 standard force field17 which modifies the near attack conformation18 (NAC, Fig. 1a), the last ground state geometry preceding the transition state geometries, with a decreased equilibrium distance for the reactive atoms. This species was chosen to model the ligand-target interaction because it represents an essential thermodynamic checkpoint for the reaction while containing minimal geometric perturbations to the ligand or target structures. With this model, the ligand is docked in the unmodified state prior to the reaction, with the reactive warhead still intact. Another advantage of modeling a NAC-like state is that, as a ground state model, no expensive QM calculations are required to determine the ligand and receptor geometries, minimizing the domain knowledge required to parameterize a new reaction class. Despite the geometric simplicity of this model, it serves as an essential step in the reaction pathway.The AutoDock standard force field estimates free energy of binding using the following equation for any pairwise interaction :
Figure 1.
a) Schematic representation of the reactive docking model and the NAC-like state, showing the pseudo 1–3 and 1–4 interactions and their van der Waals radii. Reactive atoms are highlighted with red asterisks, pseudo 1–3 and 1–4 interactions are shown with dashed lines. b) Representation of the independent docking on selected solvent-exposed residue (cysteine, teal spheres) on the target protein and the possible outcomes of the simulations.
In brief, , , and are van der Waals, directional hydrogen bond, and electrostatic enthalpies, is the torsional entropy, and is the desolvation free energy (for a full description of all the terms see reference19). The standard force field is modified by adding a 13–7 potential between the ligand reactive atom and the receptor reactive atom to represent incipient bond formation:
Where
represents the equilibrium distance (Å) and represents the equilibrium energy (kcal/mol). Like van der Waals and hydrogen bond potentials, the reactive potential is also softened according to the standard AutoDock force field description20. The 13–7 potential was initially chosen21 to provide a slightly narrower and steeper potential with respect to the standard van der Waals, however during the calibration process the impact of the potential was proven to be minimal. A major approximation of this approach is that variations in intrinsic reactivity of the ligand are ignored. As a result of this newly created pseudo-interaction, atoms within two bonds of the two reactive atoms will be placed at distances shorter than their respective van der Waals equilibrium distances (Fig.1a). To compensate for the tight proximity that this geometry would induce, the distances for pseudo 1–3 and 1–4 pairwise interactions around reactive atoms are scaled by factors W1,3 and W1,4, respectively, while any hydrogens bound to these atoms are ignored entirely from the pairwise calculation. Any atom beyond two bonds from the reactive atom pair is treated with default parameters. During docking, ligands are free to explore multiple binding modes including those not compatible with the formation of a covalent bond with the target residue, which is modeled as flexible17. Once dockings are completed, the result with the best docking score is analyzed to measure the distance between ligand and residue reactive atoms, determining the outcome of the reaction (i.e.: covalent or not, Fig.1b).
This method was initially developed to model acrylamides and chloroacetamides reacting with cysteine thiols21, then it was extended to model more warheads and residues and successfully applied to a variety of chemical proteomic tasks. In particular, it was used successfully to resolve ambiguity in reactive residues in ABPP experiments on human T cells22,23, identifying the residue most likely to be modified based on structural data. The method has been extended and applied to other reactions, such as those involving SuFEx warheads24, and used in an “inverse drug discovery” campaign targeting lysines and tyrosines in the human proteome25. In addition, it has been used in focused drug design campaigns to model the reaction between SuFEx and serine residues in specific targets26,27. In this work, the method is optimized and validated on a diverse dataset of targets (Fig.2) and warheads and reactions (Fig.3).
Figure 2.
Dataset statistics. a) Count of complexes containing a given modified residue type; b) Total number of solvent-accessible residues by type. c) Average number of solvent-accessible residue types per complex.
Figure 3.
a) Warhead contained in the dataset grouped by the residues they modified. b) Abundance of warhead types in the dataset. c) Abundance of reaction mechanisms in the dataset.
Methods
Ligand preparation.
Ligand structures identified from PDB were built in their native form restoring the active warheads using the primary literature citations, (e.g.: reconstruction of the β-lactam ring, addition of leaving groups, restoring unsaturated bonds, etc.). Initial 3D coordinates were generated using OpenBabel28 modeling protonation states at pH 7.4, then minimized (MMFF94s; 300 steps Steepest Descent; 300 Conjugated Gradient). Partial charges, torsions, standard and reactive atom type parameters were assigned according to the AutoDock protocol29 using Meeko (https://github.com/forlilab/Meeko.git) with SMARTS patterns to define the warhead atoms and assign the reactive docking force field parameters (Fig.1a). If present, macrocyclic structures were modeled as flexible17 by default during docking30.
Target preparation.
Target structures were retrieved from the Protein Data Bank and hydrogens were added with Reduce31. For oligomeric structures, dockings were performed only on the first chain. Partial charges, torsions, and standard and reactive atom type parameters were assigned using Meeko according to the AutoDock protocol29.
For each site, a cubic docking box of 30 Å side (80 points in the AutoGrid parameter file17) was defined and centered on the Cα of the target residue.
Reactive docking.
By default, the residues to be evaluated by reactive docking are automatically calculated on each target structure using MSMS32 to identify all solvent-accessible residues (default probe radius: 1.5 Å) of a given type (i.e.: cysteines) including buried cavities (Fig.1b). Optionally, a user-defined list can be provided. Then, each ligand is docked against the individual residues to be evaluated. During docking, ligands are modeled in their unmodified form with reactive warheads in place using a conventional (untethered) docking method, while the side chain of the target residue is modeled as flexible17. Dockings are performed using AutoDock-GPU20, generating 50 poses for each ligand using the default Lamarckian Genetic Algorithm (LGA) parameters29. A ligand and a residue are considered reacting if the distance between their reactive atoms in the lowest energy pose is ≤2.0 Å. Residues are then ranked by their likelihood of reacting based on the best energy of the ligand(s) predicted to react with it. Reactive docking parameters do not affect the length nor the complexity of the calculation, resulting in nearly identical docking times as conventional dockings. A step-by-step description of the setup and execution of the reactive docking protocol is available at https://github.com/forlilab/Meeko.
Datasets.
In order to calibrate and validate the reactive docking parameters, covalent complexes were collected from the PDB33, containing a very diverse pool of residues (Fig.2) and warheads (Fig.3). Complexes were visually inspected to discard highly distorted or problematic structures selecting a dataset of 431 structurally diverse ligands and chemical reactions. The dataset was then subdivided into specialized sets, with no overlap between them. The list of all PDBs and warheads in the dataset is available in Supplementary Material.
Training and test sets.
First, we selected a training set of 80 complexes using the most represented residues (cysteine, serine) for which the largest number of complexes with the most diverse ligands and protein families is available. The training set was built by randomly selecting 20 complexes representative of each of the four most abundant warheads (Fig.3a): acrylamides and chloroacetamides for cysteine; boronic acids and β-lactams for serine. The training set was used to calibrate the reactive docking force field parameters.
The remaining set of complexes was used to build the test set of 351 complexes, which includes 8 residue types (cystine, serine, threonine, lysine, tyrosine, glutamic acid, histidine, and methionine, Fig.2a) and 37 warheads (Fig.3a). The test set was used to assess the performance and transferability of the parameters obtained with the calibration process on the training set.
Virtual screening set.
A virtual screening set was built based on the experimental data published by Resnik et al.34 which used a library of 993 reactive fragments functionalized with mild electrophiles including chloroacetamides (n = 752, 76%) and acrylamides (n=241, 24%) targeting cysteines. This virtual screening set contains three targets for which structure coordinates were available and potent and selective fragments were identified: two enzymes, deubiquitinase OTUB235 and the pyrophosphatase NUDT736, and K-RasG12C 37.
Parameter selection.
The training set was used to calibrate the reactive docking parameters to obtain a single set of weights to be used to maximize the predictive accuracy. The value of 1.7 Å is fixed for all reactions and was chosen for being between the lengths of the shortest bond in all our datasets (C-C bond, ~1.5 Å) and the longest (C-S bond, ~1.8 Å). After preliminary tests, values were obtained by performing an exhaustive search with respect to value (from 2.0 to 10.0 kcal/mol, 0.5 intervals), and the scaling factors W1,3, and W1,4 (0.0 to 0.9, 0.1 interval). During the calibration, each parameter set was tested with the entire training set by docking each ligand on its cognate target including all the solvent-accessible residues that would be modified by its warhead (i.e.: all solvent-accessible cysteines for acrylamides).
Virtual screening protocol.
Firstly, we assessed the ability of the reactive docking to find the correct modified cysteine from the pool of all solvent-accessible cysteines in each structure. We calculated the average, median, and standard deviation of the docking score of the whole library for each residue (Table S1). Then the average docking score of the whole library docking against each target calculation was used to rank the most likely cysteine to be alkylated by the fragments. True positive success rates for each target were then calculated by ranking ligands predicted to react with the target residue by their best docking score, and true hit rates of known binders were calculated for the top 0.5%, 1%, and 10% of the docking results.
Results
Calibration.
The identification of optimal parameters for the reactive docking forcefield was done by performing an exhaustive parameter sweep for , W1,3, and W1,4 values, optimizing the docking performance on the 80 complexes in the training set. The calibration showed multiple sets of values could achieve comparable success rates, with the being the dominant component. In fact, the analysis of the results shows accuracy starts dropping significantly with values lower than 2.0 Kcal/mol, below which, scaling factors have limited to no effect. Therefore, the optimal values maximizing success rates and minimizing false positive rates were = 3.5 kcal/mol, W1,3 = 0.8, W1,4 = 0.4. These parameters achieved an overall success rate across the training set of 95% in identifying the correct residues as the top scored result, and >95% in the top 3 results (Fig.4a). Predictions of reactions for cysteines with acrylamides and chloroacetamides, as well as serines with boronic acids, showed excellent performance (95% top score, in all cases), while predictions of reactions between serines and β-lactams performed slightly worse (80%) (Fig.4a). Due to the higher relative abundance of solvent-accessible serines (693, approximate reactive hit ratio 1:17) over cysteines (186, approximate reactive hit ratio 1:4, Figs.2b-c) in the proteins of the training set, we anticipated that predicting the former would be a more challenging task, but overall success rates remained high for both residues.
Figure 4.
a) Training set docking performance in reactive residues predictions (1:best docking score; 2:second best-docking score; 3:third best or higher docking score). b) Test set performance in reactive residues predictions (1:best docking score; 2:second best-docking score; 3:third best or higher docking score). c) Distribution of experimental binder strengths by warhead (CA: chloroacetamide; AC: acrylamide) from Resnick et al.34 for the three targets considered in the virtual screening. d) Virtual screening success rates in binders recovery (0.5%, 1.0% and 10% fractions).
In fact, solvent accessibility alone is not sufficient to identify ligand-modifiable residues. When ranking cysteine (173 complexes), lysine (24 complexes) and histidine (8 complexes) residues in both training and test sets by solvent accessibility38 (see Supplementary Information), we found prediction accuracies very low (32%, 0% and 0% for top results, respectively) with respect to the reactive docking results (92%, 62.5%, 75% for top results, respectively) (Table S1, S2 and S3).
Due to improvements in the model (e.g. pseudo 1–3 and 1–4 scaling factors, specialized parameters for hydrogens bound to reactive atoms, etc.) and the use of larger and more diverse datasets, the generalized parameters obtained with this calibration differ from the specialized proof of concepts we reported previously21,23,25,26.
Testing of selected parameters.
The performance of the optimal values obtained in calibration was tested on the 351 complexes in the test set, which is larger and significantly more diverse than the training set. This set includes all residue types and warheads in the training set and extends it with more well-represented warhead/residue combinations (e.g., α,β unsaturated carbonyls like cyano acrylamides, vinyl carbonyl, phosphates for cysteines) as well as residues and warheads for which it was not possible to obtain a representative number of complexes. While the overall performance in identifying the correct residue as the top result dropped from 95% of the training set to 86.6% (Fig.4b) for the test set, all results showed a consistent performance of at least 75% success rate or better, except for carbonyl warheads targeting lysines (62%, Fig. 4b). The more marked drop for this residue is likely due to a combination of its high abundance in the proteins considered (14 lysine/system, Fig.2c) and their preferred localization in highly solvent-accessible regions on the surface of the protein. Both factors concurred in making docking predictions more challenging due to the potentially higher rate of false positives and the lack of well-defined pockets.
Virtual screening.
The performance of the reactive docking was then validated against experimental results reported on a small covalent ligand discovery effort in a virtual screening (VS) setting. In particular, this validation aimed at addressing the implications of ignoring ligand intrinsic reactivity to binders discrimination.
For that, a virtual screening set was built based on the experimental data published by Resnik et al.34 which used a library of 993 reactive fragments functionalized with mild electrophiles including chloroacetamides (n= 752, 76%) and acrylamides (n= 241, 24%) Fig.4c. The library was used to screen for binders of 10 cysteine-containing protein targets, using intact-protein MS and high-throughput crystallography to identify and characterize hits. The experimental characterization identified two classes of hits: strong hits (>50% labeling) and weak binders (<50% labeling) (Fig. 4c). The virtual screening set used here contains 3 of the 10 targets from the paper for which structures were available and potent and selective fragments were identified: two enzymes (deubiquitinase OTUB2; pyrophosphatase NUDT7), and K-RasG12C. For the validation, the set of parameters obtained from the calibration step was used to dock the entire fragment library against the targets in a blind docking fashion, considering all solvent-accessible cysteines for each target and predicting both the most likely residue to be modified and the most likely ligands to react (Table S4).
For the first task, the method was able to detect the correct cysteines in all three targets, confirming its residue prediction capabilities. For the VS task, we then analyzed the docking results considering the top 0.5%, 1.0%, and 10% of the results ranked by docking score for the different warheads, as summarized in Fig.4d. The corresponding Receiving Operator Curves (ROC) plots are shown in Fig.5. The method achieved true positive hit rates for top 0.5% of ranked results of 100% on NUDT7 chloroacetamide binders (50% strong and 50% weak binders) and 50% on OTUB2 chloroacetamide binders (25% strong and 25% weak binders). For KRAS, the true positive rate at the top 0.5% was only 25%. Acrylamide fragments were more difficult to predict correctly given the lower number of representatives of this warhead in the library, as well as the smaller number of strong binders. Combined success rates at identifying binders of both chemical classes in the top 0.5% of ranked results were 40% (20% strong and 20% weak binders) for OTUB2, 40 % (weak binders) for NUDT7 and 40% (weak binders) for KRAS.
Figure 5.
ROC plots of the VS of the top 0.5%, 1%, 10% and 100% of the virtual screening results for a) NUDT7, b) OTUB2, and c) K-RasG12C.
RMSD performance.
As the method models the ligand in its native form, prior to the formation of the covalent bond, incipient reaction geometries might deviate significantly from the final product coordinates, especially for reactions involving ring openings (e.g., β-lactam acylation of serines). Because of that, only 37.5% of the top-ranked docking results in the training set showed RMSD values below 2.0 Å with respect to experimental coordinates (Fig.S1). Therefore, the RMSD was not considered as a relevant metric for evaluating success rate.
Discussion
Here, we present the reactive docking protocol, a predictive method for irreversible ligand binding events based on the analysis of a modified NAC, which combines descriptions of the ground state ligand and receptor with a bias toward the incipient bond formation. This relatively simple model is advantageous because it requires neither prior modification of the ligand structure (a non-trivial effort for large chemical collections) nor distinct parameters for different reaction classes while still describing the energetics of a key thermodynamic step in the covalent modification. Additionally, because of the large contribution of the ground state character to this model, the modified forcefield is still responsive to the structural features of the ligand-protein complex that dominates non-covalent interactions. The reactive parameters described here, covering the reactive potential and scalings on pseudo- 1–3 and 1–4 interactions across the incipient bond, were calibrated on an inverse docking task with a diverse set of warheads (β-lactams, boronic acids, chloroacetamides, and acrylamides), and different reaction classes (β-lactam addition, borylation, nucleophilic substitution, and Michael addition) targeting serine and cysteine residues. These parameters were then tested against a more diverse set containing both withheld examples from the training warhead classes, as well as out-of-domain reactions, differing in the warhead, mechanism, and labeled residue, performing excellently at predicting the labeled residue. The residue prediction performance is in line with a previously reported application23, in which reactive docking was successfully used to discriminate between equivalent cysteine residues in the same tryptic peptide. This highlights how the relative simplicity of this model avoids overfitting and affords a generalizable forcefield that does not require extensive prior knowledge of either warhead or reaction mechanism.
No correlation was found between the predictive performance and the presence of bulky leaving groups or changes in the molecular topology of reacting molecules. For example, β-lactam warheads and boronic acids showed fairly similar performance in both training and test sets, despite the major structural rearrangements occurring in the former with the opening of the lactam ring. A high success rate was also found when modeling the fewer ligands with bulky leaving groups in the validation set (PDB ids: 6ax1, 3lj6, 6bq0, Fig.3a). These results are compatible with the NAC model, in which the receptor site needs to be able to accommodate and stabilize the unmodified ligands in order for the reaction to occur.
The reactive docking was additionally validated on a virtual screening task on reactive fragments against reactive cysteines in three separate proteins. Here we demonstrate that in a fully predictive setting (i.e.: neither the site nor the ligand are known in advance) the method can provide an enrichment of binders (particularly strong binders) comparable to focused conventional virtual screenings. While the use of a fragment library for assessing the VS performance is intrinsically more challenging compared to drug-like ligands39, the overall results in simultaneous prediction of both residue and ligands suggest that this protocol is very suitable for prospective VS campaigns.
This method and the calibration protocol also serve as a baseline for future development. While ligand and receptor intrinsic reactivities did not limit predictive accuracy, we anticipate that improvements in this direction are likely to strengthen the method performance.
Force field parameters described here can be easily modified, allowing interested researchers to optimize the parameters for an individual target or warhead class if sufficient data is available. Similarly, with the availability of more diverse and structurally rich experimental datasets, we anticipate the performance of the model on ligand ranking can be improved significantly.
Collectively, the results presented here confirm and expand the successful applications previously reported, suggesting this method can be readily applied to a diverse range of docking tasks and warheads, requiring only the reassignment of atom types and inclusion of the modified forcefield parameters. To the best of our knowledge, there is no other computational method that is capable of addressing both predictive aspects of proteomics experiments, that is predicting both residue to be modified and ligands capable of doing so. The very limited computational overhead and the predictive power make this method ideally suited for devising large virtual screenings campaigns to screen the growing commercially available chemical collections of reactive molecules.
Moreover, by leveraging the structural data generated by AlphaFold40,41, it makes possible proteome-wide labeling predictions, providing in silico support for large chemical proteomics applications.
Supplementary Material
Acknowledgements
This work was supported by the National Institutes of Health grants R01GM069832 and U54AI150472. We thank the anonymous reviewers for the very insightful suggestions that contributed to improving the manuscript.
Data and Software availability.
All the code is distributed under open source licenses and is available at https://github.com/forlilab/Meeko and https://github.com/ccsb-scripps/AutoDock-GPU
References
- (1).Singh J. The Ascension of Targeted Covalent Inhibitors. J. Med. Chem 2022, 65, 5886–5901. 10.1021/acs.jmedchem.1c02134. [DOI] [PubMed] [Google Scholar]
- (2).Boike L; Henning NJ; Nomura DK Advances in Covalent Drug Discovery. Nat. Rev. Drug Discov 2022, 21, 881–898. 10.1038/s41573-022-00542-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (3).Bandyopadhyay A; Gao J. Targeting Biomolecules with Reversible Covalent Chemistry. Curr. Opin. Chem. Biol 2016, 34, 110–116. 10.1016/j.cbpa.2016.08.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (4).Singh J; Petter RC; Baillie TA; Whitty A. The Resurgence of Covalent Drugs. Nat. Rev. Drug Discov 2011, 10, 307–317. 10.1038/nrd3410. [DOI] [PubMed] [Google Scholar]
- (5).Gehringer M; Laufer SA Emerging and Re-Emerging Warheads for Targeted Covalent Inhibitors: Applications in Medicinal Chemistry and Chemical Biology. J. Med. Chem 2019, 62, 5673–5724. 10.1021/acs.jmedchem.8b01153. [DOI] [PubMed] [Google Scholar]
- (6).Lu X; Smaill JB; Patterson AV; Ding K. Discovery of Cysteine-Targeting Covalent Protein Kinase Inhibitors. J. Med. Chem 2022, 65, 58–83. 10.1021/acs.jmedchem.1c01719. [DOI] [PubMed] [Google Scholar]
- (7).Li H; Qi W; Wang Y; Meng L. Covalent Inhibitor Targets KRasG12C: A New Paradigm for Drugging the Undruggable and Challenges Ahead. Genes Dis. 2023, 10, 403–414. 10.1016/j.gendis.2021.08.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Lonsdale R; Ward RA Structure-Based Design of Targeted Covalent Inhibitors. Chem. Soc. Rev 2018, 47, 3816–3830. 10.1039/c7cs00220c. [DOI] [PubMed] [Google Scholar]
- (9).Bianco G; Goodsell DS; Forli S. Selective and Effective: Current Progress in Computational Structure-Based Drug Discovery of Targeted Covalent Inhibitors. Trends Pharmacol. Sci 2020, 41, 1038–1049. 10.1016/j.tips.2020.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (10).Bianco G; Forli S; Goodsell DS; Olson AJ Covalent Docking Using Autodock: Two-Point Attractor and Flexible Side Chain Methods. Protein Sci. Publ. Protein Soc 2016, 25, 295–301. 10.1002/pro.2733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (11).Schröder J; Klinger A; Oellien F; Marhöfer RJ; Duszenko M; Selzer PM Docking-Based Virtual Screening of Covalently Binding Ligands: An Orthogonal Lead Discovery Approach. J. Med. Chem 2013, 56, 1478–1490. 10.1021/jm3013932. [DOI] [PubMed] [Google Scholar]
- (12).Ouyang X; Zhou S; Su CTT; Ge Z; Li R; Kwoh CK CovalentDock: Automated Covalent Docking with Parameterized Covalent Linkage Energy Estimation and Molecular Geometry Constraints. J. Comput. Chem 2013, 34, 326–336. 10.1002/jcc.23136. [DOI] [PubMed] [Google Scholar]
- (13).Ai Y; Yu L; Tan X; Chai X; Liu S. Discovery of Covalent Ligands via Noncovalent Docking by Dissecting Covalent Docking Based on a “Steric-Clashes Alleviating Receptor (SCAR)” Strategy. J. Chem. Inf. Model 2016, 56, 1563–1575. 10.1021/acs.jcim.6b00334. [DOI] [PubMed] [Google Scholar]
- (14).Zhu K; Borrelli KW; Greenwood JR; Day T; Abel R; Farid RS; Harder E. Docking Covalent Inhibitors: A Parameter Free Approach to Pose Prediction and Scoring. J. Chem. Inf. Model 2014, 54, 1932–1940. 10.1021/ci500118s. [DOI] [PubMed] [Google Scholar]
- (15).Friesner RA; Murphy RB; Repasky MP; Frye LL; Greenwood JR; Halgren TA; Sanschagrin PC; Mainz DT Extra Precision Glide: Docking and Scoring Incorporating a Model of Hydrophobic Enclosure for Protein−Ligand Complexes. J. Med. Chem 2006, 49, 6177–6196. 10.1021/jm051256o. [DOI] [PubMed] [Google Scholar]
- (16).London N; Miller RM; Krishnan S; Uchida K; Irwin JJ; Eidam O; Gibold L; Cimermančič P; Bonnet R; Shoichet BK; Taunton J. Covalent Docking of Large Libraries for the Discovery of Chemical Probes. Nat. Chem. Biol 2014, 10, 1066–1072. 10.1038/nchembio.1666. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (17).Morris GM; Huey R; Lindstrom W; Sanner MF; Belew RK; Goodsell DS; Olson AJ AutoDock4 and AutoDockTools4: Automated Docking with Selective Receptor Flexibility. J. Comput. Chem 2009, 30, 2785–2791. 10.1002/jcc.21256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (18).Hur S; Bruice TC The near Attack Conformation Approach to the Study of the Chorismate to Prephenate Reaction. Proc. Natl. Acad. Sci 2003, 100, 12015–12020. 10.1073/pnas.1534873100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (19).Morris GM; Goodsell DS; Halliday RS; Huey R; Hart WE; Belew RK; Olson AJ Automated Docking Using a Lamarckian Genetic Algorithm and an Empirical Binding Free Energy Function. J. Comput. Chem 1998, 19, 1639–1662. 10.1002/(SICI)1096-987X(19981115)19:14<1639::AID-JCC10>3.0.CO;2-B. [DOI] [Google Scholar]
- (20).Santos-Martins D; Solis-Vasquez L; Tillack AF; Sanner MF; Koch A; Forli S. Accelerating AutoDock4 with GPUs and Gradient-Based Local Search. J. Chem. Theory Comput 2021, 17, 1060–1073. 10.1021/acs.jctc.0c01006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (21).Backus KM; Correia BE; Lum KM; Forli S; Horning BD; González-Páez GE; Chatterjee S; Lanning BR; Teijaro JR; Olson AJ; Wolan DW; Cravatt BF Proteome-Wide Covalent Ligand Discovery in Native Biological Systems. Nature 2016, 534, 570–574. 10.1038/nature18002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (22).Weerapana E; Wang C; Simon GM; Richter F; Khare S; Dillon MBD; Bachovchin DA; Mowen K; Baker D; Cravatt BF Quantitative Reactivity Profiling Predicts Functional Cysteines in Proteomes. Nature 2010, 468, 790–795. 10.1038/nature09472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (23).Vinogradova EV; Zhang X; Remillard D; Lazar DC; Suciu RM; Wang Y; Bianco G; Yamashita Y; Crowley VM; Schafroth MA; Yokoyama M; Konrad DB; Lum KM; Simon GM; Kemper EK; Lazear MR; Yin S; Blewett MM; Dix MM; Nguyen N; Shokhirev MN; Chin EN; Lairson LL; Melillo B; Schreiber SL; Forli S; Teijaro JR; Cravatt BF An Activity-Guided Map of Electrophile-Cysteine Interactions in Primary Human T Cells. Cell 2020, 182, 1009–1026.e29. 10.1016/j.cell.2020.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (24).Dong J; Krasnova L; Finn MG; Sharpless KB Sulfur(VI) Fluoride Exchange (SuFEx): Another Good Reaction for Click Chemistry. Angew. Chem. Int. Ed 2014, 53, 9430–9448. 10.1002/anie.201309399. [DOI] [PubMed] [Google Scholar]
- (25).Mortenson DE; Brighty GJ; Plate L; Bare G; Chen W; Li S; Wang H; Cravatt BF; Forli S; Powers ET; Sharpless KB; Wilson IA; Kelly JW “Inverse Drug Discovery” Strategy To Identify Proteins That Are Targeted by Latent Electrophiles As Exemplified by Aryl Fluorosulfates. J. Am. Chem. Soc 2018, 140, 200–210. 10.1021/jacs.7b08366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (26).Zheng Q; Woehl JL; Kitamura S; Santos-Martins D; Smedley CJ; Li G; Forli S; Moses JE; Wolan DW; Sharpless KB SuFEx-Enabled, Agnostic Discovery of Covalent Inhibitors of Human Neutrophil Elastase. Proc. Natl. Acad. Sci. U. S. A 2019, 116, 18808–18814. 10.1073/pnas.1909972116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (27).Cheng Y; Li G; Smedley CJ; Giel M-C; Kitamura S; Woehl JL; Bianco G; Forli S; Homer JA; Cappiello JR; Wolan DW; Moses JE; Sharpless KB Diversity Oriented Clicking Delivers β-Substituted Alkenyl Sulfonyl Fluorides as Covalent Human Neutrophil Elastase Inhibitors. Proc. Natl. Acad. Sci 2022, 119, e2208540119. 10.1073/pnas.2208540119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (28).O’Boyle NM; Banck M; James CA; Morley C; Vandermeersch T; Hutchison GR Open Babel: An Open Chemical Toolbox. J. Cheminformatics 2011, 3, 33. 10.1186/1758-2946-3-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (29).Forli S; Huey R; Pique ME; Sanner MF; Goodsell DS; Olson AJ Computational Protein–Ligand Docking and Virtual Drug Screening with the AutoDock Suite. Nat. Protoc 2016, 11, 905–919. 10.1038/nprot.2016.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (30).Santos-Martins D; Eberhardt J; Bianco G; Solis-Vasquez L; Ambrosio FA; Koch A; Forli S. D3R Grand Challenge 4: Prospective Pose Prediction of BACE1 Ligands with AutoDock-GPU. J. Comput. Aided Mol. Des 2019, 33, 1071–1081. 10.1007/s10822-019-00241-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (31).Word JM; Lovell SC; Richardson JS; Richardson DC Asparagine and Glutamine: Using Hydrogen Atom Contacts in the Choice of Side-Chain Amide Orientation. J. Mol. Biol 1999, 285, 1735–1747. 10.1006/jmbi.1998.2401. [DOI] [PubMed] [Google Scholar]
- (32).Sanner MF; Olson AJ; Spehner J-C Reduced Surface: An Efficient Way to Compute Molecular Surfaces. Biopolymers 1996, 38, 305–320. 10.1002/(SICI)1097-0282(199603)38:3<305::AID-BIP4>3.0.CO;2-Y. [DOI] [PubMed] [Google Scholar]
- (33).Berman HM; Westbrook J; Feng Z; Gilliland G; Bhat TN; Weissig H; Shindyalov IN; Bourne PE The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235–242. 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (34).Resnick E; Bradley A; Gan J; Douangamath A; Krojer T; Sethi R; Geurink PP; Aimon A; Amitai G; Bellini D; Bennett J; Fairhead M; Fedorov O; Gabizon R; Gan J; Guo J; Plotnikov A; Reznik N; Ruda GF; Díaz-Sáez L; Straub VM; Szommer T; Velupillai S; Zaidman D; Zhang Y; Coker AR; Dowson CG; Barr HM; Wang C; Huber KVM; Brennan PE; Ovaa H; von Delft F; London N. Rapid Covalent-Probe Discovery by Electrophile-Fragment Screening. J. Am. Chem. Soc 2019, 141, 8951–8968. 10.1021/jacs.9b02822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (35).Borodovsky A; Ovaa H; Kolli N; Gan-Erdene T; Wilkinson KD; Ploegh HL; Kessler BM Chemistry-Based Functional Proteomics Reveals Novel Members of the Deubiquitinating Enzyme Family. Chem. Biol 2002, 9, 1149–1159. 10.1016/s1074-5521(02)00248-x. [DOI] [PubMed] [Google Scholar]
- (36).Reilly S-J; Tillander V; Ofman R; Alexson SEH; Hunt MC The Nudix Hydrolase 7 Is an Acyl-CoA Diphosphatase Involved in Regulating Peroxisomal Coenzyme A Homeostasis. J. Biochem. (Tokyo) 2008, 144, 655–663. 10.1093/jb/mvn114. [DOI] [PubMed] [Google Scholar]
- (37).Janes MR; Zhang J; Li L-S; Hansen R; Peters U; Guo X; Chen Y; Babbar A; Firdaus SJ; Darjania L; Feng J; Chen JH; Li S; Li S; Long YO; Thach C; Liu Y; Zarieh A; Ely T; Kucharski JM; Kessler LV; Wu T; Yu K; Wang Y; Yao Y; Deng X; Zarrinkar PP; Brehmer D; Dhanak D; Lorenzi MV; Hu-Lowe D; Patricelli MP; Ren P; Liu Y. Targeting KRAS Mutant Cancers with a Covalent G12C-Specific Inhibitor. Cell 2018, 172, 578–589.e17. 10.1016/j.cell.2018.01.006. [DOI] [PubMed] [Google Scholar]
- (38).Olsson MHM; Søndergaard CR; Rostkowski M; Jensen JH PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical PKa Predictions. J. Chem. Theory Comput 2011, 7, 525–537. 10.1021/ct100578z. [DOI] [PubMed] [Google Scholar]
- (39).Hubbard RE; Chen I; Davis B. Informatics and Modeling Challenges in Fragment-Based Drug Discovery. Curr. Opin. Drug Discov. Devel 2007, 10, 289–297. [PubMed] [Google Scholar]
- (40).Tunyasuvunakool K; Adler J; Wu Z; Green T; Zielinski M; Žídek A; Bridgland A; Cowie A; Meyer C; Laydon A; Velankar S; Kleywegt GJ; Bateman A; Evans R; Pritzel A; Figurnov M; Ronneberger O; Bates R; Kohl SAA; Potapenko A; Ballard AJ; Romera-Paredes B; Nikolov S; Jain R; Clancy E; Reiman D; Petersen S; Senior AW; Kavukcuoglu K; Birney E; Kohli P; Jumper J; Hassabis D. Highly Accurate Protein Structure Prediction for the Human Proteome. Nature 2021, 596, 590–596. 10.1038/s41586-021-03828-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (41).Holcomb M; Chang Y-T; Goodsell DS; Forli S. Evaluation of AlphaFold2 Structures as Docking Targets. Protein Sci. 2023, 32, e4530. 10.1002/pro.4530. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





