Optimizing Multisite λ-Dynamics Throughput with Charge Renormalization

Jonah Z Vilseck; Luis F Cervantes; Ryan L Hayes; Charles L Brooks, III

doi:10.1021/acs.jcim.2c00047

. Author manuscript; available in PMC: 2022 Nov 26.

Published in final edited form as: J Chem Inf Model. 2022 Mar 14;62(6):1479–1488. doi: 10.1021/acs.jcim.2c00047

Optimizing Multisite λ-Dynamics Throughput with Charge Renormalization

Jonah Z Vilseck ^1,^3,^∥, Luis F Cervantes ^4,^∥, Ryan L Hayes ¹, Charles L Brooks III ^1,^2,^*

PMCID: PMC9700484 NIHMSID: NIHMS1851979 PMID: 35286093

Abstract

With the ability to sample combinations of alchemical perturbations at multiple sites off a small molecule core, multisite λ-dynamics (MSλD) has become an attractive alternative to conventional alchemical free energy methods for exploring large combinatorial chemical spaces. However, current software implementations dictate that combinatorial sampling with MSλD must be performed with a multiple topology model (MTM), which is non-trivial to create by hand, especially for a series of ligand analogs which may have diverse functional groups attached. This work introduces an automated workflow, referred to as msld_py_prep, to assist in the creation of a MTM for use with MSλD. One approach for partitioning partial atomic charges between ligands to create a MTM, called charge renormalization, is also presented and rigorously evaluated. We find that msld_py_prep greatly accelerates the preparation of MSλD ready-to-use files and that charge renormalization can provide a successful approach for MTM generation, as long as bookending calculations are applied to correct small differences introduced by charge renormalization. Charge renormalization also facilitates the use of many different force field parameters with MSλD, broadening the applicability of MSλD for computer-aided drug design.

Graphical Abstract

graphic file with name nihms-1851979-f0001.jpg

INTRODUCTION

Alchemical free energy calculations have become an integral part of hit-to-lead optimization in both academic and industrial structure-based drug design.^1-7 The past two decades have seen a flurry of developments to improve both accuracy and efficiency of these methods, enabling more reliable prospective binding affinity predictions for small molecule derivatization and optimization.^1-10 Of note, large efficiency gains have been observed from the collective sampling of multiple ligands in a single simulation with λ-dynamics,^11-13 enveloping distribution sampling,^14-17 or λ-local elevation umbrella sampling (λ-LEUS) methods.^18-20 λ-Dynamics, the subject of this manuscript, has similarly undergone a variety of advancements since its introduction in 1996 to position it as an attractive alternative to traditional free energy perturbation (FEP)²¹ or thermodynamic integration (TI)²² methods. Several recent reports have shown the utility of λ-dynamics for drug discovery,^23-25 and a commercial workflow for widespread application of this approach has been developed.⁶

One advancement of λ-dynamics, termed multisite λ-dynamics (MSλD), enables multiple substituents to be alchemically perturbed at one or more sites of attachment off a common molecular core.¹³ This approach enables MSλD to break through an inherent scalability limitation associated with FEP or TI to sample large combinatorial alchemical spaces very efficiently. Baseline estimates suggest up to an order of magnitude speed up is obtainable with MSλD over TI for investigating a congeneric series of 10-30 compounds, and even larger efficiency gains of 20-75 times have been reported for sampling many hundreds of alchemical end-states.^6,23-25

In the current implementation of MSλD in the CHARMM molecular simulation package,^26,27 the alchemical sampling of multiple substituents is accomplished by using a multiple topology model (MTM).²⁴ Similar to the double topology model used with FEP/TI methods,²⁸ a MTM represents all alchemical substituent atoms explicitly and scales their atomic interactions with neighboring atoms and molecules via the alchemical coupling parameter, λ. In MSλD, each λ_s,i, for the i^th substituent at site s off a ligand core, is holonomically constrained to be 0 < λ_s,i < 1, and all site specific λ_s,i must sum to 1.0 to ensure that only one physical end state is sampled at any given point in time. With these constraints in place, the λ_s,i parameters can then dynamically sample between interacting (λ_s,i ≅ = 1.0) and non-interacting (λ_s,i ≅ = 0.0) states in a continuous manner in conjunction with the forces in a molecular dynamics (MD) simulation using an extended Lagrangian approach.^13,29 The dynamic nature of the λ_s,i parameters removes the need to define discrete intermediate λ_s,i states and allows relative free energy differences between many ligands to be readily computed from a single MSλD simulation.

Though use of a MTM facilitates efficient sampling of multiple substituents within a single simulation, adaptation of force field parameters across a congeneric ligand series to fit this topological representation can be difficult. This is especially true when considering the varying electrostatic characteristics different functional groups may present. For example, it is well known that electron-withdrawing or electron-donating groups can shift the distribution of electrostatic charge around an aromatic ring.³⁰ Thus, to create a MTM from a pre-selected group of ligand analogs, a single set of unperturbed core atoms and alchemical substituents that should be scaled by λ must be defined in such a way that the combination of any one substituent at each perturbation site combined with the core will yield a complete molecule with integer net charge. Correctly partitioning diverse sets of ligand analogs, which may feature substituents that vary in size, flexibility, or polarity, and their accompanying force field parameters to fit into a MTM is a non-trivial problem to solve manually. In this work, we first derive a workflow, expressed as a series of Python scripts collectively referred to as msld_py_prep, to automate the generation of MTM files for use with MSλD in the CHARMM software package. We then discuss our approach for partitioning force field partial atomic charges for an MTM, which we call charge renormalization, and show that it enables electrostatically different functional groups to be collectively sampled with MSλD.

ALGORITHMIC WORKFLOW

In this section we discuss our workflow for the creation of a multiple topology model to represent multisite substituent perturbations with MSλD in the CHARMM molecular simulation package. Collectively we refer to this algorithm as msld_py_prep, and a series of Python scripts have been written to perform each operation in the workflow (Figure 1). The details of each step of msld_py_prep are described below. As an overview, a maximum common substructure search (MCSS) is first performed to identify a set of core atoms with matching chemical environments in all ligands of interest. These core atoms are characterized as sharing identical atom types and bonded neighbors; atoms with different chemical environments are categorized into alchemical functional groups. Second, appropriate partial atomic charges for core atoms and alchemical atoms are determined to yield correct net charges for all combinatorial molecule end-states. All other intramolecular parameters for the combinatorial system are then identified. Finally, all necessary files for running a MSλD simulation in CHARMM are written to disk. Prior to performing msld_py_prep, all ligand analogs should be docked and aligned within a protein’s binding site so that the identified ligand cores are spatially near one other. This is helpful for successful completion of the MCSS step of msld_py_prep, and it ensures that the ligands maintain the same binding orientation and pose within the binding site, thus enabling the sampling of a diverse set of ligands. Furthermore, molecule coordinate files should employ the Tripos Mol2 format with one molecule per file, all ligand atoms should have unique atom names per file, and appropriate topology and parameter files formatted for CHARMM should be generated for each molecule prior to running msld_py_prep. Incidentally, the Mol2 coordinate file requirements match what is needed to obtain CGenFF parameters from the ParamChem/CGenFF or MATCH atom typing programs, and msld_py_prep works with outputs obtained from both utilities.^31-34 The following subsections describe each step of msld_py_prep in more detail.

Figure 1. — The *msld_py_prep* workflow for generating a multiple topology model for use with MSλD in the CHARMM molecular simulation package. Input files should be CHARMM formatted and generated prior to running *msld_py_prep*. While inspection of the MCSS results can be performed in any molecular visualization software package, *vis_check.py* works specifically with PyMOL.

Maximum Common Substructure Search (msld_mcs.py).

The first step in preparing a MTM for MSλD is a MCSS to identify which atoms will be in the ligand core and which will be in an alchemical substituent. With msld_mcs.py, a MCSS is performed by identifying atoms in separate molecules with identical force field atom types and lists of bonded neighbor atom types. This creates an effective fingerprint of each atom’s chemical environment, as defined by the force field. Similarities between atoms in different molecules are identified by matching atoms with the same atom type and then matching atoms with identical bonded-neighbor lists. This analysis generally resolves most atom-to-atom matches between molecules; however, any remaining discrepancies are resolved by calculating a coordinate root mean square deviation (RMSD) between potential atom matches, emphasizing the need for all ligand cores to be aligned prior to MTM construction. Atoms within a predefined RMSD threshold value of 0.8 Å are considered a match. Several redundancy checks are then performed to ensure that the same number of core atoms are matched from each molecule, resulting in a single uniform set of core atoms for use with MSλD. In a second stage of the MCSS routine, all non-core atoms in each molecule are categorized as alchemical atoms. Bonds between alchemical atoms are used to form alchemical fragments, and each substituent is grouped by its site of attachment to the ligand core. Bonds between alchemical substituents and different core atoms define unique perturbation sites around the core, and fragments at each site are filtered to remove possible duplicates. Finally, a formatted text file of the MCSS output is printed. This file identifies all core atom names from each molecule, the number of alchemical perturbation sites, and a list of all fragments and fragment atom names at each perturbation site. Within the msld_py_prep workflow, MCSS by msld_mcs.py is typically performed as an independent step with manual visualization of the MCSS results prior to charge renormalization (Figure 1). This ensures that the user can visually examine the MCSS output and confirm that the ligand system has been partitioned into core and alchemical components correctly before proceeding with msld_py_prep. To assist with visual inspection of the MCSS output, the Python script vis_check.py can be used to quickly visualize core and fragment atoms in PyMOL (Figure S1).³⁵ We also note that many other MCSS routines exist in a variety of open-source programs, such as RDKit, that could also be used for this step of the workflow.³⁶ The advantages of the msld_mcs.py MCSS procedure used here is the ability to separate atoms based on their force field atom types and bonded-environments, rather than elemental identities alone.

Charge Renormalization (msld_crn.py).

Following the correct identification of core atoms and alchemical functional groups for a series of congeneric ligands, a process called charge renormalization (CRN) begins. In this routine, partial atomic charges for each MTM group are identified such that any one alchemical fragment, at each perturbation site, combined with the ligand core will result in a physical ligand end-state with integer net charge (Figure 2). To begin, a preliminary set of partial atomic charges are identified for all core atoms by averaging together the atomic charges of equivalent atoms from each ligand. These initial charges are later adjusted to neutralize a molecule’s net charge after determining the atomic charges for atoms in each alchemical fragment. Charges for atoms in alchemical fragments are obtained next. To facilitate correct combinatorial sampling with MSλD, each alchemical substituent at a single perturbation site must maintain the same net group charge, or in cases where one substituent may feature a charge change, have the same offset group charge with respect to the core’s net charge to yield an integer charge for the full molecule. Thus, to generate appropriate partial atomic charges to meet this criterion, a site-specific target group charge is first determined by averaging the net charges of all substituents at a single perturbation site. For substituents with charge changes, group charges are first offset by their charge change prior to averaging to avoid large deviations in computed averages. The difference between the site-specific target charge and a substituent’s actual group charge is calculated, and atomic charges within that group are iteratively adjusted by ± 0.000001 e⁻ until the substituent’s net charge matches the target group charge. This is performed for all alchemical fragments at all perturbation sites connected to the common core. Finally, the partial atomic charges of the core atoms are revisited to ensure an integer net charge is obtained for a complete ligand end-state. A final target charge for the core is determined by subtracting each site-specific alchemical group charge from the expected final net charge of the ligand series. The initial core atomic charges are then iteratively adjusted by ± 0.000001 e⁻ until the core’s net charge matches the target charge. At the conclusion of charge renormalization, all functional group combinations should be feasibly sampled without introducing artifacts into the simulation caused by non-integer net charges for different physical end-states. Though charge renormalization formally changes the force field parameters of a ligand, we’ve found in practice that the changes are small. Functional groups that differ drastically in size or electrostatic characteristics can also be grouped by the user in such a way to minimize differences introduced by charge renormalization. As shown in the Results section below, errors introduced into computed free energy differences with this approach are mostly within statistical noise. However, because some outliers are observed, which may be difficult to predict in advance, quick bookending calculations (described below) can be performed post hoc to compute free energy differences between charge renormalized and original force field charge states to rigorously correct for differences introduced by charge renormalization.

Figure 2. — Example charge renormalization (CRN) of toluene and chlorobenzene into a dual topology model. On the left, CGenFF charges obtained from the ParamChem/CGenFF program have been grouped into core (blue box) and alchemical (red box) atom groups. Net group charges differ between toluene and chlorobenzene. Following small charge adjustments via CRN, net core (orange box) and alchemical (green box) group charges now match between these molecules, facilitating their use with MSλD.

Ligand Parameter Consolidation (msld_prm.py).

Following the conclusion of charge renormalization, the remaining intramolecular and Lennard-Jones force field parameters for the MTM are created. Because no atom type modifications occur with charge renormalization, or in msld_py_prep in general, all original intramolecular and Lennard Jones parameters can be consolidated and printed to a single CHARMM formatted parameter file. Thus, in this step, all force field parameters are extracted from each ligand analog, checked for duplication or potential differences in bond, angle, or dihedral parameter definitions, and written to disk.

File Generation (msld_wrt.py).

Finally, all files that are needed to run MSλD in CHARMM are written to disk. Core atom coordinates and topology definitions are generated first. All alchemical functional group coordinates and topology definitions are then written in a site-specific manner. Finally, a generic CHARMM script input file to setup and run ligand perturbations with MSλD is created. This input script is system specific to the ligand series processed with msld_py_prep, and requires minimal modification prior to initiating a MSλD simulation, such as inserting the appropriate protein, solvent, and ion file designations.

COMPUTATIONAL DETAILS

With the assistance of msld_py_prep, the generation of a MTM for examining a wide range of small molecule perturbations with MSλD in CHARMM is readily accomplished. MSλD can then be used to compute changes in solvation and binding free energies between multiple ligand analogs. Because charge renormalization formally changes a force field’s partial atomic charge model for a given molecule, it is important to evaluate how impactful these changes are to the final solvation or binding free energy results computed with MSλD. Small molecule hydration free energy calculations have long served as a test for both new alchemical free energy methods and force field parameterization schemes.^37-42 Thus, to assess the impact of charge renormalization, relative hydration free energies between original force field (FF) and renormalized charges ( $Δ Δ G_{h y d}^{F F \to C R N}$ ) were computed for a variety of small molecule analogs grouped into congeneric-like series (Figure S2 - S7). Specifically, four sets of small molecules originating from CGenFF and two sets of drug-like molecules were investigated, including one set of compounds obtained from the ChEMBL database and another of known fatty acid amide hydrolase (FAAH) receptor inhibitors.^31-33,43-45 The impact of charge renormalization on computed binding affinities was additionally investigated using the same group of FAAH inhibitors bound to the FAAH receptor.

CGenFF Small Molecule Setup.

A common functionalization approach in computer-aided lead optimization is to explore a variety of substituents at ortho, meta, and/or para positions off aromatic rings on a lead compound. Thus, four groups of aromatic compounds explicitly parameterized in the CGenFF force field were examined.^31-33 These are referred to as “123benz, “12benz”, “12fused”, and “1benz” to reflect their patterns of substitution (Figures S2 - S5). A pdb file for each molecule was built with CHARMM,^26,27 using the internal coordinate table definitions in the CGenFF topology file (version 4.1).^31-33 A mol2 file was obtained with OpenBabel,⁴⁶ and a CGenFF stream file was generated with the ParamChem/CGenFF program.^32,33 Spatial alignment was performed in Chimera,⁴⁷ and a MTM was created with msld_py_prep.

ChEMBL Small Molecule Setup.

A series of drug-like molecules obtained from the ChEMBL database was examined next (Figure S6; referred to as “ChEMBL”).^43,44 By scanning the ChEMBL database with RDKit, a total of 45 molecules containing a common ‘c1ccc2c(N)ncnc2c1’ substructure was selected for study. This scaffold is a common bicycle frequently found in drug candidates.^48,49 Filtering of compounds with this scaffold core was performed to allow for two sites of substitution off the ligand. Compound SMILES strings were used to generate mol2 files with RDKit,³⁶ which were then used to obtain CGenFF parameters with the ParamChem/CGenFF program.^31-33 Ligands were aligned with RDKit and a MTM was created with msld_py_prep.

FAAH Inhibitor Setup.

Finally, a series of 32 FAAH inhibitors,⁴⁵ which have been studied previously with MSλD,⁶ was examined (Figure S7). Initial 3D ligand conformations were generated with RDKit from SMILES strings using the ETKDG method.⁵⁰ The molecules were then aligned and minimized in RDKit using the MMFF94 force field.^51-52 For hydration calculations, all ligands were aligned with respect to ligand 31. For FAAH binding calculations, the ligands were aligned to the prepared FAAH receptor’s co-crystalized ligand structure. All alignments were performed with RDKit.^36,45 MCSS and charge renormalization were then performed with msld_py_prep.

FAAH Receptor Preparation.

Initial coordinates for the FAAH receptor-ligand complex were obtained from PDB ID: 6MRG.⁴⁵ Protonation states for titratable residues consistent with a pH of 7.4 were assigned with ProPKA.⁵³ With the CHARMM-GUI, the protein-ligand complex was then solvated in a cubic box of TIP3P water with an NaCl concentration of 0.145 M, allowing for a minimum of 7.0 Å between the complex and the edge of the box.^54,55 The protein was parameterized with the CHARMM36 force field.^56,57 This setup matches what was used previously by Raman, et al., facilitating direct comparison between previously published MSλD results for FAAH-ligand binding with MATCH-parameterized ligands versus the use of ParamChem/CGenFF-parameterized ligands reported herein.⁶

Hydration Free Energy Calculations.

Following the creation of a MTM for each ligand series, including CGenFF, ChEMBL, and FAAH compounds, hydration free energies between original and renormalized partial atomic charges were computed for each small molecule. Each molecule was first solvated in a cubic box of TIP3P water with a 12.0 Å buffer between solute and box edge with the MMTSB utility convpdb.pl.⁵⁸ Molecular dynamics simulations were then performed in vacuum and water-solvated states with both original and renormalized force field charges. MD simulations were run for 5 ns each, in the NPT ensemble at 25 °C and 1 atm, with CHARMM and the domain decomposition (domdec) module for running on graphic processing units (GPUs).^26,27,59 Periodic boundary conditions were employed, and long-range interactions were truncated with force-switching between 10 – 12 Å.⁶⁰ MD frames were saved every 500 steps, and a timestep of 2 fs was used. Following the conclusion of MD sampling, the trajectories were postprocessed and hydration free energy differences ( $Δ G_{h y d}^{F F \to C R N}$ ) were calculated with the Bennett acceptance ratio (BAR) free energy estimator.⁶¹ Free energies and percent overlap between end-point energy distributions were computed with pymbar.py.^62,63 Final relative free energy differences with respect to the original force field ( $Δ Δ G_{h y d}^{F F \to C R N}$ ) were obtained with a thermodynamic cycle for modeling solvation processes (Figure S8A).

MSλD FAAH Binding Calculations.

Retrospectively, FAAH – ligand binding affinities were investigated with MSλD for 32 FAAH inhibitors,⁴⁵ modeled with the CGenFF force field.^31-33 The simulation setup and details matched what was performed previously by Raman, et al., except that the simulations were performed directly in CHARMM.⁶ Free energy differences were calculated by subdividing the full set of 32 ligands into 5 separate MSλD calculations. Perturbations were performed with ligands solvated in solution (“water-unbound”) and bound to FAAH (“protein-bound”), in accordance with a protein-ligand binding thermodynamic cycle (Figure S8B). Bias optimization with the Adaptive Landscape Flattening (ALF)²³ algorithm was performed in three stages, with 100 iterations of 100 ps long MSλD simulations, followed by 13 iterations of 1 ns long MSλD simulations, and concluding with 5 independent trials of 5 ns long simulations. Production MSλD simulations were then run with 5 independent trials for 20 ns and 50 ns for water-unbound and protein-bound states, respectively. Binding free energy differences (ΔG_bind) for each state were calculated as a ratio of the amount of sampling compared to a single reference compound, using a λ-cutoff value of 0.99 to approximate the λ = 1.00 end-states.^13,23-25 Final relative differences in free energies of binding (ΔΔG_bind) between ligands were computed by taking the difference between water-unbound and protein-bound ligand states. As mentioned above, a MTM is necessary to sample multiple perturbations simultaneously, thus these MSλD binding calculations were performed with renormalized partial atomic charges ( $Δ Δ G_{b i n d}^{C R N}$ ). Absolute, computed binding free energies (ΔG_comp) were determined with the following equation, which utilizes known experimental binding free energies (ΔG_expt) and the computed relative binding free energies (ΔΔG_comp)⁴:

Δ G_{c o m p} = Δ Δ G_{c o m p} - (\frac{\sum Δ Δ G_{c o m p}}{n} - \frac{\sum Δ G_{e x p t}}{n})

(1)

MSλD Bookending Corrections.

Similar to the hydration free energy calculations described above, single step charge perturbations were performed with BAR between original force field and renormalized atomic charges to assess the impact of charge renormalization on MSλD binding predictions. As shown in Figure 3, “bookending” correction terms ( $Δ G_{b i n d}^{F F \to C R N}$ ) were computed post hoc to correct the MSλD $Δ G_{b i n d}^{C R N}$ results obtained with renormalized charges to the true force field free energy result ( $Δ G_{b i n d}^{F F}$ ). These correction terms were calculated by running MD simulations in water-unbound and protein-bound ligand states, with original and renormalized atomic charges separately. Simulation parameters matched what was used for the MSλD binding calculations to ensure sufficient overlap between simulations would be obtained, and simulations were run for 5 ns each. Although one bookending correction value requires four additional simulations to be run, these simulations are independent from MSλD and can be run in parallel with MSλD production calculations.

Figure 3. — Thermodynamic bookending corrections for investigating alchemical transformations with MSλD and renormalized ligand charges within a specific state (gas, water-unbound, or protein-bound). The true force field L₁ → L₂ free energy difference ( $Δ G_{b i n d}^{F F} (L_{1} \to L_{2})$ ) is composed of the MSλD free energy difference between two ligands with renormalized charges ( $Δ G_{b i n d}^{C R N} (L_{1}^{*} \to L_{2}^{*})$ ), plus bookending free energy terms ( $Δ G_{b i n d}^{F F \to C R N}$ ) to correct for charge renormalization differences.

RESULTS

To test the effectiveness of charge renormalization and measure the extent of error introduced by adjusting a small molecule’s partial atomic charges to fit within a MTM, hydration and binding free energy calculations were performed on a total of 191 compounds spanning 6 ligand series. The degree of change introduced by charge renormalization was analyzed as a charge root mean square deviation (Q-RMSD), and by measuring changes in dipole moment magnitudes (Δμ) and directions (Δμ(θ)). Complete data for each ligand and ligand series can be found in Tables S1 - S6 of the Supporting Information. As shown in Table 1, the average $Δ Δ G_{h y d}^{F F \to C R N}$ for all molecules is ca. 0.1 kcal/mol, which is generally within the limits of statistical noise for solvation transfer processes investigated with alchemical free energy methods.^37-42 Overall, the Q-RMSD and the Δμ metrics, which represent the degree of change between renormalized and original force field charges, tended to be small. This suggests that, on average, differences introduced by charge renormalization should also be small. However, when larger deviations in Q-RMSD or Δμ occur, for instance for the 12fused series of molecules, larger differences in $Δ Δ G_{h y d}^{F F \to C R N}$ are subsequently observed. This trend is clear in Figure 4, which plots $Δ Δ G_{h y d}^{F F \to C R N}$ in relation to Q-RMSD and Δμ. As Q-RMSD grows larger than 0.001 e⁻, log₁₀(Q-RMSD) > −3.0, large deviations in both Δμ and $Δ Δ G_{h y d}^{F F \to C R N}$ can occur. Above this Q-RMSD threshold, while most $Δ Δ G_{h y d}^{F F \to C R N}$ remains small, e.g., < 0.5 kcal/mol, some outliers with $Δ Δ G_{h y d}^{F F \to C R N}$ > 1.0 kcal/mol are also observed. The largest outliers unanimously come from the 12fused series of molecules. In this series, the core is defined as two carbon atoms and two hydrogen atoms that are located opposite the bond connecting the fused rings. A deeper analysis of the electrostatic descriptions of these 41 compounds indicated that many of these molecules have significant charge discrepancies for these four equivalent core atoms. Thus, while there is sufficient atom type overlap to identify these atoms as core atoms by the MCSS routine in msld_py_prep, there is poorer electrostatic overlap which leads to large Q-RMSD and Δμ by charge renormalization and, subsequently, large $Δ Δ G_{h y d}^{F F \to C R N}$ differences. In ligands that have a larger number of core atoms, this situation may be remedied by manually assigning core atoms with large electrostatic discrepancies into adjacent alchemical substituents, thereby reducing the magnitude of charge modifications by charge renormalization. This can be easily performed with the “InFrag” option in msld_py_prep. Checking Q-RMSD values after running msld_py_prep can help identify if this is necessary. Furthermore, to eliminate potential $Δ Δ G_{h y d}^{F F \to C R N}$ errors, we recommend calculating $Δ Δ G_{h y d}^{F F \to C R N}$ bookending corrections whenever charge renormalization is used to create an MTM. Encouragingly, the drug-like ChEMBL and FAAH ligand series showed among the lowest $Δ Δ G_{h y d}^{F F \to C R N}$ changes. Future applications of MSλD to ligand design and lead optimization will show how impactful charge renormalization is for analyzing many diverse sets of drug-like compounds.

Table 1.

Signed Average Deviations in Computed Relative Free Energies of Hydration (kcal/mol), Charge RMSD (e⁻), Dipole Moment Magnitudes (D) and Directions (radians) Induced by Charge Renormalization for Six Series of Small Molecules.

Series Name	$〈 Δ Δ G_{hyd}^{FF \to CRN} 〉$ ± σ (kcal/mol)	⟨Q – RMSD⟩ (e⁻)	⟨Δμ⟩ (D)^a	⟨Δμ(θ)⟩ (radians)^b
123benz	0.072 ± 0.002	0.006	−0.018	0.348
12benz	0.200 ± 0.002	0.008	0.112	0.254
12fused	0.451 ± 0.004	0.022	0.034	0.146
1benz	0.028 ± 0.000	0.001	−0.001	0.072
ChEMBL	−0.098 ± 0.002	0.003	0.090	0.077
FAAH	−0.044 ± 0.001	0.002	0.015	0.042
all series	0.101 ± 0.002	0.007	0.039	0.157

Open in a new tab

Difference in dipole moment magnitudes between original and renormalized charges

Angle between dipole vectors

Figure 4. — Charge RMSD (e⁻) and changes in dipole moments (D) for 191 small molecules due to charge renormalization. Computed relative free energies of hydration (kcal/mol) between original and renormalized partial atomic charges are color coded as a heatmap.

To begin to explore this idea, binding free energy calculations were computed with MSλD for the 32 FAAH ligands bound to the FAAH receptor. The analogs contained a wide variety of phenyl ring substitutions with differing chemical properties. Charge renormalization was performed on the entire set of ligands, and relative binding free energies were calculated from 5 separate MSλD calculations. Initial MSλD $Δ G_{b i n d}^{C R N}$ results obtained with renormalized charges were then corrected with bookending calculations to obtain $Δ G_{b i n d}^{F F}$ , the CGenFF force field result. Absolute free energies of binding are shown in Figure 5 for both data sets (see also Table S7). Overall, the uncorrected MSλD results yielded a Pearson correlation of 0.54 (± 0.001), a Spearman correlation of 0.56 (± 0.001), and a Kendall tau of 0.38 (± 0.001); yet the MUE was low at 1.07 (± 0.001) kcal/mol, a common target for alchemical binding free energy studies.^4-10 Symmetric confidence intervals were computed at the 95% confidence level with the leave-one-out method using 1000 iterations.⁶⁴ The corrected MSλD results, however, showed improved Pearson, Spearman, and Kendall tau coefficients of 0.60 (± 0.001), 0.64 (± 0.001), and 0.45 (± 0.001), respectively, and the MUE slightly improved to 1.00 (± 0.001) kcal/mol. Though Table 1 suggests that the Q-RMSD and Δμ are small for the FAAH ligands, these binding results indicate that charge renormalization does affect MSλD free energy calculations in a non-trivial manner and that improved accuracy can be obtained by correcting MSλD results obtained with renormalized charges to the true force field result. Fortunately, a single step charge perturbation from original force field charges to renormalized charges allows charge renormalization differences to be corrected quickly without significantly impacting MSλD scalability. For example, the MSλD FAAH binding calculations performed herein utilized a total of 2.23 μs of sampling, with an extra 0.64 μs of MD for bookending corrections. Raman, et al.’s estimate for running FEP or TI calculations for all 32 FAAH ligands, with cycle closure loops and 5 duplicate production calculations to match our MSλD procedure, suggests 14.4 μs of sampling would be required with FEP/TI methodologies to obtain equivalent ΔG_bind results. Based solely on amounts of sampling, MSλD alone is ca. 6.5 times more efficient than FEP/TI. This comparison negates the contribution of uncertainty in computed estimates of computational efficiency. MSλD efficiency dips only slightly to ca. 5.0 times more efficient when bookending correction simulations are included. Thus, MSλD remains highly efficient, in terms of sampling along, even when ΔG^FF→CRN corrections are included. It is important to note that accuracy and correlation to experiment are force field dependent and, thus, an ancillary focus of this paper. Fortunately, charge renormalization provides a means to employ new or non-CHARMM-based force field parameters with MSλD, allowing accuracy to be tuned if necessary for specific systems of interest, while still using a highly efficient and scalable free energy method.

Figure 5. — (A) Correlation between experimental and MSλD computed FAAH – ligand binding affinities (kcal/mol) obtained with renormalized partial atomic charges. (B) Correlation between experimental and corrected MSλD computed binding affinities (kcal/mol), representing the original force field’s free energy results.

Finally, the results from a previous MSλD investigation⁶ of the same set of 32 FAAH inhibitors bound to the FAAH receptor can be discussed.⁴⁵ That work employed commercial MATCH/CGenFF ligand parameters³⁴ and reported a MUE of 0.70 kcal/mol for computed free energies of binding compared to experiment.⁶ An analysis of their force field parameters was performed to understand potential sources of differences compared to the ParamChem/CGenFF parameters we used. This revealed large deviations in the partial atomic charges, with an average Q-RSMD of 0.069 e⁻. The trends in Figure 4 suggest large relative free energy differences would be highly likely, a pertinent reminder that electrostatic definitions can have strong impacts on ΔG_hyd or ΔG_bind calculations. Differences in dihedral parameters were also observed. Table S8 shows 6 dihedral angles found in the FAAH ligands where energetic differences between MATCH and ParamChem/CGenFF dihedral definitions were notable (RMSD > 1.0 kcal/mol). Most of these definitions can be mapped to the more flexible regions of the FAAH molecule core, such as torsions in the linker between the 4-aminopyrimidine and hydroxyl groups. Differences in torsional preferences could lead to sampling alternative ligand conformations or lead to increased ligand stain upon binding, which could also help explain MUE differences between this work and Raman, et al. Thus, this comparison emphasizes the importance of obtaining accurate ligand dihedral parameters and partial atomic charges when performing free energy calculations.

DISCUSSION AND CONCLUSIONS

In this work, we’ve derived a new framework, instantiated as a set of Python scripts collectively called msld_py_prep, to automate the creation of multiple topology models to investigate single- and multi-site perturbations of small molecule functional groups with MSλD in the CHARMM molecular simulation package. As a key step in this workflow, a process called charge renormalization has been introduced to facilitate the collective modeling of a series of structurally diverse ligands with substituent changes at potentially many different sites of attachment. Within msld_py_prep, a MCSS partitions ligands into a set of core atoms and groups of alchemical fragments for sampling with MSλD; charge renormalization then processes and adjusts atomic charges to facilitate sampling of many alchemical substituents with a single ligand core, while maintaining correct integer net charges for each ligand end-state. Our evaluation of the effects of using charge renormalization for computing free energy differences has shown that while most differences introduced by charge renormalization are small (ΔΔG^FF→CRN < 0.5 kcal/mol), significant outliers (ΔΔG^FF→CRN > 1.0 kcal/mol) are sometimes observed. Since the ΔΔG^FF→CRN magnitude cannot be estimated a priori for an arbitrary ligand series, it is recommended that post hoc bookending corrections be calculated and applied whenever charge renormalization is used. Fortunately, these corrections are easy to setup and can be run in parallel with an MSλD calculation. Because charge renormalization makes only minor adjustments to a molecule’s original set of atomic charges, bookending calculations can consist of a single step charge perturbation with BAR.⁶¹ Convergence in these calculations is established by measuring the degree of overlap between energy distributions of ligand states featuring original or renormalized charges with pymbar.^62,63 In this work, high levels of overlap were observed for all charge perturbations (Figures S9 - S15). Furthermore, we note that charge normalization is only required when alchemical functional groups have different net group charges. For example, with the MATCH atom typing program,³⁴ atoms are charged in such a way that groups of nearby atoms have a neutral or integer net charge; this is not true of ParamChem/CGenFF parameters or most other force fields in popular use. This means that with MATCH parameters it is often feasible to define alchemical fragments in the MTM such that no charge renormalization is needed at all, and in such instances bookending calculations are unnecessary because msld_py_prep has no effect on the original MATCH parameters. In addition, charge renormalization facilitates the use of many additional force fields with MSλD, which may be useful for modeling many diverse series of congeneric ligands. Finally, files generated with msld_py_prep are fully compatible with other MSλD utilities, specifically the ALF algorithm,²³ thus adding to the continued and accelerated use of MSλD in a variety of research endeavors, including structure-based drug discovery.

Supplementary Material

Supplementary Information

NIHMS1851979-supplement-Supplementary_Information.pdf^{(8.4MB, pdf)}

ACKNOWLEDGMENT

We gratefully acknowledge the many Brooks lab members who helped beta test the msld_py_prep workflow.

Funding Sources

The National Institutes of Health, through grants GM037554, GM130587, and GM107233, are gratefully acknowledged for financial support.

ABBREVIATIONS

MSλD: multisite λ-dynamics
MTM: multiple topology model
FEP: free energy perturbation
TI: thermodynamic integration
MD: molecular dynamics
MCSS: maximum common substructure search
RMSD: root mean square deviation
CRN: charge renormalization
GPU: graphic processing unit
BAR: Bennett acceptance ratio
Q-RMSD: charge root mean square deviation

Footnotes

Supporting Information.

The following files are available free of charge.

Supporting figures and tables: chemical structures of all ligands investigated, thermodynamic cycles, overlapping energy distributions, computed hydration free energies, computed FAAH-ligand binding free energies, and FAAH ligand dihedral comparison. (PDF)

The authors declare no competing financial interests.

Data and Software Availability: msld_py_prep scripts and an accompanying tutorial can be found at github.com/Vilseck-Lab/msld-py-prep for download.

REFERENCES

1.Jorgensen WL The Many Roles of Computation in Drug Discovery. Science 2004, 303, 1813–1818. [DOI] [PubMed] [Google Scholar]
2.Song LF; Merz KM Jr. Evolution of Alchemical Free Energy Methods in Drug Discovery. J. Chem. Inf. Model 2020, 60, 5308–5318. [DOI] [PubMed] [Google Scholar]
3.Cournia Z; Allen B; Sherman W Relative Binding Free Energy Calculations in Drug Discovery: Recent Advances and Practical Considerations. J. Chem. Inf. Model 2017, 57, 2911–2937. [DOI] [PubMed] [Google Scholar]
4.Wang L; Wu Y; Deng Y;Kim B; Pierce L; Krilov G; Lupyan D; Robinson S; Dahlgren MK; Greenwood J; Romero DL; Masse C; Knight JL; Steinbrecher T; Beuming T; Damm W; Harder E; Sherman W; Brewer M; Wester R; Murcko M; Frye L; Farid R; Lin T; Mobley DL; Jorgensen WL; Berne BJ; Friesner RA; Abel R Accurate and Reliable Prediction of Relative Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J. Am. Chem. Soc 2015, 137, 2695–2703. [DOI] [PubMed] [Google Scholar]
5.Schindler CEM; Baumann H; Blum A; Böse D; Buchstaller H-P; Burgdorf L; Cappel D; Chekler E; Czodrowski P; Dorsch D; Eguida MKI; Follows B; Fuchß T; Grädler U; Gunera J; Johnson T; Jorand Lebrun C; Karra S; Klein M; Knehans T; Koetzner L; Krier M; Leiendecker M; Leuthner B; Li L; Mochalkin I; Musil D; Neagu C; Rippmann F; Schiemann K; Schulz R; Steinbrecher T; Tanzer E-M; Unzue Lopez A; Viacava Follis A; Wegener A; Kuhn D Large-Scale Assessment of Binding Free Energy Calculations in Active Drug Discovery Projects. J. Chem. Inf. Model 2020, 60, 5457–5474. [DOI] [PubMed] [Google Scholar]
6.Raman EP; Paul TJ; Hayes RL; Brooks CL III Automated, Accurate, and Scalable Relative Protein-Ligand Binding Free-Energy Calculations Using Lambda Dynamics. J. Chem. Theory Comput 2020, 16, 7895–7914. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Abel R; Wang L; Mobley DL; Friesner RA A Critical Review of Validation, Blind Testing, and Real-World Use of Alchemical Protein-Ligand Binding Free Energy Calculations. Curr. Top. Med. Chem 2017, 17, 2577–2585. [DOI] [PubMed] [Google Scholar]
8.Jespers W; Esguerra M; Åqvist J; Gutiérrez-de-Terán H QligFEP: an automated workflow for small molecule free energy calculations in Q. J. Cheminform 2019, 11(26), 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Gapsys V; Pérez-Benito L; Aldeghi M; Seeliger S; van Vlijmen H; Tresadern G; de Groot BL Large scale relative protein ligand binding affinities using non-equilibrium alchemy. Chem. Sci 2020, 11, 1140–1152. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Kuhn M; Firth-Clark S; Tosco P; Mey ASJS; Mackey M; Michel J Assessment of Binding Affinity via Alchemical Free-Energy Calculations. J. Chem. Inf. Model 2020, 60, 3120–3130. [DOI] [PubMed] [Google Scholar]
11.Kong X; Brooks CL III λ Dynamics: A New Approach to Free Energy Calculations. J. Chem. Phys 1996, 105, 2414–2423. [Google Scholar]
12.Knight JL; Brooks CL III λ Dynamics Free Energy Simulation Methods. J. Comput. Chem 2009, 30, 1692–1700. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Knight JL; Brooks CL III Multisite λ Dynamics for Simulated Structure-Activity Relationship Studies. J. Chem. Theory Comput 2011, 7, 2728–2739. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Christ CD; van Gunsteren WF Enveloping distribution sampling: A method to calculate free energy differences from a single simulation. J. Chem. Phys 2007, 126, 184110. [DOI] [PubMed] [Google Scholar]
15.Christ CD; van Gunsteren WF Simple, Efficient, and Reliable Computation of Multiple Free Energy Differences from a Single Simulation: A Reference Hamiltonian Parameter Update Scheme for Enveloping Distribution Sampling (EDS). J. Chem. Theory Comput 2009, 5, 276–286. [DOI] [PubMed] [Google Scholar]
16.Perthold JW; Oostenbrink C Accelerated Enveloping Distribution Sampling: Enabling Sampling of Multiple End States while Preserving Local Energy Minima. J. Phys Chem. B 2018, 122, 5030–5037. [DOI] [PubMed] [Google Scholar]
17.Sidler D; Schwaninger A; Riniker S Replica exchange enveloping distribution sampling (RE-EDS): A robust method to estimate multiple free-energy differences from a single simulation. J. Chem. Phys 2016, 145, 154114. [DOI] [PubMed] [Google Scholar]
18.Bieler NS; Häuselmann R; Hünenberger PH Local Elevation Umbrella Sampling Applied to the Calculation of Alchemical Free-Energy Changes via λ-Dynamics: the λ-LEUS Scheme. J. Chem. Theory Comput 2014, 10, 3006–3022. [DOI] [PubMed] [Google Scholar]
19.Bieler NS; Hünenberger PH Communication: Estimating the initial biasing potential for λ-local-elevation umbrella-sampling (λ-LEUS) simulations via slow growth. J. Chem. Phys 2014, 141, 201101. [DOI] [PubMed] [Google Scholar]
20.Bieler NS; Tschopp JP; Hünenberger PH Multistate λ-Local-Elevation Umbrella-Sampling (MSλ-LEUS): Method and Application to the Complexation of Cations by Crown Ethers. J. Chem. Theory Comput 2015, 11, 2575–2588. [DOI] [PubMed] [Google Scholar]
21.Zwanzig RW High-Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. J. Chem. Phys 1954, 22, 1420–1426. [Google Scholar]
22.Kirkwood JG Statistical Mechanics of Fluid Mixtures. J. Chem. Phys 1935, 3, 300–313. [Google Scholar]
23.Hayes RL; Armacost KA; Vilseck JZ; Brooks CL III Adaptive Landscape Flattening Accelerates Sampling of Alchemical Space in Multisite λ Dynamics. J. Phys. Chem. B 2017, 121, 3626–3635. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Vilseck JZ; Armacost KA; Hayes RL Goh GB; Brooks CL III Predicting Binding Free Energies in a Large Combinatorial Chemical Space Using Multisite λ Dynamics. J. Phys. Chem. Lett 2018, 9, 3328–3332. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Vilseck JZ; Sohail N; Hayes RL; Brooks CL III Overcoming Challenging Substituent Perturbations with Multisite λ-Dynamics: A Case Study Targeting β-Secretase 1. J. Phys. Chem. Lett 2019, 10, 4875–4880. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Brooks BR; Brooks CL III; MacKerell AD Jr; Nilsson L; Petrella RJ; Roux B; Won Y; Archontis G; Bartels C; Boresch S; Caflisch A; Caves L; Cui Q; Dinner AR; Feig M; Fischer S; Gao J; Hodoscek M; Im W; Kuczera K; Lazaridis T; Ma J; Ovchinnikov V; Paci E; Pastor RW; Post CB; Pu JZ; Schaefer M; Tidor B; Venable RM; Woodcock HL; Wu X; Yang W; York DM; Karplus M CHARMM: The Biomolecular Simulation Program. J. Comput. Chem 2009, 30, 1545–1614. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Brooks BR; Bruccoleri RE; Olafson BD; States BD; Swaminathan S; Karplus M CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. J. Comput. Chem 1983, 4, 187–217. [Google Scholar]
28.Pohorille A; Jarzynski C; Chipot C Good Practices in Free-Energy Calculations. J. Phys. Chem. B 2010, 114, 10235–10253. [DOI] [PubMed] [Google Scholar]
29.Knight JL; Brooks CL III Applying Efficient Implicit Constraints in Alchemical Free Energy Simulations. J. Comput. Chem 2011, 32, 3423–3432. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Smith BS; March J March’s Advanced Organic Chemistry: Reactions, Mechanisms, and Structure, 6^th ed.; John Wiley & Sons, Inc., 2007. [Google Scholar]
31.Vanommeslaeghe K; Hatcher E; Acharya C; Kundu S; Zhong S; Shim J; Darian E; Guvench O; Lopes P; Vorobyov I; MacKerell AD Jr. CHARMM General Force Field: A Force Field for Drug-Like Molecules Compatible with the CHARMM All-Atom Additive Biological Force Fields. J. Comput. Chem 2010, 31, 671–690. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Vanommeslaeghe K; MacKerell AD Jr. Automation of the CHARMM General Force Field (CGenFF) I: Bond Perception and Atom Typing. J. Chem. Inf. Model 2012, 52, 3144–3154. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Vanommeslaeghe K; Raman EP; MacKerell AD Jr. Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic Charges. J. Chem. Inf. Model 2012, 52, 3155–3168. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Yesselman JD; Price DJ; Knight JL; Brooks CL III MATCH: An Atom-Typing Toolset for Molecular Mechanics Force Fields. J. Comput. Chem 2012, 33, 189–202. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.The PyMOL Molecular Graphics System, Version 1.8, Schrodinger LLC, New York, New York, United States. [Google Scholar]
36.RDKit: Open-source cheminformatics. Landrum, G. 2010. https://www.rdkit.org/. [Google Scholar]
37.Vilseck JV; Tirado-Rives J; Jorgensen WL Evaluation of CM5 Charges for Condensed-Phase Modeling. J. Chem. Theory Comput 2014, 10, 2802–2812. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Mobley DL; Guthrie JP FreeSolv: a database of experimental and calculated hydration free energies, with input files. J. Comput.-Aided Mol. Des 2014, 28, 711–720. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Matos GDR; Kyu DY; Loeffler HH; Chodera JD; Shirts MR; Mobley DL Approaches for Calculating Solvation Free Energies and Enthalpies Demonstrated with an Update of the FreeSolv Database. J. Chem. Eng. Data 2017, 62, 1559–1569. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Mobley DL; Bayly CI; Cooper MD; Shirts MR; Kill KA Small molecule hydration free energies in explicit solvent: An extensive test of fixed-charge atomistic simulations. J. Chem. Theory Comput 2009, 10, 350–358. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Shivakumar D; Deng Y; Roux B Computations of Absolute Solvation Free Energies of Small Molecules Using Explicit and Implicit Solvent Model. J. Chem. Theory Comput 2009, 5, 919–930. [DOI] [PubMed] [Google Scholar]
42.Shivakumar D; Williams J; Wu Y; Damm W; Shelley J; Sherman W Prediction of Absolute Solvation Free Energies using Molecular Dynamics Free Energy Perturbation and the OPLS Force Field. J. Chem. Theory Comput 2010, 6, 1509–1519. [DOI] [PubMed] [Google Scholar]
43.Mendez D; Gaulton A; Bento AP; Chambers J; De Veij M; Félix E; Magariños MP; Mosquera JF; Mutowo P; Nowotka M; Gordillo-Marañón M; Hunter F; Junco L; Mugumbate G; Rodriguez-Lopez M; Atkinson F; Bosc N; Radoux CJ; Segura-Cabrera A; Hersey A; Leach AR ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 2019, 47, D930–D940. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Davies M; Nowotka M; Papadatos G; Dedman N; Gaulton A; Atkinson F; Bellis L; Overington JP ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015, 43, W612–W620. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Saha A; Shih AY; Mirzadegan T; Seierstad M Predicting the Binding of Fatty Acid Amide Hydrolase Inhibitors by Free Energy Perturbation. J. Chem. Theory Comput 2018, 14, 5815–5822. [DOI] [PubMed] [Google Scholar]
46.O'Boyle NM; Banck M; James CA; Morley C; Vandermeersch T; Hutchison GR Open Babel: An open chemical toolbox. J Cheminform. 2011, 3, 33. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Pettersen EF; Goddard TD; Huang CC; Couch GS; Greenblatt DM; Meng EC; Ferrin TE UCSF Chimera--a visualization system for exploratory research and analysis. J. Comput. Chem 2004, 25, 1605–1612. [DOI] [PubMed] [Google Scholar]
48.Zdrazil B; Guha R The Rise and Fall of a Scaffold: A Trend Analysis of Scaffolds in the Medicinal Chemistry Literature. J. Med. Chem 2018, 61, 4688–4703. [DOI] [PubMed] [Google Scholar]
49.Taylor RB; MacCoss R; Lawson ADG Combining Molecular Scaffolds from FDA Approved Drugs: Application to Drug Discovery. J. Med. Chem 2017, 60, 1638–1647. [DOI] [PubMed] [Google Scholar]
50.Riniker S; Landrum GA Better Informed Distance Geometry: Using What We Know to Improve Conformation Generation. J. Chem. Inf. Model 2015, 55, 2562–2574. [DOI] [PubMed] [Google Scholar]
51.Tosco P; Stiefel N; Landrum G Bringing the MMFF Force Field to RDKit: Implementation and Validation. Journal of Cheminformatics. 2014, 6, 37. [Google Scholar]
52.Halgren TA Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J Comput Chem. 1996, 17, 490–519. [Google Scholar]
53.Olsson MH; Sondergaard CR; Rostkowski M; Jensen JH PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. J. Chem. Theory Comput 2011, 7, 525–537. [DOI] [PubMed] [Google Scholar]
54.Jo S; Kim T; Iyer VG; Im W CHARMM-GUI: A Web-based Graphical User Interface for CHARMM. J. Comput. Chem 2008, 29, 1859–1865. [DOI] [PubMed] [Google Scholar]
55.Lee J; Cheng X; Swails JM; Yeom MS; Eastman PK; Lemkul JA; Wei S; Buckner J; Jeong JC; Qi Y; Jo S; Pande VS; Case DA; Brooks CL III, MacKerell AD Jr, Klauda JB; Im W CHARMM-GUI Input Generator for NAMD, GROMACS, AMBER, OpenMM, and CHARMM/OpenMM Simulations using the CHARMM36 Additive Force Field. J. Chem. Theory Comput 2016, 12, 405–413. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Best RB; Mittal J; Feig M; MacKerell AD Jr. Inclusion of Many-Body Effects in the Additive CHARMM Protein CMAP Potential Results in Enhanced Cooperativity of α-Helix And β-Hairpin Formation. Biophys. J 2012, 103, 1045–1051. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Best RB; Zhu X; Shim J; Lopes PEM; Mittal J; Feig M; MacKerell AD Jr. Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ, and Side-Chain χ₁ and χ₂ Dihedral Angles. J. Chem. Theory Comput 2012, 8, 3257–3273. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Feig M; Karanicolas J; Brooks CL III MMTSB Tool Set: Enhanced Sampling and Multiscale Modeling Methods for Applications in Structural Biology. J. Mol. Graph. Model 2004, 22, 377–395. [DOI] [PubMed] [Google Scholar]
59.Hynninen AP; Crowley MF New Faster CHARMM Molecular Dynamics Engine. J. Comput. Chem 2014, 35, 406–413. [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Steinbach PJ; Brooks BR New Spherical-Cutoff Methods for Long-Range Forces in Macromolecular Simulation. J. Comp. Chem 1994, 15, 667–683. [Google Scholar]
61.Shirts MR; Bair E; Hooker G; Pande VS Equilibrium Free Energies from Nonequilibrium Measurements Using Maximum-Likelihood Methods. Phys. Rev. Lett 2003, 91, 140601. [DOI] [PubMed] [Google Scholar]
62.Shirts MR; Chodera JD Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys 2008, 129, 124105. [DOI] [PMC free article] [PubMed] [Google Scholar]
63.Klimovich PV; Shirts MR; Mobley DL Guidelines for the analysis of free energy calculations. J. Comput. Aided Mol. Des 2015, 29, 397–411. [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Leave-One-Out Cross-Validation. In: Sammut C, Webb GI (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. 2011. 10.1007/978-0-387-30164-8_469 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information

NIHMS1851979-supplement-Supplementary_Information.pdf^{(8.4MB, pdf)}

[R1] 1.Jorgensen WL The Many Roles of Computation in Drug Discovery. Science 2004, 303, 1813–1818. [DOI] [PubMed] [Google Scholar]

[R2] 2.Song LF; Merz KM Jr. Evolution of Alchemical Free Energy Methods in Drug Discovery. J. Chem. Inf. Model 2020, 60, 5308–5318. [DOI] [PubMed] [Google Scholar]

[R3] 3.Cournia Z; Allen B; Sherman W Relative Binding Free Energy Calculations in Drug Discovery: Recent Advances and Practical Considerations. J. Chem. Inf. Model 2017, 57, 2911–2937. [DOI] [PubMed] [Google Scholar]

[R4] 4.Wang L; Wu Y; Deng Y;Kim B; Pierce L; Krilov G; Lupyan D; Robinson S; Dahlgren MK; Greenwood J; Romero DL; Masse C; Knight JL; Steinbrecher T; Beuming T; Damm W; Harder E; Sherman W; Brewer M; Wester R; Murcko M; Frye L; Farid R; Lin T; Mobley DL; Jorgensen WL; Berne BJ; Friesner RA; Abel R Accurate and Reliable Prediction of Relative Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J. Am. Chem. Soc 2015, 137, 2695–2703. [DOI] [PubMed] [Google Scholar]

[R5] 5.Schindler CEM; Baumann H; Blum A; Böse D; Buchstaller H-P; Burgdorf L; Cappel D; Chekler E; Czodrowski P; Dorsch D; Eguida MKI; Follows B; Fuchß T; Grädler U; Gunera J; Johnson T; Jorand Lebrun C; Karra S; Klein M; Knehans T; Koetzner L; Krier M; Leiendecker M; Leuthner B; Li L; Mochalkin I; Musil D; Neagu C; Rippmann F; Schiemann K; Schulz R; Steinbrecher T; Tanzer E-M; Unzue Lopez A; Viacava Follis A; Wegener A; Kuhn D Large-Scale Assessment of Binding Free Energy Calculations in Active Drug Discovery Projects. J. Chem. Inf. Model 2020, 60, 5457–5474. [DOI] [PubMed] [Google Scholar]

[R6] 6.Raman EP; Paul TJ; Hayes RL; Brooks CL III Automated, Accurate, and Scalable Relative Protein-Ligand Binding Free-Energy Calculations Using Lambda Dynamics. J. Chem. Theory Comput 2020, 16, 7895–7914. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Abel R; Wang L; Mobley DL; Friesner RA A Critical Review of Validation, Blind Testing, and Real-World Use of Alchemical Protein-Ligand Binding Free Energy Calculations. Curr. Top. Med. Chem 2017, 17, 2577–2585. [DOI] [PubMed] [Google Scholar]

[R8] 8.Jespers W; Esguerra M; Åqvist J; Gutiérrez-de-Terán H QligFEP: an automated workflow for small molecule free energy calculations in Q. J. Cheminform 2019, 11(26), 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Gapsys V; Pérez-Benito L; Aldeghi M; Seeliger S; van Vlijmen H; Tresadern G; de Groot BL Large scale relative protein ligand binding affinities using non-equilibrium alchemy. Chem. Sci 2020, 11, 1140–1152. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Kuhn M; Firth-Clark S; Tosco P; Mey ASJS; Mackey M; Michel J Assessment of Binding Affinity via Alchemical Free-Energy Calculations. J. Chem. Inf. Model 2020, 60, 3120–3130. [DOI] [PubMed] [Google Scholar]

[R11] 11.Kong X; Brooks CL III λ Dynamics: A New Approach to Free Energy Calculations. J. Chem. Phys 1996, 105, 2414–2423. [Google Scholar]

[R12] 12.Knight JL; Brooks CL III λ Dynamics Free Energy Simulation Methods. J. Comput. Chem 2009, 30, 1692–1700. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Knight JL; Brooks CL III Multisite λ Dynamics for Simulated Structure-Activity Relationship Studies. J. Chem. Theory Comput 2011, 7, 2728–2739. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Christ CD; van Gunsteren WF Enveloping distribution sampling: A method to calculate free energy differences from a single simulation. J. Chem. Phys 2007, 126, 184110. [DOI] [PubMed] [Google Scholar]

[R15] 15.Christ CD; van Gunsteren WF Simple, Efficient, and Reliable Computation of Multiple Free Energy Differences from a Single Simulation: A Reference Hamiltonian Parameter Update Scheme for Enveloping Distribution Sampling (EDS). J. Chem. Theory Comput 2009, 5, 276–286. [DOI] [PubMed] [Google Scholar]

[R16] 16.Perthold JW; Oostenbrink C Accelerated Enveloping Distribution Sampling: Enabling Sampling of Multiple End States while Preserving Local Energy Minima. J. Phys Chem. B 2018, 122, 5030–5037. [DOI] [PubMed] [Google Scholar]

[R17] 17.Sidler D; Schwaninger A; Riniker S Replica exchange enveloping distribution sampling (RE-EDS): A robust method to estimate multiple free-energy differences from a single simulation. J. Chem. Phys 2016, 145, 154114. [DOI] [PubMed] [Google Scholar]

[R18] 18.Bieler NS; Häuselmann R; Hünenberger PH Local Elevation Umbrella Sampling Applied to the Calculation of Alchemical Free-Energy Changes via λ-Dynamics: the λ-LEUS Scheme. J. Chem. Theory Comput 2014, 10, 3006–3022. [DOI] [PubMed] [Google Scholar]

[R19] 19.Bieler NS; Hünenberger PH Communication: Estimating the initial biasing potential for λ-local-elevation umbrella-sampling (λ-LEUS) simulations via slow growth. J. Chem. Phys 2014, 141, 201101. [DOI] [PubMed] [Google Scholar]

[R20] 20.Bieler NS; Tschopp JP; Hünenberger PH Multistate λ-Local-Elevation Umbrella-Sampling (MSλ-LEUS): Method and Application to the Complexation of Cations by Crown Ethers. J. Chem. Theory Comput 2015, 11, 2575–2588. [DOI] [PubMed] [Google Scholar]

[R21] 21.Zwanzig RW High-Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. J. Chem. Phys 1954, 22, 1420–1426. [Google Scholar]

[R22] 22.Kirkwood JG Statistical Mechanics of Fluid Mixtures. J. Chem. Phys 1935, 3, 300–313. [Google Scholar]

[R23] 23.Hayes RL; Armacost KA; Vilseck JZ; Brooks CL III Adaptive Landscape Flattening Accelerates Sampling of Alchemical Space in Multisite λ Dynamics. J. Phys. Chem. B 2017, 121, 3626–3635. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] 24.Vilseck JZ; Armacost KA; Hayes RL Goh GB; Brooks CL III Predicting Binding Free Energies in a Large Combinatorial Chemical Space Using Multisite λ Dynamics. J. Phys. Chem. Lett 2018, 9, 3328–3332. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Vilseck JZ; Sohail N; Hayes RL; Brooks CL III Overcoming Challenging Substituent Perturbations with Multisite λ-Dynamics: A Case Study Targeting β-Secretase 1. J. Phys. Chem. Lett 2019, 10, 4875–4880. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Brooks BR; Brooks CL III; MacKerell AD Jr; Nilsson L; Petrella RJ; Roux B; Won Y; Archontis G; Bartels C; Boresch S; Caflisch A; Caves L; Cui Q; Dinner AR; Feig M; Fischer S; Gao J; Hodoscek M; Im W; Kuczera K; Lazaridis T; Ma J; Ovchinnikov V; Paci E; Pastor RW; Post CB; Pu JZ; Schaefer M; Tidor B; Venable RM; Woodcock HL; Wu X; Yang W; York DM; Karplus M CHARMM: The Biomolecular Simulation Program. J. Comput. Chem 2009, 30, 1545–1614. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Brooks BR; Bruccoleri RE; Olafson BD; States BD; Swaminathan S; Karplus M CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. J. Comput. Chem 1983, 4, 187–217. [Google Scholar]

[R28] 28.Pohorille A; Jarzynski C; Chipot C Good Practices in Free-Energy Calculations. J. Phys. Chem. B 2010, 114, 10235–10253. [DOI] [PubMed] [Google Scholar]

[R29] 29.Knight JL; Brooks CL III Applying Efficient Implicit Constraints in Alchemical Free Energy Simulations. J. Comput. Chem 2011, 32, 3423–3432. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Smith BS; March J March’s Advanced Organic Chemistry: Reactions, Mechanisms, and Structure, 6^th ed.; John Wiley & Sons, Inc., 2007. [Google Scholar]

[R31] 31.Vanommeslaeghe K; Hatcher E; Acharya C; Kundu S; Zhong S; Shim J; Darian E; Guvench O; Lopes P; Vorobyov I; MacKerell AD Jr. CHARMM General Force Field: A Force Field for Drug-Like Molecules Compatible with the CHARMM All-Atom Additive Biological Force Fields. J. Comput. Chem 2010, 31, 671–690. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Vanommeslaeghe K; MacKerell AD Jr. Automation of the CHARMM General Force Field (CGenFF) I: Bond Perception and Atom Typing. J. Chem. Inf. Model 2012, 52, 3144–3154. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Vanommeslaeghe K; Raman EP; MacKerell AD Jr. Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic Charges. J. Chem. Inf. Model 2012, 52, 3155–3168. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Yesselman JD; Price DJ; Knight JL; Brooks CL III MATCH: An Atom-Typing Toolset for Molecular Mechanics Force Fields. J. Comput. Chem 2012, 33, 189–202. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.The PyMOL Molecular Graphics System, Version 1.8, Schrodinger LLC, New York, New York, United States. [Google Scholar]

[R36] 36.RDKit: Open-source cheminformatics. Landrum, G. 2010. https://www.rdkit.org/. [Google Scholar]

[R37] 37.Vilseck JV; Tirado-Rives J; Jorgensen WL Evaluation of CM5 Charges for Condensed-Phase Modeling. J. Chem. Theory Comput 2014, 10, 2802–2812. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] 38.Mobley DL; Guthrie JP FreeSolv: a database of experimental and calculated hydration free energies, with input files. J. Comput.-Aided Mol. Des 2014, 28, 711–720. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Matos GDR; Kyu DY; Loeffler HH; Chodera JD; Shirts MR; Mobley DL Approaches for Calculating Solvation Free Energies and Enthalpies Demonstrated with an Update of the FreeSolv Database. J. Chem. Eng. Data 2017, 62, 1559–1569. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] 40.Mobley DL; Bayly CI; Cooper MD; Shirts MR; Kill KA Small molecule hydration free energies in explicit solvent: An extensive test of fixed-charge atomistic simulations. J. Chem. Theory Comput 2009, 10, 350–358. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] 41.Shivakumar D; Deng Y; Roux B Computations of Absolute Solvation Free Energies of Small Molecules Using Explicit and Implicit Solvent Model. J. Chem. Theory Comput 2009, 5, 919–930. [DOI] [PubMed] [Google Scholar]

[R42] 42.Shivakumar D; Williams J; Wu Y; Damm W; Shelley J; Sherman W Prediction of Absolute Solvation Free Energies using Molecular Dynamics Free Energy Perturbation and the OPLS Force Field. J. Chem. Theory Comput 2010, 6, 1509–1519. [DOI] [PubMed] [Google Scholar]

[R43] 43.Mendez D; Gaulton A; Bento AP; Chambers J; De Veij M; Félix E; Magariños MP; Mosquera JF; Mutowo P; Nowotka M; Gordillo-Marañón M; Hunter F; Junco L; Mugumbate G; Rodriguez-Lopez M; Atkinson F; Bosc N; Radoux CJ; Segura-Cabrera A; Hersey A; Leach AR ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 2019, 47, D930–D940. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] 44.Davies M; Nowotka M; Papadatos G; Dedman N; Gaulton A; Atkinson F; Bellis L; Overington JP ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015, 43, W612–W620. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] 45.Saha A; Shih AY; Mirzadegan T; Seierstad M Predicting the Binding of Fatty Acid Amide Hydrolase Inhibitors by Free Energy Perturbation. J. Chem. Theory Comput 2018, 14, 5815–5822. [DOI] [PubMed] [Google Scholar]

[R46] 46.O'Boyle NM; Banck M; James CA; Morley C; Vandermeersch T; Hutchison GR Open Babel: An open chemical toolbox. J Cheminform. 2011, 3, 33. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] 47.Pettersen EF; Goddard TD; Huang CC; Couch GS; Greenblatt DM; Meng EC; Ferrin TE UCSF Chimera--a visualization system for exploratory research and analysis. J. Comput. Chem 2004, 25, 1605–1612. [DOI] [PubMed] [Google Scholar]

[R48] 48.Zdrazil B; Guha R The Rise and Fall of a Scaffold: A Trend Analysis of Scaffolds in the Medicinal Chemistry Literature. J. Med. Chem 2018, 61, 4688–4703. [DOI] [PubMed] [Google Scholar]

[R49] 49.Taylor RB; MacCoss R; Lawson ADG Combining Molecular Scaffolds from FDA Approved Drugs: Application to Drug Discovery. J. Med. Chem 2017, 60, 1638–1647. [DOI] [PubMed] [Google Scholar]

[R50] 50.Riniker S; Landrum GA Better Informed Distance Geometry: Using What We Know to Improve Conformation Generation. J. Chem. Inf. Model 2015, 55, 2562–2574. [DOI] [PubMed] [Google Scholar]

[R51] 51.Tosco P; Stiefel N; Landrum G Bringing the MMFF Force Field to RDKit: Implementation and Validation. Journal of Cheminformatics. 2014, 6, 37. [Google Scholar]

[R52] 52.Halgren TA Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J Comput Chem. 1996, 17, 490–519. [Google Scholar]

[R53] 53.Olsson MH; Sondergaard CR; Rostkowski M; Jensen JH PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. J. Chem. Theory Comput 2011, 7, 525–537. [DOI] [PubMed] [Google Scholar]

[R54] 54.Jo S; Kim T; Iyer VG; Im W CHARMM-GUI: A Web-based Graphical User Interface for CHARMM. J. Comput. Chem 2008, 29, 1859–1865. [DOI] [PubMed] [Google Scholar]

[R55] 55.Lee J; Cheng X; Swails JM; Yeom MS; Eastman PK; Lemkul JA; Wei S; Buckner J; Jeong JC; Qi Y; Jo S; Pande VS; Case DA; Brooks CL III, MacKerell AD Jr, Klauda JB; Im W CHARMM-GUI Input Generator for NAMD, GROMACS, AMBER, OpenMM, and CHARMM/OpenMM Simulations using the CHARMM36 Additive Force Field. J. Chem. Theory Comput 2016, 12, 405–413. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R56] 56.Best RB; Mittal J; Feig M; MacKerell AD Jr. Inclusion of Many-Body Effects in the Additive CHARMM Protein CMAP Potential Results in Enhanced Cooperativity of α-Helix And β-Hairpin Formation. Biophys. J 2012, 103, 1045–1051. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R57] 57.Best RB; Zhu X; Shim J; Lopes PEM; Mittal J; Feig M; MacKerell AD Jr. Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ, and Side-Chain χ₁ and χ₂ Dihedral Angles. J. Chem. Theory Comput 2012, 8, 3257–3273. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R58] 58.Feig M; Karanicolas J; Brooks CL III MMTSB Tool Set: Enhanced Sampling and Multiscale Modeling Methods for Applications in Structural Biology. J. Mol. Graph. Model 2004, 22, 377–395. [DOI] [PubMed] [Google Scholar]

[R59] 59.Hynninen AP; Crowley MF New Faster CHARMM Molecular Dynamics Engine. J. Comput. Chem 2014, 35, 406–413. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R60] 60.Steinbach PJ; Brooks BR New Spherical-Cutoff Methods for Long-Range Forces in Macromolecular Simulation. J. Comp. Chem 1994, 15, 667–683. [Google Scholar]

[R61] 61.Shirts MR; Bair E; Hooker G; Pande VS Equilibrium Free Energies from Nonequilibrium Measurements Using Maximum-Likelihood Methods. Phys. Rev. Lett 2003, 91, 140601. [DOI] [PubMed] [Google Scholar]

[R62] 62.Shirts MR; Chodera JD Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys 2008, 129, 124105. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R63] 63.Klimovich PV; Shirts MR; Mobley DL Guidelines for the analysis of free energy calculations. J. Comput. Aided Mol. Des 2015, 29, 397–411. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R64] 64.Leave-One-Out Cross-Validation. In: Sammut C, Webb GI (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. 2011. 10.1007/978-0-387-30164-8_469 [DOI] [Google Scholar]

PERMALINK

Optimizing Multisite λ-Dynamics Throughput with Charge Renormalization

Jonah Z Vilseck

Luis F Cervantes

Ryan L Hayes

Charles L Brooks III

Abstract

Graphical Abstract

INTRODUCTION

ALGORITHMIC WORKFLOW

Figure 1.

Maximum Common Substructure Search (msld_mcs.py).

Charge Renormalization (msld_crn.py).

Figure 2.

Ligand Parameter Consolidation (msld_prm.py).

File Generation (msld_wrt.py).

COMPUTATIONAL DETAILS

CGenFF Small Molecule Setup.

ChEMBL Small Molecule Setup.

FAAH Inhibitor Setup.

FAAH Receptor Preparation.

Hydration Free Energy Calculations.

MSλD FAAH Binding Calculations.

MSλD Bookending Corrections.

Figure 3.

RESULTS

Table 1.

Figure 4.

Figure 5.

DISCUSSION AND CONCLUSIONS

Supplementary Material

ACKNOWLEDGMENT

Funding Sources

ABBREVIATIONS

Footnotes

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases