Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2025 Jun 6;65(12):6057–6072. doi: 10.1021/acs.jcim.5c00452

Path-Based Nonequilibrium Binding Free Energy Estimation, from Protein–Ligand to RNA-Ligand Binding

Eleonora Serra †,, Alessia Ghidini §, Riccardo Aguti , Mattia Bernetti ∥,‡,*, Sergio Decherchi ⊥,*, Andrea Cavalli §,
PMCID: PMC12199290  PMID: 40476389

Abstract

In this study, we addressed the challenge of estimating binding free energies in complex biological systems of pharmaceutical relevance, including both protein–ligand and RNA-ligand complexes. As case studies, we examined the intricate binding of the drug Gleevec to Abl-tyrosine kinase and two ligands binding to the preQ1 RNA riboswitch. By refining our approach based on nonequilibrium steered molecular dynamics simulations and path-based collective variables, we tackled the specific difficulties posed by these systems. In particular, the Abl–Gleevec complex is characterized by significant system size and extensive conformational rearrangements of the protein, whereas the systems involving RNA are characterized by marked conformational flexibility. For the Abl–Gleevec system, our method produced binding free energy estimates closely aligned with experimental values, demonstrating its reliability. For the RNA-ligand complexes investigated, we found that the simpler water model TIP3P yields more accurate free energy estimates than the TIP4P-D model, offering practical insight for future research. In this case, the agreement with the experimental results is reasonable. Overall, this work underscores the effectiveness of the proposed path-based workflow in handling complex biomolecular systems with unique characteristics, enabling systematic binding free energy predictions across a variety of targets.


graphic file with name ci5c00452_0013.jpg


graphic file with name ci5c00452_0011.jpg

Introduction

The binding free energy (ΔF b) quantifies the affinity of a potential drug for its biological target, making ΔF b a central thermodynamic observable in drug discovery campaigns. During the past few decades, numerous computational methods have been developed to estimate ΔF b, yet accurately predicting this parameter remains a challenge in many cases. The difficulties stem from various aspects, including the high degrees of freedom of the systems and the inherent flexibility of both the receptor and the ligand.

Motivated by the goal of developing a systematic protocol for computing binding free energies with path-based methods, we recently presented a semiautomatic computational workflow using nonequilibrium simulations, successful on systems of moderate size. However, challenges emerged when dealing with intricate systems, such as the Abl–Gleevec complex, which induced a very high work dissipation, hence poor convergence of free energy estimates. To improve convergence in large systems, we developed a refined strategy aimed at mitigating dissipation. In this work, taking advantage of this improved nonequilibrium strategy for binding free energy estimation, we now deal with intricate protein and RNA targets. Specifically, we take the Abl-tyrosine kinase-Gleevec complex and two RNA-ligand complexes as test cases, showcasing the adaptability of our computational workflow. Additionally, for the RNA case, we propose an ad-hoc approach to generate accurate unbinding paths via electrostatics.

The 2-phenylaminopyrimidine derivative Imatinib, commercially known as Gleevec, is an anticancer marketed drug that structurally resembles the natural substrate ATP of Abl kinase, allowing it to bind to the ATP-binding site of the kinase. Upon binding, Imatinib inhibits this site, blocking the enzymatic activity of the kinase. From a computational drug design perspective, modeling the binding and unbinding of Imatinib to Abl kinase poses significant challenges. This is mainly due to the extensive protein rearrangements that distinguish the active state of Abl kinase, which is the predominant conformation in solution, from the inactive state it adopts upon binding to Imatinib (Figure ). Key changes occur to the activation loop (A-loop) of Abl, which controls access to the active site and contains a tyrosine residue that is phosphorylated to modulate activity. In the Abl active state, the A-loop is in an open conformation that permits the binding of substrates (ATP or ADP) and their phosphorylation. In contrast, when Abl binds Imatinib, the A-loop experiences a rotation that results in the movement of up to 35 Å for central residues. Additionally, a distinctive DFG (Asp-Phe-Gly) motif within the kinase activation loop undergoes a 180° rotation in the inhibited structure with Imatinib compared to the uninhibited form. In this latter, this motif adopts the DFG-in conformation, where the DFG Asp residue points into the active site, allowing it to coordinate with Mg-ATP. However, in the structured inhibited by Imatinib, the DFG motif is in the out conformation, causing the Phe of the DFG motif to flip into the catalytic pocket.

2.

2

(Top) X-ray crystallographic structures of the c-Abl kinase (left) in an active state (A-loop in the open conformation and DFG-flip in the “in” conformation, PDB ID:2F4J) and (right) in an inhibited state, in complex with the ligand Gleevec (A-loop in the close conformation and DFG-flip in the “out” conformation, PDB ID:1IEP). The smaller amino-terminal part is the upper lobe, while the lower lobe represents the carboxyl-terminal part. The catalytic site lies in a cleft between the two lobes, and the relative orientations of the two lobes open or close the cleft. The activation loop adopts different conformations in the active/inactive states and it begins with the DFG sequence. (Bottom) Focus on the DFG motif (left) in the “in” conformation and (right) in the “out” conformation.

The ability of Imatinib to induce substantial protein conformational changes, involving more than 20 residues, is closely related to its therapeutic impact. An additional complexity is due to the high flexibility and large size of the ligand Imatinib. Nonetheless, investigating the interactions and binding/unbinding mechanisms of the Abl–Gleevec complex is of high biological and pharmaceutical significance. To this end, our nonequilibrium protocol based on physical pathways may represent an ideal choice to compute the binding free energy of this complex, as it can provide insights into the underlying interactions and mechanisms. Consequently, the Abl–Gleevec complex is used in this work as representative case of a challenging and pharmaceutically relevant protein–ligand complex.

The challenges faced in the Abl–Gleevec system mirror the complexity encountered in the targeting of RNA molecules. The latter have recently gained increasing interest as promising pharmaceutical targets of small molecules, boosted by the recent FDA approval of Risdiplam, , a small molecule drug targeting RNA. The peculiar features of RNA molecules, such as complex structural dynamics and high charge density, along with the limited knowledge about RNA-ligand interactions, make them challenging targets for both experimental and computational approaches. Thus, further investigations in RNA-ligand recognition are highly desirable and may promote tangible advancements toward the design of RNA-targeting small molecule drugs. In this context, RNA riboswitches are emerging as a compelling class of RNA targets, as they are able to bind metabolites and modulate gene expression as a result. In particular, since they are mainly found in bacteria, including human pathogens, they are gaining relevance in the urgency of bacterial resistance, which is recognized by the World Health Organization (WHO) as a major threat. , A concrete example is given by the campaign against the bacterial FMN riboswitch, with a compound at the preclinical stage. Given the availability of experimental structures in complex with different ligands, along with experimental affinity data, , we take the preQ1 riboswitch as a test case of our procedure in the context of RNA-ligand binding.

Overall this work demonstrates the applicability of the path-based method proposed to protein–ligand systems with realistic complexity, as well as the emerging class of RNA-ligand partners. This approach is designed to be systematically applicable, highlighting its potential for drug discovery endeavors.

Methods

The main steps of our nonequilibrium strategy combining Steered Molecular Dynamics (SMD), Path Collective Variables (PCVs) and the Crooks Fluctuation Theorem (CFT) are summarized here and detailed in the following subsections:

  • 1.

    Generation of unbinding MD trajectories starting from a bound complex through Adiabatic Bias MD (ABMD) coupled with an electrostatic-like collective variable (CV) to promote protein–ligand dissociation. Among several trajectories generated, the one representing the most probable and plausible mechanism is chosen as the guess path.

  • 2.

    The guess path is further optimized using two path algorithms, namely the Principal Path Algorithm and the Equidistant Waypoints Algorithm, resulting in an optimized reference path to be used in the subsequent simulations.

  • 3.

    Multiple replicates of bidirectional (i.e., binding and unbinding) nonequilibrium SMD simulations are conducted following the reference path via the PCVs. The Jarzynski work (W J) performed during the simulations is calculated.

  • 4.

    The Free Energy Surface (FES) is computed by applying the CFT piece-wise along the path to the W J values collected during binding and unbinding SMD simulations.

  • 5.

    Finally, the standard binding free energy is estimated as the sum of the binding free energy obtained by the ratio of the bound and unbound partition functions along the FES and a correction term due to the accessible ligand volume with respect to the standard volume.

A schematic representation of the computational pipeline is reported in Figure .

1.

1

Schematic representation of the computational pipeline based on nonequilibrium SMD simulations and the Crooks Fluctuation Theorem.

The strategy relies on Steered Molecular Dynamics, where a time-dependent harmonic potential is added to the regular system potential. Several trivially parallel replicates of SMD simulations are performed to sample the unbinding/binding process following the predefined physical pathway connecting the bound and unbound states. As bidirectional simulations are performed, the Crooks Fluctuation Theorem can be used to determine the FES, leveraging data from both binding and unbinding simulations. The FES is the Landau free energy profile along the collective variable used to perform the enhanced sampling simulation, in this case S(x).

Adiabatic Bias MD Simulations

The first step in the pipeline is the definition of PCVs. To identify PCVs for each studied system, multiple MD trajectories of ligand unbinding from the native bound pose, as found in the respective X-ray crystallographic complexes, were produced with the enhanced sampling method ABMD. In ABMD, a target value for the collective variable is defined and achieved thanks to a moving harmonic restraint that is active only when the collective variable is not progressing toward the defined value. The main advantage of using ABMD to derive guess trajectories instead of other enhanced sampling methods lies in its gentleness, as the restraint center progresses toward the target CV value due to thermal fluctuations only.

For protein–ligand systems, starting from the bound complex, the unbinding process is sampled with ABMD using an electrostatic-like CV. Specifically, this CV consists in a fictitious electrostatic potential generated by fictitious charges of the same sign placed on the ligand and the protein. The target value of this fictitious potential is set to zero, corresponding to ligand dissociation. An ABMD force constant of 10–17 (kJ/mol)−3 was employed. Conversely, for the RNA-ligand complexes we took advantage of the inherent electrostatic of the systems. Given the highly negative nature of the RNA electrostatic potential and the positive charge in the ligands, ABMD simulations were run employing a CV based on the Debye–Hückel interaction energy. Through this strategy, we guided the systems toward a zero electrostatic-interaction energy between the RNA and the ligands, leading to complete dissociation of the two partners. Force constants of 10–1 and 10–2 (kJ/mol)−3 were used in the ABMD simulations.

For each complex, several ABMD simulations were carried out, aiming to capturing all possible unbinding pathways. Among these, the trajectory observed with the highest probability is chosen as the guess path. In the Results and Discussion section, we discuss additional criteria that we found relevant to select an optimal guess path.

The selected trajectory of each complex is optimized with two path algorithms: the principal path algorithm, which optimizes the path in the configurational space from an initial bound state to a final unbound state, and a refined version of the equidistant waypoint algorithm, which equispaced the mean-square-deviation (MSD) between subsequent configurations of the path, as required when using PCVs. As a result, a smooth path composed of consecutive equidistant configurations in terms of MSD (with an average MSD between subsequent frames of 1 Å2) connecting the bound and unbound states is obtained.

Path Collective Variables (PCVs)

In order to use the CFT estimator, SMD simulations must be performed bidirectionally (i.e., simulating the binding and unbinding) using the same protocol. The PCVs developed by Branduardi et al. are ideal for this task, as they enable mapping the position of a point in the configurational space relative to a predefined path. Therefore, PCVs can be used to follow a predetermined pathway forward and backward, avoiding ambiguities in the evolution of the system. PCVs consist of S(x), which measures the progression along the predefined path, and Z(x), which measures the orthogonal deviation from the pathway

S(x)=i=1pieλxxi2j=1peλxxj2 1
Z(x)=λ1lni=1peλxxi2 2

where p is the number of molecular structures included in the reference path and xxi2 measures the distance (herein the MSD in the Cartesian space) between the ith configuration in the path (x i ) and the instantaneous microscopic configuration (x), while λ controls the smoothness. The S(x) PCV can be seen as an indicator function of the configuration progression, as when xx i then S(x) will have approximately the integer value i. To achieve a correct mapping, λ can be parametrized as follows

λ=2.3pi=1p1xixi+12 3

where xixi+12 measures the distance between the ith and the (i + 1)­th configuration and p is the number of configurations in the reference path. S(x) can assume values from 1 to p; however, in this study, S(x) is normalized to a range of 0 to 1.

SMD Simulations

To sample the binding/unbinding events, SMD simulations with PCVs are performed. In SMD simulations, a time-dependent harmonic restraint R(x, t) is applied along the pulling coordinate Ŝ(t

R(x,t)12k(S(x)Ŝ(t))2 4

Additionally, a half-harmonic wall (flat-bottom potential) is employed to confine the system along Z. This upper wall is active only when the Z(x) PCV value surpasses a threshold of Z = 0.05 nm2.

A key quantity to be introduced is the Jarzynski work (W J). This is the path integral of ξ̇Hξ/ξ along the trajectory Γ t

WJ=0tsdtξ̇Hξξ(Γt) 5

where H ξ represents the time-dependent part of the Hamiltonian that is added to the regular potential in SMD and ξ is the time-varying variable. The Jarzynski work of a simulation in the canonical ensemble accounts for the work required to transition the system from the initial to the final state (or vice versa).

According to the second law of thermodynamics, in a quasi-static transformation, this amount of work corresponds to the free energy difference between states A and B of the system, ΔF AB = F(B) – F(A). However, for nonequilibrium irreversible transformations, the total Jarzynski work will, on average, exceed the free energy difference by a quantity known as the dissipated work (W J )

WJdiss=WJΔFAB0 6

During SMD simulations, the higher the pulling speed, the greater the total W J, and in turn the higher the dissipated work. Several binding and unbinding nonequilibrium simulations with a specific pulling speed are necessary for free energy estimation. The SMD restraint moves at constant velocity, thus the simulation length is inversely proportional to the pulling speed.

The pulling speed and the number of SMD replicas for each system were chosen to ensure convergence of free energy estimates. For all complexes, 30 binding and 30 unbinding simulations of 100 ns were required. Moreover, to ensure consistency across different pulling speeds, the SMD simulations of the Abl–Gleevec system were performed also with a time length of 200 ns. The force constant of the moving harmonic restraint applied during the SMD is 20 kJ/mol for both systems.

FES Estimation

The free energy profile along S(x) is reconstructed from SMD simulations using the bidirectional nonequilibrium CFT estimator. , According to the Crooks Fluctuation Theorem

Pf(WJ)Pb(WJ)=exp[β(WJΔFAB)] 7

where β = 1/k B T, k B is the Boltzmann constant, and T is the absolute temperature of the simulated system. P f (W J) and P b (−W J) are the forward W J and backward −W J distributions, respectively. Specifically, to compute the free energy profile we relied on a maximum likelihood interpretation of the CFT based on the Bennett algorithm. As we perform an equal number of forward and backward SMD replicates with PCVs along the same reference pathway, free energy differences can be estimated by solving self-consistently the following equation

{1+exp[β(WJfΔFSiSi+1)]}1AB={1+exp[β(WJbΔFSiSi+1)]}1BA 8

where the work values from forward and backward simulations, W J and W J , are related to the S i , S i+1 interval. The FES point F(S i+1) corresponds to F(Si)+ΔFSiSi+1 , where the i index identifies the i-th configuration of the reference path. Therefore, solving eq for progressively increasing values of i results in the complete free energy profiles along S(x). As we normalized S(x) between 0 and 1, S(x) = 0 corresponds to the bound state, while S(x) = 1 to the unbound state.

Finally, to enable a comparison with the estimates obtained from the bidirectional method, unidirectional Jarzynski estimates can be calculated using a similar procedure and the Jarzynski equality

ΔFAB=1βexp(βWJf)f 9

Here, the angular brackets denote an exponential average taken over N nonequilibrium trajectories, in which the system transitions from its initial state to the target state, following an identical protocol.

Standard Binding Free Energy Estimation

To compare the results of the protocol with experimental values, we calculate the standard binding free energy (ΔF b°), namely the sum of the binding free energy (ΔF b) and the standard volume correction term (ΔF v), as described, for instance, by Doudou et al.

ΔFb°=ΔFb+ΔFv=1βlnQsiteQbulk1βlnVbulkV° 10

Here, ΔF b is the ratio between the probabilities of the bound and unbound ligand states, i.e. of the canonical partition functions of the bound (Q site) and unbound (Q bulk) states. Specifically, ΔF b is determined by integrating the FES along S(x) in the bound and unbound regions

QsiteQbulk=siteexp(F(S)RT)dSbulkexp(F(S)RT)dS 11

This approach requires identifying the specific molecular configuration in the reference path that discriminates between the bound and unbound regions. The S(x) value corresponding to such molecular configuration is selected through analysis of the FES followed by visual inspection of trajectories, relying on the protocol we outlined in ref .

The second term in eq is the standard volume correction term, ΔF v. It quantifies the variation of the free energy due to considering the standard-state volume V° corresponding to 1661 Å3 (concentration of 1 M) instead of the effectively sampled unbound volume V bulk. This contribution is computed using NanoShaper. In detail, we first isolated the set of ligand configurations of the reference pathway associated with the unbound state. On the union (namely, aggregation of the pdb files) of these configurations, we computed the solvent excluded surface. The resulting volume is a proxy of the unbound volume spanned by the ligand in the unbound state.

Finally, binding free energy errors are calculated via bootstrap analysis.

Well-Tempered Metadynamics Simulations

To further assess the reliability of our results, we repeated the calculations with an established protocol, namely Well-Tempered Metadynamics (MetaD) simulations , coupled with PCVs. The deposition time of Gaussians was set to 250 MD steps, and a bias factor of 15 was selected. Gaussian height of 1.0 kcal/mol was used, while Gaussian width was set to 0.2 along S(x) and 0.01 nm2 along Z(x). As for SMD simulations, the orthogonal deviation for the reference path was restricted with a wall at Z = 0.05 nm2 with force constant of 4 ×107 kJ mol–1 nm–4.

Production runs resulted in 1 μs simulations for the two Riboswitch-ligand systems, and 1.5 μs for the Abl–Gleevec complex. Convergence was determined based on two conditions: the full diffusivity of the system along the PCV S(x), and the residual Gaussian height being less than 10% of the initial height, aligning with previous methodologies.

For each system, we derived the FES from the Well-Tempered MetaD simulations, and the free energy was set to zero at the lowest point, corresponding to the ligand bound state. To estimate the standard binding free energies from the FES, the same procedure used for the SMD simulations was applied. The statistical errors on the standard binding free energies estimated from the Well-Tempered MetaD were computed via bootstrap, after dividing the MetaD simulation in 10 blocks and using 400 bootstrapping iterations.

Setup of the Systems

The Abl kinase is composed of a small N-terminal lobe, consisting of five β-strands and an α-helix, and a larger C-terminal lobe, made up of multiple helices. The ATP-binding site is positioned between these two lobes, with a cleft between them. The X-ray structure of the Abl–Gleevec complex is represented in Figure and compared to an X-ray structure of Abl in an active conformation.

The binding mechanism and selectivity of Abl are strongly influenced by two loops: the activation-loop (residues 384–403) and the glycine-rich loop (residues 248–257). The activation loop (A-loop) is integral to the structure and function of the catalytic active site and begins with a highly conserved DFG (Asp-Phe-Gly) sequence that plays a crucial role in determining the active or inactive state of the receptor. When the kinase is inactive, the A-loop adopts a compact conformation, while when the enzyme is active, the A-loop is in a more extended conformation. The glycine-rich loop in the N-lobe, also known as the phosphate-binding loop (P-loop), primarily facilitates binding of the natural ATP ligand through hydrophobic interactions. Another pivotal amino acid residue is the gatekeeper Thr338, which regulates access to the active site.

To model the Abl–Gleevec complex, we started from the X-ray structure with PDB code 1IEP (chain A only), where the Abl A-loop in the carboxy-terminal lobe is in the closed conformation, with the catalytic motif in the DFG-out state. The ligand binds in the cleft between the amino- and carboxy-lobes of the kinase. In this inhibitor-bound state, two conserved residues, Leu248 and Val256, play crucial roles by providing key interactions with the inhibitor. When Gleevec binds in the binding site, its amide group acts as an anchor by interacting with adjacent Glutamate and Aspartate residues, helping to orient properly the ligand within the pocket. Additionally, the NH group of the secondary amine of Gleevec interacts with the side chain of the gatekeeper residue.

Abl was modeled with the AMBER99SB force field, while Gleevec using the General Amber Force Field (GAFF) and AM1-BCC point charges. The protonation states of amino acid residues were assigned based on their ionization states at physiological pH, assuming standard pK a values. Additionally, the piperazine group of Gleevec is known to be protonated within the Abl pocket to facilitate interactions with surrounding residues. Thus, a positive charge was added to the outermost N-methyl piperazine nitrogen, in accordance with refs and . Solvation was carried out using the TIP3P water model within a cubic box with sides of 1.5 nm, large enough to ensure complete dissociation of the ligand from the protein pocket. To achieve physiological salt concentration (0.15 M) and neutralization, we introduced sodium and chloride ions. The fully solvated protein–ligand system then underwent energy minimization and equilibration, following the procedure already detailed in ref .

The topology and starting structure for the PreQ1 riboswitch in complex with the two ligands were taken from a recent work by Wang et al. In this work, the RNA-ligand complexes were modeled from the X-ray crystallographic structures with PDB codes 6E1W and 6E1U for the cognate and synthetic ligand, respectively. Ligands were parametrized according to the GAFF force field and AM1-BCC charges, considering a total charge of +1, due to the presence of a quaternary nitrogen in both. The systems were then solvated with TIP4P-D water model, a four-site model commonly employed in conjunction with the DESRES force field for RNA. Charges were neutralized with sodium and chloride ions, using a concentration of 0.15 M. The solvated systems were then equilibrated in two steps following ref .

In addition, we further reparameterized the cognate ligand with a more accurate representation for the charges via Restrained Electrostatic Potential (RESP) charges and optimized dihedral parameters, obtained via the PlayMolecule web server. Specifically, RESP charges were derived via quantum mechanics using the 6-311++G** basis set at the wB97X-D level of theory, while optimization of dihedral angle parameters was performed via the xTB method. As the starting X-ray structure (PDB code 6E1W) does not have hydrogen atoms, the cognate ligand can be bound to the RNA receptor in two possible tautomeric states. Consequently, an additional complex was constructed for the cognate ligand, starting from the same X-ray structure, but with a different ligand protonation state, representing the alternative tautomeric form of the ligand.

Results and Discussion

In this study, we address the estimation of binding free energy (with a previously defined methodology) in challenging biological systems of pharmaceutical relevance, namely a complex protein–ligand system together with the emerging class of RNA and their ligand partners. As test cases, we take the drug Gleevec binding to the Abl-tyrosine kinase and the preQ1 RNA riboswitch in complex with two ligands. Both these systems present remarkable challenges in terms of size and conformational flexibility. In the following sections, we discuss the results for these two test cases.

Abl–Gleevec: Ligand Flexibility and Large Conformational Rearrangements

The complex formed by the Abl-tyrosine kinase and its potent inhibitor Gleevec holds significance not only from a therapeutic perspective but also due to its complexity, particularly due to the flexibility and size of Gleevec and the large rearrangements that Abl must undergo. Gleevec is a marketed drug widely used for cancer treatment that exhibits remarkable inhibitory activity for the Abl-tyrosine kinase (experimental ΔF b = −10.9 kcal/mol).

In a previous study, we applied our nonequilibrium protocol to investigate the Abl–Gleevec system, but we encountered significant challenges due to its large number of atoms and complex binding mechanism. In this work, we overcome such issues, obtaining a satisfactory binding free energy estimate for the Abl–Gleevec complex. Moreover, we streamline the computational procedure required to expand our protocol applicability to larger, pharmaceutically relevant biomolecular systems (with complex structure and dynamics), using Abl–Gleevec as relevant prototype in this class.

Path Definition and PCVs for Abl–Gleevec

A proper parametrization of PCVs can help minimizing the dissipated work generated during nonequilibrium SMD simulations. This aspect is critical since high dissipated work (i.e., heat production) can hamper the convergence of binding free energy estimates. As we demonstrated in a recent work, improved PCVs can be obtained by introducing in the reference path, along with ligand atoms, other degrees of freedom of the system relevant to the process. This is mainly reflected in the S PCV definition, as these additional degrees of freedom will be pulled by the SMD bias. To improve the Abl–Gleevec PCV definition, we visually inspected the binding trajectories, since binding (and not unbinding) was the critical event in terms of dissipated work. Binding simulations revealed that the rapid pull of the ligand into the binding pocket did not allow sufficient time for the pocket to adapt. As a result, the Abl binding pocket maintained a predominantly closed conformation as the ligand approached, leading to significant dissipated work.

Our analysis aligns with other studies, ,, which describe Gleevec binding to the apo-kinase as an extremely complex process involving a hybrid mechanism of conformational selection and induced fit. Moreover, the substantial conformational changes of the activation and glycine-rich loop must be taken into account to allow the binding pocket of Abl to adopt its previously identified, unusual tunnel-like shape.

Consequently, the opening and closing of the Abl pocket during the binding and unbinding simulations is a crucial degree of freedom in the overall process. To properly include this conformational rearrangement during SMD simulations, heavy atoms of the binding pocket (within 6 Å from the ligand in the binding pose) were incorporated into the reference path. Thus, during binding simulations, the SMD bias potential not only pulls the ligand toward the binding pocket, but also guides structural rearrangements within the protein binding site. This refined path is anticipated to reduce the dissipated Jarzynski work, improving the convergence of binding free energy estimates. In particular, for binding simulations, this optimized path is expected to facilitate the conformational changes required in the target, thereby helping the binding mechanism.

Additionally, to define an optimal path, the exit trajectory of the ligand (its main axis) must be as orthogonal as possible to the protein surface during the unbinding event, especially for proteins with solvent-exposed binding pockets such as Abl. Moreover, for large and charged ligands like Gleevec, the conventional solvation shell definition used for smaller ligands is inadequate for accurately defining the unbound state. In the final configuration of the reference path, the ligand must be positioned at least 15 Å away from the protein surface, corresponding to more than three solvation shells, to avoid relevant electrostatic interactions in the unbound state (see Supporting Information section “Additional results for Abl-Gleevec”). Finally, the reference path must include all relevant intermediate metastable states involved during the binding event.

As anticipated, multiple ABMD simulations were performed to generate the initial trajectory to define the reference path and the PCVs. Longer ABMD simulations can promote a gentler, more realistic and spontaneous unbinding process. For this reason, we followed the same setup as in ref , except that now we performed several 20 ns ABMD runs (instead of 10 ns), simulating the unbinding of Gleevec from Abl, until a separation of at least 15 Å between the center of masses of the ligand and the binding pocket is reached. From these simulations, we selected the guess path based on the observed frequency of the mechanism, as well as the orthogonality of the ligand to the protein surface during unbinding. For each system studied, only a single predominant path is observed. This reference path was then refined using the previously mentioned path algorithms, resulting in an optimized path of 42 equidistant molecular configurations of both the ligand and pocket heavy atoms within 6 Å from the ligand in the bound pose, for a total of 51 residues. Among the residues selected to refine the path definition, we ensured the inclusion of key residues critical to protein rearrangements, as identified in the literature and confirmed by our ABMD trajectories. These include the DFG motif (Asp381, Phe382, Gly383), the first residue of the activation loop (Leu384), and nearly all residues of the glycine-rich loop. In particular, we included two conserved residues of the glycine-rich loop, Leu248 and Val256, which consistently interact with Gleevec in all bound conformations and remain in contact during the unbinding process, until Gleevec reaches its fully solvated state.

The main relevant configurations of the determined reference pathway are shown in Figure . The path starts from the bound pose found in the X-ray structure of the complex. Here, the inhibitor’s pyridine and pyrimidine rings occupy the binding site where the natural ligand binds, while the remaining portion of the inhibitor extends deeper into the hydrophobic core of the kinase, stabilizing it in its inactive conformation. Then, in this pathway, the significant rearrangements of the P-loop and the C-helix allow the unbinding of Gleevec, accompanied by movements of the A-loop. Gleevec unbinds orthogonally to the cleft between the N-terminal lobe and the C-terminal lobe, reaching the unbound state in the bulk. These observations are in agreement with the reported binding mechanism of Gleevec to Abl. , However, capturing the complete conformational rearrangements associated with the transition of Abl from its inactive to active state is far from trivial and would require plain MD simulations on the microsecond time scale. As a result, the reference path induced by our fast ABMD simulations does not include the flip of the DFG-motif in the A-loop nor does it fully capture its extensive rearrangement, which involves a rotation of the A-loop that results in the displacements of up to 35 Å for central residues. Note that our goal here is to apply a standard procedure and not customize systematically the protocol for each system; the outcome is that our protocol does not capture completely the activation process, hence it represents an approximate coarse view of the binding event (see later for a further discussion).

3.

3

Schematic representation of the reference path of the Abl–Gleevec complex. Abl kinase is shown in gray in a ribbon representation, with the pocket atoms included in the reference path in green and licorice. The surfaces of the path configurations of Gleevec are represented from dark red for the bound state to dark blue for the unbound state (for clarity, only the most relevant configurations are displayed).

Finally, to minimize boundary effects of PCVs that could distort the FES profile, fictitious configurations generated with short SMD runs were added at both ends of the reference path. Despite being physically unreachable during simulations, these configurations help to improve the sampling of the true bound and unbound states.

Binding Free Energy Estimates

Using the newly defined reference path for PCV-based sampling via SMD simulations, we performed 30 production runs for both binding and unbinding events. Additionally, we carried out the simulations for two time lengths, 100 and 200 ns (i.e., for two different pulling speeds, since we performed simulations with a restraint at constant velocity). To understand whether these SMD simulation times were sufficient to reach convergence of the estimates, we performed a statistical analysis assessing convergence as a function of the number of replicas.

We calculated W J values for the system, obtaining the work profiles over the simulation times reported in Supporting Information (Figure S3). These profiles demonstrate a reduction in the dissipated work during SMD simulations compared to the previously obtained ones (ref Figure 8), resulting from the inclusion of pocket atoms in the reference path. Furthermore, the similarity of the work curves at 100 and 200 ns suggests that 100 ns already provides a sufficiently gentle pulling speed to reach convergence.

Using our procedure based on the CFT estimator, we reconstructed the FESs along S(x), shown in Figure . The error details for each point of the FES are provided in the Supporting Information in Figure S4. The 100 ns free energy profile reveals a higher activation energy barrier for binding, probably because faster SMD simulations are more affected by kinetics, resulting in elevated barriers in the free energy profile. Nonetheless, the similarity between the free energy profiles (as for work curves) obtained at both pulling speeds indicates that 100 ns is sufficient to achieve thermodynamically (not kinetically) accurate results.

4.

4

Free energy profiles along S(x) obtained by applying CFT to SMD of 100 and 200 ns with the refined PCVs for Abl–Gleevec.

To calculate the binding free energy from the FESs, we computed the ratio of the bound and unbound partition functions, selecting the 18th configuration as the molecular structure distinguishing the bound and unbound states. After applying the standard volume correction (amounting to −1.4 kcal/mol), we obtained standard binding free energies comparable to the experimental reference of −10.9 kcal/mol. The CFT-derived estimates from 100 and 200 ns SMD simulations are −12.5 ± 1 kcal/mol and −13.0 ± 2 kcal/mol, respectively, in fair agreement with the experimental value. The results show that the inclusion of the protein pocket degrees of freedom leads to a reduction of the dissipated work, improving the convergence and precision of binding free energy estimates.

Finally, we evaluated the necessity of the new path criteria and the improvements they bring to the reference path definition with an independent sampling approach. Specifically, we performed Well-Tempered MetaD simulations using With the refined reference path, Gleevec is able to move further away from the protein, sampling the bulk state more extensively. This qualitative observation underscores the importance of the newly introduced criteria. The FES, S(x) profile, and hills deposition from the Well-Tempered MetaD simulations are available in the Supporting Information.

Conformational Rearrangements and DFG-Flip Transition

Experimental and computational studies have highlighted the crucial role of the DFG-motif in Gleevec inhibitory action for Abl. ,, As reported by Schindler et al., in order for Gleevec to bind to the Abl pocket, the DFG motif needs to undergo a conformational rearrangement from an active to an inactive state, characterized by a 180° rotation of the DFG motif called DFG-flip.

Consequently, in the Abl–Gleevec complex, the DFG-motif is in the out inactive conformation, while the activation loop adopts a closed conformation, preventing the binding of the natural substrate. Moreover, the glycine-rich loop is folded in a kinked conformation toward the binding pocket, facilitating favorable interactions with the bound ligand. The glycine-rich loop conformation is stabilized by a specific hydrogen bond between Tyr253 and Asn322.

Various computational methods have been employed to assess the contribution of the DFG-flip to the overall binding free energy of the complex Abl–Gleevec. As reported by the Roux’s group in ref , the DFG-flip resulted to be a relevant contribution to the binding free energy, amounting to 1.4 kcal/mol.

However, during our 200 ns simulations, the DFG-flip was not observed. Conformational rearrangements of both the P-loop and A-loop occur during simulations, while the DFG motif remains stable over time (Figure S5).

Although our reference path included the residues of the DFG-motif, the rearrangement sampled in the ABMD simulation does not reflect a transition of this motif from a closed to an open configuration. To simulate the DFG-flip, a tailored collective variable or longer simulation time would be required. In ref , in order to observe this rearrangement, a 200 μs plain MD simulation was run. Consequently, during the 20 ns unbinding ABMD trajectory, obtained using an electrostatic-like CV between the ligand and pocket atoms, the DFG-motif retained an out conformation even when reaching the unbound state.

To qualitatively account for the relevant contribution to the free energy difference of the DFG-flip in our results, we can include in our standard binding free energy estimates a correction term due to ΔF in→out = 1.4 kcal/mol, as computed in ref . If we account for this contribution, the CFT estimates for 100 and 200 ns simulations become −11.1 and −11.6 kcal/mol respectively, hence providing a correction in the correct direction with respect to the experimental value.

Riboswitch-Ligand: Receptor Flexibility and Conformational Rearrangements

Having established the methodology for protein systems, we explored the applicability to another challenging class of biomolecules, namely RNAs.

Path Definition and PCVs for RNA-Ligand Systems

As demonstrated with complex proteins, optimal parametrization of PCVs minimizes dissipated work during nonequilibrium SMD simulations, thereby improving the convergence of free energy estimations. Similarly, defining an optimized and plausible unbinding path is of critical importance also for RNA-ligand systems.

A total of 10 ABMD simulations, each lasting 10 ns, were run to generate the initial trajectories for each of the two RNA-ligand complexes in TIP4P-D water model. These trajectories were guided by a collective variable based on the Debye–Hückel interaction energy between the highly negative charges of RNA and the positive charge of the ligands. Particularly, they were guided toward a collective variable value of zero, which represents complete ligand unbinding. The trajectories were then cut when the distance between the ligand and the RNA exceeded 20 Å, ensuring a realistic unbound state. In these simulations, both ligands dissociated through a solvent-exposed pathway located between loop 2 (residues U12-U13-A14-U15-A16-C17) and stem 1 (residues U9-A10-G11 and C31-U32-A33-A34), as illustrated in Figure S9. This consistent unbinding mechanism involved structural rearrangements of loop 2 and stem 2, opening up the binding pocket to facilitate ligand exit.

The Debye–Hückel interaction energy during the ABMD simulations is reported in Figure S6. These plots demonstrate the reduction of the interaction energy as the ligands unbind, with final values of 1.73 and 1.14 kcal/mol for the cognate and synthetic ligands, respectively.

A key distinction between protein and RNA systems lies in the greater structural flexibility exhibited by RNA. ,,, Consequently, defining a reference pathway for PCVs for RNA-ligand complexes necessitated specific adjustments.

An implicit assumption of the PCV calculation is the inherent reliability of the optimal-alignment mean-square deviation between the configurations of the reference path and the instantaneous configuration during the simulation. For protein–ligand complexes, the Cα of the protein were used to align the instantaneous configuration to the reference configurations. However, because of the highly flexible nature of RNA, identifying suitable atoms for alignment is less trivial, yet necessary to define a correct alignment and PCV values. To define suitable atoms, we performed three unbiased 100 ns MD simulations and aggregated the data to calculate residue-wise root-mean-square fluctuations (RMSF) reported in Figure S7. We identified residues with an RMSF below a 1.8 Å threshold and we selected a subset of their atoms for the alignment, as shown in Figure S8. Specifically, of those residues we used the P and C1′ carbon atoms of the RNA backbone for alignment. Notably, this is also conceptually consistent with the alignment selection used for the protein system.

Additionally, the selection of atoms of the RNA-ligand system to be included in the reference pathway necessitates careful consideration, particularly crucial in RNAs complexes, since the ligand (un)­binding process is associated by rearrangements in stem 1. This region forms an intricate hydrogen-bonding network with the cognate ligand and adopts specific conformations of residue C15 to accommodate the bulkier synthetic ligand within the binding site (Figure ).

5.

5

Structural superposition of the apo (white), cognate ligand-bound (purple), and synthetic ligand-bound (yellow) states of the receptor, highlighting the conformational rearrangement of loop 1. A close-up view, on the left, reveals the distinct orientations of residue C15 induced by the different ligands.

Therefore, to account for these structural rearrangements, the path definition included RNA atoms within 6 Å of the ligand in the ABMD unbinding trajectories. Specifically, N1, C8, and C1′ atoms were included for purines, while C6, N3, and C1′ atoms were considered for pyrimidines (Figure S8). This resulted in an optimized unbinding reference path for PCVs consisting of approximately 35 equidistant conformations for each RNA-ligand system (Figure ).

6.

6

Schematic representation of the reference paths of riboswitch-preQ1 in complex with (left) the cognate ligand and (right) the synthetic ligand. The RNA is shown in light gray. The surfaces of the ligand configurations are colored from dark red for to the bound state to dark blue for the unbound state (for clarity, only the most relevant configurations are displayed).

Binding Free Energy Estimates

After the path optimization we followed the same protocol used for the Abl–Gleevec complex, performing 30 replicates of SMD binding and unbinding 100 ns simulations. During the SMD simulations, we calculated the Jarzynski work W J, obtaining the work profiles over the simulation time, as reported in Figure S11. The unbinding simulations of the cognate ligand (Figure S11A) show a higher dissipation in disrupting the hydrogen network with the pocket compared to those of the synthetic ligand. When the ligand reaches the unbound state and is fully solvated, the work profile reaches a plateau.

In contrast, the dissipation during binding simulations required to reach the binding pose (Figure S11B) is fairly similar for both ligands. Different replicas exhibit a different dissipation, reflecting the complexity of the ligand association process. Nevertheless, the work dissipation is limited for both processes, demonstrating the suitability of the criteria to define an optimal reference path for RNA-ligand systems.

Applying our protocol based on the bidirectional CFT estimator, we reconstructed the FESs along S(x) for both ligands in TIP4P-D water model (Figure ). To calculate the binding free energy from the FESs, we selected a discriminative conformation corresponding to S(x) values of 16 and 14 for the cognate and synthetic ligands, respectively. Standard binding free energies were obtained by computing the ratio of the bound and unbound partition functions and adding the standard volume correction. This correction contributes 0.6 kcal/mol for the complex with the natural ligand and 0.2 kcal/mol for that with the synthetic ligand. These results were compared with experimental affinity data from recent literature. , The estimated standard binding free energy (ΔF°) for the RNA-synthetic ligand complex is −5.6 ± 1 kcal/mol, slightly deviating from the experimental value of −7.9 kcal/mol. For the RNA-cognate ligand complex, ΔF° is estimated to be −17.2 ± 1 kcal/mol, with a more pronounced discrepancy compared with the experimental value of −10.9 kcal/mol.

7.

7

Free energy profiles along S(x) obtained by applying CFT to SMD (simulation time of 100 ns in TIP4P-D) for the cognate (A) and synthetic (B) ligands.

While the free energy profile along S(x) determined with the CFT for the cognate ligand appears reasonable, the ones obtained for binding simulations using the Jarzynski equality revealed a scattered unexpected trend (Figure S13A). This trend may be linked to the lower capacity of the Jarzynski estimator to mitigate the dissipative work. This high dissipation likely contributes to the error in the binding free energy estimate for this ligand.

Water Model Effect on Binding Free Energy Estimates

Hitherto, all simulations for RNA-ligand complexes were performed using the TIP4P-D water model. However, recent findings suggest that the four-site model may not be the most optimal choice to achieve good agreement between experimental results and calculations in nonequilibrium simulations. In fact, the TIP3P model might be less affected by work dissipation, leading to estimation with lower bias. Although TIP3P is neither the most recent water model nor the ideal choice for RNA, it may prove to be the most practical option for nonequilibrium simulations.

Therefore, to explore the potential impact of the water model on the binding free energy estimates in our nonequilibrium SMD simulations, we repeated all simulations using the less dissipative TIP3P model, while maintaining the same reference path (as performed in ref ).

The systems were solvated with TIP3P waters, equilibrated and, subsequently, SMD simulations were conducted with the same setup employed before. For both ligand, the general trend of the work curves with TIP3P water (Figure S12) revealed no significant differences compared to the TIP4P-D results (Figure S11). However, for the synthetic ligand, there is a non-negligible effect of the water model on the work dissipation. As seen for proteins, also for the RNA-ligand systems investigated here, the TIP3P model is shown to be less affected by dissipation than the TIP4P-D model.

A visual inspection of the simulations revealed conformational differences of the RNA in the bound state with TIP3P water compared with TIP4P-D water. Simulations with the TIP4P-D model showed difficulties in tightly following the reference pathway in the last stages (i.e., when reaching the bound state). Indeed, in these simulations the stem 1 is unable to assume the exact conformation of the X-ray crystallographic structure.

Consequently, for the synthetic ligand the CFT-derived FES (Figure S14B) and the resulting binding free energy obtained with TIP3P deviate from those estimated with TIP4P-D. The estimated ΔF° for the RNA-synthetic ligand using the TIP3P water model is −8.7 ± 0.7 kcal/mol (discriminating frame: 14th molecular configuration of the reference path), demonstrating better agreement with the reference experimental value (−7.9 kcal/mol) compared to the TIP4P-D model, which yielded a value of −5.6 ± 1 kcal/mol. In contrast, for the cognate ligand, the FES (Figure A) and the standard binding free energy value obtained with TIP3P were rather similar to those obtained with TIP4P-D. The ΔF° for the RNA-cognate ligand estimated using TIP3P is −18.5 ± 3 kcal/mol, still deviating from the reference experimental value of −10.9 kcal/mol.

8.

8

Free energy profiles along S(x) obtained by applying CFT to SMD (simulation time of 100 ns in TIP3P) for the cognate (A) and synthetic (B) ligands.

The results for the synthetic ligand highlight the significant impact of the chosen water model on the accuracy of nonequilibrium free energy estimations. While simulations employing the TIP4P-D water model underestimated the binding free energy (ΔF), those using the TIP3P water model yielded values that fell within the experimental error range.

Nevertheless, for the cognate ligand, also with the less viscose TIP3P model, the SMD results still display high discrepancy from the reference experimental data.

To further assess the reliability of our nonequilibrium method on RNA-ligand systems, we also calculated the binding free energy with an independent sampling approach. This analysis is rather interesting because Well-Tempered MetaD does not rely on the Jarzynski work for the free energy estimation. On the other hand, the nonequilibrium approaches based on the CFT and Jarzynski estimators are heavily dependent on this parameter.

Thus, we performed Well-Tempered MetaD simulation with PCVs, using the previously generated reference pathways and the TIP4P-D model. The Well-Tempered MetaD simulations were able to exhaustively sample the S(x) PCV, allowing to observe multiple (un)­binding events in each simulation, as both ligands were able to transition multiple times between the bound and the unbound state (Figure S19). Interestingly, the cognate ligand resides in the bound state for a significant fraction of the total simulation time. This is because the cognate ligand forms a higher number of interactions with RNA in the bound state. After reconstructing the FES (Figure S20), we estimate the binding free energy. The results obtained from the Well-Tempered MetaD simulations are reported in Table S2 and compared with the SMD results.

The Well-Tempered MetaD results are different from those obtained with the SMD simulations using TIP4P-D. However, the results obtained with Well-Tempered MetaD exhibit better agreement with the SMD simulations with the TIP3P water model (see Table S3). This suggests that models like TIP3P may be more suitable for reducing dissipative work in SMD simulations, while more accurate (although more expensive) models such as TIP4P-D may be more appropriate for Well-Tempered MetaD simulations.

Possible Source of Errors for the Cognate Ligand

The binding free energy values obtained for the cognate ligand deviates remarkably from the reference experimental value. In both the SMD and Well-Tempered MetaD simulations, the cognate ligand appears to have a higher binding affinity compared to the experimental one. Differently, the calculated binding affinity for the synthetic ligand is in agreement with the experimental value.

This discrepancy could arise from the fact that a correct estimation of the binding affinity to the RNA target is more difficult for the cognate ligand compared to the synthetic one, as the cognate ligand is more strongly bound to the target and has an experimentally reported higher affinity. Particularly, the crystallographic poses of the two ligands revealed how the binding mode of the cognate ligand is driven by a higher number of interactions with the RNA target. Both ligands form a comparable amount of stacking interactions with RNA, however, the cognate ligand establishes 7 hydrogen bonds with the target, while only 2 are formed with the synthetic ligand (see Figures B and S10). Consequently, the vigorous interactions between RNA and the cognate ligand introduced challenges in accurately estimating the binding free energy for this complex. Here, we investigated potential sources of discrepancies between our results for the cognate ligand and the experimental affinity.

9.

9

(A) Structures of the two tautomeric forms of the cognate ligand: N1 (left) and N2 (right). (B) Hydrogen bonding networks formed between the RNA target and the cognate ligand in the tautometric form N1 and (C) in the tautomeric form N2. The two tautomeric forms establish a different hydrogen networking with the RNA target.

A potential source of error could be due to the inaccuracy of force fields for RNA-ligand simulations. However, macromolecule force fields have improved significantly over the years, becoming progressively more reliable and increasingly capable of quantitative prediction of experimental observables. While this is true for macromolecular species such as proteins and RNAs, small molecule parametrization is still lagging behind. This is mainly due to the heterogeneity and wide size of the chemical space associated with the small organic molecules that can potentially be designed. In particular, inaccuracies in charge and dihedral angle parameters can impact remarkably simulations results. In this respect, nowadays it is rather standard to use the AM1-BCC method to assign charges to ligands for MD simulations. This usually results in a high-quality description of electrostatic properties of ligands in most cases. However, AM1-BCC charges may be insufficient in more complex scenarios. In these cases, resorting to a more accurate theory level, such as using charges derived from Density Functional Theory and the Restrained Electrostatic Potential (RESP) method, may be more appropriate. For the systems studied here, a large number of heteroatoms participate in the formation of hydrogen bond interactions with the RNA target in the bound state. Therefore, we improved the parametrization of the cognate ligand with RESP charges and optimized dihedral angles parameters. Then, we applied our protocol to generate again a guess unbinding pathway via ABMD simulations, and performed the production SMD runs. Interestingly, the preferred unbinding pathway, passing through loop 2 and stem 1, was consistent with the one observed using the previous ligand parametrization and was similar across all ABMD runs. Moreover, to assess the effect of the different water models in combination with the newly parametrized ligand, both the TIP4P-D and TIP3P water models were considered. The results are reported in Table S4.

The results obtained for the RESP parametrized cognate ligand are not strongly affected by the water model. Indeed, the binding free energies obtained in TIP3P and TIP4P-D solvent are fairly similar (within the statistical error), as for the results obtain with AM1-BCC charges. However, the results with RESP parameters are fairly consistent with the ones with the AM1-BCC parameters, both deviating from the experimental value. This may suggest that the charge model is not the major source of errors in this case.

Furthermore, to ensure consistency of our results and definitively exclude the parametrization procedure as a source of error, we reparametrized all ligands, i.e. both synthetic and cognate. Thus, instead of using topologies from ref , we reconstructed from scratch the topologies for all ligands, using the AM1-BCC charge model and GAFF (additional details provided in the Supporting Information). This provided consistent results, confirming the reproducibility of our findings and excluding the parametrization procedure as the source of discrepancy with experimental data.

Another potential source of inaccuracy for the cognate ligand may arise from the presence of different tautomeric states in solution. QM calculations (details in Supporting Information) revealed the possibility of having a different tautomeric form in solution. In particular, the cognate form considered hitherto, defined N1, can tautomerize into the form N2 by the shift of a proton from N1 (cognate N1) to N2 (cognate N2), as reported in Figure A. This alternative state N2 may be promoted in solution by the intramolecular hydrogen bond between the carbonyl and the protonated amine groups. Consequently, for completeness of our results, we decided to include the tautomeric state N2 in our simulation panel by repeating the entire pipeline with the cognate ligand in this alternative tautomeric form.

To obtain a comprehensive picture and compare with the other results, we conducted SMD simulations with the TIP4P-D and TIP3P water models and performed Well-Tempered MetaD simulations. The binding affinity calculated for the tautomeric form N2 is significantly lower than the one for the form N1. However, the results for the cognate ligand in the tautomeric form N2 show a remarkably improved agreement with the reference experimental data. Specifically, in the TIP4P-D simulations, the standard binding free energy is −13.2 ± 1 kcal/mol, while in TIP3P it is −11.9 ± 1 kcal/mol, in fair agreement with the experimental value of −10.9 kcal/mol (Table S4). Moreover, they remain statistically consistent with the independent results from the MetaD simulation in TIP4P-D water, which yielded −13.9 ± 1 kcal/mol. All the results for the cognate ligand are summarized in Figure .

10.

10

Binding free energies for all complexes tested. Dashed lines indicate experimental values (green: synthetic, blue: cognate). The x-axis represents the calculation method (SMD in TIP4P-D and TIP3P or MetaD), while the y-axis shows the corresponding binding free energy with associated errors. Ligands are color-coded: synthetic (green), cognate with AM1-BCC charges (orange) and cognate tautomer (light blue).

Notably, the tautomeric state N2 is a less expected form, as the cognate ligand N1 closely resembles the Guanine nucleobase with only minor modifications. Thanks to this similarity, the cognate ligand N1 is more likely to establish the expected hydrogen bonding network within the binding site, interacting with nucleobase C15 through their Watson–Crick edges. This is consistent with the behavior of Guanine in this specific three-dimensional arrangement (Figure B). As shown in Figure B,C, the cognate ligand in state N2 loses two hydrogen bond interactions with C15 and A29 compared to the state N1.

Furthermore, analysis of 500 ns plain MD simulations (two replicas) revealed distinct hydrogen bonding patterns for the two tautomers within the binding pocket over time. Specifically, in the N1 tautomer simulations, the number of hydrogen bonds remained stable at approximately seven, whereas in the N2 tautomer, the average number of hydrogen bonds was five. The corresponding hydrogen bond plots are provided in Supporting Information. These observations on the hydrogen bond network may help explain the lower binding affinity calculated for the N2 tautomeric form.

Finally, the presence of both tautomeric forms in solution could explain the discrepancy of the experimental affinity of the cognate ligand compared to our simulations results for the tautomer N1. As both tautomeric form coexist in solution, we may speculate that the experimental binding free energy is lower than the affinity of only the stronger binder tautomer N1, due to the contribution of the weaker binder tautomer N2.

Conclusions

In this study, we addressed the binding free energy estimation in challenging biological systems of pharmaceutical relevance, using nonequilibrium MD simulations. In particular, we focused on the binding of the Gleevec drug to the Abl protein target, and of two ligands to the preQ1 RNA riboswitch. These complexes present inherent challenges given the realistic size of Abl–Gleevec, the conformational rearrangement of the Abl protein, and the marked flexibility of the RNA receptor. Building on recent insights, we optimized the criteria for the construction of the reference binding pathways for PCVs. In particular, we emphasized the inclusion of all degrees of freedom that are critical to the binding process in the reference pathway, such as both the ligand and the binding pocket atoms for the systems treated here. This was crucial to capture the intricate structural rearrangements in the receptor binding sites, relevant both in the Abl–Gleevec system, given the complexity of the binding mechanism, and in the riboswitch-ligand systems, due to the inherent structural flexibility of the RNA molecule. Upon construction of robust reference pathways for PCV-based MD simulations, we were able to estimate standard binding free energies for both the protein- and RNA-ligand systems considered. The estimation for Gleevec exhibited higher consistency with the experimental value, confirming the inherent difficulties of RNA system calculations. Interestingly, our investigation confirmed that using the less dissipative TIP3P water model is preferable in a nonequilibrium setting, resulting in estimates in better agreement with experiments than the four-point model TIP4P-D. Conversely, when used in conjunction with MetaD, the four-point model TIP4P-D provided results compatible with experiments. This observation aligns with our recent findings, demonstrating the greater sensitivity of nonequilibrium steered molecular dynamics to the kinetics of the systems, and highlighting the importance of selecting suitable water models depending on the employed simulative approach. Although equilibrium approaches are more traditionally applied in the field of free energy calculations, nonequilibrium methods still represent an open challenge and may present computational advantages. In particular, the main advantage of the proposed pipeline lies in its straightforward parallelization compared to methods such as MetaD. Despite the total simulation time required by nonequilibrium SMD being significantly longer than that of MetaD, all simulations can be executed in parallel. Therefore, by leveraging high-performance computing architectures, the time-to-solution is determined by the execution time of a single replica, i.e., 100 ns for the systems discussed in this work. The execution time of a single replica was selected as a trade-off between the single replica time and the number of replicas to achieve the fastest convergence with the fastest time-to-solution. Finally, the proposed workflow allows for a more seamless assessment of convergence compared to MetaD. Undoubtedly, the methodology presented here is more computationally demanding than established and cost-effective “end-point” methods. Nevertheless, in addition to yielding binding free energy estimates, our approach can provide mechanistic insights into the binding process and relevant intermediates, which can be extremely useful within drug design endeavors. This information can be particularly valuable for effective applications in pharmaceutical research.

Supplementary Material

ci5c00452_si_001.pdf (16.6MB, pdf)

Acknowledgments

The authors acknowledge ISCRA for awarding this project access to the LEONARDO supercomputer, owned by the EuroHPC Joint Undertaking, hosted by CINECA (Italy). A.G. thanks the European project “Molecular Dynamics Data Bank. The European Repository for Biosimulation Data” grant number 101094651 for financial support. The authors acknowledge UCSF for the Chimera molecular graphics software suite for molecular images. M.B. acknowledges funding by the project “National Center for Gene Therapy and Drugs based on RNA Technology” (CN00000041), financed by NextGenerationEU PNRR MUR e M4C2 e Action 1.4 e Call “Potenziamento strutture di ricerca e di campioni nazionali di R&S” (CUP: J33C22001130001). S.D. acknowledges the financial support from the European UnionNextGenerationEU and the Ministry of University and Research (MUR), National Recovery and Resilience Plan (NRRP): Research program CN00000013 “National Centre for HPC, Big Data and Quantum Computing”, funded by the D.D. n.1031 del 17.06.2022 and Mission 4, Component 2, Investment 1.4 - Avviso “Centri Nazionali”D.D. n. 3138, 16 December 2021. We gratefully acknowledge the Data Science and Computation Facility and its Team for their support and assistance on the IIT High Performance Computing Infrastructure. The authors thank Matteo Donnini for preliminary experiments on RNA-ligand systems.

All the data and scripts to reproduce our findings are available at https://gitlab.iit.it/hpc/FreeEnergyPath/. Some scripts and simulations require the BiKi Software. A temporary license can be requested for free to reproduce our findings.

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.5c00452.

  • Additional results for Abl–Gleevec: Analysis of PCVs and Path definition, Work Profiles and FESs, DFG-flip contribution. Additional results for RNA-ligand systems: Debye–Hückel interaction energy profile, Analysis of PCV and Path definition, 3D structure of Riboswitch, Work Profiles and FESs. Well-Tempered MetaDynamics additional results: Well-Tempered MetaDynamics of Abl–Gleevec, Well-Tempered MetaDynamics of RNA-ligands. Possible source of errors for the cognate ligand - additional results: Additional parameters, Hydrogen bonds analysis (PDF)

#.

E.S. and A.G. contributed equally to the work. E.S contributed to Software, Validation, Formal analysis, Investigation, Data Curation, Writing - Original Draft, Visualization. A.G. contributed to Software, Validation, Formal analysis, Investigation, Data Curation, Writing - Original Draft, Visualization. R.A. contributed to Validation, Formal analysis, Investigation, Data Curation, Writing - Original Draft, Visualization. M.B. contributed to Investigation, Writing - Review & Editing, Supervision, Project administration. SD contributed to Conceptualization, Methodology, Investigation, Resources, Writing - Review & Editing, Supervision, Project administration. A.C. contributed to Resources, Writing - Review & Editing, Project administration, Funding acquisition.

The authors declare the following competing financial interest(s): Sergio Decherchi and Andrea Cavalli are co-founders of BiKi Technologies s.r.l., a company selling the BiKi Life Sciences software suite for computational drug discovery.

References

  1. Decherchi S., Cavalli A.. Thermodynamics and kinetics of drug-target binding by molecular simulation. Chem. Rev. 2020;120:12788–12833. doi: 10.1021/acs.chemrev.0c00534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Ghidini A., Serra E., Decherchi S., Cavalli A.. Bidirectional path-based non-equilibrium simulations for binding free energy. Mol. Phys. 2024:e2374465. doi: 10.1080/00268976.2024.2374465. [DOI] [Google Scholar]
  3. Serra E., Ghidini A., Decherchi S., Cavalli A.. Nonequilibrium Binding Free Energy Simulations: Minimizing Dissipation. J. Chem. Theory Comput. 2025;21:2079–2094. doi: 10.1021/acs.jctc.4c01453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Xie T., Saleh T., Rossi P., Kalodimos C. G.. Conformational states dynamically populated by a kinase determine its function. Science. 2020;370:eabc2754. doi: 10.1126/science.abc2754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Nolen B., Taylor S., Ghosh G.. Regulation of protein kinases: controlling activity through activation segment conformation. Mol. Cell. 2004;15:661–675. doi: 10.1016/j.molcel.2004.08.024. [DOI] [PubMed] [Google Scholar]
  6. Ayaz P., Lyczek A., Paung Y., Mingione V. R., Iacob R. E., de Waal P. W., Engen J. R., Seeliger M. A., Shan Y., Shaw D. E.. Structural mechanism of a drug-binding process involving a large conformational change of the protein target. Nat. Commun. 2023;14:1885. doi: 10.1038/s41467-023-36956-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Warner K. D., Hajdin C. E., Weeks K. M.. Principles for targeting RNA with drug-like small molecules. Nat. Rev. Drug Discovery. 2018;17:547–558. doi: 10.1038/nrd.2018.93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Rizvi N. F., Smith G. F.. RNA as a small molecule druggable target. Bioorg. Med. Chem. Lett. 2017;27:5083–5088. doi: 10.1016/j.bmcl.2017.10.052. [DOI] [PubMed] [Google Scholar]
  9. Falese J. P., Donlic A., Hargrove A. E.. Targeting RNA with small molecules: from fundamental principles towards the clinic. Chem. Soc. Rev. 2021;50:2224–2243. doi: 10.1039/D0CS01261K. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Di Giorgio A., Duca M.. Synthetic small-molecule RNA ligands: future prospects as therapeutic agents. MedChemComm. 2019;10:1242–1255. doi: 10.1039/C9MD00195F. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Connelly C. M., Moon M. H., Schneekloth J. S.. The emerging role of RNA as a therapeutic target for small molecules. Cell Chem. Biol. 2016;23:1077–1090. doi: 10.1016/j.chembiol.2016.05.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Tong Y., Childs-Disney J. L., Disney M. D.. Targeting RNA with small molecules, from RNA structures to precision medicines: IUPHAR review: 40. Br. J. Pharmacol. 2024;181:4152–4173. doi: 10.1111/bph.17308. [DOI] [PubMed] [Google Scholar]
  13. Dhillon S.. Risdiplam: first approval. Drugs. 2020;80:1853–1858. doi: 10.1007/s40265-020-01410-z. [DOI] [PubMed] [Google Scholar]
  14. Sheridan C.. First small-molecule drug targeting RNA gains momentum. Nat. Biotechnol. 2021;39:6–9. doi: 10.1038/s41587-020-00788-1. [DOI] [PubMed] [Google Scholar]
  15. Blakeley B. D., DePorter S. M., Mohan U., Burai R., Tolbert B. S., McNaughton B. R.. Methods for identifying and characterizing interactions involving RNA. Tetrahedron. 2012;68:8837–8855. doi: 10.1016/j.tet.2012.07.001. [DOI] [Google Scholar]
  16. Manigrasso J., Marcia M., De Vivo M.. Computer-aided design of RNA-targeted small molecules: a growing need in drug discovery. Chem. 2021;7:2965–2988. doi: 10.1016/j.chempr.2021.05.021. [DOI] [Google Scholar]
  17. Bernetti M., Aguti R., Bosio S., Recanatini M., Masetti M., Cavalli A.. Computational drug discovery under RNA times. QRB discovery. 2022;3:e22. doi: 10.1017/qrd.2022.20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Kavita K., Breaker R. R.. Discovering riboswitches: the past and the future. Trends Biochem. Sci. 2023;48:119–141. doi: 10.1016/j.tibs.2022.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Ellinger E., Chauvier A., Romero R. A., Liu Y., Ray S., Walter N. G.. Riboswitches as therapeutic targets: promise of a new era of antibiotics. Expert Opin. Ther. Targets. 2023;27:433–445. doi: 10.1080/14728222.2023.2230363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Panchal V., Brenk R.. Riboswitches as drug targets for antibiotics. Antibiotics. 2021;10:45. doi: 10.3390/antibiotics10010045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kallert E., Fischer T. R., Schneider S., Grimm M., Helm M., Kersten C.. Protein-based virtual screening tools applied for RNA–ligand docking identify new binders of the PreQ1-riboswitch. J. Chem. Inf. Model. 2022;62:4134–4148. doi: 10.1021/acs.jcim.2c00751. [DOI] [PubMed] [Google Scholar]
  22. Howe J. A., Wang H., Fischmann T. O., Balibar C. J., Xiao L., Galgoci A. M., Malinverni J. C., Mayhood T., Villafania A., Nahvi A.. et al. Selective small-molecule inhibition of an RNA structural element. Nature. 2015;526:672–677. doi: 10.1038/nature15542. [DOI] [PubMed] [Google Scholar]
  23. Wang Y., Parmar S., Schneekloth J. S., Tiwary P.. Interrogating RNA–small molecule interactions with structure probing and artificial intelligence-augmented molecular simulations. ACS Cent. Sci. 2022;8:741–748. doi: 10.1021/acscentsci.2c00149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Connelly C. M., Numata T., Boer R. E., Moon M. H., Sinniah R. S., Barchi J. J., Ferré-D’Amaré A. R., Schneekloth J. S. Jr. Synthetic ligands for PreQ1 riboswitches provide structural and mechanistic insights into targeting RNA tertiary structure. Nat. Commun. 2019;10:1501. doi: 10.1038/s41467-019-09493-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Marchi M., Ballone P.. Adiabatic bias molecular dynamics: a method to navigate the conformational space of complex molecular systems. J. Chem. Phys. 1999;110:3697–3702. doi: 10.1063/1.478259. [DOI] [Google Scholar]
  26. Gobbo D., Piretti V., Di Martino R. M. C., Tripathi S. K., Giabbai B., Storici P., Demitri N., Girotto S., Decherchi S., Cavalli A.. Investigating drug–target residence time in kinases through enhanced sampling simulations. J. Chem. Theory Comput. 2019;15:4646–4659. doi: 10.1021/acs.jctc.9b00104. [DOI] [PubMed] [Google Scholar]
  27. Ferrarotti M. J., Rocchia W., Decherchi S.. Finding principal paths in data space. IEEE Trans. Neural Netw. Learn. Syst. 2019;30:2449–2462. doi: 10.1109/TNNLS.2018.2884792. [DOI] [PubMed] [Google Scholar]
  28. Doudou S., Burton N. A., Henchman R. H.. Standard free energy of binding from a one-dimensional potential of mean force. J. Chem. Theory Comput. 2009;5:909–918. doi: 10.1021/ct8002354. [DOI] [PubMed] [Google Scholar]
  29. Isralewitz B., Gao M., Schulten K.. Steered molecular dynamics and mechanical functions of proteins. Curr. Opin. Struct. Biol. 2001;11:224–230. doi: 10.1016/S0959-440X(00)00194-9. [DOI] [PubMed] [Google Scholar]
  30. Crooks G. E.. Nonequilibrium measurements of free energy differences for microscopically reversible Markovian systems. J. Stat. Phys. 1998;90:1481–1487. doi: 10.1023/A:1023208217925. [DOI] [Google Scholar]
  31. Branduardi D., Gervasio F. L., Parrinello M.. From A to B in free energy space. J. Chem. Phys. 2007;126:054103. doi: 10.1063/1.2432340. [DOI] [PubMed] [Google Scholar]
  32. Bonomi, M. ; Branduardi, D. . PLUMED Tutorial: A Portable Plugin for Free-Energy Calculations with Molecular Dynamics, 2022.
  33. Jarzynski C.. Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach. Phys. Rev. E. 1997;56:5018. doi: 10.1103/PhysRevE.56.5018. [DOI] [Google Scholar]
  34. Ciccotti G., Rondoni L.. Jarzynski on work and free energy relations: The case of variable volume. American Institute of Chemical Engineers Journals. 2021;67:e17082. doi: 10.1002/aic.17082. [DOI] [Google Scholar]
  35. Crooks G. E.. Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences. Phys. Rev. E. 1999;60:2721. doi: 10.1103/PhysRevE.60.2721. [DOI] [PubMed] [Google Scholar]
  36. Shirts M. R., Bair E., Hooker G., Pande V. S.. Equilibrium free energies from nonequilibrium measurements using maximum-likelihood methods. Phys. Rev. Lett. 2003;91:140601. doi: 10.1103/PhysRevLett.91.140601. [DOI] [PubMed] [Google Scholar]
  37. Jarzynski C.. Nonequilibrium equality for free energy differences. Phys. Rev. Lett. 1997;78:2690. doi: 10.1103/PhysRevLett.78.2690. [DOI] [Google Scholar]
  38. Decherchi S., Bottegoni G., Spitaleri A., Rocchia W., Cavalli A.. BiKi life sciences: a new suite for molecular dynamics and related methods in drug discovery. J. Chem. Inf. Model. 2018;58:219–224. doi: 10.1021/acs.jcim.7b00680. [DOI] [PubMed] [Google Scholar]
  39. Efron, B. Breakthroughs in Statistics: Methodology and Distribution; Kotz, S. , Johnson, N. L. , Eds.; Springer New York: New York, NY, 1992; pp 569–593. [Google Scholar]
  40. Bešker N., Gervasio F. L.. Using metadynamics and path collective variables to study ligand binding and induced conformational transitions. Computational drug discovery and design. 2012;819:501–513. doi: 10.1007/978-1-61779-465-0_29. [DOI] [PubMed] [Google Scholar]
  41. Provasi D., Artacho M. C., Negri A., Mobarec J. C., Filizola M.. Ligand-induced modulation of the free-energy landscape of G protein-coupled receptors explored by adaptive biasing techniques. PLoS Comput. Biol. 2011;7:e1002193. doi: 10.1371/journal.pcbi.1002193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Callea L., Bonati L., Motta S.. Metadynamics-based approaches for modeling the hypoxia-inducible factor 2α ligand binding process. J. Chem. Theory Comput. 2021;17:3841–3851. doi: 10.1021/acs.jctc.1c00114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Bernetti M., Masetti M., Recanatini M., Amaro R. E., Cavalli A.. An integrated Markov state model and path metadynamics approach to characterize drug binding processes. J. Chem. Theory Comput. 2019;15:5689–5702. doi: 10.1021/acs.jctc.9b00450. [DOI] [PubMed] [Google Scholar]
  44. Laio A., Parrinello M.. Escaping free-energy minima. Proc. Natl. Acad. Sci. U.S.A. 2002;99:12562–12566. doi: 10.1073/pnas.202427399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Barducci A., Bussi G., Parrinello M.. Well-tempered metadynamics: a smoothly converging and tunable free-energy method. Phys. Rev. Lett. 2008;100:020603. doi: 10.1103/PhysRevLett.100.020603. [DOI] [PubMed] [Google Scholar]
  46. Bertazzo M., Gobbo D., Decherchi S., Cavalli A.. Machine learning and enhanced sampling simulations for computing the potential of mean force and standard binding free energy. J. Chem. Theory Comput. 2021;17:5287–5300. doi: 10.1021/acs.jctc.1c00177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Hornak V., Abel R., Okur A., Strockbine B., Roitberg A., Simmerling C.. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins: Struct., Funct., Bioinf. 2006;65:712–725. doi: 10.1002/prot.21123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Wang J., Wolf R. M., Caldwell J. W., Kollman P. A., Case D. A.. Development and testing of a general amber force field. J. Comput. Chem. 2004;25:1157–1174. doi: 10.1002/jcc.20035. [DOI] [PubMed] [Google Scholar]
  49. Jakalian A., Jack D. B., Bayly C. I.. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation. J. Comput. Chem. 2002;23:1623–1641. doi: 10.1002/jcc.10128. [DOI] [PubMed] [Google Scholar]
  50. Lin Y.-L., Meng Y., Huang L., Roux B.. Computational study of Gleevec and G6G reveals molecular determinants of kinase inhibitor selectivity. J. Am. Chem. Soc. 2014;136:14753–14762. doi: 10.1021/ja504146x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Jorgensen W. L., Chandrasekhar J., Madura J. D., Impey R. W., Klein M. L.. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983;79:926–935. doi: 10.1063/1.445869. [DOI] [Google Scholar]
  52. Piana S., Donchev A. G., Robustelli P., Shaw D. E.. Water dispersion interactions strongly influence simulated structural properties of disordered protein states. J. Phys. Chem. B. 2015;119:5113–5123. doi: 10.1021/jp508971m. [DOI] [PubMed] [Google Scholar]
  53. Tan D., Piana S., Dirks R. M., Shaw D. E.. RNA force field with accuracy comparable to state-of-the-art protein force fields. Proc. Natl. Acad. Sci. U.S.A. 2018;115:E1346–E1355. doi: 10.1073/pnas.1713027115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Bayly C. I., Cieplak P., Cornell W., Kollman P. A.. A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model. J. Phys. Chem. 1993;97:10269–10280. doi: 10.1021/j100142a004. [DOI] [Google Scholar]
  55. Martínez-Rosell G., Giorgino T., De Fabritiis G.. PlayMolecule ProteinPrepare: a web application for protein preparation for molecular dynamics simulations. J. Chem. Inf. Model. 2017;57:1511–1516. doi: 10.1021/acs.jcim.7b00190. [DOI] [PubMed] [Google Scholar]
  56. Smith J. S., Nebgen B., Lubbers N., Isayev O., Roitberg A. E.. Less is more: Sampling chemical space with active learning. J. Chem. Phys. 2018;148:241733. doi: 10.1063/1.5023802. [DOI] [PubMed] [Google Scholar]
  57. Seeliger M. A., Nagar B., Frank F., Cao X., Henderson M. N., Kuriyan J.. c-Src binds to the cancer drug imatinib with an inactive Abl/c-Kit conformation and a distributed thermodynamic penalty. Structure. 2007;15:299–311. doi: 10.1016/j.str.2007.01.015. [DOI] [PubMed] [Google Scholar]
  58. Seeliger M. A., Ranjitkar P., Kasap C., Shan Y., Shaw D. E., Shah N. P., Kuriyan J., Maly D. J.. Equally potent inhibition of c-Src and Abl by compounds that recognize inactive kinase conformations. Cancer Res. 2009;69:2384–2392. doi: 10.1158/0008-5472.CAN-08-3953. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Agafonov R. V., Wilson C., Otten R., Buosi V., Kern D.. Energetic dissection of Gleevec’s selectivity toward human tyrosine kinases. Nat. Struct. Mol. Biol.Trends Biochem. Sci. 2014;21:848–853. doi: 10.1038/nsmb.2891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Shan Y., Seeliger M. A., Eastwood M. P., Frank F., Xu H., Jensen M. Ø., Dror R. O., Kuriyan J., Shaw D. E.. A conserved protonation-dependent switch controls drug binding in the Abl kinase. Proc. Natl. Acad. Sci. U.S.A. 2009;106:139–144. doi: 10.1073/pnas.0811223106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Schindler T., Bornmann W., Pellicena P., Miller W. T., Clarkson B., Kuriyan J.. Structural mechanism for STI-571 inhibition of abelson tyrosine kinase. Science. 2000;289:1938–1942. doi: 10.1126/science.289.5486.1938. [DOI] [PubMed] [Google Scholar]
  62. Lovera S., Morando M., Pucheta-Martinez E., Martinez-Torrecuadrada J. L., Saladino G., Gervasio F. L.. Towards a molecular understanding of the link between imatinib resistance and kinase conformational dynamics. PLoS Comput. Biol. 2015;11:e1004578. doi: 10.1371/journal.pcbi.1004578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Ganser L. R., Kelly M. L., Herschlag D., Al-Hashimi H. M.. The roles of structural dynamics in the cellular functions of RNAs. Nat. Rev. Mol. Cell Biol. 2019;20:474–489. doi: 10.1038/s41580-019-0136-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Bosio S., Bernetti M., Rocchia W., Masetti M.. Similarities and Differences in Ligand Binding to Protein and RNA Targets: The Case of Riboflavin. J. Chem. Inf. Model. 2024;64:4570–4586. doi: 10.1021/acs.jcim.4c00420. [DOI] [PubMed] [Google Scholar]
  65. Jorgensen W. L.. Quantum and statistical mechanical studies of liquids. 10. Transferable intermolecular potential functions for water, alcohols, and ethers. Application to liquid water. J. Am. Chem. Soc. 1981;103:335–340. doi: 10.1021/ja00392a016. [DOI] [Google Scholar]
  66. Rasouli A., Pickard IV F. C., Sur S., Grossfield A., Isık Bennett M.. Essential Considerations for Free Energy Calculations of RNA–Small Molecule Complexes: Lessons from the Theophylline-Binding RNA Aptamer. J. Chem. Inf. Model. 2025;65:223–239. doi: 10.1021/acs.jcim.4c01505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Štoček J. R., Dračínský M.. Tautomerism of guanine analogues. Biomolecules. 2020;10:170. doi: 10.3390/biom10020170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Bhati A. P., Wan S., Coveney P. V.. Equilibrium and Nonequilibrium Ensemble Methods for Accurate, Precise and Reproducible Absolute Binding Free Energy Calculations. J. Chem. Theory Comput. 2025;21:440–462. doi: 10.1021/acs.jctc.4c01389. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ci5c00452_si_001.pdf (16.6MB, pdf)

Data Availability Statement

All the data and scripts to reproduce our findings are available at https://gitlab.iit.it/hpc/FreeEnergyPath/. Some scripts and simulations require the BiKi Software. A temporary license can be requested for free to reproduce our findings.


Articles from Journal of Chemical Information and Modeling are provided here courtesy of American Chemical Society

RESOURCES