Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2023 Jun 7;19(12):3705–3720. doi: 10.1021/acs.jctc.3c00140

Tackling Hysteresis in Conformational Sampling: How to Be Forgetful with MEMENTO

Simon M Lichtinger 1,*, Philip C Biggin 1,*
PMCID: PMC10308841  PMID: 37285481

Abstract

graphic file with name ct3c00140_0007.jpg

The structure of proteins has long been recognized to hold the key to understanding and engineering their function, and rapid advances in structural biology and protein structure prediction are now supplying researchers with an ever-increasing wealth of structural information. Most of the time, however, structures can only be determined in free energy minima, one at a time. While conformational flexibility may thus be inferred from static end-state structures, their interconversion mechanisms—a central ambition of structural biology—are often beyond the scope of direct experimentation. Given the dynamical nature of the processes in question, many studies have attempted to explore conformational transitions using molecular dynamics (MD). However, ensuring proper convergence and reversibility in the predicted transitions is extremely challenging. In particular, a commonly used technique to map out a path from a starting to a target conformation called steered MD (SMD) can suffer from starting-state dependence (hysteresis) when combined with techniques such as umbrella sampling (US) to compute the free energy profile of a transition. Here, we study this problem in detail on conformational changes of increasing complexity. We also present a new, history-independent approach that we term “MEMENTO” (Morphing End states by Modelling Ensembles with iNdependent TOpologies) to generate paths that alleviate hysteresis in the construction of conformational free energy profiles. MEMENTO utilizes template-based structure modelling to restore physically reasonable protein conformations based on coordinate interpolation (morphing) as an ensemble of plausible intermediates, from which a smooth path is picked. We compare SMD and MEMENTO on well-characterized test cases (the toy peptide deca-alanine and the enzyme adenylate kinase) before discussing its use in more complicated systems (the kinase P38α and the bacterial leucine transporter LeuT). Our work shows that for all but the simplest systems SMD paths should not in general be used to seed umbrella sampling or related techniques, unless the paths are validated by consistent results from biased runs in opposite directions. MEMENTO, on the other hand, performs well as a flexible tool to generate intermediate structures for umbrella sampling. We also demonstrate that extended end-state sampling combined with MEMENTO can aid the discovery of collective variables on a case-by-case basis.

Introduction

Molecular dynamics (MD) simulations promise to place static protein structures into their dynamical context, which frequently involves large-scale conformational changes.1 Thanks to rapid advances in structural biology, one may have information about several conformational states of a given protein available, but the mechanism, energetics, and kinetics of their interconversion often remain unknown. This establishes a vital, continuing role for MD in the study of protein structure that comes not without challenges. While tens of microseconds can now be simulated on commercial hardware (rising to hundreds of microseconds with stratification, running multiple boxes in parallel), this is still orders of magnitude shorter than many conformational changes, which often reach into regimes of milliseconds and beyond.24 Moreover, to accurately describe conformational dynamics at equilibrium, one needs to observe repeated transitions to obtain good statistics. Enhanced sampling methods can help address this issue in various ways: by adaptive spawning of new trajectories, by adjusting potential energy barriers, or by biasing progress along specific reaction coordinates, termed collective variables (CVs).5

Several techniques have been developed to implement these strategies. Weighted ensemble methods6 use the systematic launching of unbiased MD trajectories to sample along a set of CVs. Temperature-replica exchange MD (REMD)7 and accelerated MD (aMD)8 are popular strategies to accelerate slow dynamics without reference to collective variables, with modern extensions like replica exchange with solute tempering (REST)9,10 and Gaussian accelerated MD (GaMD).11 Umbrella sampling (US),12 metadynamics,13 and adaptive biasing force (ABF) sampling,14 on the other hand, rely on CVs that capture the relevant slow degrees of freedom (DOFs) of a system. These methods can be highly effective to sample desired conformational changes; however, the design of the required CVs is a formidable challenge in itself, often requiring extensive prior knowledge.

With the goal of obtaining a potential of mean force (PMF), or free energy surface (FES), of a given conformational change, it is conceptually useful for the majority of the above methods to separate the task into two parts. One first needs to obtain an initial path connecting two known end states, followed by focusing extensive sampling on the vicinity of this path in phase space to gather a PMF. A popular approach is to use targeted MD (TMD)15 or steered MD (SMD)16—here, we choose SMD with a CV based on the RMSD to a target state1—to generate a path, followed by replica-exchange US (REUS)20 and processing with the weighted histogram analysis method (WHAM).21 We note here that although time-dependent biasing schemes like metadynamics do not separate path generation and free energy sampling, they still internally construct transition paths by biased sampling and will suffer from issues similar to those described below.

In Figure 1a,b, we present an example of what the SMD + REUS approach may yield in practice. We ran SMD between the DFG-in and DFG-out conformations of P38α (details of this system are provided below). In projection onto a simple collective variable (DRMSD = RMSD(DFG-in) – RMSD(DFG-out), where the whole proteins are aligned but the RMSD is calculated along the DFG loop only), both biasing directions appear to produce metastable states in their respective target conformations (Figure 1a). Proceeding to REUS, however, reveals substantial starting-state dependence (or hysteresis), illustrated in Figure 1b. Generally speaking, when a bias is applied to a CV to move between two conformations, the implicit assumption is that orthogonal degrees of freedom (DOFs) will equilibrate within the available sampling. Where this is not the case, as we show schematically in Figure 1c, the result is a PMF skewed to the starting state of path generation.

Figure 1.

Figure 1

(a) Steered MD (SMD) between DFG-in and DFG-out conformations of P38α. Bidirectional steers on the RMSD to the respective target configuration, projected on the difference of RMSDs (DRMSD). (b) Replica-exchange umbrella sampling (REUS) on the obtained transition paths shows strong hysteresis. (c) Schematic representation of the problem of orthogonal degrees of freedom (DOFs) in SMD. (d) Schematic overview of the MEMENTO procedure, fixing unphysical morphing intermediates by template-based modelling.

It is widely recognized that TMD and SMD depend strongly on the choice of CV.18,19,22 In particular, global RMSD-based CVs have been shown to be biased in a “large-scale-first” fashion, where large-scale features of a target state get reconstituted before those at smaller scales as an unphysical artifact of the global best-fit alignment step.19 The potential for hysteresis when running SMD between two protein conformations has also been discussed in specific cases. For example, Meshkin et al.4 note that initially they observed hysteresis of tens of kcal/mol when sampling the transition between the occluded and outward-facing states of the Mhp1 transporter. Only through “repeated trials and errors” could they design a set of better CVs that reduced hysteresis to an “acceptable level”. To our knowledge, however, there has not been a dedicated investigation of hysteresis in conformational sampling of the scope we attempt to provide as the first objective of this paper. CV choice and path generation are evidently coupled problems, since an ideal CV would generate a history-free equilibrium path between protein conformations with SMD. Because an optimal CV space is not known in most cases (and may not even exist), one needs to consider either how to discover or approximate it, or how to build good paths independent of suboptimal CVs to eliminate starting-state bias. We show in this work that such paths may perform better than SMD paths in umbrella sampling (which still requires CVs), even without optimal CVs.

There are therefore, broadly speaking, two ways to address hysteresis in PMFs of conformational changes. A researcher can attempt to design better CVs that capture all slow movements of the protein by using extensive physical knowledge and restraints (recently demonstrated on a membrane transporter by the aforementioned Meshkin et al.4), by combining a larger number of CVs in bias-exchange metadynamics23 or—with some limits on intuitive interpretability—using a host of new machine-learning approaches.24 Alternatively, one may wish to construct a path of intermediate structures that is free of hysteresis. In transition path sampling,25,26 an ensemble of transition paths can be constructed from one rare event trajectory, and the string method27,28 has been applied with success in transitions as complex as membrane transporter conformational changes.29,30

Since SMD paths like the P38α runs we presented above can be highly metastable—to the extent that they appear converged in REUS even with hundreds of nanoseconds of simulation time per window—we decided not to attempt refining paths with MD. Instead, we focus on alternative path-generation algorithms. From various takes on morphing (coordinate interpolation)3133 and rigid-body approximations34,35 to elastic network-based models36,37 and elaborate analysis–biasing iterative schemes,38 the field is ripe with ways to connect protein conformations. Unfortunately, the potential for hysteresis in umbrella sampling remains understudied because a full validation of the PMFs is often not undertaken. Furthermore, such validation is hampered by poor code availability and user-friendliness in some of the more complex approaches.

For these reasons, we asked what might be the most conceptually simple, easy-to-implement way to generate paths that can eliminate hysteresis in umbrella sampling. Linear interpolation of coordinates (morphing) is by definition history-free, but the intermediates are unphysical: bonds and angles are distorted and side-chain interactions are disrupted. Many tools exist to fix morphing intermediates (such as RigiMOL in PyMOL39 and a panoply of morph servers32,4042), but they have been developed mainly for visual purposes and lack substantial MD validation. On the other hand, the MODELLER package43 has been used and improved for decades to prepare starting structures for MD simulations. Here, we find that its algorithm based on satisfying probability distributions for angles and dihedrals together with side-chain interactions is indeed able to fix unphysical morphing intermediates.

This approach for history-free path generation, which we term MEMENTO (Morphing Endstates by Modelling Ensembles with iNdependent TOpologies, illustrated in Figure 1d), is validated in this paper on four example systems: the toy peptide deca-alanine, the enzyme ADK, the kinase P38α, and the membrane-transporter LeuT. At each stage, we rigorously compare our results to what is attainable using SMD alone and find that MEMENTO outperforms SMD on all systems except for deca-alanine, where they behave equally well. We also show how, by combining long end-state sampling with MEMENTO, suitable CVs for umbrella sampling can be developed iteratively. Although we see this research as presenting mostly general findings on how to best sample conformational changes, we also release a documented, tested, and user-friendly python implementation of MEMENTO as the path-generation package PyMEMENTO.

Methods

Path Generation with MEMENTO

Basic Workflow

To automatically morph protein end states and fix the intermediates with MODELLER, we designed a python package. The source code, documentation, and examples are available on github: https://github.com/simonlichtinger/PyMEMENTO. We also provide a static fork of the package taken at the time of publication at https://github.com/bigginlab/PyMEMENTO. Our package uses the MDAnalysis,44 numpy,45 pandas,46 matplotlib,47 and GromacsWrapper48 packages. In the most basic cases, we perform five key steps.

1. Coordinate Interpolation

A linear morph between two protein coordinate files is calculated. If Xi is the coordinate of the ith atom of a protein, then

1. 1

where n runs from 0 to Nwindows – 1 and Nwindows is the number of windows one wishes to use for subsequent umbrella sampling. In this work, we find that 24 windows are usually sufficient for good sampling. This is the number of windows used for all examples unless otherwise stated.

2. Template-Based Modelling

MODELLER43 is used to generate Nmodels models for each set of morphed atomic coordinates, that is, unphysical geometries from the previous step. This fixes the geometry and provides reasonable side-chain orientations and interactions. We achieve this in the MODELLER package by running model generation with the protein sequence aligned onto itself. The output is an ensemble of models that are close to the morphed intermediate state but have reasonable geometry and side-chain interactions. In this work, we always generated 50 models per intermediate to provide sufficient diversity for picking a smooth path in the next step.

3. Finding a Smooth Path

In our initial trials, we noticed that in the MODELLER ensembles there was spread in some side-chain rotamers, which we feared might prove to be sampling bottle-necks if the path becomes too rough by jumping between these rotamers. There may also be concern that MODELLER could be biased toward the end-state which will be likely lie lower in energy than transition-state like structures (although we never saw any indication of this being an issue in our work). We therefore wished to pick that model out of each intermediate ensemble (each model being by itself a reasonable geometry fix) which allows for the smoothest overall path and minimizes any discontinuities. The ensemble of models at intermediate positions allows potentially for NmodelsNwindows paths between the end states, which is usually an astronomically large number. We therefore attempt to find a smooth path through the model space by running Monte Carlo annealing to minimize an energy proxy:

3. 2

where X(n) are the heavy-atom coordinates of intermediate n out of Nwindows. In essence, we minimize the RMSD of RMSDs between neighboring frames in the path, which gives a penalty to discontinuities with its quadratic term as the equivalent of a strain energy.

At step s of our Monte Carlo chain we have a path p(s), that is a sequence of Nwindows integers pi(s) for which 0 ≤ pi(s) < Nmodels. We initialize these as random numbers in the appropriate range. A Monte Carlo step is then the random exchange of a model for another one from the ensemble at one of the intermediates, i.e., the random assigment of one element of p(s) to give p(s + 1). This is then accepted or rejected according to the Metropolis criterion and the energy proxy stated in eq 2. The temperature proxy progression used in the Metropolis algorithm through the annealing procedure was determined by trial and error to work well as

3. 3

where the initial temperature proxy is T(0) = 50 and the run continues for smax = 104 steps. We always run 12 replicates of this procedure and pick the path with minimum energy sampled in the combined runs. Supplementary Figure S1 illustrates the Monte Carlo trajectories we obtained for the adenylate kinase open ↔ closed transition. While it would be difficult to properly converge to a global minimum path, here we are merely interested in some appropriately smooth path.

4. Processing of the Results

The frames of the determined path are then subjected to common MD simulation preparation steps: capping of termini (if appropriate), solvating MD boxes, and adding ions. We have automated these within the package; for this, we rely on various “gmx” tools in the GROMACS simulation engine.49 Details are provided below for how ligands and lipids are handled. PyMEMENTO supports cubic/orthorhombic and hexagonal boxes as well as the CHARMM and AMBER force fields.

5. Equilibration

For each of these boxes, in addition to the equilibration protocol described in the section on MD simulation details, we run extensive further NPT equilibration with position restraints (1000 kJ mol–1) on Cα atoms to ensure that all side chains have found suitable interaction partners. For deca-alanine, ADK, and P38α, we use 10 ns per window; for LeuT and holo-ADK we increase this to 90/100 ns to allow for the membrane and the ligand to equilibrate to a new protein conformation.

From these equilibrated path frames we start our umbrella sampling simulations as detailed in the section on MD simulations. PyMEMENTO provides some additional functionality for setting this up with Gromacs + PLUMED, but the process will fall to the user if a more involved setup (for example 2D umbrella sampling) is required.

Adding Ligands

It is often instructive to study how a ligand affects the conformational landscape of a protein, and for our ADK and P38α systems we have investigated this question with crystallographic ligands. However, for both of these proteins the ligand binding pose is only known for one of the conformational states (and may well not bind tightly to the other). We therefore aligned the end states and manually inserted the holo-state ligand conformation into the apo state, which is not a general approach but in these cases resulted in metastable ligand-bound poses after energy minimization. PyMEMENTO cannot fix internal ligand coordinates, but provided with holo end states thus prepared, it can include the ligand in all intermediate boxes after fractional translation of the center of mass. The ligand pose can then relax in our long NPT equilibrations with protein Cα restraints. The assumption is therefore that the ligand does not move much during conformational change and will relax on the given time scale. For ADK and P38α, this assumption holds well enough for computing PMFs; we validate this in Figure S2 against the alternative approach of using the best-scoring docking pose (obtained on the same MEMENTO intermediates using default options and 10 poses per window in the GNINA docking package50). We found that the ligand poses output by MEMENTO after equilibration with protein Cα restraints give docking scores only slightly lower than the best docked poses (Figure S2a,b); however, they are much smoother as measured by an RMSD metric of ligand poses between windows (when aligning the protein, Figure S2c,d). We also attempted to compute a 2D-PMF equivalent to those described below starting from the best docking poses. The shape was qualitatively similar (Figure S2e), but convergence was much degraded (Figure S 2f), presumably because of slow interconversion between the “jumpy” docked poses in REUS. For the purpose of this study, we have therefore restricted ourselves to the simple center-of-mass morphing + equilibration approach. To tackle conformational changes that have a larger component of ligand movement in future applications, we are also working on additional functionality to use an MC-annealing procedure equivalent to the one employed to smoothen MODELLER ensembles for picking docked poses. This extension will be available in a future version of PyMEMENTO and validated alongside its first application study.

We have further implemented alternative functionality to linearly morph ligand coordinates and use only energy minimization to fix them, but we do not employ it in this paper (we foresee it might become important when considering bound ions or crystallographic water in the future).

Adding Lipids

Our LeuT transporter example case is a membrane protein and therefore needs to be simulated within a membrane box. MODELLER cannot fix lipids, and attempting to morph them would likely be futile. Instead, one can provide one of the end states embedded in a lipid membrane. PyMEMENTO uses this pre-equilibrated membrane to embed all other conformational intermediates in lipids in a procedure inspired by the popular inflate-gro script.51 We first stretch the aligned membrane in its plane (here by 15%), followed by cycles (here 5 cycles) of compression (here by 4%) and energy minimization to fit it snugly around the new protein conformation. These parameters are customizable and may need to be adapted for different protein–lipid systems.

End-State Sampling, Replicates, and Two-Dimensional CVs

As detailed in the main text, we found great value in running extensive unbiased MD at the end states. Using snapshots from these trajectories to seed MEMENTO, we obtained independent replicates for which we first ran 1D REUS along a naive CV. Where there were significant differences between replicates, we concluded that the end-state sampling had captured a meaningful trend that was propagated through MEMENTO. We therefore concatenated all 1D REUS trajectories (at 5 ns stride for efficiency) and ran principal-components analysis (PCA) on the Cα positions using the GROMACS implementation. For P38α, we did this including only the DFG loop (residues 166–176) and also for the entire protein. The two first PCs obtained in each of these cases were our 2D-CVs for subsequent umbrella sampling.

For LeuT, using the entire protein for PCA, we found that the first PC unsurprisingly correlated very well with the 1D-CV; we used it as the first CV for 2D-REUS (PC 1). However, the contributions from the remaining PCs trailed off rather slowly, requiring 16 PCs to explain >75% of the variance. We thus decided to make a linear combination of PCs 2–16 that would optimally separate out the replicates (under the assumption that this is where the interesting conformational diversity would sit). We built an in-house script that uses scipy52 to maximize by differential evolution53 an entropy-like metric of distances between MEMENTO path frames:

graphic file with name ct3c00140_m004.jpg 4

where Nrep is the number of replicates (here 3) and by X(n, i) – X(n, j) we denote the distance between two conformational frames in different replicates i and j, evaluated in a projection along a given principal component. The result was termed PC 2 and used as the second CV in 2D-REUS.

While we believe that this approach for iterative CV design can be very useful in combination with MEMENTO, it does not generalize well across systems. Therefore, we did not include it in the PyMEMENTO package which focuses solely on path generation. The design of specific umbrella sampling CVs is left to the user, as far as the scope of this paper is concerned.

MD Simulation Details

All simulations in this study were run in GROMACS49 in versions 2021.3/2021.4 (the slight version discrepancy is because of different installations on two compute clusters we used). For production simulations, we used the leapfrog integrator with a time-step of 2 fs, the v-rescale thermostat with stochastic term54 with a time constant of 0.5 ps and target temperature of 300 K (310 K for LeuT), the Parrinello–Rahman barostat55 with a time constant of 2 ps and target pressure of 1 bar, and a short-range cutoff of 1.2 nm. Where lipids were present, we employed two temperature coupling groups (membrane + protein/water + ions) and a semi-isotropic version of the barostat (x/y, z axes). Example GROMACS *.mdp files are provided with PyMEMENTO.

Before starting production simulations, we energy-minimized and then equilibrated all simulation boxes for 200 ps in the NVT ensemble at a time step of 1 fs, followed by 1 ns in the NPT ensemble at a time step of 2 fs (with the Berendsen barostat), both with Cα position restraints in place. This was done before the extra MEMENTO equilibrations as described above.

Umbrella sampling simulations were run with PLUMED56 in versions 2.7.2/2.7.3 (again, due to different cluster installations). We used replica exchange every 1000 steps (2 ps), and the WHAM algorithm21 in the implementation by Alan Grossfield.57 Convergence was assessed—as discussed in the main text and shown in the supplementary figures—by ensuring histogram overlap of neighboring windows, visualizing the PMF using different fractions of the data, and calculating and RMSD between PMFs incorporating successively more data.

Further details are given for the individual systems below. We are making all simulation data available at 10.5281/zenodo.7851906 in the form of key coordinate files, full PLUMED output files with CV projections, and WHAM output free energy data. Raw simulation trajectories and various processing scripts will be made available upon reasonable request.

Deca-alanine

We used the peptide building functionality of PyMOL39 to generate helical and extended conformations of the deca-alanine peptide, which we capped with ACE and NME residues. The simulations were run with the CHARMM 36 force field58 (version July 2021). The peptide was solvated with 5237 solvent molecules in a cubic box of around 5.5 nm side length. Steered MD was run starting from the helical state on the Cα end-to-end distance over 50 ns, with a restraint sliding from 1.446 to 2.666 nm and a force constant of 5000 kJ mol–1 nm–2. US was done along the same CV, with restraint centers linearly interpolated between the terminal values and a force constant of 1000 kJ mol–1 nm–2 for 500 ns per window. For the SMD comparison, the starting frames were extracted from the SMD trajectory at even spacing in CV values. We conducted three replicates of both the MEMENTO and SMD-derived REUS. The total sampling time expended for deca-alanine was 72 μs.

ADK

We obtained structures for ADK in open59 and closed60 states from the PDB (open: 4AKE, closed: 1AKE—with inhibitor). Since we could simulate all residues (1–214), we did not cap the termini. We solvated the protein with approximately 16,500 solvent molecules (precise number varies between replicates) and a NaCl concentration of 0.15 M in a cubic box of around 8.1 nm side length. The simulations were run using the AMBER ff14.sb force field;61 in the holo-state simulations the AP5A inhibitor was parametrized using GAFF2.62 At each of the apo end states, we ran 500 ns of unbiased MD, though the closed state opened spontaneously (as is expected from the PMF), so that our MEMENTO replicates could only be seeded from the initial closed structure and the 0, 250, and 500 ns frames of the unbiased open run. Steered MD was conducted in 3 replicates in the closed → open direction and one replicate in the open → closed direction (since asessesing the stability of the resultant conformation was impossible, given that even the native closed state opened spontaneously). We ran these for 150 ns each with a restraint on the Cα-RMSD to the respective target state that linearly increased to 5000 kJ mol–1 nm–2, followed by 50 ns unbiased equilibration to judge stability of the obtained conformation. For 1D US, we ran three replicates for MEMENTO and SMD starting configurations each (the former prepared as described above, the latter as equally spaced frames from the three opening SMD runs) along the center-of-mass distance between the ATP-binding LID (residues 123–159) and AMP-binding NMP (residues 31–73) domains. Restraint centers were averaged CV values over the MEMENTO equilibration runs for all boxes, with a force constant of 2000 kJ mol–1 nm–2. 2D US ran along the LID–CORE (residues 1–30, 74–122, and 160–214) and NMP–CORE distances, with restraint centers again extracted from box equilibrations or the appropriate SMD frames, and a force constant of 1000 kJ mol–1 nm–2 along each CV. The “alternative closed state” was obtained by clustering (using GROMACS simple linkage with default parameters) MEMENTO-1D-US rep1 window 0 and SMD-1D-US rep1 window 2 as the most occupied cluster in each case. They were incorporated into 2D US by MEMENTO or SMD as described above.

We detail the amount of sampling collected on ADK in Table 1.

Table 1. Overview of All MD Sampling Performed on ADK.
Type of run Simulation time (ns)
Unbiased MD 500 × 2
SMD 200 × 6
1D US MEMENTO (259 + 234 + 258) × 24
1D US SMD (372 + 240 + 245) × 24
2D US MEMENTO apo (147 + 158) × 24
2D US SMD apo (158 + 159 + 157 + 250 + 255) × 24
2D US MEMENTO holo 201 × 24
2D US MEMENTO holo, docked ligands 205 × 24
Total 81 μs

P38α

We obtained structures for the DFG-in63 and DFG-out64 states from the PDB (DFG-in: 1P38, DFG-out: 1W83—with inhibitor). We used PyMOL to mutate two residues in 1P38 to match the sequence of 1W83: H48L and T263A. Note: in the PDB, entry 1P38 was superseded by 5UOJ in 2017, which does not model the DFG loop anymore. However, for consistency with other simulation work we wish to compare our results against,37 we still use the 1P38 coordinates, which is validated by the fact that it is a stable conformation in unbiased MD. We simulated residues 4–354 capped with ACE and NME in the AMBER ff14.sb force field;61 in the holo-state simulations the pyridine-containing L11 inhibitor was parametrized using GAFF2.62 We solvated the protein with approximately 27,300 solvent molecules (precise number varies between replicates) and a NaCl concentration of 0.15 M in a cubic box of around 9.6 nm side length. At each of the apo and holo end states (see note in Adding Ligands), we simulated 500 ns of unbiased MD, the 0, 250, and 500 ns frames of which formed the input structures for the MEMENTO and SMD replicates. For bidirectional SMD, we used the RMSD of all Cα atoms to the respective target structure as CV, gradually increasing the force constant to 20 000 kJ mol–1 nm–2 over 150 ns, followed by 50 ns of unbiased relaxation. When proceeding to 1D US, we initially trialed a difference RMSD, DRMSD = RMSD(DFG-in) – RMSD(DFG-out), on all Cα atoms, but found slightly better behavior when keeping whole-protein aligning but restricting the RMSD calculation to the DFG loop (resides 166–176). We therefore used this CV for all 1D US, with a force constant of 4000 kJ mol–1 nm–2 and restraint centers averaged from MEMENTO box equilibrations or extracted from SMD frames (only reps 1 and 3 of the SMD were used for US, since rep 2 did not lead to metastable conformations in the vicinity of the target states). We also tried to resolvate (using the GROMACS solvation tool) the conformations of 1D-US-SMD rep 1 at each intermediate but stopped our simulations short of the sampling time used in other runs, since we observed no significant difference in PMF. The 2D-US-CVs were derived as discussed in the previous section; we used a force constant of 2 × 106 kJ mol–1 on each CV during US (note that the PCs are dimensionless and have small absolute values when output by GROMACS).

We detail the amount of sampling collected on P38α in Table 2.

Table 2. Overview of All MD Sampling Performed on P38α.
Type of run Simulation time (ns)
Unbiased MD 500 × 4
SMD 200 × 6
1D US MEMENTO (150 + 142 + 140) × 24
1D US SMD (closing) (148 + 141) × 24
1D US SMD (opening) (147 + 130) × 24
1D US SMD (resolvation) (118 + 60) × 24
2D US MEMENTO apo (137 + 123 + 101) × 24
2D US SMD (closing) (110 + 158) × 24
2D US SMD (opening) (179 + 101) × 24
2D US MEMENTO holo (132 + 114 + 131) × 24
Total 62 μs

LeuT

We obtained structures for the inward-facing (IF)65 and outward-facing occluded (OCC)66 states from the PDF (IF: 3TT3, OCC: 3F3E), using MODELLER to fix a missing loop in the OCC structure from sequence. We simulate here residues 11–507, with ACE and NME caps, embedded with the CHARMM-GUI membrane builder67 in a bilayer of 344 POPE lipids, solvated with approximately 21 500 solvent molecules (precise number varies between replicates) at a NaCL concentration of 0.15 M in an orthorhombic box of around 10.6 × 10.6 × 9.8 nm side lengths. We performed 1 μs of unbiased MD for each end-state, where the 0, 500, and 1000 ns frames served as the starting structures for MEMENTO and SMD replicates. Steered MD was run on the Cα atom RMSD to the respective target state with a force constant linearly increasing to 10 000 kJ mol–1 nm–2 over 250 ns, then held for further 250 ns, and followed by unbiased relaxation for 100 ns. 1D US was set up with a distance RMSD CV, DRMSD = RMSD(IF) – RMSD(OCC), calculated on all Cα atoms. The force constant was 20 000 kJ mol–1 nm–2, and restraint centers were averaged from MEMENTO box equilibrations. The derivation of 2D CVs is described in the section above; we used a force constant of 5 × 106 kJ mol–1 along each CV. We used the same MEMENTO starting frames as for 1D US, and SMD frames extracted at equal spacing (only reps 1 and 2 of the OCC → IF SMD were used for US, since the others did not lead to metastable conformations in the vicinity of the target states). To improve histogram overlap, we included additional windows in our 2D-US: (1) a MEMENTO run between the starting and final conformations of the IF unbiased run for the 2D-MEMENTO-FES, (2) frames extracted from the IF unbiased run at equal CV spacing for the 2D-SMD-FES, (3) a MEMENTO run with 16 windows between windows 10 and 12 coordinates of MEMENTO rep 3, with higher force constant 2 × 107 kJ mol–1, for the 2D-MEMENTO-FES (4) a MEMENTO run with 16 windows between windows 9 and 12 coordinates of MEMENTO rep 2, with higher force constant 2 × 107 kJ mol–1, for the 2D-MEMENTO-FES.

We detail the amount of sampling collected on LEUT in Table 3.

Table 3. Overview of All MD Sampling Performed on LEUT.
Type of run Simulation time (ns)
Unbiased MD 1000 × 2
SMD 600 × 6
1D US MEMENTO (156 + 155 + 157) × 24
2D US MEMENTO (495 + 514 + 517) × 24
2D US MEMENTO extra 510 × 24 + (523 + 499) × 16
2D US SMD (507 + 511) × 24
2D US SMD extra 500 × 24
Total 118 μs

Results

Deca-alanine

We first sought to verify that MEMENTO can indeed fix unphysical morphs without introducing artifacts. Therefore, we applied the method to a simple toy model: deca-alanine in water (see Methods for details on simulation setup). This peptide undergoes a transition between α-helical folded and unfolded conformations that has a significant energetic and entropic barrier, but it has also been successfully described with US in the literature.68Figure 2a,b and the step-through Supplementary Video 1 illustrate how reasonable bond lengths, angles, and dihedrals are reconstructed by MEMENTO. Note also that the procedure produces evenly spaced intermediates, and thus, there is no bias toward the end-state structures. We ran one-dimensional REUS from these conformations, using the end-to-end distance of the peptide as a CV, and found a sharp free energy minimum at the helical state as well as a broad minimum at an ensemble of extended states (Figure 2c). We note that the shape of this PMF is compatible with the literature.

Figure 2.

Figure 2

(a) MEMENTO windows 0, 7, 15, and 23 after linear morphing, showing unphysical geometry in the intermediates. (b) The same windows after MODELLER processing, re-adding caps and pdb2gmx processing. Unphysical geometry is now fixed. (c) PMF of deca-alanine unfolding in water, sampled by 1D REUS along the end-to-end distance. The shaded area is the range of PMFs observed when taking only the first 60%, the last 60%, and the full sampling, which gives an indication of error and convergence. (d) Convergence of triplicate REUS simulations from SMD and MEMENTO paths. Both methods yield paths that converge on comparable time scales for deca-alanine, though convergence is inherently stochastic.

To further compare MEMENTO with existing methods, we also ran steered MD along the end-to-end distance CV to open up the helical state (Figure S3a). REUS from snapshots along this trajectory yielded a PMF virtually identical to MEMENTO (Figure S3b–f). Using three REUS replicates from independent MEMENTO and SMD runs, we also validated that they converge on comparable time scales (Figure 2d), though convergence is highly variable owing to the inherently stochastic nature of MD. We conclude that MEMENTO performance is on a par with SMD for simple systems and that it produces high-quality molecular conformers that behave well in US.

Adenylate Kinase (ADK)

While the previous example has established that MEMENTO does not degrade sampling of a system where sampling issues have no impact within the available time scale, we begin to see its benefits when moving to more complex systems. Adenylate kinase (ADK), an enzyme that catalyzes the interconversion of adenosine phosphates, is known to exhibit a large-scale conformational transition between open59 and closed60 states (Figure 3a). This conformational change may be described either by a varying distance between the LID and NMP domains or by the distance of LID and NMP domains from the protein CORE. For many years ADK has served as a benchmark system for computational research on conformational changes and enhanced sampling; we summarize studies which calculated PMFs on ADK in Table 4. While different studies have obtained substantially different results with various methodologies, the clear literature consensus of atomistic studies is that apo ADK favors the open state while the presence of substrate or inhibitor stabilizes the closed conformation. This is also consistent with the crystallographic data and mechanistic intuition. We note that many other papers6974 have studied ADK conformational changes without computing a PMF. Systematically comparing these is difficult, however, so within the scope of the current work we focused on broad consensus features of ADK PMFs.

Figure 3.

Figure 3

(a) An overview of the domain motions in the ADK open ↔ closed conformational change. (b) Representations of MEMENTO ADK intermediates 0, 7, 15 and 23, displaying the required domain motion. (c and d) PMFs from 1D-REUS of the ADK conformational change along the LID–NMP CV, with paths from (c) MEMENTO and (d) SMD. Shaded areas are the ranges of PMFs observed when taking only the first 60%, the last 60%, and the full sampling.

Table 4. Comparison of Our Results to Published PMFs of the ADK Conformational Changea.

Study Model Apo Holo Other observations
Arora 200775 atmb open closed LID closes before NMP
Lu 200876 CGc compe n/a LID and NMP closing equivalent
Beckstein 200977 impl/sold closed n/a open state is plateau
Jana 201178 atm intf n/a LID more flexible
Matsunaga 201279 atm open closed n/a
Song 201380 atm open n/a n/a
Wang 201381 CG comp comp LID more flexible
Wang 201482 atm open closed holo closed has two wells
Li 201583 atm open closed NMP prefers open in apo
Formoso 201584 atm comp n/a LID more flexible
Zeller 201585 atm open closed n/a
Shao 201686 atm comp n/a Intermediate: LID closed, NMP open
Matsunaga 201687 CG comp n/a NMP opens before LID
Halder 201788 atm int n/a NMP more flexible than LID
Zheng 201889 atm open n/a LID can close, NMP cannot
Wang 202037 atm open closed NMP can close, LID cannot
Wang 202090 atm open closed Intermediate at partial LID open
Peng 202191 atm open n/a LID closure preferred to NMP, but NMP first possible
Lu 202292 atm open closed native ligand binding pose → closed, non-native → open
This work atm open closed LID can close in apo, NMP cannot
a

We indicate the type of computational model used and which state was found to be favored in the apo and holo proteins. Our results agree with the consensus of the atomistic studies, while coarse-grained models appear to not capture the conformational changes well.

b

Atomistic.

c

Coarse-grained.

d

Implicit solvent.

e

Comparable, distinct basins for open and closed states of similar depth.

f

Intermediate, only one basin that lies between open and closed states.

We applied the MEMENTO procedure to ADK (intermediates are shown in Figure 3b and step-through Supplementary Videos 2 and 3), where we performed independent replicates by equilibration of the open state in unbiased MD (see Methods for details; closed-state equilibration could not be used to seed replicates due to spontaneous opening in MD). We also ran three replicates of SMD based on a Cα-RMSD CV in the closed → open direction, and one replicate in the open → closed direction (Figure S4a,b; since the closed state opened in unbiased MD, we had no measure for how successful the open → closed SMD direction might be; thus, we did not pursue it further). Using the LID–NMP domain distance as a simple one-dimensional CV, we computed PMFs for our three MEMENTO and SMD replicates (Figure 3c,d, convergence analysis in Figure S4c–f).

As expected, the open state was favored over the closed state by roughly similar free energy differences in all replicates. However, some discrepancies between replicates remained, which prompted us to investigate what a relevant orthogonal DOF may be in this case. By clustering windows 0 (MEMENTO) and 2 (SMD) of the respective first replicates, we found a metastable alternative closed state (Figure 4a), in which the LID is closed on the CORE, but the NMP domain remains open. On inspection, this conformation can be rationalized through a number of positively charged residues that coordinate the highly negatively charged substrate in holo ADK (see Figure 4c), which can engage in alternative salt-bridges in the absence of ligand: aspartate 33 and arginine 36 (NMP domain) coordinate arginine 131 and asparates 146 and 147 (LID domain). Several of the studies summarized in Table 4 have also found that the LID domain can close while NMP remains open in apo ADK simulations.86,89,91 This alternative closed state as an orthogonal DOF is sampled to different extents in the MEMENTO replicates (as it was not explicitly included in the simulation setup), leading to the observed differences between our 1D-REUS replicates.

Figure 4.

Figure 4

(a) An alternative closed-state identified by clustering trajectories from 1D-REUS with MEMENTO paths. The LID domain is closed while the NMP domain remains open. (b) PMF from 2D-REUS along the LID–CORE and NMP–CORE CVs, with MEMENTO paths connecting the closed, open, and alternative closed states. Crosses indicate the REUS window starting frames, connected in sequence. (c) AP5A inhibitor binding pose in the 1AKE crystal structure, illustrating how the negative ligand charge is accommodated by multiple positively charged protein residues. (d) PMF from 2D-REUS in the presence of AP5A, displaying a switch of the conformational preference to the closed state.

By connecting this alternative closed state to the open conformation with another MEMENTO path, we captured a 2D PMF along the LID–CORE and NMP–CORE distances as CVs (a combination used previously in the literature37), shown in Figure 4b, that converges very well (Figure S5). In a separate set of simulations including the crystallographic inhibitor AP5A, we likewise achieved a converged PMF—this time favoring the closed protein conformation (Figure 4d). Consistent with the majority literature conclusions, taken together these PMFs show how the crystallographic closed state is unstable in apo ADK, but also that the LID domain has significant flexibility in the open state—driven by non-native salt-bridge interactions that compensate partially for the lack of substrate. By contrast, holo ADK prefers the closed state, where a number of positively charged residues form salt bridges with the ligand.

We also investigated whether a similar PMF can be recapitulated using only SMD paths. When we projected our SMD runs onto the 2D-CV space, we discovered that the closed → open runs opened the LID before the NMP domain, and the open → closed run closes the LID domain before the NMP domain (Figure S6a). We note that this may have been the reason why inter-replicate differences were less pronounced in the 1D-REUS from SMD paths (Figure 3d) compared to MEMENTO. Since we only used the opening-direction paths for those calculations, the alternative closed state was sampled less compared with MEMENTO paths. In light of our MEMENTO results regarding LID flexibility the SMD paths look encouraging, however an attempt to compute a 2D-PMF by REUS (Figure S6b) converged poorly (Figure S6c,d), even if the rough shape of the PMF was consistent with our previous results. Notably, our MEMENTO paths are much smoother in CV space (Figure 4b, as indicated by black crosses and connecting lines on the 2D PMFs) than the SMD paths, which move less evenly (Figure S6b) and with more noise. Furthermore, we did not succeed in incorporating the alternative closed state explicitly into the PMF, because REUS from bidirectional SMD between the open and alternative closed states showed significant hysteresis (Figure S6e,f). We therefore conclude that MEMENTO not only produces results in line with expectations for ADK but also that there is a tangible advantage over SMD for path generation in this case. The example further highlights how MEMENTO can help identify orthogonal DOFs and thus design better CVs through the possibility of incorporating extensive end-state sampling to propagate conformational diversity that would not otherwise be feasible to achieve across many stratified windows. There is also full flexibility to add extra paths to an existing PMF, which is useful if one does discover relevant alternative conformations.

P38α

To provide a second example of a conformational change in a globular protein (that is not as much of a canonical example in the enhanced sampling field as ADK, but for which previous work still exists) we next focused our efforts on P38α, a mitogen-activated protein kinase with roles in heart physiology and implications in heart disease.93 P38α undergoes a conformational change in its DFG loop, where the DFG-in conformation63 is active, while the DFG-out state64 is inactive (illustrated in Figure 5a). It is known from NMR94 and EPR95 studies that the apo P38α displays a conformational equilibrium between DFG-in and DFG-out geometries, and that type 2 inhibitors (which bind to the DFG-out conformation) reduce the accessibility of the active DFG-in conformation. Several computational studies have broadly come to the same conclusions using various enhanced sampling methods,37,82,91 although inhibitors—where they were simulated—are not found to abolish completely the (meta-)stability of a DFG-in-like state but merely raise it in energy relative to DFG-out.

Figure 5.

Figure 5

(a) Overview of the P38α system, highlighting DFG-in and DFG-out states. (b) 1D-REUS along the DRMSD CV from MEMENTO paths, showing big differences between replicates. Shaded area is the range of PMFs observed when taking only the first 60%, the last 60%, and the full sampling. (c) PCA results, showing the two CVs obtained for 2D-REUS sampling. (d) 2D-REUS along PCA-CVs with MEMENTO paths. Crosses indicate the REUS window starting frames, connected in sequence. (e and f) 2D-REUS from SMD paths, in the (e) DFG-in → DFG-out and (f) DFG-out → DFG-in directions.

As for ADK, we first ran 500 ns of unbiased equilibrations at all combinations of DFG-in/out and apo/holo states. In the introductory Figure 1a,b, we have already demonstrated how SMD suffers from large hysteresis in 1D-REUS along a difference RMSD calculated for the DFG-loop only (DRMSD). To see how a history-independent path would perform in comparison, we ran three MEMENTO replicates (step-through Supplementary Videos 4 and 5), followed by 1D-REUS along the DRMSD (Figure 5b). While the results already look more reasonable than SMD, the substantial differences between replicates we found led us to hypothesize that our end-state sampling included significant conformational relaxation which was propagated through MEMENTO, but was of too large a time scale to be equilibrated in each REUS window. To find a set of CVs which includes this slow DOF, we opted to extract relevant information from the 1D-REUS sampling we had already accumulated, in an iterative fashion via principal component analysis (PCA; see Methods for more details). By including the DFG loop only or the whole protein in the PCA, we hoped to obtain principal components (PCs) that separate the DFG-loop motion from a superimposed protein conformational bias (Figure 5c and Supplementary Videos 6 and 7 illustrate the highest contributing PCs in each case, which we used as CVs in subsequent simulations). The loop PC expectedly covers a motion that would interconvert DFG-in and DFG-out loop conformations. The Protein PC also carries an element of DFG-loop motion, but in conjunction with a twist in the protein along the thin hinge that lies under the DFG loop.

Subsequent 2D-REUS we ran along the same MEMENTO paths with the new CVs produced a PMF that was highly instructive about the processes at work (Figure 5d). Projecting our initial apo-P38α unbiased MD (that started from the crystallographic DFG-out conformation) onto this PMF (Figure S7), one can clearly see how the protein assumes an alternative, less extreme DFG-out geometry on removal of the crystallographic ligand. Replicates 2 and 3 therefore start from this free energy basin, which explains why these replicates showed an increasing bias toward DFG-out in 1D-REUS, and also why replicate 1 converged poorly in the DFG-out region. Because the relaxation movement is now included in the PMF, convergence stands much improved and is no longer worse in some regions of the PMF than others (Figure S8).

With a working set of CVs at hand, we further wished to investigate whether the 1D-REUS hysteresis with SMD paths we observed was only caused by our poor initial 1D-CV guess for umbrella sampling, or whether the path exhibited starting-state dependence also in our more detailed 2D-CV space. We therefore projected our SMD replicates onto the 2D-PMF (Figure S9a) and found that replicates 1 and 3 appeared to lead to metastable states in the vicinity of their respective targets, even though the paths were not as smooth as MEMENTO in this CV space. However, in 2D-REUS, the paths still showed significant bias toward the SMD starting states (Figure 5e,f). This is not a simple convergence issue within sampling time-scales similar to those we used for MEMENTO-REUS (Figure S9c–f), although the DFG-out → DFG-in SMD direction has some issues with REUS histogram overlap leading to noticeably slower convergence. We also investigated whether the hysteresis might be predominantly due to slow solvent DOFs that could be addressed by resolvating the SMD intermediates, but we found that this does little to improve the situation (Figure S9b). In conclusion, while better CVs may well theoretically capture those DOFs that produce hysteresis in SMD for P38α, they are not as simple to obtain as with MEMENTO (and we have not managed to develop any here).

Finally, we wished to explore the role of the crystallographic type 2 inhibitor that was bound in the DFG-out structure64 (Figure S10a). Using the same 2D-CVs as before, we obtained a converged PMF (Figure S10b–d) that showed a free-energy basin near the DFG-out crystal structure, in contrast to the apo state. However, the PMF displays low free energy barriers to conformations that register as DFG-in to our CVs, which contradicts the experimental evidence. Noting that other MD studies (using 1D US)37,82 also found DFG-in conformations to be accessible in holo P38α, we judge that this may either be an issue with MD models of this system, or that what we observe as DFG-in-like in our simulations may not be experimentally relevant DFG-in conformations, thus hinting at another CV problem. To our knowledge, this issue has not been addressed before, and it would be worthwhile to do so in the future; however, we see this as beyond the scope of the present study.

Leucine Transporter (LeuT)

Thus far, our validation examples were all soluble proteins. However, conformational dynamics of other types of systems such as membrane proteins also play a crucial role in biology. For example, the solute carrier (SLC) superfamily encompasses 65 families of more than 450 transport proteins, which facilitate the movement of substrates ranging in size from protons to steroids and heme across the cell membrane.96 These transporters function by exposing a substrate binding site in turn to the intra- and extracellular medium, in what is termed an alternating access mechanism. Among the SLC transporters, many structurally diverse variants have been described, which operate by the rocker-switch, rocking bundle, and elevator mechanisms.97,98 A dynamical representation of these conformational changes is of great pharmacological interest since many drugs are carried by SLC transporters, and it may help design better drug delivery approaches in the future.

LeuT, a bacterial sodium-coupled leucine transporter, is an archaetypal SLC transporter with a rocking bundle mechanism (the folds observed in the SLC 6, 7, 11, 12, and 38 families are its namesakes). Therefore, understanding the LeuT conformational changes is likely to facilitate insight into the transport of various amino acids, neurotransmitters, and inorganic ions. Here, we apply MEMENTO and SMD to the conformational change between the inward-facing (IF)65 and outward-facing occluded (OCC)66 conformations (Figure 6a).

Figure 6.

Figure 6

(a) Overview of the IF and OCC structures of LeuT, and pore radius profiles calculated with CHAP.102 (b) Illustration of the closing motion of TM1, before (green) and after (pink) 1 μs unbiased MD. (c) 1D-REUS along the DRMSD CV from MEMENTO paths, showing big differences between replicates. Shaded area is the range of PMFs observed when taking only the first 60%, the last 60%, and the full sampling. (d) PCA results, showing the two CVs obtained for further sampling. (e and f) 2D-REUS from (e) MEMENTO and (f) SMD paths. Crosses indicate the REUS window starting frames, connected in sequence.

As in the previous examples, we first ran long (here 1 μs) unbiased MD on lipid-embedded boxes of these two conformations (both as apo protein). While the OCC state is stable in MD, we noticed that in the IF simulations, transmembrane helix 1 (TM1) collapsed at its intracellular end to an orientation that takes it closer to the opposing bundle; that is, it partially closed spontaneously (Figure 6b). Propagated through MEMENTO paths (step-through Supplementary Videos 8–11) and 1D-REUS along a DRMSD CV (based on the original IF and OCC coordinates), this leads to big differences between replicates. Notably, in replicates 2 and 3 (the starting states of which were “collapsed TM1”), the PMF is not sampled near the original IF structure (Figure 6c). Since we wished to determine the energetics of the full connecting path between the IF and OCC structures, we once again set out to construct 2D-CVs using PCA based on the 1D-REUS trajectories. The details of this procedure are described in Methods; in short, we found that too many PCs were required to explain a substantial proportion of the variance to be directly useful in US. Therefore, we took PC 1, the highest contributing direction, as one CV, while constructing the second CV as a linear combination between the next 15 PCs that separates the replicate MEMENTO paths best (we termed this CV “PC 2” for simplicity). These CVs are global and complex, but roughly speaking PC 1 covers the conformational changes occurring both at the intra- and extracellular sides between IF and OCC, while PC 2 focuses more on the TM 1 movement with less contributions at the extracellular face (Figure 6d and Supplementary Videos 12–15).

Equipped with these CVs, we proceeded to computing the 2D-PMF of the overall conformational change. In our first attempts, we experienced issues with undersampling the IF ↔ “collapsed TM1” region (unsurprisingly, since it was not originally included as an explicit path) as well as parts of the OCC ↔ collapsed TM1 transition region (it appears that 24 windows were not quite enough for this conformational change). We therefore used MEMENTO again to supplement our PMF with extra paths between the crystallographic and “collapsed TM1” IF states, as well as additional 16 windows along the replicate 2–3 paths within the transition region. The final PMF is shown in Figure 6e and converges reasonably well (Figure S11), though somewhat worse than previous examples even with 500 ns/window of sampling. The PMF reproduces the observation from unbiased MD that the IF state is not stable in the simulated conditions but collapses into a broad basin of an inward-facing, partially occluded state. There is then a second free energy barrier between this basin and the crystallographic outward-facing OCC state, which corresponds to the rocking bundle motion. Since this barrier is significantly smaller however than for the direct IF ↔ OCC interconversion, our simulations suggest a preferred sequential mechanism for the overall conformational change proceeding via first closing TM1 before engaging in the rocking bundle movement.

This PMF agrees well with previous data published on LeuT. Gur et al.99 simulated a set of unbiased MD trajectories, starting from crystal structures and paths obtained using an anisotropic network model-based biasing scheme termed coMD. They found two free energy basins of inward-facing and outward-facing conformations that interconvert most favorably via occluded states (although obtaining sufficient sampling in the transition region was challenging). They also found that the intracellular side of TM1 tends to partially close up during their situations. This observation had also been made in previous experimental100 and computational101 investigations and is reproduced in this work in the “collapsed TM1” state. The literature therefore supports the shape of the PMF we have obtained for apo LeuT. Moreover, to our knowledge this study contributes the most extensive sampling of LeuT IF ↔ OCC conformational dynamics expended to date, therefore corroborating earlier results. However, investigating the role of substrate and coupled sodium ions in the transport cycle—as other studies have attempted—was beyond the scope of the present work, since we are chiefly concerned with sampling methodology here.

In order to compare our MEMENTO 2D-PMF to what would be achievable with SMD for path generation, we ran three replicates of long (500 ns) SMD, with initial and target states set to the same frames of unbiased MD as we used for MEMENTO above (Figure S12). We found that only SMD OCC → IF replicates 1–2 and no IF → OCC runs reached metastable conformations near their target states. When we calculated a 2D-PMF along these paths, supplemented with extra frames from the apo IF unbiased run (in place of the extra MEMENTO path used above to close the gap between IF and “collapsed TM1”), we found a strong bias toward the OCC state (Figure 6f) and convergence worse than with MEMENTO paths (Figure S13). The fact that the windows from unbiased MD and SMD OCC → “collapsed IF” did not overlap well with each other (see Figure S13bc) made us suspect that hysteresis is at play here. A comparison to the MEMENTO 2D-PMF is also suggestive of strong hysteresis, if we assume the MEMENTO results as ground truth. We cannot explicitly show the starting-state bias by contrasting PMFs from SMD in opposite directions, since we were unable to perform SMD in the IF → OCC direction at all. This in itself can be taken as strong indication for hysteresis, nonetheless.

We have demonstrated in conclusion that MEMENTO—together with an iterative approach for deriving CVs to match DOFs sampled in long unbiased MD—can provide valuable insight into complex and global protein conformational changes in LeuT. Steered MD, in turn, fails again to provide paths of matching quality for the purposes of US, due to starting state bias.

Discussion

We investigated in this work four protein systems that exhibit conformational changes for which structural data is available at both end states: deca-alanine, ADK, P38α, and LeuT. Furthermore, for each of these systems, previous studies that calculated the PMFs for the conformational changes are available in the literature. The aim of this project was to determine the extent to which SMD suffers from hysteresis when used to seed umbrella sampling simulations for computing PMFs. We also wished to study whether these adverse effects can be mitigated by the use of paths generated with MEMENTO (implemented and freely available as PyMEMENTO), a simple procedure we devised for crafting history-independent paths using a combination of coordinate morphing and template-based structure modelling.

On deca-alanine, we found no difference in the quality of SMD and MEMENTO paths: the simple nature of the system evidently does produce significant hysteresis. The example therefore serves as validation of the PyMEMENTO code, showing that it does not introduce artifacts in an easy test case. Moving on to ADK, SMD and MEMENTO paths can both yield converged PMFs along physically intuitive CVs. When attempting to investigate the role of an alternative closed conformation, however, hysteresis appeared in bidirectional SMD, thus conferring an advantage to MEMENTO for a thorough exploration of the system. For P38α, SMD suffered from large-scale hysteresis. MEMENTO, in turn, was able to generate converged and consistent PMFs, albeit only once we had derived custom CVs using an iterative scheme that utilizes MEMENTO, extensive end-state sampling, and PCA. Lastly, concerning the membrane transporter LeuT, this approach for generating custom CVs was again successful with MEMENTO, while SMD suffered from hysteresis even when using the consistent CV space obtained from MEMENTO results. We note that for all examples, our results largely agree with previously published data in the literature.

In conclusion, SMD is prone to hysteresis when combined with umbrella sampling for all but the simplest conformational changes, at least without substantial prior knowledge regarding CV choice. We therefore suggest as best practice to all investigators who use SMD to generate paths for refinement and PMF computations to carefully use controls in both targeting directions, where possible. Hysteresis—if it is an issue within the relevant CV space—will become apparent as substantial differences between the obtained PMFs. It is unclear still whether this is also an issue when using SMD as a part of more elaborate biasing schemes; however, our results warrant a degree of caution. It is left for future work to establish this for each of the many methods using SMD. We also show that MEMENTO, despite its conceptual simplicity and easy implementation, is a powerful and flexible tool for conformational sampling. This is especially the case due to the ease of its combination with long end-state sampling and dimensionality reduction for CV derivation, and because newly discovered conformational states can effortlessly be integrated into existing ensembles of paths. Nonetheless, the free energy of protein conformational changes remains a formidable challenge that requires substantial expertise and system-specific knowledge or deep ad hoc investigation on the part of the researcher.

Acknowledgments

We thank Dr. Zhiyi Wu for training and helpful discussions during the early stages of the project, Dr. Irfan Alibay and Dr. Rocco Meli for their help with python software development, as well as all members of the Biggin group for their continued feedback. This project was funded by the Wellcome Trust (Grant ID: 218514/Z/19/Z) and compute resources at the EPSRC ARCHER2 and N8 CIR BEDE facilities, granted via the High-End Computing Consortium for Biomolecular Simulation (HECBioSim, https://www.hecbiosim.ac.uk/), supported by EPSRC (EP/L000253).

Data Availability Statement

Supplementary data with key coordinate files and simulation biasing and analysis output (in PLUMED format, and as PMFs produced by WHAM) is available at 10.5281/zenodo.7851906. Our PyMEMENTO package is available with source code, documentation, and examples on github: https://github.com/simonlichtinger/PyMEMENTO. We also provide a static fork of the package taken at the time of writing at https://github.com/bigginlab/PyMEMENTO.

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.3c00140.

  • Supplementary figures S1–S13 (PDF)

  • Set of 15 supplementary videos illustrating the progresion of MEMENTO paths as discrete intermediates, as well as the collective variables we derived using principal component analysis as rocking animations (ZIP)

The authors declare no competing financial interest.

Footnotes

1

It should be noted that TMD and SMD differ in the type of algorithm employed to pull the CV: while TMD uses a holonomic constraint, SMD employs a harmonic restraint. The name TMD has also been used to denote SMD if the CV is an RMSD to a target state,17,18 as is mostly the case in this paper. However, we use the name SMD throughout to highlight our use of a harmonic restraint. TMD and SMD have been shown to give consistent results, so that any problems are general to the formulation of the CV.18,19

Supplementary Material

ct3c00140_si_001.pdf (5.1MB, pdf)
ct3c00140_si_002.zip (24.8MB, zip)

References

  1. Orellana L. Large-scale conformational changes and protein function: breaking the in silico barrier. Frontiers in Molecular Biosciences 2019, 6, 117. 10.3389/fmolb.2019.00117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Jackson M. B. On the time scale and time course of protein conformational changes. J. Chem. Phys. 1993, 99, 7253–7259. 10.1063/1.465418. [DOI] [Google Scholar]
  3. Tokonami S.; Onose M.; Nakasone Y.; Terazima M. Slow conformational changes of blue light sensor bluf proteins in milliseconds. J. Am. Chem. Soc. 2022, 144, 4080–4090. 10.1021/jacs.1c13121. [DOI] [PubMed] [Google Scholar]
  4. Meshkin H.; Zhu F. Toward convergence in free energy calculations for protein conformational changes: a case study on the thin gate of mhp1 transporter. J. Chem. Theory Comput. 2021, 17, 6583–6596. 10.1021/acs.jctc.1c00585. [DOI] [PubMed] [Google Scholar]
  5. Hénin J.; Lelièvre T.; Shirts M. R.; Valsson O.; Delemotte L. Enhanced sampling methods for molecular dynamics simulations. arXiv 2022, 2202.04164. 10.48550/arXiv.2202.04164. [DOI] [Google Scholar]
  6. Zuckerman D. M.; Chong L. T. Weighted ensemble simulation: review of methodology, applications, and software. Annual Review of Biophysics 2017, 46, 43–57. 10.1146/annurev-biophys-070816-033834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Sugita Y.; Okamoto Y. Replica-exchange molecular dynamics method for protein folding. Chem. Phys. Lett. 1999, 314, 141–151. 10.1016/S0009-2614(99)01123-9. [DOI] [Google Scholar]
  8. Hamelberg D.; Mongan J.; McCammon J. A. Accelerated molecular dynamics: a promising and efficient simulation method for biomolecules. J. Chem. Phys. 2004, 120, 11919–11929. 10.1063/1.1755656. [DOI] [PubMed] [Google Scholar]
  9. Liu P.; Kim B.; Friesner R. A.; Berne B. J. Replica exchange with solute tempering: a method for sampling biological systems in explicit water. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 13749–13754. 10.1073/pnas.0506346102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Wang L.; Friesner R. A.; Berne B. J. Replica exchange with solute scaling: a more efficient version of replica exchange with solute tempering (rest2). J. Phys. Chem. B 2011, 115, 9431–9438. 10.1021/jp204407d. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Miao Y.; Feher V. A.; McCammon J. A. Gaussian accelerated molecular dynamics: unconstrained enhanced sampling and free energy calculation. J. Chem. Theory Comput. 2015, 11, 3584–3595. 10.1021/acs.jctc.5b00436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Torrie G.; Valleau J. Nonphysical sampling distributions in monte carlo free-energy estimation: umbrella sampling. J. Comput. Phys. 1977, 23, 187–199. 10.1016/0021-9991(77)90121-8. [DOI] [Google Scholar]
  13. Barducci A.; Bussi G.; Parrinello M. Well-tempered metadynamics: a smoothly converging and tunable free-energy method. Phys. Rev. Lett. 2008, 100, 020603 10.1103/PhysRevLett.100.020603. [DOI] [PubMed] [Google Scholar]
  14. Darve E.; Rodríguez-Gómez D.; Pohorille A. Adaptive biasing force method for scalar and vector free energy calculations. J. Chem. Phys. 2008, 128, 144120 10.1063/1.2829861. [DOI] [PubMed] [Google Scholar]
  15. Schlitter J.; Engels M.; Krüger P.; Jacoby E.; Wollmer A. Targeted molecular dynamics simulation of conformational change-application to the t - r transition in insulin. Mol. Simul. 1993, 10, 291–308. 10.1080/08927029308022170. [DOI] [Google Scholar]
  16. Izrailev S.; Stepaniants S.; Isralewitz B.; Kosztin D.; Lu H.; Molnar F.; Wriggers W.; Schulten K. In Computational Molecular Dynamics: Challenges, Methods, Ideas; Griebel M., Keyes D. E., Nieminen R. M., Roose D., Schlick T., Deuflhard P., Hermans J., Leimkuhler B., Mark A. E., Reich S., Skeel R. D., Eds.; Springer: Berlin, 1999; Vol. 4; pp 39–65. [Google Scholar]
  17. Apostolakis J.; Ferrara P.; Caflisch A. Calculation of conformational transitions and barriers in solvated systems: application to the alanine dipeptide in water. J. Chem. Phys. 1999, 110, 2099–2108. 10.1063/1.477819. [DOI] [Google Scholar]
  18. Huang H.; Ozkirimli E.; Post C. B. Comparison of three perturbation molecular dynamics methods for modeling conformational transitions. J. Chem. Theory Comput. 2009, 5, 1304–1314. 10.1021/ct9000153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Ovchinnikov V.; Karplus M. Analysis and elimination of a bias in targeted molecular dynamics simulations of conformational transitions: application to calmodulin. J. Phys. Chem. B 2012, 116, 8584–8603. 10.1021/jp212634z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Sugita Y.; Kitao A.; Okamoto Y. Multidimensional replica-exchange method for free-energy calculations. J. Chem. Phys. 2000, 113, 6042–6051. 10.1063/1.1308516. [DOI] [Google Scholar]
  21. Kumar S.; Rosenberg J. M.; Bouzida D.; Swendsen R. H.; Kollman P. A. The weighted histogram analysis method for free-energy calculations on biomolecules. i. the method. J. Comput. Chem. 1992, 13, 1011–1021. 10.1002/jcc.540130812. [DOI] [Google Scholar]
  22. Pan A. C.; Weinreich T. M.; Shan Y.; Scarpazza D. P.; Shaw D. E. Assessing the accuracy of two enhanced sampling methods using egfr kinase transition pathways: the influence of collective variable choice. J. Chem. Theory Comput. 2014, 10, 2860–2865. 10.1021/ct500223p. [DOI] [PubMed] [Google Scholar]
  23. Piana S.; Laio A. A bias-exchange approach to protein folding. J. Phys. Chem. B 2007, 111, 4553–4559. 10.1021/jp067873l. [DOI] [PubMed] [Google Scholar]
  24. Chen M. Collective variable-based enhanced sampling and machine learning. European Physical Journal B 2021, 94, 211. 10.1140/epjb/s10051-021-00220-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Dellago C.; Bolhuis P. G. In Atomistic approaches in modern biology; Reiher M., Ed.; Springer: Berlin, 2007; Vol. 268; pp 291–317; Series Title: Topics in Current Chemistry. [Google Scholar]
  26. Juraszek J.; Vreede J.; Bolhuis P. G. Transition path sampling of protein conformational changes. Chem. Phys. 2012, 396, 30–44. 10.1016/j.chemphys.2011.04.032. [DOI] [Google Scholar]
  27. Maragliano L.; Fischer A.; Vanden-Eijnden E.; Ciccotti G. String method in collective variables: minimum free energy paths and isocommittor surfaces. J. Chem. Phys. 2006, 125, 024106 10.1063/1.2212942. [DOI] [PubMed] [Google Scholar]
  28. Pan A. C.; Sezer D.; Roux B. Finding transition pathways using the string method with swarms of trajectories. J. Phys. Chem. B 2008, 112, 3432–3440. 10.1021/jp0777059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Moradi M.; Enkavi G.; Tajkhorshid E. Atomic-level characterization of transport cycle thermodynamics in the glycerol-3-phosphate:phosphate antiporter. Nat. Commun. 2015, 6, 8393. 10.1038/ncomms9393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. McComas S. E.; Reichenbach T.; Mitrovic D.; Alleva C.; Bonaccorsi M.; Delemotte L.; Drew D. Determinants of sugar-induced influx in the mammalian fructose transporter glut5. bioRxiv 2022, 1. 10.1101/2022.06.17.495601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Weiss D. R.; Levitt M. Can morphing methods predict intermediate structures?. J. Mol. Biol. 2009, 385, 665–674. 10.1016/j.jmb.2008.10.064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Castellana N. E.; Lushnikov A.; Rotkiewicz P.; Sefcovic N.; Pevzner P. A.; Godzik A.; Vyatkina K. Morph-pro: a novel algorithm and web server for protein morphing. Algorithms for Molecular Biology 2013, 8, 19. 10.1186/1748-7188-8-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Lee B. H.; Park S. W.; Jo S.; Kim M. K. Protein conformational transitions explored by a morphing approach based on normal mode analysis in internal coordinates. PLoS One 2021, 16, e0258818. 10.1371/journal.pone.0258818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Wells S.; Menor S.; Hespenheide B.; Thorpe M. F. Constrained geometric simulation of diffusive motion in proteins. Physical Biology 2005, 2, S127–S136. 10.1088/1478-3975/2/4/S07. [DOI] [PubMed] [Google Scholar]
  35. Farrell D. W.; Speranskiy K.; Thorpe M. F. Generating stereochemically acceptable protein pathways. Proteins: Struct., Funct., Bioinf. 2010, 78, 2908–2921. 10.1002/prot.22810. [DOI] [PubMed] [Google Scholar]
  36. López-Blanco J. R.; Chacón P. New generation of elastic network models. Curr. Opin. Struct. Biol. 2016, 37, 46–53. 10.1016/j.sbi.2015.11.013. [DOI] [PubMed] [Google Scholar]
  37. Wang A.; Zhang D.; Li Y.; Zhang Z.; Li G. Large-scale biomolecular conformational transitions explored by a combined elastic network model and enhanced sampling molecular dynamics. J. Phys. Chem. Lett. 2020, 11, 325–332. 10.1021/acs.jpclett.9b03399. [DOI] [PubMed] [Google Scholar]
  38. Li Y.; Gong H. Identifying a feasible transition pathway between two conformational states for a protein. J. Chem. Theory Comput. 2022, 18, 4529–4543. 10.1021/acs.jctc.2c00390. [DOI] [PubMed] [Google Scholar]
  39. The pymol molecular graphics system, version 2.6.0a0 open-source; Schrödinger, LLC: New York, NY, 2022. [Google Scholar]
  40. Krebs W. G. Survey and summary: the morph server: a standardized system for analyzing and visualizing macromolecular motions in a database framework. Nucleic Acids Res. 2000, 28, 1665–1675. 10.1093/nar/28.8.1665. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Seo S.; Kim M. K. Kosmos: a universal morph server for nucleic acids, proteins and their complexes. Nucleic Acids Res. 2012, 40, W531–W536. 10.1093/nar/gks525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Veevers R.; Hayward S. Morphing and docking visualisation of biomolecular structures using multi-dimensional scaling. Journal of Molecular Graphics and Modelling 2018, 82, 108–116. 10.1016/j.jmgm.2018.04.013. [DOI] [PubMed] [Google Scholar]
  43. Šali A.; Blundell T. L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 1993, 234, 779–815. 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
  44. Gowers R.; Linke M.; Barnoud J.; Reddy T.; Melo M.; Seyler S.; Domański J.; Dotson D.; Buchoux S.; Kenney I.; Beckstein O.. Mdanalysis: a python package for the rapid analysis of molecular dynamics simulations. Python in Science Conference, Austin, Texas, 2016; pp 98–105.
  45. Harris C. R.; Millman K. J.; van der Walt S. J.; Gommers R.; Virtanen P.; Cournapeau D.; Wieser E.; Taylor J.; Berg S.; Smith N. J.; Kern R.; Picus M.; Hoyer S.; van Kerkwijk M. H.; Brett M.; Haldane A.; del Río J. F.; Wiebe M.; Peterson P.; Gérard-Marchant P.; Sheppard K.; Reddy T.; Weckesser W.; Abbasi H.; Gohlke C.; Oliphant T. E. Array programming with numpy. Nature 2020, 585, 357–362. 10.1038/s41586-020-2649-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. McKinney W.Data structures for statistical computing in python. Python in Science Conference, Austin, Texas, 2010; pp 56–61.
  47. Hunter J. D. Matplotlib: a 2d graphics environment. Computing in Science & Engineering 2007, 9, 90–95. 10.1109/MCSE.2007.55. [DOI] [Google Scholar]
  48. Beckstein O.Becksteinlab/gromacswrapper, release 0.8.2; Tempe, AZ; 2021. [Google Scholar]
  49. Abraham M. J.; Murtola T.; Schulz R.; Páll S.; Smith J. C.; Hess B.; Lindahl E. Gromacs: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1–2, 19–25. 10.1016/j.softx.2015.06.001. [DOI] [Google Scholar]
  50. McNutt A. T.; Francoeur P.; Aggarwal R.; Masuda T.; Meli R.; Ragoza M.; Sunseri J.; Koes D. R. Gnina 1.0: molecular docking with deep learning. Journal of. J. Cheminf. 2021, 13, 43. 10.1186/s13321-021-00522-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Kandt C.; Ash W. L.; Peter Tieleman D. Setting up and running molecular dynamics simulations of membrane proteins. Methods 2007, 41, 475–488. 10.1016/j.ymeth.2006.08.006. [DOI] [PubMed] [Google Scholar]
  52. Virtanen P.; Gommers R.; Oliphant T. E.; Haberland M.; Reddy T.; Cournapeau D.; Burovski E.; Peterson P.; Weckesser W.; Bright J.; van der Walt S. J.; Brett M.; Wilson J.; Millman K. J.; Mayorov N.; Nelson A. R. J.; Jones E.; Kern R.; Larson E.; Carey C. J.; Polat I.; Feng Y.; Moore E. W.; VanderPlas J.; Laxalde D.; Perktold J.; Cimrman R.; Henriksen I.; Quintero E. A.; Harris C. R.; Archibald A. M.; Ribeiro A. H.; Pedregosa F.; van Mulbregt P.; Vijaykumar A.; Bardelli A. P.; Rothberg A.; Hilboll A.; Kloeckner A.; Scopatz A.; Lee A.; Rokem A.; Woods C. N.; Fulton C.; Masson C.; Häggström C.; Fitzgerald C.; Nicholson D. A.; Hagen D. R.; Pasechnik D. V.; Olivetti E.; Martin E.; Wieser E.; Silva F.; Lenders F.; Wilhelm F.; Young G.; Price G. A.; Ingold G.-L.; Allen G. E.; Lee G. R.; Audren H.; Probst I.; Dietrich J. P.; Silterra J.; Webber J. T.; Slavič J.; Nothman J.; Buchner J.; Kulick J.; Schönberger J. L.; de Miranda Cardoso J. V.; Reimer J.; Harrington J.; Rodríguez J. L. C.; Nunez-Iglesias J.; Kuczynski J.; Tritz K.; Thoma M.; Newville M.; Kümmerer M.; Bolingbroke M.; Tartre M.; Pak M.; Smith N. J.; Nowaczyk N.; Shebanov N.; Pavlyk O.; Brodtkorb P. A.; Lee P.; McGibbon R. T.; Feldbauer R.; Lewis S.; Tygier S.; Sievert S.; Vigna S.; Peterson S.; More S.; Pudlik T.; Oshima T.; Pingel T. J.; Robitaille T. P.; Spura T.; Jones T. R.; Cera T.; Leslie T.; Zito T.; Krauss T.; Upadhyay U.; Halchenko Y. O.; Vázquez-Baeza Y. Scipy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 2020, 17, 261–272. 10.1038/s41592-019-0686-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Storn R.; Price K. Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization 1997, 11, 341–359. 10.1023/A:1008202821328. [DOI] [Google Scholar]
  54. Bussi G.; Donadio D.; Parrinello M. Canonical sampling through velocity rescaling. J. Chem. Phys. 2007, 126, 014101 10.1063/1.2408420. [DOI] [PubMed] [Google Scholar]
  55. Parrinello M.; Rahman A. Polymorphic transitions in single crystals: a new molecular dynamics method. J. Appl. Phys. 1981, 52, 7182–7190. 10.1063/1.328693. [DOI] [Google Scholar]
  56. Tribello G. A.; Bonomi M.; Branduardi D.; Camilloni C.; Bussi G. Plumed 2: new feathers for an old bird. Comput. Phys. Commun. 2014, 185, 604–613. 10.1016/j.cpc.2013.09.018. [DOI] [Google Scholar]
  57. Grossfield A.Wham: the weighted histogram analysis method, ver. 2.0.10; Rochester, NY, 2020. [Google Scholar]
  58. Huang J.; Rauscher S.; Nawrocki G.; Ran T.; Feig M.; de Groot B. L.; Grubmüller H.; MacKerell A. D. Charmm36m: an improved force field for folded and intrinsically disordered proteins. Nat. Methods 2017, 14, 71–73. 10.1038/nmeth.4067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Müller C.; Schlauderer G.; Reinstein J.; Schulz G. Adenylate kinase motions during catalysis: an energetic counterweight balancing substrate binding. Structure 1996, 4, 147–156. 10.1016/S0969-2126(96)00018-4. [DOI] [PubMed] [Google Scholar]
  60. Müller C. W.; Schulz G. E. Structure of the complex between adenylate kinase from escherichia coli and the inhibitor ap5a refined at 1.9 å resolution. J. Mol. Biol. 1992, 224, 159–177. 10.1016/0022-2836(92)90582-5. [DOI] [PubMed] [Google Scholar]
  61. Maier J. A.; Martinez C.; Kasavajhala K.; Wickstrom L.; Hauser K. E.; Simmerling C. Ff14sb: improving the accuracy of protein side chain and backbone parameters from ff99sb. J. Chem. Theory Comput. 2015, 11, 3696–3713. 10.1021/acs.jctc.5b00255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Wang J.; Wolf R. M.; Caldwell J. W.; Kollman P. A.; Case D. A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157–1174. 10.1002/jcc.20035. [DOI] [PubMed] [Google Scholar]
  63. Wang Z.; Harkins P. C.; Ulevitch R. J.; Han J.; Cobb M. H.; Goldsmith E. J. The structure of mitogen-activated protein kinase p38 at 2.1-å resolution. Proc. Natl. Acad. Sci. U. S. A. 1997, 94, 2327–2332. 10.1073/pnas.94.6.2327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Gill A. L.; Frederickson M.; Cleasby A.; Woodhead S. J.; Carr M. G.; Woodhead A. J.; Walker M. T.; Congreve M. S.; Devine L. A.; Tisi D.; O’Reilly M.; Seavers L. C. A.; Davis D. J.; Curry J.; Anthony R.; Padova A.; Murray C. W.; Carr R. A. E.; Jhoti H. Identification of novel p38α map kinase inhibitors using fragment-based lead generation. J. Med. Chem. 2005, 48, 414–426. 10.1021/jm049575n. [DOI] [PubMed] [Google Scholar]
  65. Krishnamurthy H.; Gouaux E. X-ray structures of leut in substrate-free outward-open and apo inward-open states. Nature 2012, 481, 469–474. 10.1038/nature10737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Singh S. K.; Piscitelli C. L.; Yamashita A.; Gouaux E. A competitive inhibitor traps leut in an open-to-out conformation. Science 2008, 322, 1655–1661. 10.1126/science.1166777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Wu E. L.; Cheng X.; Jo S.; Rui H.; Song K. C.; Dávila-Contreras E. M.; Qi Y.; Lee J.; Monje-Galvan V.; Venable R. M.; Klauda J. B.; Im W. Charmm-gui membrane builder toward realistic biological membrane simulations. J. Comput. Chem. 2014, 35, 1997–2004. 10.1002/jcc.23702. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Hazel A.; Chipot C.; Gumbart J. C. Thermodynamics of deca-alanine folding in water. J. Chem. Theory Comput. 2014, 10, 2836–2844. 10.1021/ct5002076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Kirillova S.; Cortés J.; Stefaniu A.; Siméon T. An nma-guided path planning approach for computing large-amplitude conformational changes in proteins. Proteins: Struct., Funct., Bioinf. 2008, 70, 131–143. 10.1002/prot.21570. [DOI] [PubMed] [Google Scholar]
  70. Korkut A.; Hendrickson W. A. Computation of conformational transitions in proteins by virtual atom molecular mechanics as validated in application to adenylate kinase. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 15673–15678. 10.1073/pnas.0907684106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Kubitzki M. B.; de Groot B. L. The atomistic mechanism of conformational transition in adenylate kinase: a tee-rex molecular dynamics study. Structure 2008, 16, 1175–1182. 10.1016/j.str.2008.04.013. [DOI] [PubMed] [Google Scholar]
  72. Ye C.; Ding C.; Ma R.; Wang J.; Zhang Z. Electrostatic interactions determine entrance/release order of substrates in the catalytic cycle of adenylate kinase. Proteins: Struct., Funct., Bioinf. 2019, 87, 337–347. 10.1002/prot.25655. [DOI] [PubMed] [Google Scholar]
  73. Yuan Y.; Zhu Q.; Song R.; Ma J.; Dong H. A two-ended data-driven accelerated sampling method for exploring the transition pathways between two known states of protein. J. Chem. Theory Comput. 2020, 16, 4631–4640. 10.1021/acs.jctc.9b01184. [DOI] [PubMed] [Google Scholar]
  74. Kanada R.; Terayama K.; Tokuhisa A.; Matsumoto S.; Okuno Y. Enhanced conformational sampling with an adaptive coarse-grained elastic network model using short-time all-atom molecular dynamics. J. Chem. Theory Comput. 2022, 18, 2062–2074. 10.1021/acs.jctc.1c01074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Arora K.; Brooks C. L. Large-scale allosteric conformational transitions of adenylate kinase appear to involve a population-shift mechanism. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 18496–18501. 10.1073/pnas.0706443104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Lu Q.; Wang J. Single molecule conformational dynamics of adenylate kinase: energy landscape, structural correlations, and transition state ensembles. J. Am. Chem. Soc. 2008, 130, 4772–4783. 10.1021/ja0780481. [DOI] [PubMed] [Google Scholar]
  77. Beckstein O.; Denning E. J.; Perilla J. R.; Woolf T. B. Zipping and unzipping of adenylate kinase: atomistic insights into the ensemble of open-closed transitions. J. Mol. Biol. 2009, 394, 160–176. 10.1016/j.jmb.2009.09.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Jana B.; Adkar B. V.; Biswas R.; Bagchi B. Dynamic coupling between the lid and nmp domain motions in the catalytic conversion of atp and amp to adp by adenylate kinase. J. Chem. Phys. 2011, 134, 035101 10.1063/1.3516588. [DOI] [PubMed] [Google Scholar]
  79. Matsunaga Y.; Fujisaki H.; Terada T.; Furuta T.; Moritsugu K.; Kidera A. Minimum free energy path of ligand-induced transition in adenylate kinase. PLoS Computational Biology 2012, 8, e1002555 10.1371/journal.pcbi.1002555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Song H. D.; Zhu F. Conformational dynamics of a ligand-free adenylate kinase. PLoS One 2013, 8, e68023 10.1371/journal.pone.0068023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Wang Y.; Gan L.; Wang E.; Wang J. Exploring the dynamic functional landscape of adenylate kinase modulated by substrates. J. Chem. Theory Comput. 2013, 9, 84–95. 10.1021/ct300720s. [DOI] [PubMed] [Google Scholar]
  82. Wang J.; Shao Q.; Xu Z.; Liu Y.; Yang Z.; Cossins B. P.; Jiang H.; Chen K.; Shi J.; Zhu W. Exploring transition pathway and free-energy profile of large-scale protein conformational change by combining normal mode analysis and umbrella sampling molecular dynamics. J. Phys. Chem. B 2014, 118, 134–143. 10.1021/jp4105129. [DOI] [PubMed] [Google Scholar]
  83. Li D.; Liu M.; Ji B. Mapping the dynamics landscape of conformational transitions in enzyme: the adenylate kinase case. Biophys. J. 2015, 109, 647–660. 10.1016/j.bpj.2015.06.059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Formoso E.; Limongelli V.; Parrinello M. Energetics and structural characterization of the large-scale functional motion of adenylate kinase. Sci. Rep. 2015, 5, 8425. 10.1038/srep08425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Zeller F.; Zacharias M. Substrate binding specifically modulates domain arrangements in adenylate kinase. Biophys. J. 2015, 109, 1978–1985. 10.1016/j.bpj.2015.08.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Shao Q. Enhanced conformational sampling technique provides an energy landscape view of large-scale protein conformational transitions. Phys. Chem. Chem. Phys. 2016, 18, 29170–29182. 10.1039/C6CP05634B. [DOI] [PubMed] [Google Scholar]
  87. Matsunaga Y.; Komuro Y.; Kobayashi C.; Jung J.; Mori T.; Sugita Y. Dimensionality of collective variables for describing conformational changes of a multi-domain protein. J. Phys. Chem. Lett. 2016, 7, 1446–1451. 10.1021/acs.jpclett.6b00317. [DOI] [PubMed] [Google Scholar]
  88. Halder R.; Manna R. N.; Chakraborty S.; Jana B. Modulation of the conformational dynamics of apo-adenylate kinase through a pi–cation interaction. J. Phys. Chem. B 2017, 121, 5699–5708. 10.1021/acs.jpcb.7b01736. [DOI] [PubMed] [Google Scholar]
  89. Zheng Y.; Cui Q. Multiple pathways and time scales for conformational transitions in apo-adenylate kinase. J. Chem. Theory Comput. 2018, 14, 1716–1726. 10.1021/acs.jctc.7b01064. [DOI] [PubMed] [Google Scholar]
  90. Wang J.; Peng C.; Yu Y.; Chen Z.; Xu Z.; Cai T.; Shao Q.; Shi J.; Zhu W. Exploring conformational change of adenylate kinase by replica exchange molecular dynamic simulation. Biophys. J. 2020, 118, 1009–1018. 10.1016/j.bpj.2020.01.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Peng C.; Wang J.; Shi Y.; Xu Z.; Zhu W. Increasing the sampling efficiency of protein conformational change by combining a modified replica exchange molecular dynamics and normal mode analysis. J. Chem. Theory Comput. 2021, 17, 13–28. 10.1021/acs.jctc.0c00592. [DOI] [PubMed] [Google Scholar]
  92. Lu J.; Scheerer D.; Haran G.; Li W.; Wang W. Role of repeated conformational transitions in substrate binding of adenylate kinase. J. Phys. Chem. B 2022, 126, 8188–8201. 10.1021/acs.jpcb.2c05497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Marber M. S.; Rose B.; Wang Y. The p38 mitogen-activated protein kinase pathway—a potential target for intervention in infarction, hypertrophy, and heart failure. Journal of Molecular and Cellular Cardiology 2011, 51, 485–490. 10.1016/j.yjmcc.2010.10.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Vogtherr M.; Saxena K.; Hoelder S.; Grimme S.; Betz M.; Schieborr U.; Pescatore B.; Robin M.; Delarbre L.; Langer T.; Wendt K. U.; Schwalbe H. Nmr characterization of kinase p38 dynamics in free and ligand-bound forms. Angew. Chem., Int. Ed. 2006, 45, 993–997. 10.1002/anie.200502770. [DOI] [PubMed] [Google Scholar]
  95. Roser P.; Weisner J.; Simard J. R.; Rauh D.; Drescher M. Direct monitoring of the conformational equilibria of the activation loop in the mitogen-activated protein kinase p38α. Chem. Commun. 2018, 54, 12057–12060. 10.1039/C8CC06128A. [DOI] [PubMed] [Google Scholar]
  96. Pizzagalli M. D.; Bensimon A.; Superti-Furga G. A guide to plasma membrane solute carrier proteins. FEBS Journal 2021, 288, 2784–2835. 10.1111/febs.15531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Drew D.; Boudker O. Shared molecular mechanisms of membrane transporters. Annu. Rev. Biochem. 2016, 85, 543–572. 10.1146/annurev-biochem-060815-014520. [DOI] [PubMed] [Google Scholar]
  98. Drew D.; North R. A.; Nagarathinam K.; Tanabe M. Structures and general transport mechanisms by the major facilitator superfamily (mfs). Chem. Rev. 2021, 121, 5289–5335. 10.1021/acs.chemrev.0c00983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Gur M.; Zomot E.; Cheng M. H.; Bahar I. Energy landscape of leut from molecular simulations. J. Chem. Phys. 2015, 143, 243134. 10.1063/1.4936133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Kazmier K.; Sharma S.; Quick M.; Islam S. M.; Roux B.; Weinstein H.; Javitch J. A.; Mchaourab H. S. Conformational dynamics of ligand-dependent alternating access in leut. Nature Structural & Molecular Biology 2014, 21, 472–479. 10.1038/nsmb.2816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Grouleff J.; Søndergaard S.; Koldsø H.; Schiøtt B. Properties of an inward-facing state of leut: conformational stability and substrate release. Biophys. J. 2015, 108, 1390–1399. 10.1016/j.bpj.2015.02.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Klesse G.; Rao S.; Sansom M. S.; Tucker S. J. Chap: a versatile tool for the structural and functional annotation of ion channel pores. J. Mol. Biol. 2019, 431, 3353–3365. 10.1016/j.jmb.2019.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ct3c00140_si_001.pdf (5.1MB, pdf)
ct3c00140_si_002.zip (24.8MB, zip)

Data Availability Statement

Supplementary data with key coordinate files and simulation biasing and analysis output (in PLUMED format, and as PMFs produced by WHAM) is available at 10.5281/zenodo.7851906. Our PyMEMENTO package is available with source code, documentation, and examples on github: https://github.com/simonlichtinger/PyMEMENTO. We also provide a static fork of the package taken at the time of writing at https://github.com/bigginlab/PyMEMENTO.


Articles from Journal of Chemical Theory and Computation are provided here courtesy of American Chemical Society

RESOURCES