Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2022 Oct 7;18(10):e1010583. doi: 10.1371/journal.pcbi.1010583

Markov state modelling reveals heterogeneous drug-inhibition mechanism of Calmodulin

Annie M Westerlund 1,#, Akshay Sridhar 1,#, Leo Dahl 1, Alma Andersson 1,2, Anna-Yaroslava Bodnar 1, Lucie Delemotte 1,*
Editor: Guanghong Wei3
PMCID: PMC9581412  PMID: 36206305

Abstract

Calmodulin (CaM) is a calcium sensor which binds and regulates a wide range of target-proteins. This implicitly enables the concentration of calcium to influence many downstream physiological responses, including muscle contraction, learning and depression. The antipsychotic drug trifluoperazine (TFP) is a known CaM inhibitor. By binding to various sites, TFP prevents CaM from associating to target-proteins. However, the molecular and state-dependent mechanisms behind CaM inhibition by drugs such as TFP are largely unknown. Here, we build a Markov state model (MSM) from adaptively sampled molecular dynamics simulations and reveal the structural and dynamical features behind the inhibitory mechanism of TFP-binding to the C-terminal domain of CaM. We specifically identify three major TFP binding-modes from the MSM macrostates, and distinguish their effect on CaM conformation by using a systematic analysis protocol based on biophysical descriptors and tools from machine learning. The results show that depending on the binding orientation, TFP effectively stabilizes features of the calcium-unbound CaM, either affecting the CaM hydrophobic binding pocket, the calcium binding sites or the secondary structure content in the bound domain. The conclusions drawn from this work may in the future serve to formulate a complete model of pharmacological modulation of CaM, which furthers our understanding of how these drugs affect signaling pathways as well as associated diseases.

Author summary

Calmodulin (CaM) is a calcium-sensing protein which makes other proteins dependent on the surrounding calcium concentration by binding to these proteins. Such protein-protein interactions with CaM are vital for calcium to control many physiological pathways within the cell. The antipsychotic drug trifluoperazine (TFP) inhibits CaM’s ability to bind and regulate other proteins. Here, we use molecular dynamics simulations together with Markov state modeling and machine learning to understand the structural and dynamical features by which TFP bound to the one domain of CaM prevents association to other proteins. We find that TFP encourages CaM to adopt a conformation that is like the one stabilized in absence of calcium: depending on the binding orientation of TFP, the drug indeed either affects the CaM hydrophobic binding pocket, the calcium binding sites or the secondary structure content in the domain. Understanding TFP binding is a first step towards designing better drugs targeting CaM.

Introduction

Cellular Ca2+-signaling pathways affect a multitude of physiological processes, including learning and memory, muscle contraction, metabolism, and long-term depression [1,2]. Calmodulin (CaM) is a small and highly-conserved Ca2+-sensing protein which contributes to the ubiquity of Ca2+-signaling by binding and regulating a wide range of target-proteins. Indeed, CaM has been found to regulate voltage-gated ion channels, G-Protein coupled receptors, NMDA receptors and even proteins with opposite cellular function like kinases and phosphatases [3,4].

CaM consists of N-terminal and C-terminal lobes (N-CaM and C-CaM), as well as a linker connecting the two lobes (Fig 1A). The binding promiscuity is partly due to the flexible linker, which permits a wrapping of the lobes around target-proteins, but also due to the conformational plasticity of the two lobes [3,57]. Each of the N-CaM and C-CaM lobes contains two (helix-loop-helix) EF-hand motifs (Fig 1B) which each form a Ca2+ binding site [810]. The helices within each lobe enclose a hydrophobic pocket containing bulky aromatic residues and methionines that make up the interaction surface for target-proteins [11,12]. In line with the known binding promiscuity of CaM, various NMR, small-angle X-ray scattering (SAXS) and molecular dynamics (MD) simulation studies have demonstrated the protein’s conformational flexibility [5,1316]. Moreover, previous studies have demonstrated that Ca2+-ion binding in the two lobes affects properties of the conformational landscape by inducing a change in the interhelical angles within the EF motifs [5,8,9,17]. Specifically, the Ca2+-free (apo) CaM tends to adopt a closed conformation, shielding the hydrophobic residues from the surrounding solvent. Ca2+-bound (holo) CaM, on the other hand, may adopt more open conformations where the target-protein interacting residues are exposed to solvent [5,13].

Fig 1.

Fig 1

(A) The molecular structure of Calmodulin (PDB: 1CLL [8]). The N-CaM, linker and C-CaM are shown in green, orange and grey respectively. (B) The two EF-hand motifs making up the structure of C-CaM. The bound Ca2+ ions are shown as red spheres together with their coordinating residues. (C) The molecular structure of Trifluoperazine. The five atoms distributed across the ligand used to featurize the trajectory are labelled. (D) Trifluoperazine binding configurations within calmodulin (PDB: 1LIN [23]). Ligands at the inter-lobe sites are coloured yellow and those at the N-CaM/C-CaM intra-lobe sites are coloured purple. (E) The hydrophobic methionine and aromatic residues making up the intra-lobe binding site within C-CaM.

Antipsychotic phenothiazines make up a class of drugs which are known CaM modulators [1821]. By binding to CaM, the drugs effectively inhibit target-protein association. Interestingly, the hydrophobicity of these drugs correlates with the ability to inhibit CaM function, suggesting important interactions with the hydrophobic binding pockets of the two lobes [19]. Trifluoperazine (TFP, Fig 1C) is a potent member of this family of drugs whose interaction with CaM is well-studied [2225]. Apart from antipsychotic purposes, TFP has also been investigated as treatment for various cancer forms [20]. Crystallographic structures of TFP bound to CaM revealed that the drug is capable of binding to CaM with varying stoichiometries at both inter- and intra-lobe binding sites (Fig 1D).

TFP binding at the lower-affinity inter-lobe site is suggested to inhibit target-protein association by causing a compact CaM structure [2628]. In contrast, the effect on CaM’s ability to bind target-proteins due to intra-lobe drug binding is still poorly understood. In experimentally resolved CaM-TFP complex structures, TFP at this intra-lobe C-CaM site occupies two different poses where the drug interacts with the hydrophobic methionine and aromatic residues (Fig 1E) [25]. However, the mechanism and kinetics of transitions between the binding poses, as well as possible inhibitory effects on target-protein binding, remains unclear.

Molecular Dynamics (MD) simulations provide atomistic resolution of protein dynamic behavior and are routinely used to reveal the impact of drug-binding on protein function. However, conformational changes of CaM lobes, as well as phenothiazine binding, occur on the microsecond (μs) timescale; beyond the range of classical unbiased MD simulations [13,29]. Markov state modeling (MSM) is a computational technique that allows the stitching together of multiple short independent simulations and thus circumvents the sampling issue [30,31].

Here, we analyze the ensemble of TFP binding-modes in C-CaM, the most favorable binding site for TFP [32]. This is done by constructing an MSM from ~21 μs adaptively sampled atomistic simulations of CaM in presence of a bound TFP molecule. The obtained free energy landscape and kinetic network then validate the sampled drug-binding modes and illustrate their coarse structural and dynamical attributes. Through further systematic and comparative analysis of the MSM macrostates with previously sampled trajectories of TFP-unbound CaM, we uncover the state-dependent molecular determinants through which TFP prevents CaM from binding target-proteins.

Materials and methods

Calmodulin-TFP system preparation and equilibration

To enable extensive sampling of TFP bound to the multitude of C-CaM conformational states, we chose to initiate simulations in a diversity of conformational states. To do so, we obtained the three C-CaM states predicted from simulations of the 3CLN structure [33] in our prior work [5]–states 2, 5 and 6, which correspond to Calmodulin’s binding to partner proteins in shallow, intermediate and deep manner respectively (S1 Fig). The initial TFP positions within each state were generated by alignment of the C-CaM Cɑ to chain A of the 4RJD [25] structure. To account for the reduced ligand diffusivity within the collapsed hydrophobic core of state 2, we obtained an additional TFP initial pose by alignment of the C-CaM Cɑ to chain B of the 4RJD structure.

Each Ca2+ bound CaM-TFP complex obtained thereby were placed in a box containing a solution of 22598 water molecules and Na+ counter-ions. The system interaction energies were minimized until the maximum force on an atom was < 1000 kJ/mol and equilibrated by a 20 ps NVT ensemble with harmonic restraints of 1000 kJ/mol/nm2 applied to protein and TFP atoms. Pressure was then relaxed, and the box scaled using a Berendsen barostat [34] for 20 ps with similar restraints on the protein and TFP atoms.

Replica exchange solute tempering and adaptive sampling simulations

Before initiating the MD simulations for MSM construction, we ran replica exchange solute tempering (REST) simulations [35,36] to quickly explore the CaM-TFP conformational landscape and obtain a wide range of initial configurations to launch the initial round of adaptive sampling simulations. As such, the REST simulations were not run until convergence of the free energy landscape and not used in the MSM analysis. In temperature replica exchange (REMD) simulations, parallel simulations are run at different temperatures with the higher temperatures allowing a faster exploration of conformational states. REST simulations are a modification of the REMD algorithm wherein only a subset of the system is heated to enable a wider temperature span for the same number of replicas, thereby making the method more efficient for state exploration [36]. We initiated three sets of REST simulations from each of the three bound states with 5 replicas run for 500 ns each. Exchanges between adjacent replicas were attempted every 4 ps and the ‘temperature’ span (303.15 K to 330 K) was determined using the temperature predictor of Patriksson and van der Spoel [37] by only considering the CaM-TFP atoms. Efficiency of the REST simulations exchanges was assessed by analyzing the energy overlap between adjacent replicas and exchange probability (S2 Fig). Protein-ligand contacts of the lowest-temperature replica were then projected onto the two slowest degrees of freedom (called ‘tICs’—see ‘Selecting distance-based features and projecting the data onto slow degrees of freedom’). The two tICs were obtained from the continuous REST trajectories. For each CaM-TFP state, 10 points uniformly distributed across the two-dimensional grid spanning the extent of these two tICs were selected–and their representative conformations were used to initiate the first round of 100 ns-long unbiased MD simulations. Doing so ensured the initiation of MD simulations from both free energy basins and barriers with equal probabilities.

For subsequent rounds of adaptive sampling, we used a counts-based method to sample new initial configurations. This was done by first building an MSM (see ‘Construction of a Markov state model’) using the existing unbiased MD simulations and randomly selecting microstates to sample from, based on the stationary distribution over microstates. Specifically, the probability of sampling an initial configuration from a microstate was inversely proportional to its stationary probability [38]. Thus, microstates with a low stationary probability were more likely selected for seeding new simulations. In other words, rarely visited parts of the conformational landscape were more often chosen to launch new simulations. This simple counts-based method has been demonstrated to be efficient for exploring new states [38,39]. Through eight rounds of such adaptive sampling, we accumulated a combined simulation time of 20.8 μs across the initial C-CaM/TFP binding configuration.

Molecular dynamics simulation parameters

The MD simulations were performed using Gromacs 2018 [40] with a timestep of 2 fs. For the REST simulations, we used Plumed 2.3.5 [41,42]. Long-range electrostatic interactions were calculated using the particle mesh Ewald method [43] and hydrogen-bond lengths were constrained using LINCS [44]. Pressure and temperature were maintained through the use of the Parrinello–Rahman barostat [45] (1 bar) and Nose-Hoover (300 K) thermostat [46] respectively. The protein was described using the Charmm36 force field [47], water using the TIP3P model, Ca2+-ions using the Charmm27 parameters of Liao et al. [48] and TFP parameters were generated using STaGE [49].

Selecting distance-based features and projecting the data onto slow degrees of freedom

To build the MSM, we first determined distance features to describe TFP binding-modes within C-CaM. For this, the minimum distances between the 60 C-CaM residues 88–147 and five TFP atoms (C25, C24, C10, SC4, C16) were calculated. The five atoms were chosen to encompass the ligand’s functional groups (Fig 1C)–the C25 atom is within the aromatic ring while C10 is adjacent to the trifluoromethyl group, and the SC4 Sulphur is situated between the two. The C16 and C24 atoms, on the other hand, are situated within the alkyl linker and piperazine groups respectively. These distances were transformed into quasi-binary contacts, varying cut-offs from 5 to 8 Å, using the following transformations:

D={D<i,(D<i)1wherei=5,6,7,8Å)

To maximize the kinetic variance within the features, we evaluated these eight different feature types using the variational approach for Markov processes (VAMP2) score [50,51]. Each feature type was scored at 5 lag-times (2 ns, 5 ns, 10 ns, 15 ns, 20 ns) using the top ten eigenvalues and 5-fold cross-validation. S3 Fig shows the mean scores for different lag-times. The scores suggested that these feature types yielded MSMs of similar quality. However, the feature type given by a 6 Å cutoff and inverse distances yielded the highest mean VAMP2 score across lag-times, and this feature type was therefore selected for MSM construction.

The data was projected with time-lagged independent component analysis (tICA) [52] using time-lagged correlation matrices (tICA time-lag of 20 ns). A tICA projection provides a low-dimensional representation of the data along the slowest degrees of freedom. Overall, we used the python package MDTraj [53] to compute distance-based features, MSMBuilder [54] for parameter optimization and pyEMMA [55] for scoring the distance-based features, projecting the data, building the MSM and extracting the final macrostates.

Construction of a Markov state model

The Generalized matrix Rayleigh quotient (GMRQ) method was used for MSM hyperparameter selection. S4 Fig shows the GMRQ scores calculated from 5-fold cross-validation, suggesting that the inclusion of the 10 slowest time-lagged independent components (tICs) and 200 microstates is sufficient to describe the Markov model. The 10 slowest time-lagged independent components (tICs) obtained from the full set of simulations were thus used as input to k-means clustering to obtain 200 microstates. To select a Markovian, or memoryless, lag-time for the MSM, we first estimated a set of transition probability matrices at varying lag-times. Each such matrix describes the probability of transitioning between microstates at a specific lag-time. By plotting the implied timescales [56,57] of the obtained transition matrices, we identified a Markovian lag-time where the timescales appear to converge (15 ns–S5 Fig). The final MSM was constructed using this lag-time, and used to assign probability weights (πi) to each trajectory frame i. Equilibrium averages of any system characteristic O could then be obtained with a weighted sum,

O=πiOiπi.

The number of MSM metastable states was selected using eigenvalue spectral analysis of the MSM transition matrix (S6 Fig). PCCA++ spectral clustering [58,59] was then used to estimate the probability of a trajectory frame belonging to each macrostate. Finally, we identified core-states by assigning a trajectory frame to one of the macrostates if the state probability was >80%. The rest of the trajectory frames were left as transition points. The core states allow us to better distinguish the states due to reduced noise in state-definition.

Identification of important residues using machine learning

To pinpoint C-CaM residues participating in important residue-interactions of each identified TFP binding state, we used the demystifying toolkit [60]. This approach is based on explainable artificial intelligence (AI). In short, a machine learning model can be trained to recognize the three major states based on each frame’s internal coordinates and metastable state-assignment. We may then ask the model which input features contributed the most to making a classification (state assignment) decision. Here, the input features were given by the inverse distances between all C-CaM residue pairs which were less than 6.5 Å in at least one frame and more than 6.5 Å in another. It should be noted that this contrasts with the feature types used for MSM construction which considered distances between CaM residues and selected TFP atoms. We trained random forest (RF) models using scikit-learn [61] and default parameters as determined in Fleetwood et al. [60] to recognize each metastable state given the input features and calculated the importance of residue-interactions for distinguishing between states by their Gini-importance [62]. Moreover, we calculated the Kullback-Leibler divergence of inverse residue-distance distributions for comparison. For both models, we trained one model per state in a one-versus-the-rest fashion. Finally, the per-residue importance was calculated as a sum over all the residues’ interactions. The obtained residue importance was normalized between 0 and 1.

Results and discussion

Validation of Markov state model and identification of three major metastable states

We projected two-dimensional MSM free-energy landscapes along the three slowest time-dependent independent components (tICs). The best state separation was observed when the data was projected along tIC1-tIC3 (Figs 2A, S7 and S8). This free-energy landscape suggested the presence of three well-defined major local minima. The projection of the TFP-bound C-CaM structures onto the tIC space revealed how the free energy basins obtained using MD simulations compared with experimentally resolved structures: the 4RJD (Chain—or CaM molecule—B) and 1CTR structures indeed differ from 1A29 and 1LIN along tIC3. Interestingly, Chain A of the 4RJD structure is displaced from the other structures along tIC1, located close to a separate free-energy minimum. To further validate these free energy minima, we analyzed the paths taken by the various sampled trajectories and confirmed that we observe multiple transitions between the free-energy basins. Hence, the coverage of the conformational space by the combined set of MD simulations was extensive (S9 Fig).

Fig 2.

Fig 2

(A) Projection of the Markov state model free-energy surface along the time-lagged independent components tIC1 and tIC3. The projection of the 4RJD (Chains A & B) [25], 1LIN [23], 1A29 [24] and 1CTR [22] crystallographic structures onto the surface are shown as white dots. (B) S1-S7 macrostate assignments along the tICs obtained from PCCA++ clustering. Each trajectory frame is represented as a dot within the scatter plot. The stationary probability distributions of the seven macrostates and the associated standard deviations are shown in the inset.

Based on the MSM spectral analysis, we selected and extracted seven metastable states (S6 Fig). Fig 2B shows the resulting clustering, where each dot corresponds to a trajectory frame, colored by its macrostate assignment. The seven states (S1-S7), which were validated by Chapman-Kolmogorov tests (S10 Fig), are well separated in the tIC space (S7 and S8 Figs). We note, however, that the three major macrostates (S5, S6, S7) accounted for ~98% of the stationary probability distribution (Fig 2B). We thus proceeded to characterize these three major TFP binding-modes and their effect on C-CaM conformations.

The metastable states represent distinct TFP binding modes

We first identified the coarse molecular features of each metastable state by calculating the ligand root mean square displacement (RMSD) across frames in the state after alignment of the C-CaM Cɑ atoms. The trajectory frame with the lowest mean RMSD was then selected as a representative structure. A visual inspection of the resulting representative binding modes provides an initial explanation of their separation in tIC-space (Fig 3A). The trifluoromethyl (CF3) moiety of TFP within S5/S6 and S7, for example, is found in reversed orientations, either pointing towards or away from the hydrophobic pocket, respectively. Consistently, the experimentally resolved structures projected onto the free energy basins display similar ‘flipped’ configurations: the CF3 group points towards the hydrophobic pocket in 4RJD (Chain A) [25], 1A29 [24] and 1LIN [23], and away from the pocket in 4RJD (Chain B) [25] and 1CTR [22].

Fig 3.

Fig 3

(A) Representative structures of the S5, S6 and S7 macrostates aligned using the Cɑ C-CaM. The flipped orientation that buries the CF3 moiety within the S5 and S6 macrostates is highlighted using an orange circle. The three F92, F141 and M124 residues whose interactions characterize apo- and holo-CaM states are illustrated. (B) Relative contact frequency of the TFP atoms with C-CaM residues compared between the three S5, S6 and S7 macrostates. The flipped orientation that buries the CF3 moiety within the S5 and S6 macrostates is highlighted using an orange circle and the contrasting burying of the C10 atom with S7 is highlighted using cyan circles. (C) Projection of the centres of the three populated macrostates onto the calculated free-energy surface along time-dependent independent components tIC1 and tIC3. Transition rates representing the mean first passage times between the macrostates and the associated standard deviations are illustrated as arrows.

Clearly, state S7 represents a separate binding pose compared to S5 and S6. A distinct difference between S5 and S6 is, however, difficult to assess by visual inspection. Moreover, unlike the static representative structures, the ensembles of microstates provide a dynamic picture of residue movements and interactions with TFP. To account for the conformational heterogeneity within metastable core-states, we therefore analyzed the ensemble of frames assigned to each state. By doing so, we set out to reveal subtle, but important, differences in the interactions between TFP and CaM. First, we validated the pose difference across metastable states by studying the state-dependent interactions between TFP atoms and C-CaM residues. The difference in contact frequencies between the three metastable states is shown in Fig 3B. Consistent with a flipped orientation between S5/S6 and S7, the C10 and C25 atoms located at opposite edges display similar delta contact frequencies within the S7:S5 and S7:S6 differences. In S5 and S6, the C25 atom and the Fluorines are buried to interact with the F89/F92 residues at the base of the hydrophobic pocket (orange circle). Alternatively, in S7, the C10 atom slots into the hydrophobic pocket and interacts with the buried aromatic residues (cyan circle). The differences between the S5 and S6 binding poses, however, remained difficult to characterize. Indeed, these differences are subtle, and the orientation of the phenothiazine tricyclic is identical in both states. The atomistic contact analysis instead suggests that the main difference between the two binding-modes primarily stems from the interactions between C24 and the residues making up the binding pocket.

To depict the dynamics of interconversion between TFP binding modes, we calculated the mean first passage times (MFPT) [63] between the three major macrostates. The results demonstrate that the S5 and S6 states interchange on the sub-microsecond timescale while exchanges with the S7 state occur on a significantly longer timescale, consistent with the structural differences observed above. Interestingly, the MFPTs also show that a transition to S7 from the S5 state with the buried TFP typically requires a longer timescale than a transition from the less buried TFP pose in state S6. This hints that although S6 is globally more similar to S5, it may represent an intermediate state with subtle features similar to both S7 and S5.

TFP binding-modes alter the C-CaM binding pocket by affecting local interactions at a Ca2+-binding site

With the MSM, we were able to identify three major macrostates which agree with the experimentally resolved TFP-bound configurations. However, the backbone conformations of C-CaM in the representative structures are nearly identical, with a maximum Cɑ RMSD of only 1.71 Å between them. Hence, we hypothesize that the effect of TFP binding-modes on C-CaM conformation is instead manifested through subtle state-dependent changes of residue interactions. To pinpoint CaM residues with distinct interaction patterns in the three states, we utilized machine learning and explainable artificial intelligence [60]. Briefly, we used a random forest (RF) classifier to recognize the three major states based on internal coordinates and metastable state-assignment, and then extracted the per-residue accumulated feature importance (see ‘Materials and Methods’).

Fig 4A–4C show the state-dependent RF importance profiles. A high RF importance indicates large state-dependent changes in the residue’s interaction pattern and thereby suggests movement of the residue relative to its interacting residues. Estimating importance from the Kullback-Leibler divergence between distances’ probability distributions validates these importance profiles by yielding profiles which share main features with the ones obtained by RF (S11 Fig). As expected for S5/S6 and S7 states with opposing orientations, the RF models identify M144 and F92 residues as important for discerning the states. This result is consistent with the C10 atom oriented towards these residues in the respective states [25]. In fact, both F92 and M144 are highly conserved residues [64], often interacting with target-proteins [7]. Earlier work showed that exposure of F92 to solvent favors deep binding to target-proteins [5], and that the F92A mutant is linked to a drop in ion affinity [65]. In addition to F92, we also note that F141, another highly conserved residue [64], has a relatively high importance in the S7 profile. Not only is F141L a known long QT syndrome (LQTS) mutation which also decreases Ca2+-affinity [66,67], but the interaction and stacking of F141 and F92 has been shown to associate with the transition from apo- to holo-like conformations [13]. Moreover, we observe an increased importance of V108 in S7 (Fig 4A). This residue is indeed important for the packing and repacking of these aromatic residues during the transition from apo- to holo-like conformations [13]. It was also found important to characterize the overall conformational ensemble of C-CaM [60]. Finally, we note that N111 has a relatively high importance in the S7 profile (Fig 4A). This residue has also been suggested to change interactions upon Ca2+-binding [68]. To further rationalize the structural basis of the importance profiles, we calculated the change in mean inter-residue distances between the states (Fig 4B). Consistent with the RF importance predictions, distances between the helices comprising the F92 and F141 residues are further apart in S7 than in S6/S5 (orange circle). The helix comprising the V108 and N111 residues show similar state-dependant differences with distance to the F92 helix farther in S7 than in S6/S5 (cyan circle) and distance to the F141 helix farther in S6/S5 than in S7 (black circle)–a result consistent with the residues’ capability to impede hydrophobic stacking within the C-CaM core.

Fig 4.

Fig 4

(A) The per-residue importance in discerning the S5, S6 and S7 macrostates calculated using the supervised Random Forest method. Plots represent the mean importance values calculated from five-fold cross-validation and the standard deviations are plotted as error bars. Physiologically important residues and those with high importance values are illustrated using red dotted lines. (B) Relative C-CaM inter-residue distances compared between the S5, S6 and S7 macrostates. The important F92, V108, N111 and F141 residues identified by the Random Forest method are illustrated as dotted lines. The increased distance between the F92-F141 and F92-V108 residue pairs in S7 are highlighted as orange and cyan circles respectively. The decreased distance between the V108-F141 residues in S7 is highlighted with a black circle/ (C) Distance between center-of-mass of the N97 and E104 residues making up the Ca2+ binding site within the S5, S6 and S7 macrostates. Each violinplot spans the 5th and 95th percentile of distances and densities are weighted by the Markov state model probabilities. The median value for each macrostate is represented as a white dot. The inter-residue distances calculated from apo- (PDB: 1CFD [9]) and holo- (PDB: 1CLL [8]) calmodulin structures are shown as grey and black dotted lines respectively. The mean inter-residue distances within MD simulations [5] of apo- and holo-CaM are shown as grey and black dashed lines respectively. The standard-deviations of the residue distances within the MD simulations are shown as a shaded area.

Next, we detect differences in the importance profiles of S5 and S6 (Fig 4A). We find that N97 is important for recognizing S5, while F92 appears to change its interaction pattern in S6. Interestingly, the N97I mutant has been shown to lead to a reorientation of the hydrophobic domain [3], while the N97S mutant causes LQTS [67]. The latter has specifically been shown to affect the activation of the voltage-gated ion channel KCNQ1 likely due to the interactions between CaM and the voltage-sensor domain during channel activation [69,70]. The estimated importance of F92 is likely due to its position within the hydrophobic pocket and interaction with TFP (Fig 3A). N97, on the other hand, is on a loop distant from the hydrophobic pocket, and instead coordinates one of the bound Ca2+ ions (Fig 1B). To assess the state-dependence of residue interactions within this Ca2+-binding site, we computed the distribution of distances between the N97 and its interacting residue E104 for each state (Fig 4C). In the S5 state, the residues are stable at a spatially close distance, a characteristic of holo CaM. The flipped TFP pose in state S7, however, disrupts the ion-binding site and leads to a more apo-like conformation with a larger separation of the residues. Incidentally, TFP binding in C-CaM has been suggested to markedly reduce the Ca2+ affinity [32], possibly due to such disruptions at the ion binding site. Consistent with the higher affinity of the second Ca2+-binding site [71], however, the TFP binding modes display no noticeable disruption on its stability (D129-D133 –S12 Fig).

Flipping of the TFP molecule affects the hydrophobic pocket and is associated with a changed β-sheet structure content

As mentioned above, the process of binding and unbinding Ca2+ is intrinsically associated with CaM conformational changes. In the presence of ions, the lobe undergoes a ‘repacking’ of hydrophobic residues which is characterized by the movement of the F92 residue. This repacking results in a stacking of four aromatic residues (S13 Fig) [13]. Motivated by the findings from the state-dependent RF importance profiles, we therefore investigated how the different TFP binding-modes affect the hydrophobic pocket. To characterize this, we studied the interactions between F92, M124 and F141 –three residues from different helices whose sidechains subtend into the pocket (Fig 1E). In fact, M124, together with M109, M144 and M145, make up the set of highly conserved methionines [64] which often participate in target-protein binding [11,12].

Fig 5A shows the hydrophobic packing along two inter-residue distances–the F92-F141 and F92-M124 distances. In contrast to the other two states, the slotting of the C10 atom with its aromatic ring in S7 prevents the otherwise advantageous stacking interactions of F92 with the other aromatic residues (Fig 3A). This leads to the adoption of a conformation reminiscent of the crystallographic apo-CaM (Fig 5A). However, the observed binding mode-dependent effects are local. As such, the packing of the methionines lining the pocket is left unaffected (S14 Fig). Viewing the S7 as a state with local apo-like structural features is consistent with the findings that the transition from apo- to holo-like conformation is mainly characterized by the repacking of aromatic residues rather than changed interactions between the methionines [13,72]. In summary, our results suggest that the F92, F141 and N97 residues play particularly significant roles in the coupling between the hydrophobic domain and Ca2+-site, and that the orientation of TFP controls the switch between these apo- and holo-like interactions.

Fig 5.

Fig 5

(A) Inter-residue distances between the sidechains of the F92, M124 and F141 within the S5, S6 and S7 macrostates. The two-dimensional histograms are weighted by their Markov state model probabilities. The inter-residue distances calculated from apo- (PDB: 1CFD [9]) and holo- (PDB: 1CLL [8]) calmodulin structures are shown as grey and black dots respectively. The mean inter-residue distances within MD simulations [5] of apo- and holo-CaM are shown as grey and black squares respectively. Error bars represent the standard deviations of residue distances within the MD simulations. (B) The per-residue secondary structure of C-CaM within the three populated macrostates compared to their behaviour within simulations of apo-CaM. The Y99 and N137 residues displaying enhanced sheet behaviour are illustrated as dotted lines. (C) The molecular structure of C-CaM illustrating the Y99-N137 antiparallel sheet situated adjacent to the two Ca2+ ion binding sites. The bound Ca2+ ions are shown as red spheres.

The two Ca2+ sites in C-CaM are cooperative with ion-binding at one interface resulting in up to a 10 kJ/mol enhancement in binding at the other interface [5,6]. Moreover, binding Ca2+ at the two sites yields a shift in the position of the β-sheet structure within the lobe [5]. To investigate whether the state-dependent TFP-binding can also affect the apo- and holo-like hallmarks of secondary structure content, we compared the per-residue secondary structure frequency in each state with that of the drug-unbound CaM ensemble (Fig 5B) [5].

The 98GYISA102 and 134GQVNY138 loops are adjacent to the two ion-sites and transiently form antiparallel β-sheet structures [73]. Specifically, the Y99 and N137 residues participate more frequently in sheet structures in the S5 and S6 states (Fig 5B and 5C). Recent work has shown that Y99 and N137 are important for the functional interactions between CaM and the activated-open state of KCNQ1 [70]. Together, the results provide a hypothesis on the mechanism of Calmodulin inhibition by the varying TFP binding modes. Apart from obstructing the hydrophobic pocket, in the S5 and S6 states, TFP could potentially hinder interactions between CaM and target-proteins by locking the crucial Y99 and N137 residues in a β-sheet structure. Conversely, within the S7 state, TFP functions by enabling a local switch to an apo-like state characterized by differential interactions of the aromatic F89, F92 and F141 residues.

Conclusions

The rationalizing of protein-ligand interactions forms the basis of drug design. The structural heterogeneity of CaM, which forms the basis of its function, however, makes it a challenging drug target [74]. The inhibition of CaM requires a drug to bind across its conformational ensemble, including transitioning between apo- and holo-like states. To understand the molecular aspects of such an inhibition, we performed extensive unbiased and adaptively sampled MD simulations of TFP-bound (holo) CaM to build an MSM from which metastable TFP-binding modes could be extracted. The results demonstrate that TFP is a heterogeneous inhibitor which acts by blocking the different CaM binding pockets via various binding poses. The burial of the halogen moiety into the hydrophobic pocket, for example, correlates with C-CaM adopting subtle holo-like features around a Ca2+ binding site, while the ‘flipped’ TFP-binding mode could be attributed to apo-like features in this region. Moreover, we observed a general apo-like secondary structure content in C-CaM due to TFP binding. This, together with the observed state-dependent packing of hydrophobic residues in C-CaM, hints that TFP may prevent target-protein binding through subtle yet distinct blocking mechanisms. The different binding-poses may affect the coupling between the C-CaM β-sheet structure, a Ca2+-binding site and the packing of aromatic residues.

An extension of the current work may address the construction of MSM models for understanding TFP binding to N-CaM. Since N-CaM is less dynamic than C-CaM [16], a comparison can shed further light on the inhibitory mechanism of TFP. Nonetheless, the notion that TFP uses different binding poses to block CaM is in line with the conformational heterogeneity and binding-promiscuity of CaM. This work thus deepens our understanding of how one drug can inhibit such highly flexible binding pockets. The results presented here may serve as a stepping-stone towards a full understanding of the pharmacological modulation of CaM, with implications on signaling pathways and associated diseases.

Supporting information

S1 Fig. Calmodulin states 2, 5 and 6 obtained from Molecular Dynamics simulations of the 3CLN structure that were used as initial configurations for the REST simulations.

The bound Ca2+ ions included in the simulations are shown as red spheres.

(TIFF)

S2 Fig. Efficiency of the Replica-Exchange with Solute Tempering (REST) simulations initiated from the three CaM states assessed by the energy overlap and mean exchange acceptance probability between adjacent replicas.

(TIFF)

S3 Fig. VAMP2 scores of the 10 slowest processes for different feature transformations calculated at a variety of lag times τ.

The mean values from five-fold cross-validation are plotted as bars and error bars represent the standard deviations. The mean value across the lag times calculated for each feature transformation is mentioned within the legend.

(TIFF)

S4 Fig. The optimization of MSM hyperparameters through the calculation of GMRQ scores for different feature transformations at varying numbers of microstates and processes.

The mean values from five-fold cross-validation are plotted as dots and standard deviations are plotted as error bars.

(TIFF)

S5 Fig. Top 8 eigenvalues of the transition probability matrix calculated at varying lag-times to identify a memoryless Markovian time.

The 95% confidence intervals of the eigenvalues are shown as shaded regions. The black solid curve delimits a shaded region where the implied timescales are shorter than the lagtime.

(TIFF)

S6 Fig. Spectral analysis of the eigenvalues to identify the number of clusters.

The cutoff between the sixth and seventh relaxation timescales selected in this work is illustrated as red dotted line.

(TIFF)

S7 Fig. Top: Projection of the Markov state model free-energy surface along the three slowest time-lagged independent compo- nents (tICs).

Bottom: S1-S7 macrostate assignments along the three tICs obtained from PCCA++ clustering. Each trajectory frame is represented as a dot within the scatter plot.

(TIFF)

S8 Fig. Separation of the S5, S6 and S7 macrostates along the 10 slowest time-lagged independent components (tICs) used for Markov state model construction.

(TIFF)

S9 Fig. Transitions between the macrostate basins analyzed by projecting the initial and final configurations of individual trajectories onto the free-energy surface.

Individual trajectories are represented as subplots with the initial and final configuration shown as black and white dots respectively. Trajectories with transitions between the basins are highlighted with a magenta outline.

(TIFF)

S10 Fig. Chapman-Kolmogorov test validating the Markov state model by comparing the probabilities of transiting between the macrostates (blue line) and the calculated probabilities from the constructed model (black line).

(TIFF)

S11 Fig. The per-residue importance in discerning the S5, S6 and S7 macrostates calculated using the supervised KL Divergence method.

Plots represent the mean values calculated from five-fold cross-validation and the standard deviations are plotted as error bars. Physiologically important residues and those with high importance values are illustrated using red dotted lines.

(TIFF)

S12 Fig. Distance between center-of-mass of the D129 and D133 acidic residues making up the second Ca2+ binding site within the S5, S6 and S7 macrostates.

Each violinplot spans the 5th and 95th percentile of distances and is weighted by the Markov state model probabilities. The median value for each macrostate is represented as a white dot. The inter-residue distances calculated from apo- (PDB: 1CFD) and holo- (PDB: 1CLL) calmodulin structures are shown as grey and black dotted lines respectively.

(TIFF)

S13 Fig

Transition in the stacking of aromatic residues between the (A) apo- (PDB: 1CFD) and (B) holo- (PDB: 1CLL) states of calmodulin induced by the binding of Ca2+ ions.

(TIFF)

S14 Fig. Distances between the sidechains of the M109, M124 and M144 Methionine residues making up the hydrophobic binding pocket within the S5, S6 and S7 macrostates.

Each violinplot spans the 5th and 95th percentile of distances and is weighted by the Markov state model probabilities. The median value for each macrostate is represented as a white dot. The inter-residue distances calculated from apo- (PDB: 1CFD) and holo- (PDB: 1CLL) calmodulin structures are shown as grey and black dotted lines respectively.

(TIFF)

Acknowledgments

The MD simulations were performed on resources provided by the Swedish National Infrastructure for Computing (SNIC) on Beskow at the PDC Center for High Performance Computing (PDC-HPC). We further wish to thank Magnus Lundborg for assisting in ligand parameterization.

Data Availability

Code (Jupyter notebook) and input data needed to reproduce the results presented herein can be found on Zenodo with DOI 10.5281/zenodo.7045222.

Funding Statement

AMW was funded by a Swedish e-research center (SeRC) COVID-19 transition grant. AS was funded by Marie-Sklodowska Curie Fellowship Lipopeutics (Grant Number 898762). LDe acknowledges the Science for Life Laboratory (SciLifeLab), the Göran Gustafsson foundation and the Swedish Research Council (VR 2018-04905 and 2019-02433) for funding. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: New perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017;45. doi: 10.1093/nar/gkw1092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Berridge MJ, Bootman MD, Roderick HL. Calcium signalling: Dynamics, homeostasis and remodelling. Nature Reviews Molecular Cell Biology. 2003. doi: 10.1038/nrm1155 [DOI] [PubMed] [Google Scholar]
  • 3.Tidow H, Nissen P. Structural diversity of calmodulin binding to its target sites. FEBS Journal. 2013. doi: 10.1111/febs.12296 [DOI] [PubMed] [Google Scholar]
  • 4.Mahling R, Rahlf CR, Hansen SC, Hayden MR, Shea MA. Ca2+-saturated calmodulin binds tightly to the N-terminal domain of A-type fibroblast growth factor homologous factors. Journal of Biological Chemistry. 2021;296. doi: 10.1016/j.jbc.2021.100458 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Westerlund AM, Delemotte L. Effect of Ca2+on the promiscuous target-protein binding of calmodulin. PLoS Comput Biol. 2018;14. doi: 10.1371/journal.pcbi.1006072 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Smith DMA, Straatsma TP, Squier TC. Retention of conformational entropy upon calmodulin binding to target peptides is driven by transient salt bridges. Biophys J. 2012;103. doi: 10.1016/j.bpj.2012.08.037 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Villarroel A, Taglialatela M, Bernardo-Seisdedos G, Alaimo A, Agirre J, Alberdi A, et al. The ever changing moods of calmodulin: How structural plasticity entails transductional adaptability. Journal of Molecular Biology. 2014. doi: 10.1016/j.jmb.2014.05.016 [DOI] [PubMed] [Google Scholar]
  • 8.Chattopadhyaya R, Meador WE, Means AR, Quiocho FA. Calmodulin structure refined at 1.7 Å resolution. J Mol Biol. 1992;228. doi: 10.1016/0022-2836(92)90324-D [DOI] [PubMed] [Google Scholar]
  • 9.Kuboniwa H, Tjandra N, Grzesiek S, Ren H, Klee CB, Bax A. Solution structure of calcium-free calmodulin. Nat Struct Biol. 1995;2. doi: 10.1038/nsb0995-768 [DOI] [PubMed] [Google Scholar]
  • 10.Linse S, Helmersson A, Forsen S. Calcium binding to calmodulin and its globular domains. Journal of Biological Chemistry. 1991;266. doi: 10.1016/s0021-9258(18)92938-8 [DOI] [PubMed] [Google Scholar]
  • 11.Siivari K, Zhang M, Palmer AG, Vogel HJ. NMR studies of the methionine methyl groups in calmodulin. FEBS Lett. 1995;366. doi: 10.1016/0014-5793(95)00504-3 [DOI] [PubMed] [Google Scholar]
  • 12.Gopalakrishna R, Anderson WB. The effects of chemical modification of calmodulin on Ca2+-induced exposure of a hydrophobic region. Separation of active and inactive forms of calmodulin. BBA—Molecular Cell Research. 1985;844. doi: 10.1016/0167-4889(85)90099-0 [DOI] [PubMed] [Google Scholar]
  • 13.Shukla D, Peck A, Pande VS. Conformational heterogeneity of the calmodulin binding interface. Nat Commun. 2016;7. doi: 10.1038/ncomms10910 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Westerlund AM, Delemotte L. Inflecs: Clustering free energy landscapes with gaussian mixtures. J Chem Theory Comput. 2019. doi: 10.1021/acs.jctc.9b00454 [DOI] [PubMed] [Google Scholar]
  • 15.Jeon J, Yau WM, Tycko R. Millisecond Time-Resolved Solid-State NMR Reveals a Two-Stage Molecular Mechanism for Formation of Complexes between Calmodulin and a Target Peptide from Myosin Light Chain Kinase. J Am Chem Soc. 2020;142. doi: 10.1021/jacs.0c11156 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Tjandra N, Kuboniwa H, Ren H, Bax A. Rotational Dynamics of Calcium-Free Calmodulin Studied by 15N-NMR Relaxation Measurements. Eur J Biochem. 1995;230. doi: 10.1111/j.1432-1033.1995.tb20650.x [DOI] [PubMed] [Google Scholar]
  • 17.Westerlund AM, Harpole TJ, Blau C, Delemotte L. Inference of Calmodulin’s Ca2+-Dependent Free Energy Landscapes via Gaussian Mixture Model Validation. J Chem Theory Comput. 2018;14. doi: 10.1021/acs.jctc.7b00346 [DOI] [PubMed] [Google Scholar]
  • 18.Lucchesi PA, Scheid CR. Effects of the anti-calmodulin drugs calmidazolium and trifluoperazine on 45Ca transport in plasmalemmal vesicles from gastric smooth muscle. Cell Calcium. 1988;9. doi: 10.1016/0143-4160(88)90028-0 [DOI] [PubMed] [Google Scholar]
  • 19.Weiss B, Prozialeck WC, Wallace TL. Interaction of drugs with calmodulin. Biochemical, pharmacological and clinical implications. Biochem Pharmacol. 1982;31. doi: 10.1016/0006-2952(82)90104-6 [DOI] [PubMed] [Google Scholar]
  • 20.Manoharan GB, Okutachi S, Abankwa D. Potential of phenothiazines to synergistically block calmodulin and reactivate PP2A in cancer cells. PLoS One. 2022;17: e0268635–. Available: doi: 10.1371/journal.pone.0268635 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Weiss B, Prozialeck W, Cimino M, Sellinger Barnette M, Wallace TL. PHARMACOLOGICAL REGULATION OF CALMODULIN. Ann N Y Acad Sci. 1980;356. doi: 10.1111/j.1749-6632.1980.tb29621.x [DOI] [PubMed] [Google Scholar]
  • 22.Cook WJ, Walter LJ, Walter MR. Drug Binding by Calmodulin: Crystal Structure of a Calmodulin-Trifluoperazine Complex. Biochemistry. 1994;33. doi: 10.1021/bi00255a006 [DOI] [PubMed] [Google Scholar]
  • 23.Vandonselaar M, Hickie RA, Quail JW, Delbaere LTJ. Trifluoperazine-induced conformational change in Ca2+-calmodulin. Nat Struct Biol. 1994;1. doi: 10.1038/nsb1194-795 [DOI] [PubMed] [Google Scholar]
  • 24.Vertessy BG, Harmat V, Böcskei Z, Náray-Szabó G, Orosz F, Ovádi J. Simultaneous binding of drugs with different chemical structures to Ca2+-calmodulin: Crystallographic and spectroscopic studies. Biochemistry. 1998;37. doi: 10.1021/bi980795a [DOI] [PubMed] [Google Scholar]
  • 25.Feldkamp MD, Gakhar L, Pandey N, Shea MA. Opposing orientations of the anti-psychotic drug trifluoperazine selected by alternate conformations of M144 in calmodulin. Proteins: Structure, Function and Bioinformatics. 2015;83. doi: 10.1002/prot.24781 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Matsushima N, Hayashi N, Jinbo Y, Izumi Y. Ca2+-bound calmodulin forms a compact globular structure on binding four trifluoperazine molecules in solution. Biochemical Journal. 2000;347. doi: 10.1042/0264-6021:3470211 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Osawa M, Kuwamoto S, Izumi Y, Yap KL, Ikura M, Shibanuma T, et al. Evidence for calmodulin inter-domain compaction in solution induced by W-7 binding. FEBS Lett. 1999;442. doi: 10.1016/s0014-5793(98)01637-8 [DOI] [PubMed] [Google Scholar]
  • 28.Matsushima N, Hayashi N, Watanabe N, Jinbo Y, Izumi Y. Binding of trifluoperazine to apocalmodulin revealed by a combination of small-angle X-ray scattering and nuclear magnetic resonance. J Appl Crystallogr. 2007;40. doi: 10.1107/S0021889807002117 [DOI] [Google Scholar]
  • 29.Marshak DR, Lukas TJ, Watterson DM. Drug-Protein Interactions: Binding of Chlorpromazine to Calmodulin, Calmodulin Fragments, and Related Calcium Binding Proteins. Biochemistry. 1985;24. doi: 10.1021/bi00322a020 [DOI] [PubMed] [Google Scholar]
  • 30.Pande VS, Beauchamp K, Bowman GR. Everything you wanted to know about Markov State Models but were afraid to ask. Methods. 2010;52: 99–105. doi: 10.1016/j.ymeth.2010.06.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Chodera JD, Noé F. Markov state models of biomolecular conformational dynamics. Curr Opin Struct Biol. 2014;25: 135–144. doi: 10.1016/j.sbi.2014.04.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Feldkamp MD, O’Donnell SE, Yu L, Shea MA. Allosteric effects of the antipsychotic drug trifluoperazine on the energetics of calcium binding by calmodulin. Proteins: Structure, Function and Bioinformatics. 2010;78. doi: 10.1002/prot.22739 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Babu YS, Bugg CE, Cook WJ. Structure of calmodulin refined at 2.2 Å resolution. J Mol Biol. 1988;204. doi: 10.1016/0022-2836(88)90608-0 [DOI] [PubMed] [Google Scholar]
  • 34.Berendsen HJC, Postma JPM, van Gunsteren WF, DiNola A, Haak JR. Molecular dynamics with coupling to an external bath. Journal of Chemical Physics. 1984;81: 3684. doi: 10.1063/1.448118 [DOI] [Google Scholar]
  • 35.Wang L, Friesner RA, Berne BJ. Replica exchange with solute scaling: A more efficient version of replica exchange with solute tempering (REST2). Journal of Physical Chemistry B. 2011;115: 9431–9438. doi: 10.1021/jp204407d [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Bussi G. Hamiltonian replica exchange in GROMACS: A flexible implementation. Mol Phys. 2014;112: 379–384. doi: 10.1080/00268976.2013.824126 [DOI] [Google Scholar]
  • 37.Patriksson A, van der Spoel D. A temperature predictor for parallel tempering simulations. Physical Chemistry Chemical Physics. 2008;10: 2073–2077. doi: 10.1039/b716554d [DOI] [PubMed] [Google Scholar]
  • 38.Weber JK, Pande VS. Characterization and rapid sampling of protein folding Markov state model topologies. J Chem Theory Comput. 2011;7. doi: 10.1021/ct2004484 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Zimmerman MI, Porter JR, Sun X, Silva RR, Bowman GR. Choice of Adaptive Sampling Strategy Impacts State Discovery, Transition Probabilities, and the Apparent Mechanism of Conformational Changes. J Chem Theory Comput. 2018;14. doi: 10.1021/acs.jctc.8b00500 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Abraham MJ, Murtola T, Schulz R, Pall S, Smith JC, Hess B, et al. Gromacs: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1: 19–25. doi: 10.1016/j.softx.2015.06.001 [DOI] [Google Scholar]
  • 41.Bonomi M, Branduardi D, Bussi G, Camilloni C, Provasi D, Raiteri P, et al. PLUMED: A portable plugin for free-energy calculations with molecular dynamics. Comput Phys Commun. 2009;180: 1961–1972. doi: 10.1016/j.cpc.2009.05.011 [DOI] [Google Scholar]
  • 42.Tribello GA, Bonomi M, Branduardi D, Camilloni C, Bussi G. PLUMED 2: New feathers for an old bird. Comput Phys Commun. 2014;185: 604–613. doi: 10.1016/j.cpc.2013.09.018 [DOI] [Google Scholar]
  • 43.Darden T, York D, Pedersen L. Particle mesh Ewald: An N·log(N) method for Ewald sums in large systems. Journal of Chemical Physics. 1993;98: 10089. doi: 10.1063/1.464397 [DOI] [Google Scholar]
  • 44.Hess B. P-LINCS: A parallel linear constraint solver for molecular simulation. J Chem Theory Comput. 2008;4: 116–122. doi: 10.1021/ct700200b [DOI] [PubMed] [Google Scholar]
  • 45.Parrinello M, Rahman A. Polymorphic transitions in single crystals: A new molecular dynamics method. J Appl Phys. 1981;52: 7182–7190. doi: 10.1063/1.328693 [DOI] [Google Scholar]
  • 46.Hoover WG, Holian BL. Kinetic moments method for the canonical ensemble distribution. Physics Letters, Section A: General, Atomic and Solid State Physics. 1996;211. doi: 10.1016/0375-9601(95)00973-6 [DOI] [Google Scholar]
  • 47.Huang J, Mackerell AD. CHARMM36 all-atom additive protein force field: Validation based on comparison to NMR data. J Comput Chem. 2013;34. doi: 10.1002/jcc.23354 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Liao J, Marinelli F, Lee C, Huang Y, Faraldo-Gómez JD, Jiang Y. Mechanism of extracellular ion exchange and binding-site occlusion in a sodium/calcium exchanger. Nat Struct Mol Biol. 2016;23. doi: 10.1038/nsmb.3230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Lundborg M, Lindahl E. Automatic GROMACS topology generation and comparisons of force fields for solvation free energy calculations. J Phys Chem B. 2015;119: 810–23. doi: 10.1021/jp505332p [DOI] [PubMed] [Google Scholar]
  • 50.Noé F, Clementi C. Kinetic Distance and Kinetic Maps from Molecular Dynamics Simulation. J Chem Theory Comput. 2015;11. doi: 10.1021/acs.jctc.5b00553 [DOI] [PubMed] [Google Scholar]
  • 51.Wu H, Noé F. Variational Approach for Learning Markov Processes from Time Series Data. J Nonlinear Sci. 2020;30. doi: 10.1007/s00332-019-09567-y [DOI] [Google Scholar]
  • 52.Pérez-Hernández G, Paul F, Giorgino T, de Fabritiis G, Noé F. Identification of slow molecular order parameters for Markov model construction. Journal of Chemical Physics. 2013;139. doi: 10.1063/1.4811489 [DOI] [PubMed] [Google Scholar]
  • 53.McGibbon RT, Beauchamp KA, Harrigan MP, Klein C, Swails JM, Hernández CX, et al. MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophys J. 2015;109: 1528–1532. doi: 10.1016/j.bpj.2015.08.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Harrigan MP, Sultan MM, Hernández CX, Husic BE, Eastman P, Schwantes CR, et al. MSMBuilder: Statistical Models for Biomolecular Dynamics. Biophys J. 2017;112. doi: 10.1016/j.bpj.2016.10.042 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Scherer MK, Trendelkamp-Schroer B, Paul F, Pérez-Hernández G, Hoffmann M, Plattner N, et al. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J Chem Theory Comput. 2015;11: 5525–5542. doi: 10.1021/acs.jctc.5b00743 [DOI] [PubMed] [Google Scholar]
  • 56.Swope WC, Pitera JW, Suits F. Describing protein folding kinetics by molecular dynamics simulations. 1. Theory. Journal of Physical Chemistry B. 2004;108. doi: 10.1021/jp037421y [DOI] [Google Scholar]
  • 57.Swope WC, Pitera JW, Suits F, Pitman M, Eleftheriou M, Fitch BG, et al. Describing protein folding kinetics by molecular dynamics simulations. 2. Example applications to alanine dipeptide and a β-hairpin peptide. Journal of Physical Chemistry B. 2004;108. doi: 10.1021/jp037422q [DOI] [Google Scholar]
  • 58.Deuflhard P, Weber M. Robust Perron cluster analysis in conformation dynamics. Linear Algebra Appl. 2005;398. doi: 10.1016/j.laa.2004.10.026 [DOI] [Google Scholar]
  • 59.Röblitz S, Weber M. Fuzzy spectral clustering by PCCA+: Application to Markov state models and data classification. Adv Data Anal Classif. 2013;7. doi: 10.1007/s11634-013-0134-6 [DOI] [Google Scholar]
  • 60.Fleetwood O, Kasimova MA, Westerlund AM, Delemotte L. Molecular Insights from Conformational Ensembles via Machine Learning. Biophys J. 2020;118. doi: 10.1016/j.bpj.2019.12.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Pedregosa F, Michel V, Grisel O, Blondel M, Prettenhofer P, Weiss R, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12. [Google Scholar]
  • 62.Breiman L. Random forests. Mach Learn. 2001;45. doi: 10.1023/A:1010933404324 [DOI] [Google Scholar]
  • 63.Polizzi NF, Therien MJ, Beratan DN. Mean First-Passage Times in Biology. Israel Journal of Chemistry. 2016. doi: 10.1002/ijch.201600040 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Halling DB, Liebeskind BJ, Hall AW, Aldrich RW. Conserved properties of individual Ca2+-binding sites in calmodulin. Proc Natl Acad Sci U S A. 2016;113. doi: 10.1073/pnas.1600385113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Wang K, Holt C, Lu J, Brohus M, Larsen KT, Overgaard MT, et al. Arrhythmia mutations in calmodulin cause conformational changes that affect interactions with the cardiac voltage-gated calcium channel. Proc Natl Acad Sci U S A. 2018;115. doi: 10.1073/pnas.1808733115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Crotti L, Johnson CN, Graf E, de Ferrari GM, Cuneo BF, Ovadia M, et al. Calmodulin mutations associated with recurrent cardiac arrest in infants. Circulation. 2013;127. doi: 10.1161/CIRCULATIONAHA.112.001216 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Limpitikul WB, Dick IE, Joshi-Mukherjee R, Overgaard MT, George AL, Yue DT. Calmodulin mutations associated with long QT syndrome prevent inactivation of cardiac L-type Ca2+ currents and promote proarrhythmic behavior in ventricular myocytes. J Mol Cell Cardiol. 2014;74. doi: 10.1016/j.yjmcc.2014.04.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Shimoyama H, Takeda-Shitaka M. Residue-residue interactions regulating the Ca2+-induced EF-hand conformation changes in calmodulin. J Biochem. 2017;162. doi: 10.1093/jb/mvx025 [DOI] [PubMed] [Google Scholar]
  • 69.Sun J, MacKinnon R. Cryo-EM Structure of a KCNQ1/CaM Complex Reveals Insights into Congenital Long QT Syndrome. Cell. 2017;169. doi: 10.1016/j.cell.2017.05.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Kang PW, Westerlund AM, Shi J, White KMF, Dou AK, Cui AH, et al. Calmodulin acts as a state-dependent switch to control a cardiac potassium channel opening. Sci Adv. 2020;6. doi: 10.1126/sciadv.abd6798 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Malmendal A, Evenäs J, Forsén S, Akke M. Structural dynamics in the C-terminal domain of calmodulin at low calcium levels. J Mol Biol. 1999;293. doi: 10.1006/jmbi.1999.3188 [DOI] [PubMed] [Google Scholar]
  • 72.Kelly KL, Dalton SR, Wai RB, Ramchandani K, Xu RJ, Linse S, et al. Conformational Ensembles of Calmodulin Revealed by Nonperturbing Site-Specific Vibrational Probe Groups. Journal of Physical Chemistry A. 2018;122. doi: 10.1021/acs.jpca.8b00475 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Finn BE, Forsén S. The evolving model of calmodulin structure,function and activation. Structure. 1995;3. doi: 10.1016/s0969-2126(01)00130-7 [DOI] [PubMed] [Google Scholar]
  • 74.Menyhard D, Keseru G, Naray-Szabo G. Calmodulin in Complex with Proteins and Small Molecule Ligands: Operating with the Element of Surprise; Implications for Structure-Based Drug Design. Current Computer Aided-Drug Design. 2009;5. doi: 10.2174/157340909789577874 [DOI] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010583.r001

Decision Letter 0

Nir Ben-Tal, Guanghong Wei

12 Jul 2022

Dear Dr Delemotte,

Thank you very much for submitting your manuscript "Markov State Modeling Reveals Heterogeneous Drug-Inhibition Mechanism of Calmodulin" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Guanghong Wei

Associate Editor

PLOS Computational Biology

Nir Ben-Tal

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors combined molecular dynamics simulations, markov state model and machine learning strategies to investigate the structural dynamics of CaM exerted by the binding of one known drug TFP, which prevents the downstream protein associations. The authors claimed that the TFP binding to CaM can put an influence on its C-terminal region and induce it into a Ca2+-free conformation. This current manuscript provides some structural insights into the regulatory roles of TFP in stabilizing CaM. Despite that, many technical flaws should be fixed and more structural analyses should be provided in the revised version before the final acceptance. See below:

1. To validate the MSM, I would suggest the authors to perform the GMRQ analyses for the hyperparameters selections of the MSM, including the tICA lag-time, microstate number, number of tICs etc. Related to this, the implied timescale curves shown in Fig. S2 are actually not well leveled off, I suggest the authors to further increase the lag-time to check if the curves can converge or try alternative models.

2. In Fig. 2 and Fig. 3, the errors of the stationary distribution and MFPTs are not provided, this should be fixed in the revised manuscript.

3. From Fig. 1D, TFP can potentially bind with CaM in different pockets and varied TFP/CaM ratios. The authors should provide more details regarding the system setups. For example, how many TFP molecules were included in the system? How to parametrize the TFP molecule in the study etc. One structural model for the complete system is recommended.

4. In Fig. 4, the authors highlighted several critical residues in CaM, however, no detailed structural analyses to support their conclusions. I would suggest the authors to conduct more structural analyses and discuss with their ML results.

5. The authors attempted to compare the TFP-bound CaM with its Ca2+-bound/unbound state, however, either Ca2+-bound or -unbound CaM has been simulated in current work. I would suggest the authors to perform several additional MD simulations for these systems in order to make a valid comparison.

Reviewer #2: Overall, the authors detail the various mechanisms of TFP interactions with Calmodulin (CaM) by performing molecular dynamics simulations and adaptively sampling the conformational landscape of the CaM-TFP complex. The authors aim to explain the mechanism of TFP inhibition, and use extensive molecular dynamics simulations for the same. The authors use a guided machine-learning and quantitative approaches to explain the binding modes of TFP to CaM, which gives further validity to the results. They aim to explain the hydrophobic rearrangements of the binding pocket are responsible for the inhibition of CaM. It is an interesting work and I have a long list of minor edits. Further review is not needed.

Introduction:

Paragraph 4: Inconsistent terminology is used – intra-lobe is referred to as within-lobe and vice-versa.

Paragraph 5: ‘MD Simulations may be used?’ Authors could change this line to “MD simulations are routinely used” ..

Methods:

Calmodulin-TFP system preparation and equilibration:

Was the N-Terminal Domain present in the simulations, but the TFP only bound to the C-Terminal domain? The manuscript does not make this clear.

The authors say ‘TFP was placed in prominent states’ – can they make the said ‘placement’ clear? What are states 2, 5 and 6 from prior work? Can the authors justify their choice? From how many total states were these three chosen? Were these simulations in the presence of calcium ions? Please make the methodology more clear.

Replica Exchange solute tempering and adaptive sampling simulations:

The authors perform Replica Exchange Simulated Tempering (REST) simulations. However, there is not enough motivation provided for the rationale behind these simulations. Why are the authors performing these simulations? A mention of the method in the introduction with some relevant background is needed.

The conformations were chosen uniformly over the grid – what was the criteria for ensuring uniformity?

Could the authors provide more details into the data collected per round of sampling?

Selecting distance-based features and projecting the data onto slow degrees of freedom

“The minimum distances between all C-CaM residues and five TFP atoms distributed across the ligand’s structure (C25, C24, C10, S, C16) were calculated.” Why these 5 atoms in particular are chosen for TFP?

Is the S in (C25, C24, C10, S, C16) same as SC4 in Fig 1C? If yes, it is used inconsistently.

Fig S1: What do the values in the legend 0.5, 0.6 indicate? Can the authors make the term ‘feature type’ clear?

Construction of a Markov State Model

‘Results were stable’ – can the authors make it more quantitative justification than a qualitative one?

Fig S2: Have the authors computed the errors for the implied timescales?

Identification of important residues using machine learning

The cutoffs for the MSM construction (6Å) differ from the cutoff used for identification (6.5Å) – can the authors justify this decision?

Results and discussion:

Validation of Markov state model

The Chapman-Kolmogorov state test differs for the S2-S2 conversion diverges significantly between the predicted and the estimated probabilities.

Why tIC1 and tIC3 were chosen as the slowest metrics ? Why not tIC2 ?

The metastable states represent different binding modes

The slowest process corresponding to tIC1 – shows significant separation among macrostates S5 and S6 in Fig 2B, but these two macrostates represent similar binding modes. Why?

Authors mention an orange circle and cyan circle, in Fig 3 but it seems to be missing.

Is there evidence of the TFP interconverting through the different binding modes?

TFP Binding modes alter the C-CaM binding pocket by affecting local interactions at a Ca+2 binding site

The C_α RMSD of 1.71Å is unclear – is it between S5-S6, S6-S7 or S5-S7?

Can the authors give a more qualitative explanation of the y-axis in Fig 4A – what does RF importance signify? How does RF importance correlate to the dynamics of the protein macrostates S(5-7)?

Flipping of the TFP molecule affects the hydrophobic pocket and is associated with a changed beta-sheet structure content

Have the authors performed simulations in the presence of Ca+2 to explore the reduced affinity binding between Ca+2 in the presence of TFP?

Reviewer #3: In this manuscript, the authors have constructed Markov State Models (MSMs) from molecular dynamics (MD) simulations to study the binding of trifluoperazine to calmodulin. They identified important protein residues that are associated with several drug binding poses. Interestingly, they find that upon binding, trifluoperazine can stabilize both apo and holo-like calmodulin conformations, depending on the binding pose. Overall, this is a rigorous study, and the manuscript is also well-written. Their results provide new insights into the understanding of the calmodulin inhibition and may facilitate the drug development in the long term. Therefore, I would like to recommend its publications after minor revision (see my comments below):

1. To obtain the input features for the tICA analysis, the authors simply chose all the pair-wise distances between protein residues and five TFP atoms within a cut-off (6\\AA selected by the VAMP2 score). I am wondering if the application of Spectral-OASIS, SRVs or other methods (see discussions in JACS Au, 1(9), 1330, (2021)) can help further refine the input feature set? More interestingly, will these methods (Spectral-OASIS, SRVs, etc) can identify the same set of important residues (e.g., distance features between these important residues and TFP) as those obtained from their explainable AI algorithm after the MSM construction?

2. The current discussion of binding poses is very detailed in terms of the importance of protein residues but lacking in terms of their inhibitory effect. Specifically, it is not clear how the binding pose inducing the holo-like conformation contributes to the inhibition of calmodulin. I would like to suggest the authors to expand their discussions of the binding poses with respect to the subsequent inhibiting effect to benefit a general audience.

3. It will be helpful to include a SI figure to display the efficiency and convergence of their REST simulations (e.g., acceptance probability, coverage of the replica space, etc).

4. Fig S1: it is not obvious to me that “the feature type given by a 6 \\AA cutoff and inverse distances yielded the highest mean score across lag-times”. Could the authors clarify on this point?

5. The choice of hyperparameters, including tICA lag time, number of tICs, and number of microstates can all impact the quality of an MSM. These hyperparameters can also be optimized using the cross-validation method based on the VAMP2 score or GMRQ. The authors may consider optimizing some of these parameters or include some discussions on this point. Especially on the choice of number of microstates, could the authors clarify on their criterion: “until the results described herein were stable”?

6. Fig. S2: it’s not obvious to me that the implied timescales are fully converged. These implied timescale plots still display noticeable deviations from being flat even on the logarithm scale. Can the authors rephrase their claim of convergence and explain their choice of a relatively short MSM lag-time of 15ns? I notice that their 7-macrostate MSM (Fig. S6) is well validated by the Chapman-Kolmogorov test. Maybe adding the Chapman-Kolmogorov test of the 200-microstate MSM (on the residence probabilities of the top populated microstates) can help justify their choice of the Markovian lag time?

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: None

Reviewer #2: No: It will be a few GB dataset after removing water so authors could consider sharing the entire MD dataset.

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010583.r003

Decision Letter 1

Nir Ben-Tal, Guanghong Wei

18 Sep 2022

Dear Dr Delemotte,

We are pleased to inform you that your manuscript 'Markov State Modelling Reveals Heterogeneous Drug-Inhibition Mechanism of Calmodulin' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Guanghong Wei

Academic Editor

PLOS Computational Biology

Nir Ben-Tal

Section Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have addressed all of my raised issues.

Reviewer #2: Authors have addressed all the minor comments outlined in the initial review. Congratulations to the authors on this interesting work.

Reviewer #3: The authors have appropriately addressed my comments in the revised manuscript. I would like to recommend its publication.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: None

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Diwakar Shukla

Reviewer #3: No

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1010583.r004

Acceptance letter

Nir Ben-Tal, Guanghong Wei

2 Oct 2022

PCOMPBIOL-D-22-00880R1

Markov State Modelling Reveals Heterogeneous Drug-Inhibition Mechanism of Calmodulin

Dear Dr Delemotte,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Zsofia Freund

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Calmodulin states 2, 5 and 6 obtained from Molecular Dynamics simulations of the 3CLN structure that were used as initial configurations for the REST simulations.

    The bound Ca2+ ions included in the simulations are shown as red spheres.

    (TIFF)

    S2 Fig. Efficiency of the Replica-Exchange with Solute Tempering (REST) simulations initiated from the three CaM states assessed by the energy overlap and mean exchange acceptance probability between adjacent replicas.

    (TIFF)

    S3 Fig. VAMP2 scores of the 10 slowest processes for different feature transformations calculated at a variety of lag times τ.

    The mean values from five-fold cross-validation are plotted as bars and error bars represent the standard deviations. The mean value across the lag times calculated for each feature transformation is mentioned within the legend.

    (TIFF)

    S4 Fig. The optimization of MSM hyperparameters through the calculation of GMRQ scores for different feature transformations at varying numbers of microstates and processes.

    The mean values from five-fold cross-validation are plotted as dots and standard deviations are plotted as error bars.

    (TIFF)

    S5 Fig. Top 8 eigenvalues of the transition probability matrix calculated at varying lag-times to identify a memoryless Markovian time.

    The 95% confidence intervals of the eigenvalues are shown as shaded regions. The black solid curve delimits a shaded region where the implied timescales are shorter than the lagtime.

    (TIFF)

    S6 Fig. Spectral analysis of the eigenvalues to identify the number of clusters.

    The cutoff between the sixth and seventh relaxation timescales selected in this work is illustrated as red dotted line.

    (TIFF)

    S7 Fig. Top: Projection of the Markov state model free-energy surface along the three slowest time-lagged independent compo- nents (tICs).

    Bottom: S1-S7 macrostate assignments along the three tICs obtained from PCCA++ clustering. Each trajectory frame is represented as a dot within the scatter plot.

    (TIFF)

    S8 Fig. Separation of the S5, S6 and S7 macrostates along the 10 slowest time-lagged independent components (tICs) used for Markov state model construction.

    (TIFF)

    S9 Fig. Transitions between the macrostate basins analyzed by projecting the initial and final configurations of individual trajectories onto the free-energy surface.

    Individual trajectories are represented as subplots with the initial and final configuration shown as black and white dots respectively. Trajectories with transitions between the basins are highlighted with a magenta outline.

    (TIFF)

    S10 Fig. Chapman-Kolmogorov test validating the Markov state model by comparing the probabilities of transiting between the macrostates (blue line) and the calculated probabilities from the constructed model (black line).

    (TIFF)

    S11 Fig. The per-residue importance in discerning the S5, S6 and S7 macrostates calculated using the supervised KL Divergence method.

    Plots represent the mean values calculated from five-fold cross-validation and the standard deviations are plotted as error bars. Physiologically important residues and those with high importance values are illustrated using red dotted lines.

    (TIFF)

    S12 Fig. Distance between center-of-mass of the D129 and D133 acidic residues making up the second Ca2+ binding site within the S5, S6 and S7 macrostates.

    Each violinplot spans the 5th and 95th percentile of distances and is weighted by the Markov state model probabilities. The median value for each macrostate is represented as a white dot. The inter-residue distances calculated from apo- (PDB: 1CFD) and holo- (PDB: 1CLL) calmodulin structures are shown as grey and black dotted lines respectively.

    (TIFF)

    S13 Fig

    Transition in the stacking of aromatic residues between the (A) apo- (PDB: 1CFD) and (B) holo- (PDB: 1CLL) states of calmodulin induced by the binding of Ca2+ ions.

    (TIFF)

    S14 Fig. Distances between the sidechains of the M109, M124 and M144 Methionine residues making up the hydrophobic binding pocket within the S5, S6 and S7 macrostates.

    Each violinplot spans the 5th and 95th percentile of distances and is weighted by the Markov state model probabilities. The median value for each macrostate is represented as a white dot. The inter-residue distances calculated from apo- (PDB: 1CFD) and holo- (PDB: 1CLL) calmodulin structures are shown as grey and black dotted lines respectively.

    (TIFF)

    Attachment

    Submitted filename: Reviewer_comments.pdf

    Data Availability Statement

    Code (Jupyter notebook) and input data needed to reproduce the results presented herein can be found on Zenodo with DOI 10.5281/zenodo.7045222.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES