Abstract
Proteins gain optimal fitness such as foldability and function through evolutionary selection. However, classical studies have found that evolutionarily designed protein sequences alone cannot guarantee foldability, or at least not without considering local contacts associated with the initial folding steps. We previously showed that foldability and function can be restored by removing frustration in the folding energy landscape of a model WW domain protein, CC16, which was designed based on Statistical Coupling Analysis (SCA). Substitutions ensuring the formation of five local contacts identified as “on‐path” were selected using the closest homolog native folded sequence, N21. Surprisingly, the resulting sequence, CC16‐N21, bound to Group I peptides, while N21 did not. Here, we identified single‐point mutations that enable N21 to bind a Group I peptide ligand through structure and dynamic‐based computational design. Comparison of the docked position of the CC16‐N21/ligand complex with the N21 structure showed that residues at positions 9 and 19 are important for peptide binding, whereas the dynamic profiles identified position 10 as allosterically coupled to the binding site and exhibiting different dynamics between N21 and CC16‐N21. We found that swapping these positions in N21 with matched residues from CC16‐N21 recovers nature‐like binding affinity to N21. This study validates the use of dynamic profiles as guiding principles for affecting the binding affinity of small proteins.
Keywords: binding affinity, CC16, CC16‐N21, energy landscape, Group I peptide, N21, SCA, structure and dynamic‐based design, WW domain
1. INTRODUCTION
Protein sequences encode the necessary information for folding and function and are optimized through evolutionary pressure specific to the environment (Butler et al., 2018; Jiang et al., 2021; Russ et al., 2020; Socolich et al., 2005). Within a protein family, multiple sequence alignment (MSA) reveals that residues are crucial through conservation (i.e., amino acid position preference) and co‐evolution statistics, which can be used to predict mutation effects (Bai et al., 2016; Campitelli et al., 2021; Hopf et al., 2017; Kazan et al., 2022; Modi, Campitelli, et al., 2021; Russ et al., 2005; Voelz et al., 2009).
Several classic folding and binding studies have focused on WW domains, one of the most abundant independently folded protein domains in nature, because of their biological importance in regulating transcription, apoptosis, and ubiquitylation by binding to proline‐rich peptides (Chen & Sudol, 1995; Ilsley et al., 2002; Macias et al., 1996; Rotin, 1998; Sudol, 1996; Sudol & Hunter, 2000). Evolutionary inference methods that incorporate co‐evolution and conservation were used successfully to design artificial WW sequences that fold similarly to their natural counterparts. However, a significant proportion (approximately two‐thirds) of the sequences obtained with this approach failed to fold correctly (Russ et al., 2005; Socolich et al., 2005). In previous work, we showed that non‐foldability was due to frustration: the N‐terminal β‐hairpin turn would not form correctly due to strong non‐native local contacts. We restored foldability by explicit consideration of the early folding steps, thus reducing frustration (Zou et al., 2021). We identified five contacts that stabilize the nascent β‐hairpin, and grafted them from a foldable natural homologous sequence, N21, to an unfolded, designed sequence, CC16. This newly designed variant, CC16‐N21, folds and binds a Group I proline‐rich model ligand with a binding affinity comparable with natural WW domains (Kd = 71 μM) (Russ et al., 2005; Zou et al., 2021). Surprisingly, the native sequence N21, which shares 58% sequence similarity with CC16‐N21, shows no affinity to this peptide ligand, even though it is much more stable than CC16‐N21 to thermal denaturation (Tm 46.8°C vs. 22.4°C) (Zou et al., 2021).
Here, we investigate the balance between folding and binding in the context of WW domains. Our goal is to identify mutations that can modulate the binding affinity of the native N21 sequence to Group I peptides. We performed a comprehensive structural and dynamic analysis on N21 and CC16‐N21, using our docking method, Adaptive BP‐Dock, to sample the binding trajectories with Group I peptide (EYPPYPPPPYPSG), and compared the lowest energy bound poses (Bolia & Ozkan, 2016; Bolia, Woodrum, et al., 2014; Kazan et al., 2022). This analysis revealed interactions between the ligand and two tyrosine residues (9Y and 19Y) in CC16‐N21 that are absent in the N21 sequence, suggesting that they might be crucial for binding. We introduced tyrosine residues at corresponding positions in the N21 sequence, generating three mutants: H9Y, H19Y, and the double mutant H9YH19Y (Figure S1).
In parallel, we explored the conformational differences between N21 and CC16‐N21 in the unbound state using dynamic flexibility index (DFI) analysis. DFI is a position‐specific metric that computes each residue position's response fluctuation to external perturbations (i.e., random Brownian kick) occurring on the protein chain and is related to conformational entropy per residue position (Nevin Gerek et al., 2013). Positions exhibiting low DFI values (i.e., a DFI percentile value lower than 0.2) are classified as hinges. Hinges are stable locations within the 3‐D interaction network of a protein, and they do not deviate from their mean when external perturbations occur on the protein. However, due to their extensive interaction network, they can transfer perturbations to the rest of the protein. These rigid hinges can act like joints in a skeleton, mediating the collective motion of the protein, and have been shown to be important for function (Butler et al., 2015; Kolbaba‐kartchner et al., 2021; Kumar et al., 2015; Modi, Risso, et al., 2021; Nevin Gerek et al., 2013). Our previous protein evolution studies showed that proteins modulate function through a hinge‐shift mechanism in which increases in the flexibility of certain hinges (i.e., hinge losses) are compensated by rigidification at other distal flexible sites (i.e., new hinge formation) through mutations during evolution (Kumar et al., 2015; Modi et al., 2018; Modi, Risso, et al., 2021). In the present study, we used this hinge‐shift mechanism as a novel conformational dynamics‐based computational design approach to find distal sites (i.e., allosteric mutation sites) from the binding residues to modulate binding affinity through altering dynamics. We rationally mold the protein flexibility profile of the N21 based on changes in hinge location upon mutation, then deliberately weigh and alter the dynamics (assessed by DFI profiles) of the designed N21 sequences toward the dynamics of better binder CC16‐N21 as done in our other studies (Campitelli et al., 2021; Kumar et al., 2015; Larrimore et al., 2017; Modi, Risso, et al., 2021). We compared the DFI profiles of N21 and CC16‐N21: we found a drastic difference in the flexibility of two distal sites (P16, T10), suggesting that they may allosterically modulate binding through alteration of dynamics. Position 16 is a proline in both N21 and CC16‐N21, while position 10 is T in N21 and H in CC16‐N21. We evaluated this hinge‐shift location's contribution to binding by swapping these residues in variant T10H and examining the resulting binding profile computationally. Finally, all variants designed by either structural or dynamic approaches were expressed and characterized experimentally. These two orthogonal design approaches show that not only the specific interactions of the residues of the binding site, but distal sites can also modulate the binding of the WW domain through dynamic allostery.
2. RESULTS AND DISCUSSION
2.1. Structure‐based design of the variants considering the crucial contacts between the N21 and the peptide
We investigated the molecular interaction governing peptide binding using modeled N21 and CC16‐N21 peptide complexes, obtained through homology modeling (Zou et al., 2021). The unbound conformations were subjected to MD simulation and clustered using k‐means to gather highly sampled conformations (Kolbaba‐kartchner et al., 2021). The dominant conformation was used as an input representative structure for docking analyses with Adaptive BP‐Dock to generate the bound complexes with the Group I peptide. The docked pose with the lowest binding energy score was selected as the bound state (Figure 1). The docked pose of CC16‐N21 shows that tyrosine 9 and 19 are in contact with the peptide ligand. These interactions are missing in N21 bound pose because the sequence of N21 has two histidine at positions 9 and 19. The difference in these interactions is reflected in the computed binding scores of the complexes, −6.91 X‐score energy units (XEUs) for N21 and −7.62 XEU for CC16‐N21 (Table 1), suggesting that these two residue positions are critical for binding to Group I peptide.
FIGURE 1.

Models of ligand bound (CC16‐N21) and unbound (N21). 10Y (green) in the ligand (‐PPxY‐ motif) interacts with residues 9Y and 19Y (cyan) in CC16‐N21. The corresponding positions are H in N21. Sequence alignment of N21, CC16, and CC16_N21 are shown.
TABLE 1.
Predicted binding scores, thermostability, and binding assay profile of the WW domain variants were summarized below.
| WW variants | Tm (°C) | Kd (μM) | ∆H (kcal/Mol) | ‐T∆S (kcal/Mol) | ∆G (kcal/Mol) | Binding energy score (XEU) |
|---|---|---|---|---|---|---|
| N21 a | 46.8 | – | – | – | – | −6.91 |
| CC16_N21 b | 22.4 | 71 ± 4.7 | −3.5 ± 0.2 | −1.3 ± 0.0 | −5.2 ± 0.1 | −7.62 |
| H9Y | 40.3 | 86 ± 6.0 | −0.9 ± 0.0 | −4.3 ± 0.0 | −5.2 ± 0.1 | −7.62 |
| H19Y | 54.1 | – | – | – | – | −6.88 |
| H9YH19Y | 56.2 | 50 ± 3.0 | −0.2 ± 0.0 | −5.3 ± 0.0 | −5.5 ± 0.0 | −8.03 |
| T10H | 38.5 | 84 ± 1.0 | −1.2 ± 0.1 | −4.0 ± 0.1 | −5.2 ± 0.1 | −7.63 |
Naturally occurring variants used as background sequence for this study.
Variants used in previous studies (Zou et al., 2021) and used for comparison in this study.
2.2. Dynamic‐based design of the WW variant utilizing flexibility profiles
The flexibility profiles of the N21 and CC16‐N21 exhibit similar dynamics (Figure 2), except at positions 10 (histidine in CC16‐N21 and threonine in N21) and 16 (proline in both), which appeared to be hinge shift positions. Based on other studies suggesting the change in flexibility upon mutation (aka hinge shift mechanism) may impact function, the change in dynamics of these two positions could be responsible for the differences in peptide ligand binding observed between CC16‐N21 and N21 (Zou et al., 2021). To investigate this hinge shift mechanism in detail, we modeled N21‐T10H and computed its dynamics. We found that T10H mutation enhances the flexibility at positions 10 and 16: the profile of N21‐T10H is similar to that of CC16‐N21.
FIGURE 2.

Dynamic‐based design. DFI plots of N21, CC16‐N21, and T10H (top) and color‐coded ribbon diagrams showing the DFI profile of each position (red, highest, and blue, lowest DFI values). Positions 10 and 16, distal to the binding site, show hinge shifts that may contribute to binding. Residue position 16P is conserved. Mutating T10 to H in N21 restores the DFI profile toward that of CC16‐N21, suggesting that T10H could modulate binding dynamics.
2.3. Biophysical characterization of the newly designed variants
The mutants were prepared by recombinant expression and characterized experimentally. We found that the mutations did not interfere with secondary structure formation, as shown by CD spectroscopy. All mutants, H9Y, H19Y, H9YH19Y, and T10H yielded CD spectra typical of the WW fold and similar to WT N21, with a positive peak centered at 227 nm (Figure 3a). Thermal denaturation experiments were carried out by monitoring the loss of CD signal at 227 nm in the 5°C–90°C range (Figure 3b); the corresponding Tm values are summarized in Table 1. Mutants H19Y (54.1°C) and H9YH19Y (56.2°C) are more stable to thermal denaturation than N21 (46.8°C), while mutations T10H (38.5°C) and H9Y (40.3°C) resulted in loss of stability compared to N21; all mutants are within the range of naturally occurring WW sequences (Socolich et al., 2005). The proteins are monomeric at the concentrations used for binding assay as assessed by size exclusion chromatography (SEC) and CD spectroscopy, which show no concentration‐dependent variation in Tm (Figures S2 and S3). Additional CD studies support a two‐state folding process (Figures S6 and S7).
FIGURE 3.

Structure and thermostability of WW domain variants. (a) CD spectra of WW domain variants at 4°C. (b) Thermal denaturation (Tm) curves monitored at 227 nm (5°C to 90°C, ramp rate of 0.3°C/min). [θ] = Mean Residue Ellipticity; Buffer condition: 20 mM NaPO4, pH 7.0.
2.4. Binding to Group I peptide
We assessed whether the designed variants bind Group I proline‐rich peptide by isothermal titration calorimetry (ITC) titrations and compared the dissociation constants with CC16‐N21, for which a Kd of 71 μM ± 4 μM had been measured (Table 1) (Zou et al., 2021). We found that T10H and H9Y are comparable to CC16‐N21 (Kd of 84 μM ± 1 μM and 86 μM ± 6 μM, respectively). (Figure 4a,b). H9YH19Y displayed improved binding (Kd = 50 μM ± 3 μM) (Figure 4c) while H19Y did not show any binding affinity for the ligand, even at 1 mM concentration (Figure S4A,B). For comparison, naturally occurring WW domains show a wide range of affinity toward Group I peptide ranging from 1 μM to 500 μM (Kato et al., 2004; Russ et al., 2005). We note that H9YH19Y retains the interactions between tyrosine residues and proline residues of the peptide ligand observed in a variety of WW domains (Hu et al., 2004; Kraemer‐Pecore et al., 2009; Macias et al., 2002) and captured in the design of the original CC16 sequence based on MSA analysis, as shown in Figure 5 (Zou et al., 2021).
FIGURE 4.

Isothermal Titration Calorimetry (ITC) of the WW domain variants with Group I peptide measured at 4°C (a) H9Y; (b) T10H; and (c) H9YH19Y. Buffer condition: 20 mM NaPO4 at pH 7.0; experiments were carried out in duplicate, and blank was subtracted (Figure S4C).
FIGURE 5.

Docking poses of the WW domain variants showing H‐bonding. The docked poses of CC16‐N21, H9YH19Y, N21, H9Y, H19Y, and T10H are shown in cartoon representations. Residues making hydrogen bonding interactions with the peptide ligand (gray) are shown in sticks and annotated in the figure. Hydrogen bonds (yellow dashes) between the peptide ligand and the residues of the protein are highlighted. Five hydrogen bonds are observed for CC16‐N21 and H9YH19Y. H9Y and T10H only make four hydrogen bonds with the peptide ligand. Only one hydrogen bond is identified for the nonbinders N21 and H19Y.
Mirroring our previous observation with the parent sequences CC16‐N21 and N21, the experimental binding affinities to Group I peptide do not correlate with thermodynamic stability of the N21 mutant series. Analysis by Adaptive BP‐dock and scoring by X‐score energy units (XEUs) differentiate binders from non‐binders: N21 and H19Y have binding scores higher than −7.00 XEUs while H9Y, T10H, H9YH19Y, and CC16‐N21 were clustered below −7.6 XEUs (Figure 6a) (Bolia & Ozkan, 2016) The experimental binding equilibrium constants, Kd (Figure 6a and Table 1). Additionally, we examined the MD simulations of the unbound variants to understand the dynamics of residues that form the binding surface in WW domains. We utilized the DFI metric to examine the total change in dynamics of these binding residues with respect to the non‐binder N21 (Figure 6b). Variant H9YH19Y had the largest change in the negative direction, indicating the highest level of rigidification compared to the others. In contrast, H19Y exhibited a change in the positive direction, suggesting that the hydrogen bonding residues are more flexible and, therefore, struggle maintaining interactions with the peptide, resulting in a loss of binding.
FIGURE 6.

Experimental binding data ln(Kd) compared with predicted binding score and change in %DFI of binding residues: (a) Correlation with predicted binding scores obtained from Adaptive BP‐dock (R = 0.94). The double mutant H9YH19Y has the lowest predicted binding energy score and lowest Kd. (b) The change in dynamics () of binding residues making hydrogen bonds with the peptide correlates with experimental Kd (R = 0.98). The variant H9YH19Y has the largest change in the negative direction indicating the largest rigidification at the binding site.
Previous SCA analysis of the WW domain revealed that eight positions show strong mutual co‐evolution with the binding sites (Russ et al., 2005). Some of these positions have no direct interactions with the ligands, suggesting that a distal dynamic allosteric mechanism might be governing the WW domain binding process. We explored whether there are possible allosteric substitutions in CC16‐N21 that modulate binding affinity by applying DFI, a position‐specific metric that measures the relative flexibility of a residue backbone compared with the rest of the protein. Flexible regions identified by DFI metric tend to have a relatively large residue fluctuation response to a perturbation in other regions, while rigid regions have lower responses. Rigid regions with a DFI score lower than 0.2 are defined as hinges. These hinge sites have critical network of interactions within the 3‐D fold (Campitelli et al., 2020). They do not exhibit high residue response fluctuations to the perturbations exerted on the protein chain, yet they can transfer the perturbations efficiently to the distal sites of protein, like joints in a skeleton, and play a critical role in modulating the collective motion of a protein. Hinges are critical to protein function: for example, mutations in these positions alter the conformational dynamics profile and correlate to disease‐associated mutations in ferritin (Kumar et al., 2015). More broadly, DFI profiles are associated with function and changes in DFI value, particularly in rigid sites, lead to changes in function (Teilum et al., 2009; Xu et al., 2008). Comparative dynamics analysis of ancestral proteins with their corresponding extant homologs revealed that change in DFI profile, particularly compensation of the loss of certain hinge locations by the formation of the new hinge sites called a hinge‐shift mechanism, is utilized by nature to manipulate protein function (Modi, Risso, et al., 2021). The DFI analysis of Human Pin1 also showed that substrate binding to the WW domain induces a hinge shift mechanism and enhances the catalytic efficiency (Campitelli et al., 2018).
Hydrogen bonds are important interactions in biological systems, as they contribute to the stability and function of proteins and other biomolecules. Thus, we analyzed the number of hydrogen bond patterns of the docked poses and computed the number of hydrogen bonds formed between the peptide ligand and the binding residues for each mutant (Figure 5). We found that CC16‐N21 and H9YH19Y formed five hydrogen bonds, H9Y and T10H formed four hydrogen bonds, and only one hydrogen bond was identified for the nonbinders N21 and H19Y. This analysis also aligns with our dynamics analysis and suggests that the enhanced flexibility of binding residues in N21 and H19Y leads to loss in the formation of hydrogen bonds, thus leading to poor binding affinity.
Since our results strongly support that the dynamics of the WW domain play a critical role in its biophysical properties, we further investigated the equilibration and the relaxation of dynamics of variants. We studied %DFI in a sequential manner by creating a time series that carries information on the evolution of the dynamics. We prepared the time series %DFI by averaging 3 adjacent time windows, respectively (0.5 μs–1 μs, 1 μs–1.5 μs, and 1.5 μs–2 μs). As our earlier works (Butler et al., 2015; Kazan et al., 2022; Larrimore et al., 2017; Modi, Campitelli, et al., 2021) highlight that the DFI profiles capture the related function (Butler et al., 2018; Campitelli et al., 2021; Kolbaba‐kartchner et al., 2021; Modi, Risso, et al., 2021; Ose et al., 2022; Stevens et al., 2022; Zou et al., 2015), we cluster these time series of the DFI values of each variant using PCA (see Section 3.4) to compare their dynamics profiles. The first two principal components are responsible for most of the variance in the mutant DFI profiles. Hence, we utilized these two first principal components to analyze the clustering of the mutants based on their similarity in flexibility profiles. The projection of the data on the first and second principal components shows that the second principal component (PC2) clearly separates binders and non‐binders (Figure 7a). All variants exhibiting binding to Group I peptide have positive PC2 scores, indicating that they have similar flexibility profiles associated with binding dynamics. In contrast, H19Y is clustered with the native N21 with negative PC2 scores, suggesting that their unbound dynamics results in poor binding (Figure 7a). The first principal component (PC1) captures folding stability: its value is correlated with melting temperature (Tm) (Figure 7b). This analysis differentiates the role of dynamics in modulating protein stability and binding poses and can help explain why H19Y does not bind the ligand.
FIGURE 7.

(a) Correlation between PC2 and binding affinity: The projections of times series DFI data on the first and second principal components show separation between binders (in red) and non‐binders (in blue) on the second principal components score, where binders have PC2 > 0 and non‐binders have PC2 < 0. (b) Correlation between PC1 and melting temperature (Tm): The plot of PC1 versus Tm shows a clearly positive correlation (R = 0.79, p = 0.11) between them, indicating that the PC1 reflects the stability of N21 and its mutants.
2.5. Conclusion
We conducted a comparative analysis of the WW domains N21 and CC16‐N21 to explore why N21 exhibits poor binding, while its close homolog, artificially designed CC16‐N21, showed high affinity to Group I peptides. The peptide‐bound structure obtained by adaptive BP‐dock highlighted differences in the binding domains between N21 and CC16‐N21. Binding in CC16‐N21 is mediated by two tyrosine residues (9Y and 19Y) that contact the peptide, whereas in the N21 sequence, the equivalent positions are occupied by histidine. We explored whether unbound dynamics (i.e., the unbound conformational ensemble) played a major role in binding by computing DFI profiles, which provide position‐specific metrics related to conformational entropy per site. The comparison of the DFI profiles suggested a hinge‐shift point at a distal position 10, which is a histidine in CC16‐N21 but a threonine in N21. Based on these observations, we generated four mutants of N21 by substituting these residues: H9Y, H19Y, T10H, and the double mutant H9YH19Y. The bound forms were modeled using Adaptive BP‐Dock and ranked according to their docking energy scores. The variants were experimentally characterized: all formed secondary structures comparable to N21, although mutations modulated the stability to thermal denaturation. Furthermore, the binding affinities to Group I peptide correlated with the predicted docking energy scores. When we coupled this analysis with the computed DFI values of the positions that formed hydrogen bonds with the peptide, we observed that enhanced flexibility at these binding residue positions correlated with impaired binding in N21 and H19Y. Principal component analysis of time‐series DFI sheds light on the role of unbound dynamics in governing binding and stability. These results suggest that dynamics govern WW domain binding and that sites that do not directly interact but distally modulate the dynamics of binding may also be crucial and fundamental for binding. We hope that our structure and dynamics‐based protein design approach can be used to predict protein binding in general and to study protein‐ligand interactions.
3. METHODS AND MATERIALS
3.1. Molecular dynamics simulation
Molecular dynamics simulations of the wild‐type and mutants were performed using AMBER 20 (Salomon‐Ferrer et al., 2013). The mutants were modeled by PyMOL Mutagenesis Wizard (DeLano, 2002). Topology files were prepared based on ff99SB forcefield and the solvation box was modeled by explicit water model TIP3P (Mark & Nilsson, 2001) with a 14 Å minimum distance from the boundary to protein. The systems were neutralized by adding sodium and chloride ions and then minimized with the steepest descent algorithm followed by the conjugate gradient method for 5000 steps. The systems were then heated up to 300 K. Each system was then simulated for 2 μs with 2 fs time‐step at constant temperature (300 K) and pressure (1 bar) with Langevin thermostat and barostat.
3.2. Adaptive BP‐dock
Adaptive BP‐dock (Bolia & Ozkan, 2016; Kazan et al., 2022) is an iterative docking approach that utilizes perturbation response scanning (PRS) (Atilgan et al., 2010) combined with the RosettaLigand program (version 3.5) (Meiler & Baker, 2006) to model the interactions between the WW domain and the peptide ligand. The induced fit that emerged from the binding event is challenging to model with docking tools with static protein backbone and peptide movement. In Adaptive BP‐Dock, we include both the backbone flexibility of the receptor and the ligand. Before each docking step, a new conformation of the protein receptor is calculated based on the residue response fluctuation profile upon force perturbations on the binding pocket residues using PRS. This approach mimics the peptide ligand's forces acting on the receptor and generates a new conformation that samples binding‐induced conformations. This conformation is then docked with the peptide ligand using RosettaLigand. Adaptive BP‐dock which includes binding‐induced backbone conformational changes improves the modeling of binding interactions and can predict binding scores that capture the binding trends seen in experiments. The predicted binding scores are evaluated by X‐score empirical scoring function. X‐score energy units (XEUs) have been shown previously to provide a good correlation with experimental results (Bolia, Gerek, & Ozkan, 2014; Bolia & Ozkan, 2016; Wang et al., 2002). Thus, we applied Adaptive BP‐dock to each of the most representative clusters sampled during unbound MD simulations. We conducted three separate docking simulations to ensure the binding interactions between the WW domain and peptide ligand are captured accurately.
3.3. Dynamic flexibility index
Dynamic flexibility index (Butler et al., 2018; Gerek & Ozkan, 2011; Kazan et al., 2022; Kumar et al., 2015; Larrimore et al., 2017; Modi, Risso, et al., 2021; Ose et al., 2022; Stevens et al., 2022) uses the PRS technique that combines Elastic Network Model (ENM) and Linear Response Theory (LRT) (Nevin Gerek et al., 2013). In PRS, Brownian‐like unit forces are applied sequentially to each residue as perturbations (Atilgan et al., 2010; Kumar et al., 2015). According to LRT, the linear response vector perturbation due to is calculated as follows:
| (1) |
where −1 is the inverse of Hessian matrix.
In this work, instead of using Hessian matrix calculated via ENM, we used covariance matrix for C‐alpha atoms calculated from MD trajectories which are proportional to the inverse of the Hessian matrix (H−1). This is because MD provides more precise residue–residue interaction, such as long‐range interactions and solvation effects via atomistic force fields.
| (2) |
To compute DFI, we perturbed each residue sequentially by applying random unit forces on each residue. We then generated Perturbation Responses Matrix A as follows.
| (3) |
where denotes the average response at position due to perturbations on .
This procedure is repeated several times in different directions for each position, to ensure that forces are isotropically sampled. Then the averaged Perturbation Response Matrix A is used to calculate the DFI per residue.
| (4) |
This index is often more useful as a percentile since the DFI range varies for different proteins. Therefore, the DFI percentile is calculated as
| (5) |
where N is the total number of residues and is the number of residues with DFI value .
To understand and capture the converged dynamics of the protein system, we calculated time series based on MD covariance matrices from the three sequential time windows: (i) 0.5 μs‐1 μs, (ii) 1 μs‐1.5 μs, and (iii) 1.5 μs‐2 μs. To improve the accuracy, the loose ends of the proteins were excluded.
3.4. Principal component analysis
Principal component analysis (PCA) was applied to time series DFI. This dimensionality‐reduction method is used to reduce the variables in a high‐dimensional dataset while retaining most of the information from the dataset, therefore making the data more interpretable (Jolliffe & Cadima, 2016). For N21 and its mutants, the time series DFI profiles were merged into matrix where (total number of DFI profiles) and (dimension of time series DFI) (Kolbaba‐kartchner et al., 2021). Singular value decomposition of was conducted as follows:
| (6) |
Here, and are unitary matrices with orthonormal columns which are called left singular vectors and right singular vectors, respectively. is a rectangular diagonal matrix of positive number called the singular values of . were arranged, by convention, in a decreasing order of their magnitude and represent the variances in the corresponding left and right singular vectors.
The column vectors of are called the principal components. They are new variables that are constructed from the initial variables where the first principal component is a direction that maximizes the variance of the projected data, therefore preserves most of the data's variation. Score matrix is defined as follows:
| (7) |
Each row vector of is the projection of the corresponding data vector from matrix on every principal component. In this study, we utilized the projections of our original data vectors on the first and second principal components and discovered their relations with the protein functions.
3.5. Plasmid sequencing and protein expression
The sequences encoding for the designed WW domain proteins, containing point mutation(s) on N21 native sequence, were ordered from Genscript. All mutants were fused to Maltose Binding Protein (MBP) and cloned in pMAL‐c5x vector for expression. Each gene contained an N‐terminal poly‐histidine tag and the TEV cleavage site ENLYFQG to facilitate purification. The plasmids were transformed via heat shock into competent E. coli BL‐21 cells (NEB) and the mix was plated on LB agar plates containing ampicillin overnight at 37°C. Single colonies were used to inoculate 5 mL LB liquid cultures containing ampicillin and were grown overnight at 37°C shaking at 200 RPM. 10 mL of each culture was transferred to a 2 L flask containing 1 L LB media with ampicillin for growth and expression. The rest of the cells were centrifuged down, and the plasmid DNA was extracted using Promega Wizard® Plus SV Miniprep kits. Sequences were verified using GeneWiz Sanger Sequencing. The 1 L cultures were grown to OD600 of 0.6–0.8 and protein expression was induced by addition of 1 mM IPTG. Proteins were expressed for 6 h at 37°C shaking at 200 RPM. The total protein yield for these conditions was roughly 20 mg/L.
3.6. Protein isolation and purification
Cells were harvested by centrifugation at 5000 RPM for 20 min and resuspended in 30 mL 20 mM NaPO4 at pH 7.4, 0.5 M NaCl, and 20 mM imidazole buffer. Cells were lysed by sonication for 20 minutes using ON/OFF cycle by 30 s, and then spun down at 5000 RPM, 4 for 1 h. The supernatant was purified on a 5 mL Amersham Bioscience HisTrap column by FPLC (AKTApure). Fractions containing the protein were dialyzed in 20 mM NaPO4 at pH 7.4, 0.5 M NaCl, and 10 mM imidazole at 4°C. The His‐tag was cleaved by digesting the proteins with TEV at a ratio of 1:20 TEV to fusion protein, followed by purification by HisTrap column; WW proteins were collected in flowthrough. The proteins were further purified by RP‐HPLC on a 250 × 10 mm Phenomenex C18 Semi‐prep column by gradient elution starting with 0.01% TFA in water (solvent A) to 95% acetonitrile with 0.01% TFA (solvent B). Purified proteins were verified by MALDI (Figure S5) and stored at −20°C after being lyophilized.
3.7. Group I peptide synthesis and purification
Proline‐rich peptide, Group I, was synthesized on a CEM Liberty automated peptide synthesizer using Wang resin and FMOC‐protected amino acids. Deprotection conditions: 20% Piperidine, 0.1 M HOBT in DMF. Activation and coupling solutions: 0.5 M HBTU and 2 M DIEA in NMP. After completion of the synthesis, cleavage from resin was accomplished by shaking for 2 h using a cleavage cocktail containing 95% TFA, 2.5% Triisopropylsilane, and 2.5% distilled water. After 2 h, the mixture was filtered, excess TFA was removed, and lyophilized. The crude peptide was purified by RP‐HPLC on a 250 × 10 mm Phenomenex C18 Semi‐prep column by gradient elution starting with 0.01% TFA in water (solvent A) to 95% acetonitrile with 0.01% TFA (solvent B). and verified by MALDI (Bruker).
3.8. Circular dichroism
Protein stability and folding were assessed by Circular Dichroism (CD) using a JASCO J‐815 CD Spectrophotometer (JASCO, Easton, MD). Full scans were measured from 280 nm‐200 nm at 5°C with a 1 cm (or 1 mm) quartz cuvette, at protein concentration of 40 μM in 20 mM NaPO4 buffer at pH 7.4. Spectra were collected in triplicate, averaged, and converted to mean residue ellipticity (Greenfield, 2006).
Denaturation temperature (Tm) for all the WW domain peptides was calculated by monitoring ellipticity at 227 nm while increasing temperature from 5°C to 90°C at a ramp rate of 0.3°C/min.
Data were analyzed to extract Tm according to established methods in OriginPro 2018 (Greenfield, 2006).
3.9. Isothermal titration calorimetry (ITC)
WW domain and Group I peptides were sent to Sanford‐Burnham Medical Research Institute (La Jolla, CA) for ITC using an ITC200 calorimeter from Microcal (North Hampton, MA). In short, aliquots of Group I peptide from a 5 mM stock were titrated into 100 μM WW domain peptides, in 20 mM NaPO4 buffer, pH 7.0. Data were analyzed using standard fitting procedures with a one‐binding site model and analyzed using the Origin software package provided by Microcal. Titrations were carried out in duplicates, using phosphate buffer as blank.
Supporting information
Figure S1. Docked state of mutant residues on the N21 background(left). Sequences of the newly designed WW domain variants with mutations are highlighted in red(right).
Figure S2. SEC chromatogram to check possible dimerization at the concentrations used in ITC. Superdex 30 Increase 10/300 GL column was used in AKTApure and calibrated using appropriate standards (inset). Buffer condition: 20 mM NaPO4 at pH 7.0.
Figure S3. CD scans of the variants using cuvette of different path lengths. Quartz cuvettes with path lengths of 1 cm and 1 mm were used to scan the variants. Buffer condition: 20 mM NaPO4 at pH 7.0.
Figure S4. Duplicated ITC for H19Y variants titrated with Group 1 peptide. Buffer condition: 20 mM NaPO4 at pH 7.0.
Figure S5. Characterization of the WW domain variants (H9Y, H9YH19Y, and T10H) and Group 1 peptide using molecular weight verification by MALDI‐ToF. H9Y and H19Y have the same molecular weight.
ACKNOWLEDGMENTS
We acknowledge support from the Gordon and Betty Moore Foundations (S. Banu Ozkan) and NIH award 1R21CA207832‐01 (Giovanna Ghirlanda and S. Banu Ozkan). I. Can Kazan, Jin Lu, and Mohammad Imtiazur Rahman were supported, in part, by National Science Foundation (Awards 1901709 and 1935105).
Lu J, Rahman MI, Kazan IC, Halloran NR, Bobkov AA, Ozkan SB, et al. Engineering gain‐of‐function mutants of a WW domain by dynamics and structural analysis. Protein Science. 2023;32(9):e4759. 10.1002/pro.4759
Jin Lu, Mohammad Imtiazur Rahman and I. Can Kazan contributed equally to the paper.
Review Editor: Nir Ben‐Tal
Contributor Information
S. Banu Ozkan, Email: banu.ozkan@asu.edu.
Giovanna Ghirlanda, Email: gghirlanda@asu.edu.
REFERENCES
- Atilgan C, Gerek ZN, Ozkan SB, Atilgan AR. Manipulation of conformational change in proteins by single‐residue perturbations. Biophys J. 2010;99(3):933–943. 10.1016/j.bpj.2010.05.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bai F, Morcos F, Cheng RR, Jiang H, Onuchic JN. Elucidating the druggable Interface of protein−protein interactions using fragment docking and coevolutionary analysis. Proc Natl Acad Sci. 2016;113(50):E8058. 10.1073/pnas.1615932113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolia A, Gerek ZN, Ozkan SB. BP‐dock: a flexible docking scheme for exploring protein–ligand interactions based on unbound structures. J Chem Inf Model. 2014;54(3):913–925. 10.1021/ci4004927 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolia A, Ozkan SB. Adaptive BP‐dock: an induced fit docking approach for full receptor flexibility. J Chem Inf Model. 2016;56(4):734–746. 10.1021/acs.jcim.5b00587 [DOI] [PubMed] [Google Scholar]
- Bolia A, Woodrum BW, Cereda A, Ruben MA, Wang X, Ozkan SB, et al. A flexible docking scheme efficiently captures the energetics of glycan‐cyanovirin binding. Biophys J. 2014;106(5):1142–1151. 10.1016/j.bpj.2014.01.040 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Butler BM, Gerek ZN, Kumar S, Ozkan SB. Conformational dynamics of nonsynonymous variants at protein interfaces reveals disease association. Proteins Struct Funct Bioinforma. 2015;83(3):428–435. 10.1002/prot.24748 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Butler BM, Kazan IC, Kumar A, Ozkan SB. Coevolving residues inform protein dynamics profiles and disease susceptibility of NSNVs. PLoS Comput Biol. 2018;14(11):e1006626. 10.1371/journal.pcbi.1006626 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campitelli P, Guo J, Zhou H‐X, Ozkan SB. Hinge‐shift mechanism modulates allosteric regulations in human Pin1. J Phys Chem B. 2018;122(21):5623–5629. 10.1021/acs.jpcb.7b11971 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campitelli P, Modi T, Kumar S, Ozkan SB. The role of conformational dynamics and allostery in modulating protein evolution. Annu Rev Biophys. 2020;49(1):267–288. 10.1146/annurev-biophys-052118-115517 [DOI] [PubMed] [Google Scholar]
- Campitelli P, Swint‐Kruse L, Ozkan SB. Substitutions at nonconserved rheostat positions modulate function by rewiring long‐range, dynamic interactions. Mol Biol Evol. 2021;38(1):201–214. 10.1093/molbev/msaa202 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen HI, Sudol M. The WW domain of yes‐associated protein binds a proline‐rich ligand that differs from the consensus established for Src homology 3‐binding modules. Proc Natl Acad Sci. 1995;92(17):7819–7823. 10.1073/pnas.92.17.7819 [DOI] [PMC free article] [PubMed] [Google Scholar]
- DeLano WL. The PyMOL molecular graphics system. PyMOL Mol Graph Syst. 2002;40:44–53. http://www.pymol.org [Google Scholar]
- Gerek ZN, Ozkan SB. Change in allosteric network affects binding affinities of PDZ domains: analysis through perturbation response scanning. PLoS Comput Biol. 2011;7(10):e1002154. 10.1371/journal.pcbi.1002154 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greenfield NJ. Using circular dichroism collected as a function of temperature to determine the thermodynamics of protein unfolding and binding interactions. Nat Protoc. 2006;1(6):2527–2535. 10.1038/nprot.2006.204 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hopf TA, Ingraham JB, Poelwijk FJ, Schärfe CPI, Springer M, Sander C, et al. Mutation effects predicted from sequence co‐variation. Nat Biotechnol. 2017;35(2):128–135. 10.1038/nbt.3769 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu H, Columbus J, Zhang Y, Wu D, Lian L, Yang S, et al. A map of WW domain family interactions. Proteomics. 2004;4(3):643–655. 10.1002/pmic.200300632 [DOI] [PubMed] [Google Scholar]
- Ilsley JL, Sudol M, Winder SJ. The WW domain: linking cell signalling to the membrane cytoskeleton. Cell Signal. 2002;14(3):183–189. 10.1016/S0898-6568(01)00236-4 [DOI] [PubMed] [Google Scholar]
- Jiang X‐L, Dimas RP, Chan CTY, Morcos F. Coevolutionary methods enable robust design of modular repressors by reestablishing intra‐protein interactions. Nat Commun. 2021;12(1):5592. 10.1038/s41467-021-25851-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philos Trans R Soc Math Phys Eng Sci. 2016;374(2065):20150202. 10.1098/rsta.2015.0202 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kato Y, Nagata K, Takahashi M, Lian L, Herrero JJ, Sudol M, et al. Common mechanism of ligand recognition by group II/III WW domains. J Biol Chem. 2004;279(30):31833–31841. 10.1074/jbc.M404719200 [DOI] [PubMed] [Google Scholar]
- Kazan IC, Sharma P, Rahman MI, Bobkov A, Fromme R, Ghirlanda G, et al. Design of novel cyanovirin‐N variants by modulation of binding dynamics through distal mutations. eLife. 2022;11:e67474. 10.7554/eLife.67474 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kolbaba‐Kartchner B, Can Kazan I, Mills JH, Banu Ozkan S. The role of rigid residues in modulating TEM‐1 β‐lactamase function and thermostability. Int J Mol Sci. 2021;22(6):1–19. 10.3390/ijms22062895 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kraemer‐Pecore CM, Lecomte JTJ, Desjarlais JR. A de novo redesign of the WW domain. Protein Sci. 2009;12(10):2194–2205. 10.1110/ps.03190903 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar A, Glembo TJ, Ozkan SB. The role of conformational dynamics and allostery in the disease development of human ferritin. Biophys J. 2015;109(6):1273–1281. 10.1016/j.bpj.2015.06.060 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Larrimore KE, Kazan IC, Kannan L, Kendle RP, Jamal T, Barcus M, et al. Plant‐expressed cocaine hydrolase variants of butyrylcholinesterase exhibit altered allosteric effects of cholinesterase activity and increased inhibitor sensitivity. Sci Rep. 2017;7(1):10419. 10.1038/s41598-017-10571-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Macias MJ, Hyvönen M, Baraldi E, Schultz J, Sudol M, Saraste M, et al. Structure of the WW domain of a kinase‐associated protein complexed with a proline‐rich peptide. Nature. 1996;382(6592):646–649. 10.1038/382646a0 [DOI] [PubMed] [Google Scholar]
- Macias MJ, Wiesner S, Sudol M. WW and SH3 domains, two different scaffolds to recognize proline‐rich ligands. FEBS Lett. 2002;513(1):30–37. 10.1016/S0014-5793(01)03290-2 [DOI] [PubMed] [Google Scholar]
- Mark P, Nilsson L. Structure and dynamics of the TIP3P, SPC, and SPC/E water models at 298 K. J Phys Chem A. 2001;105(43):9954–9960. 10.1021/jp003020w [DOI] [Google Scholar]
- Meiler J, Baker D. ROSETTALIGAND: protein‐small molecule docking with full side‐chain flexibility. Proteins Struct Funct Bioinforma. 2006;65(3):538–548. 10.1002/prot.21086 [DOI] [PubMed] [Google Scholar]
- Modi T, Campitelli P, Kazan IC, Ozkan SB. Protein folding stability and binding interactions through the lens of evolution: a dynamical perspective. Curr Opin Struct Biol. 2021;66:207–215. 10.1016/j.sbi.2020.11.007 [DOI] [PubMed] [Google Scholar]
- Modi T, Huihui J, Ghosh K, Ozkan SB. Ancient thioredoxins evolved to modern‐day stability–function requirement by altering native state ensemble. Philos Trans R Soc B Biol Sci. 2018;373(1749):20170184. 10.1098/rstb.2017.0184 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Modi T, Risso VA, Martinez‐Rodriguez S, Gavira JA, Mebrat MD, Van Horn WD, et al. Hinge‐shift mechanism as a protein design principle for the evolution of β‐lactamases from substrate promiscuity to specificity. Nat Commun. 2021;12(1):1852. 10.1038/s41467-021-22089-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nevin Gerek Z, Kumar S, Banu Ozkan S. Structural dynamics flexibility informs function and evolution at a proteome scale. Evol Appl. 2013;6(3):423–433. 10.1111/eva.12052 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ose NJ, Butler BM, Kumar A, Kazan IC, Sanderford M, Kumar S, et al. Dynamic coupling of residues within proteins as a mechanistic foundation of many enigmatic pathogenic missense variants. PLoS Comput Biol. 2022;18(4):e1010006. 10.1371/journal.pcbi.1010006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rotin D. WW (WWP) Domains: From Structure to Function. Protein Modules in Signal Transduction; NY, USA: Springer, 1998. p. 115–133. 10.1007/978-3-642-80481-6_5 [DOI] [PubMed] [Google Scholar]
- Russ WP, Figliuzzi M, Stocker C, Barrat‐Charlaix P, Socolich M, Kast P, et al. An evolution‐based model for designing chorismate mutase enzymes. Science. 2020;369(6502):440–445. 10.1126/science.aba3304 [DOI] [PubMed] [Google Scholar]
- Russ WP, Lowery DM, Mishra P, Yaffe MB, Ranganathan R. Natural‐like function in artificial WW domains. Nature. 2005;437(7058):579–583. 10.1038/nature03990 [DOI] [PubMed] [Google Scholar]
- Salomon‐Ferrer R, Götz AW, Poole D, Le Grand S, Walker RC. Routine microsecond molecular dynamics simulations with AMBER on GPUs. 2. Explicit solvent particle mesh Ewald. J Chem Theory Comput. 2013;9(9):3878–3888. 10.1021/ct400314y [DOI] [PubMed] [Google Scholar]
- Socolich M, Lockless SW, Russ WP, Lee H, Gardner KH, Ranganathan R. Evolutionary information for specifying a protein fold. Nature. 2005;437(7058):512–518. 10.1038/nature03991 [DOI] [PubMed] [Google Scholar]
- Stevens AO, Kazan IC, Ozkan B, He Y. Investigating the allosteric response of the PICK1 PDZ domain to different ligands with all‐atom simulations. Protein Sci. 2022;31(12):e4474. 10.1002/pro.4474 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sudol M. Structure and function of the WW domain. Prog Biophys Mol Biol. 1996;65(1–2):113–132. 10.1016/S0079-6107(96)00008-9 [DOI] [PubMed] [Google Scholar]
- Sudol M, Hunter T. NeW wrinkles for an old domain. Cell. 2000;103(7):1001–1004. 10.1016/S0092-8674(00)00203-8 [DOI] [PubMed] [Google Scholar]
- Teilum K, Olsen JG, Kragelund BB. Functional aspects of protein flexibility. Cell Mol Life Sci. 2009;66(14):2231–2247. 10.1007/s00018-009-0014-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Voelz VA, Shell MS, Dill KA. Predicting peptide structures in native proteins from physical simulations of fragments. PLoS Comput Biol. 2009;5(2):e1000281. 10.1371/journal.pcbi.1000281 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang R, Lai L, Wang S. Further development and validation of empirical scoring functions for structure‐based binding affinity prediction. J Comput Aided Mol des. 2002;16(1):11–26. 10.1023/A:1016357811882 [DOI] [PubMed] [Google Scholar]
- Xu Y, Colletier J‐P, Weik M, Jiang H, Moult J, Silman I, et al. Flexibility of aromatic residues in the active‐site gorge of acetylcholinesterase: x‐ray versus molecular dynamics. Biophys J. 2008;95(5):2500–2511. 10.1529/biophysj.108.129601 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zou T, Risso VA, Gavira JA, Sanchez‐Ruiz JM, Ozkan SB. Evolution of conformational dynamics determines the conversion of a promiscuous generalist into a specialist enzyme. Mol Biol Evol. 2015;32(1):132–143. 10.1093/molbev/msu281 [DOI] [PubMed] [Google Scholar]
- Zou T, Woodrum BW, Halloran N, Campitelli P, Bobkov AA, Ghirlanda G, et al. Local interactions that contribute minimal frustration determine foldability. J Phys Chem B. 2021;125(10):2617–2626. 10.1021/acs.jpcb.1c00364 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure S1. Docked state of mutant residues on the N21 background(left). Sequences of the newly designed WW domain variants with mutations are highlighted in red(right).
Figure S2. SEC chromatogram to check possible dimerization at the concentrations used in ITC. Superdex 30 Increase 10/300 GL column was used in AKTApure and calibrated using appropriate standards (inset). Buffer condition: 20 mM NaPO4 at pH 7.0.
Figure S3. CD scans of the variants using cuvette of different path lengths. Quartz cuvettes with path lengths of 1 cm and 1 mm were used to scan the variants. Buffer condition: 20 mM NaPO4 at pH 7.0.
Figure S4. Duplicated ITC for H19Y variants titrated with Group 1 peptide. Buffer condition: 20 mM NaPO4 at pH 7.0.
Figure S5. Characterization of the WW domain variants (H9Y, H9YH19Y, and T10H) and Group 1 peptide using molecular weight verification by MALDI‐ToF. H9Y and H19Y have the same molecular weight.
