Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2011 Mar 28;108(15):6133–6138. doi: 10.1073/pnas.1017311108

Mapping allostery through the covariance analysis of NMR chemical shifts

Rajeevan Selvaratnam a, Somenath Chowdhury a, Bryan VanSchouwen a, Giuseppe Melacini a,b,1
PMCID: PMC3076865  PMID: 21444788

Abstract

Allostery is a fundamental mechanism of regulation in biology. The residues at the end points of long-range allosteric perturbations are commonly identified by the comparative analyses of structures and dynamics in apo and effector-bound states. However, the networks of interactions mediating the propagation of allosteric signals between the end points often remain elusive. Here we show that the covariance analysis of NMR chemical shift changes caused by a set of covalently modified analogs of the allosteric effector (i.e., agonists and antagonists) reveals extended networks of coupled residues. Unexpectedly, such networks reach not only sites subject to effector-dependent structural variations, but also regions that are controlled by dynamically driven allostery. In these regions the allosteric signal is propagated mainly by dynamic rather than structural modulations, which result in subtle but highly correlated chemical shift variations. The proposed chemical shift covariance analysis (CHESCA) identifies interresidue correlations based on the combination of agglomerative clustering (AC) and singular value decomposition (SVD). AC results in dendrograms that define functional clusters of coupled residues, while SVD generates score plots that provide a residue-specific dissection of the contributions to binding and allostery. The CHESCA approach was validated by applying it to the cAMP-binding domain of the exchange protein directly activated by cAMP (EPAC) and the CHESCA results are in full agreement with independent mutational data on EPAC activation. Overall, CHESCA is a generally applicable method that utilizes a selected chemical library of effector analogs to quantitatively decode the binding and allosteric information content embedded in chemical shift changes.


Long-range allosteric perturbations are propagated not only by structural changes but also by effector-dependent modulations in dynamics (123). The end points of these long-range allosteric signal propagations are effectively characterized by the comparative analysis of the structural and dynamic profiles of apo and effector-bound states (2, 7). However, what remains experimentally challenging is often defining the networks of residues that mediate the cross-talk between distal sites. Such clusters of coupled residues are particularly elusive in allosteric processes with a significant dynamically driven component (1117), as in this case the allosteric signal propagation relies on subtle, but critical, conformational and side-chain packing rearrangements that often fall below the resolution of common X-ray or NMR structure determination methods (2, 7, 24).

Here we introduce a general experimental method to map allosteric networks based on the covariance analysis of NMR chemical shifts. The chemical shift covariance analysis (CHESCA) is based on two simple but general notions. The first assumption is that the subtle but functionally relevant structural changes that underlie the allosteric modulations of dynamics are effectively probed by accurately measured NMR chemical shift variations, even when they escape detection through traditional structure determination methods (2427). The second general assumption of the CHESCA approach is that when a system is subject to a set of perturbations, residues that belong to the same effector-dependent allosteric network exhibit a concerted response to the perturbation set. This collective response is effectively sensed through correlations between the chemical shift variations experienced by different residues as a result of the perturbations. Such perturbations may arise from mutations or from chemical modification of the ligand effector. In either case, the perturbation set includes both active and inactive states, enabling the identification of networks of correlated residues linked to activation.

The CHESCA method is illustrated and validated here through its application to the multidomain signaling protein EPAC (Fig. 1A) (28). EPAC is a guanine exchange factor controlled by the second messenger cAMP and epitomizes signaling proteins that function as molecular switches and signal transducers (28). In such systems, the ligand-dependent activation is often adequately explained by the coupling of binding and activation equilibria, resulting in the four-state thermodynamic cycle illustrated in Fig. 1B (29). According to this model, in the apo state the activation equilibrium is shifted toward inactive conformations, while in the effector-bound (holo) state the activation equilibrium is shifted toward the active conformations (29) (Fig. 1B). The structures of EPAC in both apo and holo states have been recently solved (30, 31), revealing that the main difference between the inactive and active conformations is in their overall topology. The inactive state is defined by a “closed” topology, whereby the regulatory region of EPAC (Fig. 1A) sterically occludes access of substrate Rap proteins to the catalytic site (Fig. 1B) (30). In the absence of cAMP, the steric occlusion in the inactive state is secured by a cluster of salt bridges between the catalytic segment of EPAC and the N-terminal α1,2 helices, which are part of the EPAC cAMP-binding domain (CBD) (Fig. 1B) (30). This inhibitory salt bridge cluster is commonly referred to as the ionic latch (IL) (Fig. 1B).

Fig. 1.

Fig. 1.

(A) Domain and region organization of EPAC1. The cAMP-binding domain (CBD) analyzed by NMR is highlighted in orange. (B) Allosteric model of EPAC regulation based on coupled activation and cAMP-binding equilibria. CBD:β denotes the β-subdomain and IL stands for ionic latch. (C) Local rmsd between the structures of the EPAC CBD in the apo and holo states (PDB ID codes 2BYV and 3CF6, respectively). The secondary structure is reported with dashed lines. The regions undergoing significant conformational changes (i.e., PBC and hinge helix) are highlighted in green, while the IL zone is marked in red. The IL residues are indicated by vertical bars. Residues in the N-terminal helical bundle (NTHB) that display enhanced dynamics in the ps–ns and/or ms–μs time scale upon cAMP binding are indicated by red dots. PBC denotes the cAMP phosphate binding cassette. (D) Ribbon diagrams of the superimposed apo (gray) and holo (green) EPAC CBD structures (PDB ID codes as in C). The holo ribbon is shown only for the regions where it differs significantly from the apo state as judged based on the local rmsd reported in C. cAMP is shown as black spheres. The IL residues are shown in red. The dashed contours define the PBC and IL regions.

Upon cAMP binding, no significant changes in local structure are observed for the IL region as assessed based on the rmsd values shown in Fig. 1C (red region) (30, 31). However, cAMP enhances the ps–ns and ms–μs dynamics in the IL zone (red dots in Fig. 1C), increasing the entropic penalty for the IL and thus contributing to the weakening of the inhibitory IL interactions between the regulatory and the catalytic regions (15, 16). The dynamically promoted release of the IL induced by cAMP assists the transition to “open” active structures (Fig. 1B), in which a hinge rotation of a helix C-terminal to the CBD (α6, Fig. 1C) moves the regulatory region away from the catalytic region, providing Rap unhindered access to the catalytic site of EPAC (Fig. 1B) (15, 31, 32).

The enhancement of IL dynamics linked to cAMP binding provides an initial explanation for how the inhibitory IL is remotely controlled by cAMP in the framework of dynamically driven allostery, but it is still unclear how the cAMP signal propagates from the cAMP phosphate binding cassette (PBC) and the adjacent α6 hinge helix (green regions in Fig. 1D) to the distal IL region, where dynamics is enhanced (red sites in Fig. 1D). Here we address this question by applying the CHESCA method, which results in a PBC-to-IL signaling pathway independently supported through mutagenesis (33) and conserved coevolutionary patterns reported for known CBDs (34). The application of the chemical shift covariance approach to EPAC illustrates therefore the effectiveness of the CHESCA principle for defining intramolecular allosteric networks that control signal transduction.

Results and Discussion

Selection of Perturbations.

The first step in the implementation of the chemical shift covariance method is a careful choice of the states and of the associated perturbations used to identify correlations between residues. In general, a primary criterion for the selection of perturbations useful in mapping networks of coupled residues that control biological function, such as enzyme activity, is the inclusion of both active and inactive states that sample a diverse activation spectrum. In the case of the EPAC CBD, such perturbations are provided by the binding of the endogenous activator (i.e., cAMP, Fig. 2A) and of three related analogs, the 2′-OMe-cAMP and the diastereoisomeric phosphorothioate cAMP analogs Rp- and Sp-cAMPS (Fig. 2A). The covalent modifications of these three cAMP surrogates perturb key hydrogen bonds that anchor cAMP to the phosphate binding cassette (Fig. 2B) and result in different degrees of EPAC activation, as indicated by the wide range of relative kmax values spanned by the selected cAMP analogs (Fig. 2A) (33). The relative kmax value quantifies the guanine exchange activity of EPAC at saturating concentrations of effector ligand and is a useful indicator of the position of the activation equilibrium (Fig. 1B) (33). In the inactive states, such as the apo or the antagonist Rp-cAMPS-bound states, kmax ,relative = 0, corresponding to at least partial guanine nucleotide exchange inhibition, while at saturating concentrations of the cAMP activator kmax ,relative = 1 (Fig. 2A) (33). However, even at saturating concentrations, cAMP does not fully shift the activation equilibrium to the active state and therefore other ligands with enhanced active vs. inactive state selectivity cause superactivation (kmax ,relative > 1) by further increasing the population of the active state. This is the case for the superagonists Sp-cAMPS and 2′-OMe-cAMP that display a kmax ,relative > 1 (Fig. 2A) (33). This wide kmax ,relative range from 0 to > 1 indicates that the chosen perturbations are able to modulate the populations of inactive and active conformations, locking the activation equilibrium at different positions and making the chosen ligands an effective tool to map functional allosteric networks. In summary, the five states used for CHESCA are the apo and the four holo-states (Fig. 2A), all at saturating concentrations (Fig. 2C and Fig. S1). The apo and Rp-cAMPS-bound states are inactive, while the other three states are active.

Fig. 2.

Fig. 2.

(A) Covalent structure and related modifications of cAMP utilized to modulate the activation equilibrium of EPAC. For each ligand the relative kmax value is reported (kmax ,rel). (B) Key residues and hydrogen bonds that anchor cAMP to the PBC. cAMP sites that are modified in the cAMP analogs used in this work are highlighted with color circles (color coding as in A). (C) Binding isotherms for high and low affinity ligands measured though relative chemical shift changes of L207. (D) Hypothetical scheme showing correlated chemical shift variations for two different residues i and j. For residues subject to a two-state active/inactive fast exchange, a linear pattern is often observed (E). (FH) Representative interresidue combined chemical shift correlations (SI Text).

The Chemical Shift Covariance Analysis (CHESCA).

Once the 1H-15N HSQC spectra of the five selected states are fully assigned, pairwise correlations between chemical shift variations experienced by different residues are analyzed for the purpose of identifying networks of coupled residues. A simple proof of the linear correlations between chemical shift variations is obtained when the allosteric activation is adequately described by a single two-state and fast exchanging equilibrium (Fig. 2E and SI Text). In this respect it should be noted that the inactive/active equilibrium can still be and often is in the fast exchange regime even when the apo/holo exchange is slow in the chemical shift time scale. According to the two-state activation model in the fast exchange regime, the chemical shifts measured for residues far removed from the effector binding site are modeled simply as population weighted linear averages of the chemical shifts in the active and inactive states, implying that the chemical shifts observed for different residues sensing the same inactive/active equilibrium are linearly correlated.

The interresidue linear chemical shift correlation can be proven in more formal terms (SI Text) if we denote as δi,Ac (and δj,Ac), δi,In (and δj,In), and δik (and δjk) the combined chemical shifts of residue i (and j) in the active, inactive, and kth perturbed states, respectively. The combined chemical shift typically refers to weighted contributions from the ppm values of the amide proton and nitrogen (i.e., Inline graphic, with wN = 0.2 and wH = 1) (35). If amino acids i and j are sufficiently distant from the sites of the perturbations, their chemical shifts sense exclusively the perturbation-linked changes in the active/inactive conformational equilibrium. In this case, δik and δjk are related through the linear equation (SI Text):

graphic file with name pnas.1017311108eq4.jpg [1]

where α = Δδiδj, Δδi = (δi,Ac - δi,In), Δδj = (δj,Ac - δj,In) and β = δi,In - αδj,In.

Eq. 1 has two key implications. First, it should be noted that Eq. 1 is valid irrespective of the size of Δδi and Δδj, indicating that, if residues i and j belong to the same allosteric network, their perturbation-dependent chemical shift variations are expected to be highly linearly correlated regardless of their magnitude. This observation suggests that even minor ppm variations are potentially significant for the definition of allosteric networks, if measured accurately and precisely (SI Text). Second, because Eq. 1 was derived based on the assumption that the chemical shifts observed in the different perturbed states are a weighted average of the ppm values in the active and inactive conformations (Eqs. S1 and S2), it is expected that when a network of residues is functionally related to allostery, the chemical shifts of the active states (e.g., agonist bound) should be well-separated from those of the inactive states (e.g., apo and antagonist bound); i.e., the active and inactive states should cluster into two distinct groups in the chemical shift correlation plots, as shown for the representative correlations of Fig. 2 FH. Taken together, these two implications of Eq. 1—i.e., the high degree of correlation between the δik and δjk chemical shifts, independently of the magnitudes of Δδi and of Δδj, and the separate clustering of active vs. inactive states—define the foundation of the CHESCA approach and provide two simple, but effective, criteria for the identification of allosteric networks. These considerations are not necessarily limited to a two-state model, because in the presence of multistate equilibria the interresidue chemical shift correlations are still linear when the relative chemical shift changes experienced by the two correlated residues form isomorphic patterns (SI Text and Fig. 2D).

Identification of Networks.

In order to facilitate the systematic implementation of the two Eq. 1-based criteria for allosteric networks, the combined Inline graphic and Inline graphic chemical shifts of the EPAC CBD construct EPAC1h (149–318) in the apo and in the four holo states (i.e., under saturating conditions of Rp-cAMPS, cAMP, Sp-cAMPS, and 2′-OMe-cAMP) were compiled into an nr × 5 data matrix M, where nr is the number of residues for which assignments are available in all five states (Fig. S2). The correlation matrix (R) of M transpose was then computed, whereby the rij elements of R represent the Pearson’s correlation coefficients between residues i and j (SI Text, Eq. S9, and Fig. S2). The off-diagonal elements of R corresponding to |rij| values ≥0.98 are displayed in Fig. 3A and three representative correlations are shown in Fig. 2 FH. While for all three correlations of Fig. 2 FH rij > 0.98, their slopes vary by more than one-order of magnitude (Fig. 2 FH), confirming that the chemical shift variations of two residues can be highly correlated even when their magnitudes are markedly different, as anticipated based on Eq. 1. This observation explains why in the R matrix of Fig. 3A several cross-peaks connect regions undergoing changes in conformation, and subject to major chemical shift variations (green highlights in Figs. 1C and 3 A and F), to regions undergoing changes in local dynamics rather than in local structure, and subject only to minor chemical shift variations such as the IL zone (red highlights in Figs. 1C and 3 A and F). The R “cross-peaks” of Fig. 3A provide therefore an initial appreciation of the potential of chemical shift correlations as a reliable tool to map the interaction networks underlying allosteric processes, including dynamically driven allostery. However, based on simple visual inspection of R (Fig. 3A) it is difficult to reliably and systematically identify such allosteric networks. For this purpose it is necessary to analyze the R matrix using agglomerative clustering (AC) methods, which are aimed at effectively identifying groups of coupled residues.

Fig. 3.

Fig. 3.

(A) Chemical shift correlation matrix for the EPAC CBD. If the absolute value of the Pearson’s correlation coefficient between a pair of residues is ≥0.98 the corresponding cross-peak is marked with a dot. The two clusters of residues identified through the agglomerative algorithm illustrated in B and C are marked with blue and black lines, respectively. The secondary structure is reported as in Fig. 1C. (B and C) Dendrograms for clusters I and II, respectively. Residues linked through the dendrograms are reported at the bottom of each tree. All nodes shown correspond to correlation coefficients with absolute values ≥0.98. (D) Dendrogram for the agglomerative clustering of the five states (apo and four holo) obtained using the chemical shift submatrix containing only the residues of cluster I. (E) As D but for cluster II. (F) For each residue the maximum variation in δNHcomb. among the five states is reported as absolute value. Residues belonging to clusters I and II are reported as blue and black bars, respectively. Residues that do not belong to clusters I or II are indicated by dotted lines. Residues for which no line is reported are those for which unambiguous assignments in all five states are unavailable. The ppm changes for G269 and A280 are off-scale as indicated by the arrows.

Agglomerative Clustering (AC).

AC is a method commonly used to identify gene networks from microarray data, and it can be generally applied to identify clusters based on correlation matrices, such as R (Fig. S2) (36). Specifically, AC assigns a first intracluster link to the residue pair with the highest absolute value of the correlation coefficient within R. A second intracluster link is then assigned between the first residue pair and the so-called “nearest neighbor” among the remaining residues, i.e., the residue with the highest absolute value of the correlation coefficient to either of the residues in the original pair. Subsequent links in this hierarchical clustering process are created in a similar way and are graphically represented through dendrograms. Dendrograms are tree-like diagrams in which the linked residues are aligned in a horizontal row, while the vertical axis reports the magnitude of the correlation coefficient at which each successive residue is added to the previous groups. For instance, Fig. 3B shows the dendrogram for the largest cluster found in R with a cutoff of |rij|≥0.98 (SI Text). This cluster will be referred to as cluster I and consists of 38 residues (Fig. 3B and Table S4), which are connected by a blue grid in Fig. 3A. The second largest cluster with |rij|≥0.98 is denoted as cluster II and contains 10 residues (Fig. 3C; black grid in Fig. 3A). All other clusters within the 0.98 |rij| cutoff have only two or a maximum of three residues and will not be considered further here because they are not large enough to define a network.

The AC analysis and the related dendrograms illustrated in Fig. 3 AC indicate that the residues in each of the identified networks I and II display a highly concerted response to the perturbations used to build the correlation matrix R; i.e., they are highly cross-correlated within each cluster and in this respect they fulfill the first criterion for a network, as proposed above based on Eq. 1 (i.e., the high degree of correlation between the δik and δjk chemical shifts). However, the clustering analysis of Fig. 3 AC does not provide any information on the second criterion for an allosteric network mentioned above, i.e., the separate clustering of active and inactive states. Namely, the dendrograms of Fig. 3 B and C have the merit of identifying the residues belonging to clusters I and II (Fig. S2), but they do not provide any information on the function of these networks in terms of their role in allostery or effector binding (i.e., functional assignment).

Functional Assignment of Networks.

In order to assign a function to clusters I and II, we propose two independent approaches. The first method is still largely based on correlation matrices and subsequent hierarchical clustering, while the second relies on singular value decomposition (SVD). Specifically, the first approach starts with the selection of the submatrix of M corresponding to cluster I residues (i.e., matrix MI in Fig. S2) and the computation of the correlation matrix of MI (i.e., RI). RI is then analyzed according to the same agglomerative clustering algorithm discussed above for Fig. 3 B and C, but this time to group states based on residues rather than residues based on states (Fig. S2). The resulting dendrogram is shown in Fig. 3D, which clearly indicates that cluster I correctly separates the active from the inactive states, confirming that cluster I functions as an allosteric activation network. For example, Fig. 2 FH displays three representative correlations between cluster I residues with well-separated active and inactive states. Interestingly, when a similar protocol is applied to the cluster II submatrix of M (Fig. S2), the separation is not between active and inactive states but between the apo and the bound states (Fig. 3E), strongly suggesting that the function of cluster II, unlike cluster I, is linked more to binding than activation.

A second independent method used to validate the functional assignments of networks I and II to allostery and binding, respectively, is based on SVD (36, 37). Considering that SVD is designed to search for orthogonal directions of maximum variance, an effective assignment of allosteric vs. binding functions is obtained when SVD is employed to factorize matrix M rather than M (SI Text and Fig. S2). M, unlike M, is a matrix that compiles relative rather than absolute compounded chemical shifts. Specifically, M is calculated using the antagonist bound state as reference (i.e., Rp-cAMPS). This means that for each residue M reports four δNHcomb differences: Apo vs. Rp-cAMPS, cAMP vs. Rp-cAMPS, Sp-cAMPS vs. Rp-cAMPS, and 2′-OMe-cAMP vs. Rp-cAMPS (Fig. S2). The matrix M is then column mean centered and factorized through SVD (Fig. S2). The first two principal components (PCs) identified through SVD account for > 96% of the total variance (Table S2) and therefore the other PCs can be safely discarded. The resulting loading and score plots for the first two PCs are displayed in Fig. 4.

Fig. 4.

Fig. 4.

Score and loading plots for the first two principal components derived from the SVD of matrix M (Fig. S2). Circles correspond to scores and diamonds to loadings. Selected scores are labeled with the related residue number (i.e., M row) and each loading is labeled with the respective state difference (i.e., M column). The scores for residues in clusters I and II (Fig. 3) are shown as blue and black filled circles, respectively. Scores for residues that do not belong to cluster I or II are shown as open circles. Ellipsoids at one and two standard deviations for the first two principal components are displayed with solid black lines. The inset shows an expansion of the score plot within one standard deviation.

The loading plot is useful to decode the chemical meaning of each PC, while the residue-specific scores quantify the contribution of each residue to each PC. For example, the loading plot (Fig. 4, diamonds) clearly indicates that the major contribution to the first PC (PC1) is from the Apo vs. Rp-cAMPS column of M. Because the Rp-cAMPS antagonist binds EPAC but does not activate it, this means that PC1 mainly quantifies binding as opposed to allosteric contributions. Fig. 4 also shows that for PC2, unlike PC1, the most significant contributions arise from the cAMP vs. Rp-cAMPS, Sp-cAMPS vs. Rp-cAMPS, and 2′-OMe-cAMP vs. Rp-cAMPS columns of M. Because cAMP, Sp-cAMPS, 2′-OMe-cAMP, and Rp-cAMPS all bind EPAC but only the first three ligands are able to activate it, PC2 reflects mostly contributions from allosteric activation rather than binding. In summary, the SVD loadings of Fig. 4 indicate that PC1 and PC2 are mainly associated with binding and allostery, respectively.

Based on the loading analysis, the position of the score for a given residue relative to the PC1 and PC2 coordinates (Fig. 4, circles) reflects mainly the contributions of that specific residue to binding and allostery, respectively. We should then expect that the SVD scores for residues identified by AC as belonging to the allosteric cluster I (Fig. 3 B and D) are aligned along PC2, while the SVD scores for residues assigned by AC to the binding-related cluster II (Fig. 3 C and E) are distributed along PC1. This prediction is remarkably well confirmed by Fig. 4, which shows that the scores for residues in clusters I and II (blue and black solid circles, correspondingly) tend to be confined along PC2 and PC1, respectively, with only a minimal spread over the rest of the PC1/PC2 plane. The SVD factorization (Fig. 4) therefore corroborates the AC-based functional assignment of these two networks (Fig. 3 D and E). Further details on the SVD vs. AC comparison are available in the SI Text. Overall, taken together the SVD and AC methods provide a robust protocol (Fig. S2) to evaluate the binding and allosteric relevance of the clusters identified through CHESCA. A further and NMR-independent confirmation of the AC/SVD-based functional assignments of networks I and II to allostery and binding, respectively, is provided by mutational analyses, as explained in SI Text. Once the identification and the functional assignment of networks I and II is validated, further insight into the significance of these clusters is obtained by analyzing how they relate to the residue-specific chemical shift changes (Fig. 3F) and to the known structure of EPAC (Fig. 5).

Fig. 5.

Fig. 5.

(A) Map of cluster I residues (gray stick and surface representation) onto the ribbon diagram of the EPAC CBD using the same color coding as in Fig. 1D and with the NTHB sites displaying increased dynamics upon cAMP-dependent activation marked in red. Cluster I residues that appear as isolated, noncontiguous pairs are shown separately in Fig. S3 C and D. The dashed red arrow outlines a possible signal propagation pathway from the PBC to the IL based on cluster I. (B and C) Key interactions within cluster I: (B) two spines of hydrophobic residues at the interfaces between α4, α5, and α6; (C) an interaction node centered on R186 and bridging α1, α2, α4, and β1. (D) Cluster II (orange surface and sticks).

Significance of Minor Chemical Shift Changes.

Fig. 3F reports the maximum ppm variations observed for each residue across all the five states of Fig. 2A. One notable result evident from Fig. 3F is that the CHESCA-based allosteric cluster I includes also several residues subject to minimal ppm changes. For instance, Fig. 3F shows that the allosteric cluster I comprises several residues in the IL-spanning N-terminal helices α1 and α2, which display minor ppm differences. Consistently with the minor chemical shift variations (Fig. 3F), this region of EPAC does not change significantly in local conformation upon cAMP-binding (Fig. 1C), but it exhibits enhanced local dynamics upon activation (Fig. 1C). This example illustrates a more general scenario in which minor chemical shift changes are associated with subtle variations in the local environment that fall below the resolution of currently available structure determination methods. Such subtle changes may include minor readjustments in backbone and/or side-chain structure and/or dynamics, e.g., in the energy landscape of the interconverting states within the conformational ensemble. Although these variations often escape direct structural observation, they are sensed through chemical shift differences and underlie the modulations in dynamics that are critical for the allosteric signal propagation (7, 24). It is therefore essential to not only identify such allosterically relevant minor chemical shift changes through AC of the R matrix (Fig. 3 and Fig. S2) but to also ensure that these minor variations in ppm values are reliably measured (SI Text). Further insight into the interactions mediating these chemical shift correlations is gained by mapping clusters I and II onto the structures of the EPAC CBD (Fig. 5).

Structural Analysis of the Allosteric Network Defined by Cluster I.

Fig. 5A shows that the majority of the residues in cluster I define a continuous surface that bridges the gap between the region with most significant conformational changes, i.e., the PBC and the hinge helix (Fig. 5A, green ribbon), and the region that is not subject to any significant local conformational change but displays enhanced dynamics in the active state, i.e., the IL zone in the distal N-terminal helical bundle (NTHB) (Figs. 1C and 5A, red ribbon). Interestingly, these bridging residues of cluster I are almost entirely confined to the noncontiguous α-subdomain of the EPAC CBD, originating from the α5 helix in the PBC and extending into the C-terminal α6 hinge helix and α4-1 in the NTHB, ultimately reaching the inhibitory IL zone (Fig. 5A, red dashed arrow). Within this extensive allosteric network bridging α5 to α1, two subsets of interactions are particularly notable.

First, residues L273 in α5 and F300, I303, and V307 in α6 define a hydrophobic spine (Fig. 5B) that interfaces with another hydrophobic spine in α4 including L207, V211, and V218 (Fig. 5B). These coupled spines within cluster I indicate that the previously identified contacts between L273 and F300 (i.e., “hydrophobic hinge”) (31) are part of a significantly more extended allosteric network of interactions that couples α5 in the PBC to both α6 and α4 (Fig. 5B). Second, α4 is also part of another subset of cluster I residues that connect the α4/β1 region to helices α2 and α1 (Fig. 5C). Although composed mostly by hydrophobic amino acids (Fig. 5C), this second subcluster I is centered around R186 that serves as an interaction hub stabilized by multiple hydrogen bonds between its guanidinum and adjacent carbonyl oxygen atoms (Fig. 5C). Together the interactions formed by the hydrophobic spines (Fig. 5B) and those that nucleate around R186 (Fig. 5C) provide an effective allosteric network to propagate the cAMP signal from the PBC to the distal IL region. Such cAMP signal propagation relies on subtle structural repacking and/or rearrangements that escape detection by conventional structure determination methods but are effectively sensed by NH chemical shift changes.

The bridging role from the PBC to the IL outlined above for cluster I helps explain how the IL dynamics are enhanced upon cAMP binding and rationalizes how the IL residues mediate in a cAMP-dependent manner several critical inhibitory salt bridges. Furthermore, cluster I also includes residues that don’t bridge the PBC to the IL (i.e., nine out of 38 cluster I residues) and reside in the β-subdomain of the EPAC CBD (Fig. 5A and Fig. S3). Interestingly, these cluster I residues are not part of the β-strands but of the interstrand loops of the typical β-barrel of CBDs (Fig. S3), suggesting that the β-strands serve primarily as a passive scaffold, while the loops play a more active allosteric role. This observation applies both to the isolated interstrand regions located opposite to the PBC (Fig. S3 C and D) and to those adjacent to the PBC (Fig. S3B). The structural analysis of cluster II (Fig. 5D) is available as SI Text.

Conclusions.

We have shown that the covariance analysis of subtle but significant NMR chemical shift changes caused by a set of covalently modified agonists and antagonists reveals extensive allosteric networks that bridge the site of effector binding to distant regions mediating inhibitory interactions, including those that are dynamically controlled. The chemical shift covariance analysis (CHESCA) relies on agglomerative clustering (AC) and singular value decomposition (SVD) of the chemical shift matrix. The combined analysis of the dendrograms and score plots generated by AC and SVD, respectively, not only identifies networks of coupled residues but also provides a residue-specific dissection of the relative contributions to binding and allostery. These CHESCA results are independently validated by several known EPAC mutants as well as by previous assessments of allostery based on coevolutionary patterns (34). Overall, the CHESCA method employs a small but carefully selected chemical library of effector analogs to decode the binding and allosteric information content embedded in chemical shifts. We anticipate CHESCA to be of general applicability for mapping allosteric networks in proteins amenable to NMR, complementing results from mutant cycles (38) and analyses of functional couplings between coevolving sites (39).

Materials and Methods

Sample Preparation.

Isotopically labeled samples were prepared as explained in SI Text.

NMR Spectroscopy.

All spectra were acquired at 306 K with a Bruker Avance 700-MHz spectrometer equipped with a 5 mm TCI cryoprobe. Ligand titrations where monitored through [15N-1H] HSQC spectra. The 1H and 15N chemical shifts employed for CHESCA were referenced relative to 15N-Ac-Gly and measured using HSQC spectra acquired at saturating ligand concentrations. Ligand saturation is critical because the four ligands selected as perturbations (Fig. 2A) span a wide range of affinities with KD values varying up to two orders of magnitude (Table S1). This means that when nonsaturating concentrations of ligands are used, the differences between the different ligand-bound states (Fig. 2A) reflect not only the different degrees of activation, as desired for the purpose of mapping allosteric networks, but also the varying extents of binding. To avoid this bias, it is important to ensure that saturating ligand concentrations are reached, as verified by the binding isotherms displaying a clear dose-response pattern (Fig. 2C and Fig. S1). The plateau region of the binding isotherms (Fig. 2C and Fig. S1) defines the ligand concentration range recommended for CHESCA. The 1H and 15N chemical shifts used as input for CHESCA are the average of the values measured through Gaussian peak fitting for the last three points in the titration (i.e., at ligand saturation). The standard deviation for these three measured chemical shift values at saturation is assumed to represent the error in the chemical shifts for the bound states. For the apo state, the error of the chemical shifts was estimated based on the differences between the ppm values measured for two different apo samples. The measurement of minor chemical shift changes is further detailed in SI Text.

Chemical Shift Analysis.

The protocols for the AC and SVD analyses are available as SI Text and Fig. S2.

Supplementary Material

Supporting Information

Acknowledgments.

We thank P. Britz-McKibbin and F. Fogolari for helpful discussions, and the Canadian Institute of Health Research and the National Sciences and Engineering Research Council (NSERC) for financial support. We are also indebted to the Heart and Stroke Foundation of Canada for a Maureen Andrew New Investigator to G.M. and to NSERC for a graduate scholarship to B.V.S.

Footnotes

The authors declare no conflict of interest.

*This Direct Submission article had a prearranged editor.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1017311108/-/DCSupplemental.

References

  • 1.Smock RG, Gierasch LM. Sending signals dynamically. Science. 2009;324:198–203. doi: 10.1126/science.1169377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Tsai CJ, del Sol A, Nussinov R. Allostery: Absence of a change in shape does not imply that allostery is not at play. J Mol Biol. 2008;378:1–11. doi: 10.1016/j.jmb.2008.02.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kuriyan J, Eisenberg D. The origin of protein interactions and allostery in colocalization. Nature. 2007;450:983–990. doi: 10.1038/nature06524. [DOI] [PubMed] [Google Scholar]
  • 4.Goodey NM, Benkovic SJ. Allosteric regulation and catalysis emerge via a common route. Nat Chem Biol. 2008;4:474–482. doi: 10.1038/nchembio.98. [DOI] [PubMed] [Google Scholar]
  • 5.Changeux JP, Edelstein SJ. Allosteric mechanisms of signal transduction. Science. 2005;308:1424–1428. doi: 10.1126/science.1108595. [DOI] [PubMed] [Google Scholar]
  • 6.Kern D, Zuiderweg ER. The role of dynamics in allosteric regulation. Curr Opin Struct Biol. 2003;13:748–757. doi: 10.1016/j.sbi.2003.10.008. [DOI] [PubMed] [Google Scholar]
  • 7.Daily MD, Gray JJ. Local motions in a benchmark of allosteric proteins. Proteins. 2007;67:385–399. doi: 10.1002/prot.21300. [DOI] [PubMed] [Google Scholar]
  • 8.Boehr DD, Nussinov R, Wright PE. The role of dynamic conformational ensembles in biomolecular recognition. Nat Chem Biol. 2009;5:789–796. doi: 10.1038/nchembio.232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Velyvis A, Yang YR, Schachman HK, Kay LE. A solution NMR study showing that active site ligands and nucleotides directly perturb the allosteric equilibrium in aspartate transcarbamoylase. Proc Natl Acad Sci USA. 2007;104:8815–8820. doi: 10.1073/pnas.0703347104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hilser VJ, Garcia-Moreno EB, Oas TG, Kapp G, Whitten ST. A statistical thermodynamic model of the protein ensemble. Chem Rev. 2006;106:1545–1558. doi: 10.1021/cr040423+. [DOI] [PubMed] [Google Scholar]
  • 11.Popovych N, Sun S, Ebright RH, Kalodimos CG. Dynamically driven protein allostery. Nat Struct Mol Biol. 2006;13:831–838. doi: 10.1038/nsmb1132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Tzeng SR, Kalodimos CG. Dynamic activation of an allosteric regulatory protein. Nature. 2009;462:368–372. doi: 10.1038/nature08560. [DOI] [PubMed] [Google Scholar]
  • 13.Cooper A, Dryden DT. Allostery without conformational change. A plausible model. Eur Biophys J. 1984;11:103–109. doi: 10.1007/BF00276625. [DOI] [PubMed] [Google Scholar]
  • 14.Fuentes EJ, Der CJ, Lee AL. Ligand-dependent dynamics and intramolecular signaling in a PDZ domain. J Mol Biol. 2004;335:1105–1115. doi: 10.1016/j.jmb.2003.11.010. [DOI] [PubMed] [Google Scholar]
  • 15.Das R, et al. Entropy-driven cAMP-dependent allosteric control of inhibitory interactions in exchange proteins directly activated by cAMP. J Biol Chem. 2008;283:19691–19703. doi: 10.1074/jbc.M802164200. [DOI] [PubMed] [Google Scholar]
  • 16.Das R, et al. Dynamically driven ligand selectivity in cyclic nucleotide binding domains. J Biol Chem. 2009;284:23682–23696. doi: 10.1074/jbc.M109.011700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.McNicholl ET, Das R, SilDas S, Taylor SS, Melacini G. Communication between tandem cAMP binding domains in the regulatory subunit of protein kinase A-Ialpha as revealed by domain-silencing mutations. J Biol Chem. 2010;285:15523–15537. doi: 10.1074/jbc.M110.105783. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lipchock JM, Loria JP. Nanometer propagation of millisecond motions in V-type allostery. Structure. 2010;18:1596–1607. doi: 10.1016/j.str.2010.09.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Masterson LR, Mascioni A, Traaseth NJ, Taylor SS, Veglia G. Allosteric cooperativity in protein kinase A. Proc Natl Acad Sci USA. 2008;105:506–511. doi: 10.1073/pnas.0709214104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Masterson LR, et al. Dynamics connect substrate recognition to catalysis in protein kinase A. Nat Chem Biol. 2010;6:821–828. doi: 10.1038/nchembio.452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Jarymowycz VA, Stone MJ. Fast time scale dynamics of protein backbones: NMR relaxation methods, applications, and functional consequences. Chem Rev. 2006;106:1624–1671. doi: 10.1021/cr040421p. [DOI] [PubMed] [Google Scholar]
  • 22.Ma B, Nussinov R. Enzyme dynamics point to stepwise conformational selection in catalysis. Curr Opin Chem Biol. 2010;14:652–659. doi: 10.1016/j.cbpa.2010.08.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Cui Q, Karplus M. Allostery and cooperativity revisited. Protein Sci. 2008;17:1295–1307. doi: 10.1110/ps.03259908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zhuravleva A, et al. Propagation of dynamic changes in barnase upon binding of barstar: An NMR and computational study. J Mol Biol. 2007;367:1079–1092. doi: 10.1016/j.jmb.2007.01.051. [DOI] [PubMed] [Google Scholar]
  • 25.Revington M, Zhang Y, Yip GN, Kurochkin AV, Zuiderweg ER. NMR investigations of allosteric processes in a two-domain thermus thermophilus Hsp70 molecular chaperone. J Mol Biol. 2005;349:163–183. doi: 10.1016/j.jmb.2005.03.033. [DOI] [PubMed] [Google Scholar]
  • 26.Shen Y, et al. Consistent blind protein structure generation from NMR chemical shift data. Proc Natl Acad Sci USA. 2008;105:4685–4690. doi: 10.1073/pnas.0800256105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Cavalli A, Salvatella X, Dobson CM, Vendruscolo M. Protein structure determination from NMR chemical shifts. Proc Natl Acad Sci USA. 2007;104:9615–9620. doi: 10.1073/pnas.0610313104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Gloerich M, Bos JL. Epac: Defining a new mechanism for cAMP action. Annu Rev Pharmacol Toxicol. 2010;50:355–375. doi: 10.1146/annurev.pharmtox.010909.105714. [DOI] [PubMed] [Google Scholar]
  • 29.Kraemer A, et al. Dynamic interaction of cAMP with the rap guanine-nucleotide exchange factor Epac1. J Mol Biol. 2001;306:1167–1177. doi: 10.1006/jmbi.2001.4444. [DOI] [PubMed] [Google Scholar]
  • 30.Rehmann H, Das J, Knipscheer P, Wittinghofer A, Bos JL. Structure of the cyclic-AMP-responsive exchange factor Epac2 in its auto-inhibited state. Nature. 2006;439:625–628. doi: 10.1038/nature04468. [DOI] [PubMed] [Google Scholar]
  • 31.Rehmann H, et al. Structure of Epac2 in complex with a cyclic AMP analogue and RAP1B. Nature. 2008;455:124–127. doi: 10.1038/nature07187. [DOI] [PubMed] [Google Scholar]
  • 32.Berman HM, et al. The cAMP binding domain: An ancient signaling module. Proc Natl Acad Sci USA. 2005;102:45–50. doi: 10.1073/pnas.0408579102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Rehmann H, Schwede F, Doskeland SO, Wittinghofer A, Bos JL. Ligand-mediated activation of the cAMP-responsive guanine nucleotide exchange factor epac. J Biol Chem. 2003;278:38548–38556. doi: 10.1074/jbc.M306292200. [DOI] [PubMed] [Google Scholar]
  • 34.Kannan N, et al. Evolution of allostery in the cyclic nucleotide binding module. Genome Biol. 2007;8:R264. doi: 10.1186/gb-2007-8-12-r264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Aden J, Wolf-Watz M. NMR identification of transient complexes critical to adenylate kinase catalysis. J Am Chem Soc. 2007;129:14003–14012. doi: 10.1021/ja075055g. [DOI] [PubMed] [Google Scholar]
  • 36.Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA. 1998;95:14863–14868. doi: 10.1073/pnas.95.25.14863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Sakurai K, Goto Y. Principal component analysis of the pH-dependent conformational transitions of bovine beta-lactoglobulin monitored by heteronuclear NMR. Proc Natl Acad Sci USA. 2007;104:15346–15351. doi: 10.1073/pnas.0702112104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Horovitz A, Fersht AR. Strategy for analysing the co-operativity of intramolecular interactions in peptides and proteins. J Mol Biol. 1990;214:613–617. doi: 10.1016/0022-2836(90)90275-Q. [DOI] [PubMed] [Google Scholar]
  • 39.Lockless SW, Ranganathan R. Evolutionarily conserved pathways of energetic connectivity in protein families. Science. 1999;286:295–299. doi: 10.1126/science.286.5438.295. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES