Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Sep 8.
Published in final edited form as: Structure. 2010 Sep 8;18(9):1140–1148. doi: 10.1016/j.str.2010.06.013

Metal-binding sites are designed to achieve optimal mechanical and signaling properties

Anindita Dutta 1, Ivet Bahar 1,*
PMCID: PMC2937013  NIHMSID: NIHMS228751  PMID: 20826340

Abstract

Many proteins require bound metals to achieve their function. We take advantage of increasing structural data on metal-binding proteins to elucidate three properties: the involvement of metal-binding sites in the global dynamics of the protein, predicted by elastic network models, their exposure/burial to solvent, and their signal-processing properties indicated by Markovian stochastics analysis. Systematic analysis of a dataset of 145 structures reveals that the residues that coordinate metal ions enjoy remarkably efficient and precise signal transduction properties. These properties are rationalized in terms of their physical properties: participation in hinge sites that control the softest modes collectively accessible to the protein and occupancy of central positions minimally exposed to solvent. Our observations suggest that metal-binding sites may have been evolutionary selected to achieve optimum allosteric communication. They also provide insights into basic principles for designing metal-binding sites, which are verified to be met by recently designed de novo metal-binding proteins.

Keywords: metal-binding sites, elastic network models, GNM, signal propagation, allosteric communication

Introduction

Metal-binding proteins are associated with a variety of cellular functions. Some of them play roles in transport and cell signaling; others act as cofactors that are incorporated into enzymes to provide structural support while stabilizing functional conformations, or they directly participate in chemical reactions during enzyme catalysis (Kendrik et al., 1992; Tainer et al., 1992). A number of web servers and databases (DBs) have been developed to annotate metal-binding proteins (Babor et al., 2008); and tools based on machine learning techniques such as support vector machines, neural networks and Bayesian classifiers have been developed for predicting metal-binding sites (Lin et al., 2005; Lin et al., 2006; Passerini et al., 2006; Passerini et al., 2007; Ebert and Altman, 2008) using the known structures of bound or ‘holo’ proteins as training dataset. Edelman, Sobolev and coworkers (2008) developed such an algorithm for predicting binding sites for transition metals, i.e., those associated with “catalytic, co-catalytic or structural” roles (Babor et al., 2008). These metals are coordinated by three or more amino acids. Cysteine (C), histidine (H), glutamatic acid (E) and aspartic acid (D) are the most frequently observed metal-binding residues (Auld, 2001; Golovin et al., 2005; Babor et al., 2008), and are referred to as the “CHED” category of metal-binding residues.

Many recent studies show that the collective dynamics of proteins is tied to their mechanism of function. Proteins possess the ability to undergo a distribution of collective changes in conformation, or modes of motions, at their equilibrium or native state, which accommodate, if not facilitate, their function (Bahar et al., 2007). Coarse-grained normal mode analysis (NMA) methods have found widespread use in characterizing these collective motions. In particular, elastic network models (ENMs) have found broad utility in conjunction with NMA, after the work of Tirion (Tirion, 1996), Bahar and coworkers (Bahar et al., 1997; Haliloglu et al., 1997) and Hinsen (Hinsen, 1998; Hinsen et al., 2000) that showed that cooperative movements that underlie many activities can be captured by network models. The three major reasons behind the use of the ENMs are their simplicity, the robustness of the predicted modes of motions in the low frequency regime, also called the softest modes, and the functional significance of these modes indicated by numerous applications (Cui and Bahar, 2006; Bahar et al., 2010).

These studies reveal that functional residues possess unique, structure-encoded properties that enable their function. For example, in a study of the relation between soft modes and catalytic activity for a series of enzymes, Yang and Bahar (2005) found that catalytic residues tend to be located near key mechanical sites. Here key mechanical sites refer to hinge sites or anchors that mediate the softest modes of motions, also called global modes, as predicted by the Gaussian Network Model (GNM) (Bahar et al., 1997; Haliloglu et al., 1997). The probability of finding a catalytic residue among these key mechanical sites turns out 3–4 times higher than that from a random search. Such spatial proximity has been proposed to be a prerequisite for efficient coupling between chemical and mechanical activities (Yang and Bahar, 2005). A further study of signal propagation pathways using a Markovian stochastic model showed that allosteric structures are predisposed to instigate efficient communication mechanisms favored by their inherent network topology (Chennubhotla and Bahar, 2007).

Following a similar premise, we have analyzed the collective dynamics of a representative set of metal-binding proteins. The questions we asked were: do metal-binding sites possess structure-induced dynamic properties that enable their involvement/assistance in many activities? Are they distinguished by specific features or specific positions in the structure, which enable them to achieve their role in the activity of the proteins that bind them? Do they share some common design features (apart from those structurally known) that confer efficient cooperation and communication across the proteins?

A representative dataset of metal-binding proteins previously used by Babor et al. (2008) has been adopted here for a systematic analysis. Figure 1 shows the broad range of functions represented by these proteins. 64% of them are enzymes with a diversity of biochemical activities, and the remaining 36% are almost equally distributed among other functions. We examined several properties of metal-binding residues, and compared them to those of the non metal-binding amino acids of the same type. First we focused on the soft modes predicted by the GNM, to see whether any dynamic role is assumed by metal-binding sites (apart from their chemical role of coordinating the ligand). Second, we examined their solvent accessibility. Third, using information-theoretic spectral methods, we mapped the signal flow pathways inherently accessible to these structures, and see if/how metal-binding sites are involved in establishing allosteric communication. The results presented here for a set of 145 holo proteins demonstrate that metal-binding residues occupy low mobility regions in the global modes, i.e., the stabilization of the bound metal results not only from local geometry and energetics, but from a global optimization of the intrinsic dynamics of the overall protein. These residues also tend to be buried in the structure, despite being polar or charged. Our study further shows that metal-binding sites serve as efficient signal transduction centers, suggesting that their particular location on the 3-dimensional structure has been evolutionarily optimized to achieve most cooperative effects. The observed propensities provide guidelines for designing potential metal-binding sites in proteins, which are verified to be fulfilled by de novo metal-binding proteins.

Figure 1. Functional distribution of holo proteins in the database (Babor et al., 2008) of metal-binding proteins presently examined.

Figure 1

The dataset contains 145 holo proteins. The majority of these structures are enzymes, and their distribution among different enzymatic classes is shown in the lower pie chart. Functional annotation was done using UniProt and PDB (Berman et al., 2000; Jain et al., 2009). See also Table S1, Table S2, and TableS3.

Results and Discussion

DATASETS

The analysis has been performed using 175 metal-binding proteins' structures deposited in the Protein Data Bank (PDB) (Berman et al., 2000). 60 of these structures refer to metal-binding proteins that have been resolved in both apo and holo forms. Table S1 in the Supplementary Information (SI) lists the PDB codes, chain identifiers, lengths (number of resolved residues, N) of these structures, along with the identities of bound metals and metal-binding residues, and the root-mean-square deviations (RMSDs) between the two forms, both for the backbone and the metal-binding site. The RMSDs averaged over all pairs are 0.389 ± 0.351Å and 0.221 ± 0.332Å for the backbone and metal-binding sites, respectively (last row in Table S1), indicating that the proteins exhibit minor changes in structure upon metal binding. Datasets II and III include an additional 115 metal-binding proteins structurally resolved in holo form only (SI Tables S2 and S3). The complete set of 145 holo structures include all those compiled by Edelman et al (Babor et al., 2008), except for those whose ligand-binding sites could not be identified/verified using the MetalloProtein Database (Castagnetto et al., 2002), or those which have more than 90% sequence similarity with respect to a member of the dataset. The bound metals include Zn (most frequent), Co, Ni, Fe, Mn and Cu.

APO AND HOLO FORMS EXHIBIT SIMILAR GLOBAL DYNAMICS

Figure 2A-D illustrates the global mobility profiles of a few metal-binding proteins in the softest modes of motions. Global mobility profiles refers to the normalized distributions of the square displacements of residues in the lowest frequency GNM mode. Mobile regions appear as peaks, whereas minima are regions with restricted movements that often pack functional residues in well-defined geometries. The panels A′-D′ display the color-coded ribbon diagrams for the respective proteins. Metal-binding residues are indicated by filled circles (panels A-D) and displayed in space-filling representation (panels A′-D′).

Figure 2. Global dynamics of metal-binding proteins illustrated for four cases.

Figure 2

A-D. Fluctuation profiles obtained by the GNM for four metal-binding proteins in holo form: A. 1MUC a muconate lactonizing enzyme with bound Mn2+; B. 1VLX an electron transport protein with bound Co2+; C. 1JFZ an RNase III endonuclease with bound Mn2+; and D. 1HP7 an anti-trypsin binding Zn2+. The curves represent the normalized distributions, or histograms, of square fluctuations, as a function of residue number, in the softest modes accessible to each structure. The yellow markers show the loci of metal-binding residues. These tend to occupy positions near local or global minima. Panels A and B compare the profiles for the holo (red dashed curve) and apo (blue curve) forms and illlustrate that the two forms show minimal, if any, change in their global mode profile. A′, B′, C′, D′. Ribbon diagrams of the four proteins in their holo forms, color-coded according to GNM softest mode profiles in panels A-D, from blue (most rigid) to red (most mobile). The metal-binding sites are shown in space-filling representation, and the metal ions in pink. Note that metal-binding sites are highly constrained in general (shown in blue), except for the structure in panel A/A′. See also Table S1, Table S2, and TableS3.

Panels A and B of Figure 2 compare each the global mobility profiles obtained for the holo and apo forms of the same protein. The close superposition of the pairs of curves in each panel suggests that there is no observable difference between the global mode profiles of the metal-bound and -free forms. This trend is seen in practically all of the 30 pairs of structures resolved in the presence and absence of bound metal (Dataset I). The last two columns in Table S1 list (i) the correlations between the global mobility profiles of the two forms, and (ii) the correlations between the X-ray crystallographic B-factors experimentally observed for the two forms. An average correlation of 0.936 is obtained between the global mobilities of the apo and holo forms; whereas their B-factors, which scale with the mean-square fluctuations (MSFs) or residues as Bi = (8π2/3) <(ΔRi)2> exhibit an average correlation of 0.745. These results indicate there are some differences in the MSFs of residues in different forms, which may arise from minor structural differences between the two forms as well as different packing geometry (intermolecular contacts) and other effects such as static disorder in the crystal structures. The global mobilities, on the other hand, are insensitive to small differences in structure, consistent with the well-established robustness of softest modes (Nicolay and Sanejouand, 2006; Tama and Brooks, 2006).

These results confirm that the global dynamics of a given protein is a collective property of its overall architecture, and ligand/metal binding has minimal, if any, effect on its intrinsically accessible soft motions, in accord with previous experimental and computational observations made for liganded and unliganded forms of enzymes (Eisenmesser et al., 2005; Tobi and Bahar, 2005; Yang and Bahar, 2005; Lange et al., 2008; Bakan and Bahar, 2009). In view of the insensitivity of the global modes to the presence/absence of bound metal, we focus on the dynamics of 145 holo proteins listed in Datasets II and III.

METAL-BINDING SITES HAVE RESTRICTED FLUCTUATIONS

Next, we examined whether metal-binding sites occupy positions coinciding with, or close to, key mechanical sites in the 3-dimensional structure of the proteins. Key mechanical sites serve as hinges/anchors in the global modes, and as such they appear as minima in the global mode profiles. Panels A-D in Figure 2 indicate a tendency of metal-binding sites (indicated by yellow circles) to be located near minima (local or global), although this trend is not that distinctive. Likewise, the color-coded diagrams in panels A′-D′ also indicate relatively low mobilities (blue regions) at metal-binding residues (shown in space-filling representation), although departures from this behavior are also observable (e.g., panel A′).

Toward a more critical assessment of the relationship, if any, between metal-binding sites and key mechanical sites, we performed a comparative analysis of the mobilities of three groups of residues: all residues, metal-binding CHED (Cys, His, Glu and Asp) residues and all other CHED residues in the Datasets I-III.

The results are presented in Figure 3. Panel A displays the histograms of mobilities for the three subsets. For comparative purposes, the mobilities were normalized in the range [0, 1] for each protein. Metal-binding CHED residues, indicated by the red bars, exhibit a clear bias towards lower mobilities compared to the residues in the other two groups. This is despite the fact that CHED residues are charged or polar, and tend to positioned on the surface of the protein, thus enjoying higher mobility as compared to other residues. The mean values and variances corresponding to the three distributions are 0.24 ± 0.27 for all residues and 0.26 ± 0.28 for non-metal-binding CHED, and 0.17 ± 0.22 for metal-binding CHED (Table S4). The variance is high because of the long-tail of the distributions. Panel B presents the same results as cumulative distributions. Almost 65% of metal-binding CHED residues have mobilities lower than 0.1.

Figure 3. Comparison of the global mobilities of different types of residues.

Figure 3

A. Histograms of mobilities for three different groups of residues: metal-binding CHED, all CHED (green) and all (blue) residues. Mobilities are normalized in the range [0, 1], and results are shown for 21 bins at an interval of 0.05; the first bin refers to the count of residues having mobilities in the range from 0.00 to 0.05 and so on, expressed as probabilistic occurrence on the ordinate. B. Cumulative distributions of mobilities for the same three groups of residues. C-F. Same as in panel A, for specific amino acids at metal-binding and other (green) locations. Metal-binding ‘HED’ residues exhibit mobilities significantly lower than their non-metal-binding counterparts, while cysteines (‘C’) show the opposite trend. See Table S4 for the number of residues in each subset, along with the mean and covariance values corresponding to the histograms in panels A, C-F. The numbers in parenthesis in the insets show average mobilities. See also Table S4, Table S5, Figure S1, and Figure S2.

In Figure 3 panels C-F, we take a closer look at each type of CHED residues and compare the mobilities of the metal-binding and non-metal-binding subsets. Among these four types of amino acids, it is interesting to note that cysteines exhibit a fundamentally different behavior: while metal-binding His, Glu and Asp possess a significantly lower mobility (in the holo forms) compared to their non-metal-binding counterparts, metal-binding cysteines enjoy a higher mobility than those not involved in metal-binding. Table S4 summarizes the mean mobilities and their standard deviation data for group of each amino acid. Essentially, the metal-binding HED residues display mean mobilities around 0.12 (smaller than the average mobility of all residues by a factor of 2), while non-metal binding HED residues are almost indistinguishable from all other residues. Cysteines exhibit the opposite pattern: metal-binding cysteines are even more mobile than an average non-metal-binding residue.

The statistical significance of the results has been tested using three methods: (i) upon examining the distribution of the mobilities of an ensemble of randomly selected residues (of the same size as the metal-binding CHED residues), which showed that the mean and covariance (0.24 ± 0.27) are comparable to those of the entire ensemble, repeated and confirmed for five independent runs; (ii) performing the same type of analyses for each of the CHED residues, which indicated the distinctive properties of metal-binding CHED (Table S4), and (ii) doing a KS test at 5% level of significance to verify that metal-binding CHED residues have a statistically significant difference in their mobility distribution compared to their non-metal-binding counterparts (Table S5), except for the subset of Cys residues.

To further elucidate the distinctive properties, if any, of specific metal-binding proteins, we analyzed the distribution of mobilities in zinc-binding and manganese-binding proteins since they represent the majority in the dataset (71/145 and 32/145, respectively). The results for these two sets of metal-binding proteins are presented in the respective SI Figures S1 and S2. Notably, a large number of zinc-coordinating residues are cysteines, which increased the sample size for Cys residues, and diminished the distinctive behavior of the CHED group: the average mobility of metal-binding CHED residues in Zn-binding proteins was indeed found to be 0.21 (Figure S1-A). This is in contrast to Mn-binding proteins that display an average mobility of 0.097 (Figure S2-A), as there are no cysteines ligating the manganese ions (Figure S2-C). The remaining “HED” residues show the same trend as seen in Figure 3.

METAL-BINDING RESIDUES SHOW DECREASED SOLVENT ACCESSIBILITY

A comparative analysis of solvent accessibility (McConkey et al., 2002; Eyal et al., 2004) properties sheds light on the distinctive properties of the two subgroups of CHED residues: Metal-binding CHED residues have much smaller solvent accessible surface areas (SASAs) compared to non-metal-binding CHED, as may be viewed from panels A and B in Figure 4. Panels D-F shows that there is a drastic difference in the solvent exposure of histidines, glutamates and aspartates belonging to the metal-binding and non-metal-binding subgroups: the former subgroup exhibits a distinctive preference for more buried positions. Notably, ~ 80% of metal-binding glutamates have normalized SASAs smaller than 0.05, as compared to 10% of the non metal-binding glutamates. Cysteines (panel C), on the other hand, do not appear to differentiate between the two subgroups: they tend to occupy buried positions irrespective of metal binding. About 60% of both metal-binding and non metal-binding cysteines have normalized SASAs lower than 0.05.

Figure 4. Comparison of the solvent accessibilities of metal binding and other (blue) residues.

Figure 4

A. Histograms of solvent accessibilities for CHED residues in the two groups. B. Cumulative distributions for two groups of CHED residues, indicating the strong propensity of metal-binding CHED residues to be buried. C-F. Same as panel A, for each of the CHED residues. Solvent accessibility was calculated using the constrained Voronoi procedure described in McConkey et al. (2002), and Eyal et al. (2004). The numbers in parenthesis on the legend show average solvent accessibility.

Thus we see a predisposition of metal-coordinating histidines, glutamates and aspartates to occupy regions with limited, if any, exposure to solvent. This property, along with the low mobility noted above, can be used as ‘features’ in facilitating the identification of potentially metal-coordinating HED residues in putative metal-binding proteins.

METAL-BINDING SITES HAVE ENHANCED SIGNAL-PROPAGATION PROPERTIES

Chennubhotla and Bahar (2007) have introduced a Markov Model for describing the stochastics of signal transmission in proteins modeled as networks of nodes and springs. Two quantities are defined as metrics of communication propensities: (1) hitting time, H(i,j), which represents the number of steps (network edges) involved in transferring a signal from residue j to residue i, averaged over all possible paths connecting these two endpoints, and (2) commute time, C(i,j) = H(i,j) + H(j,i). The former depends on the direction of signal flow; the latter is independent of the direction. These two quantities are conveniently represented in terms of 2-dimensional maps, representative of the communication efficiency of all pairs of amino acids in a given protein. Notably, these two information theoretic concepts have been shown to correlate with physics-based properties obtainable by the GNM: commute time is directly proportional to MSFs in inter-residue distances (Chennubhotla & Bahar, 2007), residue pairs subject to small distance fluctuations being able to efficiently communicate. Hitting time, on the other hand, may be conveniently expressed in terms of the MSFs and cross-correlations in the positions of residues (see Methods).

Calculations performed for metal-binding proteins revealed that metal-binding sites possess uniquely efficient communication propensities. For illustrative purposes, we display in Figure 5, the types of the results for a zinc-binding protein (PDB ID: 1I6N). The top two maps describe the hitting times (A) and commute times (B) evaluated for all pairs. Notably, residues seem to have more or less uniform signal sending properties while they differ in their ability to receive signals, as evidenced from the insensitivity of H(i,j) to residue j. Panel C displays the average receiving properties of residues, found from <H(i)> = ΣjH(i,j)/N. The red circles indicate metal-binding residues. Clearly, these residues have very low <H(i)> values, i.e. they are distinguished by their fast communication with all other residues. In addition to the mean hit times per residue, we also calculated the corresponding variance. The plot of the average hit time vs its standard deviation (or variance) for each residue in panel D clearly shows that metal-binding residues have minimal hit times and minimal variance, i.e. they are “fast and precise” in so far as their signal transmission properties are concerned. Note that similar features were observed for catalytic residues in our previous work (Chennubhotla and Bahar, 2007), suggesting that protein structures are designed to position their functional residues at key sites enabling cooperative response. Additional results for more proteins may be found in the Supplementary Material (Figures S3 and S4), which confirms the same distinctive features.

Figure 5. Signal propagation properties illustrated for a Zn2+ binding protein with endonuclease fold.

Figure 5

A. Hitting time H(i,j) map as a metric of the average duration of time, or number of steps, required to transmit signals from residue i to residue j, predicted by a Markovian model of communication (see Eq. 5). Blue and red regions correspond to shortest and longest hit times, respectively. B. Commute times C(i,j) = H(i,j) + H(j,i) (see Eq. 6). C. Average hit time vs residue number, evaluated from the average of H(i,j) over all starting points j. Red markers show the metal binding residues. Almost all of them occur at the minima of the curve, pointing to the efficient signal transmission properties of metal-binding sites. D. Average hit time vs its variance for each residue. The residues that participate in metal binding (shown in red markers) exhibit small average hit time along with small standard deviations. See also Figure S3 and Figure S4.

The results for all 145 metal-binding proteins are presented in Figure 6. As in Figure 5D, the points represent the mean (abscissa) and variance (ordinate) of the hit times <H(i)> evaluated for each residue in these proteins. In order to be able to display the results for all proteins on the same plot, hit times have been normalized with respect to the cumulative degree of the proteins (see Methods). Black dots correspond to CHED residues involved in metal binding (panel A) and other CHED residues (panel B). The comparison of the dispersion of black dots in the two panels demonstrates that metal-binding CHED (panel A) exhibit minimal hit time and variance compared to their non-metal-binding counterparts (panel B). We notice, however, two clusters of outliers in panel A. The former (enclosed by a red circle) refers to a Ni++-binding transcription factor (PDB ID: 1B9N), also noted in Table S1 to be an outlier. The apo form of this protein has been resolved in the presence of molybdate ion, which elicited a significant change in its quaternary structure (Gourley et al., 2001), hence the difference in the global dynamics of the apo and holo forms. The 2nd cluster (enclosed in the blue circle) in Figure 6 panel A refers to a DNA-binding protein that might undergo atypical conformational changes upon DNA binding.

Figure 6. Comparison of the communication propensities of metal-binding (panel A) and non metal-binding (panel B) CHED residues.

Figure 6

Results are displayed for all residues in 145 holo metal-binding proteins. The abscissa shows the average hit-time, and the ordinate shows the variance in the hit-times, both quantities being normalized with respect to the cumulative degree of the proteins. A. The black markers show the position of metal-binding CHED residues; colored points refer to all other residues. B The black markers show the position of non metal-binding CHED. A few outliers are highlighted in panel A. See also Figure S4 and Figure S5.

The above results point to the fast and effective communication property of metal binding residues and their surroundings. Thus it appears that based on the inherent network topology, proteins are intrinsically wired such that metal-binding residues are optimally positioned to enable efficient communication.

INSIGHTS INTO DE NOVO DESIGN OF METAL-BINDING PROTEINS

Assessment of collective motions intrinsically accessible to a given architecture can assist in designing proteins with suitable dynamics. Metal-binding residues emerge here as efficient signal propagators based on equilibrium state fluctuations available to the protein and show specific and fast communication patterns. These properties provide meaningful insights into the architectural design of metal-binding sites. These sites ought to be co-localized or near-neighboring with global hinges; and they should include charged and polar residues (e.g. CHED residues) that are buried, such that they will effectively ligate the metal and mediate the concerted motions of the surrounding structural elements, or the signal propagation between them.

To test the validity of this conjecture, we examined two de novo designed metal-binding proteins (Figure 7): a cobalt-bound metalloprotein in the form of a four-helical bundle (PDB ID: 1OVU) (Geremia et al., 2005) and a dimetal binding protein with Zn2+ (PDB ID: 2HZ8) (Calhoun et al., 2008). The global mode shape predicted by the GNM (panel A) as well as signal propagation properties reflected by the mean value and covariance of hitting times for each residue (panel B) clearly show that the metal-binding residues occupy key mechanical positions (minima in global mode) and have fast and precise communication capacities. Thus such inherent properties are also present, perhaps inadvertently, in the design of these proteins.

Figure 7. Global dynamics and signal transduction properties of metal-binding de novo designed proteins.

Figure 7

A. Global fluctuation profiles predicted by the GNM for 1OVU – a four-helix bundle binding Co2+ (top) and 1HZ8 – a dimetal Zn2+ binding protein (bottom). Metal binding sites are shown by the red markers. B. Average hitting times vs their covariance plot, for all residues in these two cases. Metal-binding residues (red circles) lie at the shortest hitting times and smallest variance region. C. Ribbon diagram color-coded by signal propagation properties with the blue regions indicating fast and efficient communication, and red regions the poorest communication properties. Metal-binding residues are shown by space-filling representation, and metals are shown by the red spheres. The metal binding sites were identified using Ligand Explorer in the PDB.

Conclusion

With increasing structural data on metal-binding proteins, we are now in a position to gain insights into the structural and dynamic features of metal-binding sites, and their significance with regard to the catalytic, transport, and signaling properties of the metal-binding proteins. Our results show that metal-binding sites might have an inherent preference to undergo minimal fluctuations in their positions, occupy central/buried positions despite being polar or charged, and possess unique signal transduction properties. These three properties are not necessarily independent: more buried residues usually tend to have more restricted mobilities, and their tight packing confers efficient signal propagation properties. The fluctuations and signaling properties derived here are both based on network models: GNM for collective dynamics, and Markovian network model for allosteric communication. As described in the methodology, the residue fluctuations derived from the GNM relate to commute/hit times. Notably, the distinctive behavior of metal-binding His, Asp and Glu becomes more prominent when their signaling properties are examined, suggesting that these sites might be evolutionary selected to optimize the allosteric communication across the protein.

The study provides us with insights into simple design principles: the protein architecture uniquely encodes an ensemble of equilibrium motions, some being more probable than others. Functional residues/sites are usually implicated in some major way (e.g. hinge-bending, redistribution of salt bridges, conformational switches) in the softest motions, which are readily triggered by external perturbations (e.g. ligand binding). Metal-binding residues are indeed shown here to be located at/near such key mechanical positions (global hinge centers) to readily elicit cooperative responses.

The above arguments are exclusively based on topological properties of network models representative of protein structures. As such, they provide information on purely entropic driving forces. The results obtained here suggest that the entropic driving forces inherent to the geometry/architecture of the metal-binding proteins ascribe efficient mechanical and signal transduction properties to metal binding sites.

Methods

GNM

GNM is a coarse-grained model for predicting the fluctuation dynamics of proteins. The α-carbons of the folded protein are chosen as the nodes of the network, and these nodes are connected by springs of uniform force constant if they are located within a cutoff distance rc generally taken to be 7.0 ± 5 Å (Bahar et al., 1997; Kundu et al., 2002). The dynamics of the network is fully defined by the NxN Kirchoff or connectivity matrix Γ, where the off-diagonal elements are −1 if the distance between residue i and j is less than rc and 0 otherwise. The diagonal term i denote the degree of the ith residue, and thus the columns sum to zero. The MSFs of residues and the cross-correlations between residue fluctuations are given by (Bahar et al., 1997; Haliloglu et al., 1997).

<(ΔRi)2>=(3kTγ)[Γ1]iiand<ΔRi.ΔRj>=(3kTγ)[Γ1]ij (1)

The network composed of N nodes has N−1 independent modes of motion. These are extracted by eigenvalue decomposition Γ = UΛUT. U is the orthogonal matrix whose kth column uk is the kth mode eigenvectors, and Λ is the diagonal matrix of eigenvalues, λk. <ΔRiRj> can be written in terms of the sum of the contribution of each mode as

<ΔRi.ΔRj>=(3kTγ)Σk[λk1ukukT]ij (2)

Thus (ΔRi)2k = (3kT/λ) [λk−1 ukukT]ij gives the square fluctuation in mode k, and since the eigenvectors are normalized, [ukukT]ii as a function of residue i represents the probability distribution of residue fluctuations in mode k (Bahar et al., 1997; Haliloglu et al., 1997). In the application of the GNM to metal-binding proteins, the metal ions of the 30 holo proteins in Dataset I have been accounted for by including an additional node/site the coordinates of which coincide with that of the bound metal. Comparison of the softest modes computed with and without such additional nodes confirmed that the global dynamics of the proteins are maintained with an average correlation coefficient of 0.9975. Further calculations with holo proteins were performed without including in the GNM the metal ion node.

We provide in Table S2 the correlation coefficient between the mean-square fluctuations of residues, <(ΔRi)2>, predicted by the GNM for each protein and those indicated by the corresponding X-ray crystallographic B-factors Bi = (8π2/3) <(ΔRi)2>. The average correlation coefficient is found to be 0.61 provided that all subunits/monomers in the PDB entry are considered in the GNM. Note that the correlation was improved (from 0.53 to 0.61) upon performing GNM calculations for all chains of multimeric proteins.

Markov Process and Equilibrium Distribution

According to the discrete time, discrete state Markov model developed for exploring the propagation of perturbations in proteins (Chennubhotla and Bahar, 2007), the strength of the interaction between pairs of residues is given by the ijth element,

aij=NijNiNj (3)

of the affinity matrix A. Here Nij is the total number of heavy atom contacts between residues i and j, and Ni is the number of side chain atoms in residue i. The local interaction density dj is given by dj = Σi aij, and organized as the diagonal elements of the degree matrix D diag{dj}. Residue pairs with higher affinity undergo more efficient communication. The conditional probability of transmitting a signal from residue j to residue i in one time step is given by the ijth element

mij=dj1aij (4)

of the conditional probability matrix M = AD−1. M fully defines the stochastics of information diffusion across the structure. The hitting time for the transfer of a signal from residue j to residue i is given by (Chennubhotla and Bahar, 2007; Chennubhotla et al., 2008)

H(i,j)=Σk=1N[Γ1]kj[Γ1]ij[Γ1]ki+[Γ1]iidk (5)

where Λ is readily evaluated from Λ = DA. We note that H(i, j) is a function of the cross-correlations between residue fluctuations (see Eq. 1). Likewise, the commute time C(i,j) is expressed in terms of residue fluctuations, using

C(i,j)=[Γ1]ii+[Γ1]jj2[Γ1]ijΣk=1Ndk=ΔRij.ΔRij[γ3kTΣk=1Ndk] (6)

We note that the last term in square brackets is a constant, and increases linearly with the size of the protein (Figure S5). This quantity was used to normalize average hit-time values and standard deviations, to bring the values in all proteins to the same scale.

Supplementary Material

01

Acknowledgements

Support from NIH grant # 5R01LM007994-06 is gratefully acknowledged by IB. AD acknowledges useful discussions with Dr. Chennubhotla.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

REFERENCES

  1. Auld DS. Zinc coordination sphere in biochemical zinc sites. Biometals. 2001;14:271–313. doi: 10.1023/a:1012976615056. [DOI] [PubMed] [Google Scholar]
  2. Babor M, Gerzon S, Raveh B, Sobolev V, Edelman M. Prediction of transition metal-binding sites from apo protein structures. Proteins. 2008;70:208–217. doi: 10.1002/prot.21587. [DOI] [PubMed] [Google Scholar]
  3. Bahar I, Atilgan AR, Erman B. Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Fold Des. 1997;2:173–181. doi: 10.1016/S1359-0278(97)00024-2. [DOI] [PubMed] [Google Scholar]
  4. Bahar I, Chennubhotla C, Tobi D. Intrinsic dynamics of enzymes in the unbound state and relation to allosteric regulation. Curr Opin Struct Biol. 2007;17:633–640. doi: 10.1016/j.sbi.2007.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bahar I, Lezon TR, Bakan A, Shrivastava IH. Normal mode analysis of biomolecular structures: functional mechanisms of membrane proteins. Chem Rev. 2010;110:1463–1497. doi: 10.1021/cr900095e. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bakan A, Bahar I. The intrinsic dynamics of enzymes plays a dominant role in determining the structural changes induced upon inhibitor binding. Proc Natl Acad Sci U S A. 2009;106:14349–14354. doi: 10.1073/pnas.0904214106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Calhoun JR, Liu W, Spiegel K, Dal Peraro M, Klein ML, Valentine KG, Wand AJ, DeGrado WF. Solution NMR structure of a designed metalloprotein and complementary molecular dynamics refinement. Structure. 2008;16:210–215. doi: 10.1016/j.str.2007.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Carafoli E. Calcium signaling: a tale for all seasons. Proc Natl Acad Sci U S A. 2002;99:1115–1122. doi: 10.1073/pnas.032427999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Castagnetto JM, Hennessy SW, Roberts VA, Getzoff ED, Tainer JA, Pique ME. MDB: the Metalloprotein Database and Browser at The Scripps Research Institute. Nucleic Acids Res. 2002;30:379–382. doi: 10.1093/nar/30.1.379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chennubhotla C, Bahar I. Signal propagation in proteins and relation to equilibrium fluctuations. PLoS Comput Biol. 2007;3:1716–1726. doi: 10.1371/journal.pcbi.0030172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Chennubhotla C, Yang Z, Bahar I. Coupling between global dynamics and signal transduction pathways: a mechanism of allostery for chaperonin GroEL. Mol Biosyst. 2008;4:287–292. doi: 10.1039/b717819k. [DOI] [PubMed] [Google Scholar]
  13. Cui Q, Bahar I. Normal Mode Analysis: Theory and Applications to Biological and Chemical Systems. Chapman & Hall/CRC; London, UK: 2006. [Google Scholar]
  14. Ebert JC, Altman RB. Robust recognition of zinc binding sites in proteins. Protein Sci. 2008;17:54–65. doi: 10.1110/ps.073138508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Eisenmesser EZ, Millet O, Labeikovsky W, Korzhnev DM, Wolf-Watz M, Bosco DA, Skalicky JJ, Kay LE, Kern D. Intrinsic dynamics of an enzyme underlies catalysis. Nature. 2005;438:117–121. doi: 10.1038/nature04105. [DOI] [PubMed] [Google Scholar]
  16. Eyal E, Najmanovich R, McConkey BJ, Edelman M, Sobolev V. Importance of solvent accessibility and contact surfaces in modeling side-chain conformations in proteins. J Comput Chem. 2004;25:712–724. doi: 10.1002/jcc.10420. [DOI] [PubMed] [Google Scholar]
  17. Geremia S, Di Costanzo L, Randaccio L, Engel DE, Lombardi A, Nastri F, DeGrado WF. Response of a designed metalloprotein to changes in metal ion coordination, exogenous ligands, and active site volume determined by X-ray crystallography. J Am Chem Soc. 2005;127:17266–17276. doi: 10.1021/ja054199x. [DOI] [PubMed] [Google Scholar]
  18. Golovin A, Dimitropoulos D, Oldfield T, Rachedi A, Henrick K. MSDsite: a database search and retrieval system for the analysis and viewing of bound ligands and active sites. Proteins. 2005;58:190–199. doi: 10.1002/prot.20288. [DOI] [PubMed] [Google Scholar]
  19. Gourley DG, Schuttelkopf AW, Anderson LA, Price NC, Boxer DH, Hunter WN. Oxyanion binding alters conformation and quaternary structure of the c-terminal domain of the transcriptional regulator mode. Implications for molybdate-dependent regulation, signaling, storage, and transport. J Biol Chem. 2001;276:20641–20647. doi: 10.1074/jbc.M100919200. [DOI] [PubMed] [Google Scholar]
  20. Haliloglu T, Bahar I, Erman B. Gaussian Dynamics of Folded Proteins. Phys Rev Lett. 1997;79:3090–3093. [Google Scholar]
  21. Hantke K. Iron and metal regulation in bacteria. Curr Opin Microbiol. 2001;4:172–177. doi: 10.1016/s1369-5274(00)00184-3. [DOI] [PubMed] [Google Scholar]
  22. Hinsen K. Analysis of domain motions by approximate normal mode calculations. Proteins. 1998;33:417–429. doi: 10.1002/(sici)1097-0134(19981115)33:3<417::aid-prot10>3.0.co;2-8. [DOI] [PubMed] [Google Scholar]
  23. Hinsen K, Petrescu A-J, Dellerue S, Bellissent-Funel M-C, Kneller GR. Harmonicity in slow protein dynamics. Chemical Physics. 2000;261:25–37. [Google Scholar]
  24. Jain E, Bairoch A, Duvaud S, Phan I, Redaschi N, Suzek BE, Martin MJ, McGarvey P, Gasteiger E. Infrastructure for the life sciences: design and implementation of the UniProt website. BMC Bioinformatics. 2009;10:136. doi: 10.1186/1471-2105-10-136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kendrik MJ, Plishka MJ, Robinson KD. Metals in Biological Systems. Prentice Hall Professional Technical Reference; New York: 1992. [Google Scholar]
  26. Kundu S, Melton JS, Sorensen DC, Phillips GN., Jr. Dynamics of proteins in crystals: comparison of experiment with simple models. Biophys J. 2002;83:723–732. doi: 10.1016/S0006-3495(02)75203-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Lange OF, Lakomek NA, Fares C, Schroder GF, Walter KF, Becker S, Meiler J, Grubmuller H, Griesinger C, de Groot BL. Recognition dynamics up to microseconds revealed from an RDC-derived ubiquitin ensemble in solution. Science. 2008;320:1471–1475. doi: 10.1126/science.1157092. [DOI] [PubMed] [Google Scholar]
  28. Lin CT, Lin KL, Yang CH, Chung IF, Huang CD, Yang YS. Protein metal binding residue prediction based on neural networks. Int J Neural Syst. 2005;15:71–84. doi: 10.1142/S0129065705000116. [DOI] [PubMed] [Google Scholar]
  29. Lin HH, Han LY, Zhang HL, Zheng CJ, Xie B, Cao ZW, Chen YZ. Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach. BMC Bioinformatics. 2006;7(Suppl 5):S13. doi: 10.1186/1471-2105-7-S5-S13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. McConkey BJ, Sobolev V, Edelman M. Quantification of protein surfaces, volumes and atom-atom contacts using a constrained Voronoi procedure. Bioinformatics. 2002;18:1365–1373. doi: 10.1093/bioinformatics/18.10.1365. [DOI] [PubMed] [Google Scholar]
  31. Nicolay S, Sanejouand YH. Functional modes of proteins are among the most robust. Phys Rev Lett. 2006;96:078104. doi: 10.1103/PhysRevLett.96.078104. [DOI] [PubMed] [Google Scholar]
  32. Passerini A, Andreini C, Menchetti S, Rosato A, Frasconi P. Predicting zinc binding at the proteome level. BMC Bioinformatics. 2007;8:39. doi: 10.1186/1471-2105-8-39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Passerini A, Punta M, Ceroni A, Rost B, Frasconi P. Identifying cysteines and histidines in transition-metal-binding sites using support vector machines and neural networks. Proteins. 2006;65:305–316. doi: 10.1002/prot.21135. [DOI] [PubMed] [Google Scholar]
  34. Tainer JA, Roberts VA, Getzoff ED. Protein metal-binding sites. Curr Opin Biotechnol. 1992;3:378–387. doi: 10.1016/0958-1669(92)90166-g. [DOI] [PubMed] [Google Scholar]
  35. Tama F, Brooks CL. Symmetry, form, and shape: guiding principles for robustness in macromolecular machines. Annu Rev Biophys Biomol Struct. 2006;35:115–133. doi: 10.1146/annurev.biophys.35.040405.102010. [DOI] [PubMed] [Google Scholar]
  36. Tirion MM. Large Amplitude Elastic Motions in Proteins from a Single-Parameter, Atomic Analysis. Phys Rev Lett. 1996;77:1905–1908. doi: 10.1103/PhysRevLett.77.1905. [DOI] [PubMed] [Google Scholar]
  37. Tobi D, Bahar I. Structural changes involved in protein binding correlate with intrinsic motions of proteins in the unbound state. Proc Natl Acad Sci U S A. 2005;102:18908–18913. doi: 10.1073/pnas.0507603102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Yang LW, Bahar I. Coupling between catalytic site and collective dynamics: a requirement for mechanochemical activity of enzymes. Structure. 2005;13:893–904. doi: 10.1016/j.str.2005.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01

RESOURCES