Abstract
Normal mode analysis offers an efficient way of modeling the conformational flexibility of protein structures. We use anisotropic displacement parameters from crystallography to test the quality of prediction of both the magnitude and directionality of conformational flexibility. Normal modes from four simple elastic network model potentials and from the CHARMM forcefield are calculated for a data set of 83 diverse, ultrahigh resolution crystal structures. While all five potentials provide good predictions of the magnitude of flexibility, all-atom potentials have a clear edge at prediction of directionality, and the CHARMM potential has the highest prediction quality. The low-frequency modes from different potentials are similar, but those computed from the CHARMM potential show the greatest difference from the elastic network models. The comprehensive evaluation demonstrates the costs and benefits of using normal mode potentials of varying complexity.
Keywords: normal mode analysis, protein dynamics, anisotropic displacement parameters, elastic network models
Introduction
The native state of a protein is an ensemble of conformers, deviating to some extent from the average coordinates reported as the experimental structure. Knowledge of the static structure is not sufficient for understanding the functional mechanisms, which often depend on the flexibility of protein structures. Experimental observation of conformational motion of biomolecules is becoming possible, thanks to experimental innovation, but remains a formidable challenge. Crystals can be subjected to time-resolved experiments (Moffat, 2001), but the range of applications is limited to reactions that can be triggered by light or trapped by clever manipulations. NMR spectroscopy can be used to determine both the structure and the dynamics of proteins (Lindorff-Larsen et al., 2005), but it is limited both by the maximum size of protein structures and by the difficulty of discrimination of slowly or quickly exchanging dynamics (Palmer et al., 2001). Mass spectrometry coupled with hydrogen/deuterium exchange and proteolysis has been used to determine changes in the relative solvent accessibility of amide hydrogens (Lanman and Prevelige, 2004) and single-molecule experiments using optical trapping have resulted in spectacular observations of the motion of motor proteins (Abbondanzieri et al., 2005). In general, direct measurement of molecular motion remains laborious and limited.
Computer simulations of biological macromolecules enable detailed explorations of the conformational ensemble near the native state (Karplus and Kuriyan, 2005). However, the computational cost of molecular dynamics with all-atom forcefields limits the accessible timescale of simulations, particularly of large molecular assemblies. Thus, approximate methods, such as normal mode analysis (NMA), are often used to efficiently describe the allowed conformational ensemble of protein structures (Brooks and Karplus, 1983; Go et al., 1983; Levitt et al., 1985). The decomposition into modes with different frequencies reduces the dimensionality of the problem, since a few lowest-frequency modes describe the most dominant directions of motion (Teodoro et al., 2003). These global modes have been used to predict protein flexibility (Cui et al., 2004; Van Wynsberghe et al., 2004) and to study the mechanism of conformational transitions necessary for protein function (Ma and Karplus, 1997). Simple coarse-grained potentials, such as Elastic Network Models, provide an efficient description of a protein structure by connecting atoms or residues within a certain distance with identical harmonic potentials (Tirion, 1996). Despite the extreme simplification, these models capture the basic topology of a structure and generate predictions on the flexibility and preferred modes of motion of proteins that are in general agreement with experimental data (Bahar and Rader, 2005).
The study of protein conformational dynamics requires interplay between experiment and computation. A readily available measure of conformational mobility is the Debye-Waller temperature factor, or B-factor, which models the variance in atomic position from the scattering data. It has been used as a source of information on protein flexibility for decades (Frauenfelder et al., 1979), and as computational methodologies have matured, studies over large numbers of crystal structures have shown good agreement with computations, specifically with ENM results (Kundu et al., 2002). While the classic B-factor has long been a routine parameter in protein structure refinement, until recently few crystal data sets contained sufficiently many observations (unique reflections) to allow determination of anisotropic displacement parameters (ADPs). These parameters model the probability distribution of atomic positions as a Gaussian function with ellipsoidal contours, and have been shown to significantly improve the refinement statistics for crystal structures of biological macromolecules (Dauter et al., 1997; Esposito et al., 2000; Longhi et al., 1997) at resolution better than 1.2 Å. ADPs have been used in a few studies as a qualitative indicator of the directionality of prevalent motion in a protein structure (Wilson and Brunger, 2000), but this source of experimental information has not been systematically exploited.
The proliferation of various simple ENM-like models for macromolecular fluctuation begs the question of their relative fidelity and reliability, but no systematic comparison of the methods has been undertaken, to the best of our knowledge. Recently, systematic assessments of individual ENM potentials were published: a validation of a model based on Cα coordinates (Anisotropic Network Model) using B-factors from a diverse set of crystal structures (Eyal et al., 2006), and a study showing that adding residue-specific parameters into the same model led to large improvement in B-factor prediction (Hamacher and McCammon, 2006). Our group had used 98 highest resolution crystal structures in the PDB for systematic evaluation of prediction of the magnitude of motion in protein structures using an isotropic ENM model, and demonstrated that using atomic information and strengthening the model parameter for covalent interaction resulted in better prediction quality (Kondrashov et al., 2006). In the present work, we compare the quality of prediction of the magnitude and direction of structural variance for the most commonly used anisotropic ENM potentials, and introduce a new one to better model different chemical interactions. Comparison between ADPs and computational variance matrices allows a quantitative evaluation of the merits and drawbacks of different potentials. We also investigate the effect of the choice of potential on the global dynamic properties, such as the correlation matrix.
Results
Analysis of crystallographic data
The present study evaluates predictions of five coarse-grained normal mode potentials using a set of anisotropic displacement parameters from ultra-high resolution crystal structures. The Protein Data Bank (Berman et al., 2000) was searched for all X-ray crystal structures of proteins with chain length of at least 50 residues, with resolution at or beyond 1 Å, with the restriction that the structures have less than 50% sequence identity. 83 such structures were deposited with anisotropic displacement parameters, containing a total of 17763 protein residues. Excluding those with disordered Cα atoms or those involved in intermolecular crystal contacts, both of which have an effect on the ADP, left 12348 residues with usable ADPs. The anisotropic displacement parameters are commonly represented as ellipsoids in crystal structures, as shown in Figure 1 and contain information about both the magnitude and the preferred direction of atomic variation in the crystal. The anisotropy of the ellipsoid, defined as the ratio of the smallest to the largest eigenvalue of ellipsoid matrix (Trueblood et al., 1996), is a measure of deviation from spherical shape. We separated the structures by refinement software used, and found different distributions of anisotropy for the Cα ADPs. 68 structures were refined using SHELX (Sheldrick and Schneider, 1997), and the remaining were determined 15 using Refmac from the CCP4 suite (CCP4, 1994). The Cα ADPs in the Refmac set had a mean anisotropy of 0.64, compared with 0.51 for the SHELX set (Figure 2) suggesting that the crystallographic restraints used in the two programs have significant effects on resulting ADPs. Since sphere-like ellipsoids contain little directional information, a subset of ADPs with anisotropy of less than 0.5 was chosen, leaving 4642 ADPs to compare with the computational predictions of directionality of variance.
Normal mode potentials
Elastic Network Models are dependent on two parameters: the cutoff distance (Tirion, 1996), which separates atom pairs deemed in contact from those which are not interacting, and a force constant for the interaction between contacting atoms We have shown recently that a stronger force constant between covalently bound residues resulted in greatly improved variance prediction quality for an isotropic ENM (Kondrashov et al., 2006), compared with the single force constant GNM (Bahar et al., 1997). In this work, we introduce a new ENM method, called distance-based network model (DNM), with multiple force constants for atomic contacts, as described in Methods. It is clear that atom pairs closer than 2.3 Å are covalently bound and thus have stronger interactions than those 5 Å apart. To mimic the chemistry, several discrete distance ranges were defined, and force constants were varied to optimize the agreement with ADPs. We found that similar results were obtained if the force constants for each category were set to the reciprocal of the total number of contacts in this range. Since the number of atomic contacts grows with distance, this ensures that interactions between atoms farther away are represented by weaker force constants than those in close proximity. The atomic interactions are added up with the appropriate force constants for each residue, producing a residue-level model based on atomic interactions, with no additional free parameters, since the force constants are defined based on the contact matrices. The only parameter not defined from the structure is the maximum cutoff distance considered, and we optimized it by comparing prediction quality in calculations with a range of cutoffs from 5 Å to 11 Å. The analysis for agreement with magnitudes and directions of ADP ellipsoids with the model predictions is shown in Table 1 of SI. With the exception of the 5 Å cutoff, the results were very similar, and 9 Å was selected as the optimal cutoff distance.
We also tested four existing normal mode potentials, three of the ENM variety, and one based on the CHARMM forcefield. The ElNemo method (Suhre and Sanejouand, 2004) depends on the cutoff distance between atoms, and the atomic contact matrices are combined into rigid-motion blocks on a residue level. We varied the cutoff distance from 5 to 11 Å (Table 2 of SI), and found that the best results at 5 Å, compared with the default value of 8 Å. The anisotropic network model (ANM) (Atilgan et al., 2001) depends on a cutoff distance between Ca atoms, and we evaluated the results for a range from 10 to 16 Å (see Table 3 of SI). The variation is also relatively small, but there is an opposite trend between quality of prediction of direction and magnitude of motion. The best cutoff for directionality prediction was at 10 Å, while the best agreement in magnitude was with 16 Å cutoff, in contrast to the previously used value of 13 Å (Atilgan et al., 2001). Normal modes were computed using an atomistic forcefield (CHARMM) with rigid-body blocks for residues, referred to as Block Normal Modes (BNM) (Li and Cui, 2002). The inclusion of non-protein ligands and cofactors in the CHARMM force-field resulted in significant increases in quality of prediction, and thus all the non-protein residues for which CHARMM libraries could be found were added to the models. The last method used is the Harmonic Ca potential (HCA) with distance-dependent force constant (Hinsen et al., 2000), as implemented in Molecular Modeling Toolkit (MMTK) (Hinsen, 2000).
Comparison of crystallographic and computational variance
Anisotropic covariance tensors were computed from 100 lowest frequency normal modes (excluding the trivial rotation and translation modes, see Methods) from 83 structures, and fidelity of both magnitude and direction prediction was assessed. Magnitude prediction quality was measured by linear correlation between isotropic ADPs (B-factors) and the predicted isotropic variances over each structure. Two different measures were used for directional agreement, the absolute value of the dot product between the largest axes of the anisotropic ADPs (ellipsoids), and the volume overlap fraction, as defined in Methods. These two measures were employed to compare pairs of corresponding residues, and the reported numbers are the statistics over all sufficiently anisotropic ellipsoids from all 83 structures. Table 1 shows that prediction quality was markedly different for the magnitude and direction of motion. All the models had average isotropic correlations of 0.66–0.68, with the exception of 0.61 for HCA. On the other hand, there was considerable variation in the directional agreement of ADP ellipsoids. The two measures of directional agreement, the dot product and the overlap fraction, largely showed the same trend, with HCA and ANM displaying relatively weak agreement, while ElN, DNM and BNM, show considerably higher prediction quality, with CHARMM-based BNM having an edge over the ENM methods.
Table 1.
Dota | Overlapa | Isotropicb | |
---|---|---|---|
Randomc | 0.5 | 0.3 | 0 |
HCA | 0.599 (0.199) | 0.520 (0.167) | 0.617 (0.131) |
ANM (16 Å)d | 0.556 (0.190) | 0.525 (0.172) | 0.676 (0.111) |
ElN (5 Å)d | 0.641 (0.208) | 0.583 (0.184) | 0.680 (0.128) |
DNM (9 Å)d | 0.650 (0.209) | 0.575 (0.181) | 0.655 (0.136) |
CHARMM BNM | 0.655 (0.211) | 0.608 (0.188) | 0.658 (0.128) |
mean and standard deviation of comparisons with ADPs from individual Cα atoms
mean and standard deviation for isotropic correlation over 83 entire structures
computed for randomly oriented ADPs with anisotropy 0.5.
choice of optimal cutoff parameter in parentheses.
The mean absolute value of the dot product is easy to interpret as a measure of the angle between the preferred direction and in the experimental and computed ellipsoids. The average value of 0.65 for BNM corresponds to an angle of 48°, while the average of 0.56 for ANM corresponds to an angle of 56° but this does not tell the whole story because it only compares one principal axis of the ellipsoids. The overlap coefficient is the volume fraction shared by two ellipsoids of unit volume, and this quantity varies appreciably from 0.52 for HCA to 0.61 for BNM. We tested the hypothesis that predictions agree no better than expected from a random uniform distribution of ellipsoid direction, for which the mean dot product is 0.5, and the mean overlap fraction is 0.3 (when anisotropy is fixed at 0.5). Almost all of the structures with a reasonable sample of usable ADPs (with anisotropy < 0.5) showed better than random agreement in overlap fraction (P<0.01, see SI). For the method with lowest agreement, HCA, 11 structures did not meet this criterion, and only four had more than 10 sufficiently anisotropic ADPs, with the highest at 24. The results for the best-performing BNM method had only four structures where the null hypothesis could not be rejected, all of which had only 5 or fewer usable ADPs.
To illustrate the importance of including a large subset of normal modes for accurate variance prediction, we performed the computations with different numbers of modes from CHARMM BNM, shown in Table 2. In NMA, the reciprocal of the eigenvalue (frequency squared) represents the contribution of the mode to the total variance, and the first column shows the cumulative fraction of variance of the first 100 modes represented by the subset. The first 10 modes account for nearly half of the variance, but the prediction quality for all three measures is considerably lower than for the full 100 modes, and shows monotonic improvement with inclusion of additional modes. The effect is dramatic for the overlap fraction, largely due to the contribution of higher frequency modes to “rounding” of the computed ellipsoids, leading to higher overlap volume with the relatively isotropic ADPs. However, the dot product and the isotropic correlation, which are independent of anisotropy, show consistent improvement with inclusion of additional higher-frequency modes, showing that calculations using only a handful of lowest-frequency modes are likely imprecise.
Table 2.
Variance fractiona | Dot productb | Overlap fractionb | Isotropic corrc | |
---|---|---|---|---|
3 modes | 0.272 | 0.632 | 0.153 | 0.531 |
5 modes | 0.348 | 0.634 | 0.305 | 0.557 |
10 modes | 0.473 | 0.640 | 0.451 | 0.600 |
20 modes | 0.623 | 0.643 | 0.536 | 0.622 |
30 modes | 0.720 | 0.645 | 0.563 | 0.632 |
40 modes | 0.791 | 0.646 | 0.578 | 0.640 |
50 modes | 0.847 | 0.648 | 0.589 | 0.645 |
60 modes | 0.894 | 0.650 | 0.594 | 0.650 |
variation represented by the indicated subset of modes as a fraction of the variance from 100 lowest frequency modes, excluding rigid-body modes
mean comparisons with ADPs from individual Cα atoms
mean isotropic correlation over 83 entire structures
Effect of potential on global dynamic ensembles
Comparison of the normal modes produced by different methods revealed a clear distinction between the harmonic ENM models and the CHARMM forcefield BNM. We used the modes computed from all 83 structures to investigate how the dynamic ensemble predictions depend on the use of the potential. The overlap measure described in Methods was used to compare the 17 lowest-frequency modes from all 5 models. Figure 3 shows the agreement between individual modes for all 10 pairs of potentials, averaged over all 83 structures. The highest agreement was observed for the lowest-frequency modes, but the overlap measure dropped below 0.5, depending on the pair of methods, at some point in the first 15 modes. This demonstrated that the details of potential play a secondary role at lowest-frequency modes, which are dominated by the contact topology and shape of the molecular structure. A second observation is the distinctiveness of modes derived from CHARMM-based BNM, which showed much lower overlap with ENM-based methods (dotted lines) than overlap among modes from ENM-type potentials (solid lines), with the single exception of the overlap between ANM and ElN. Since the latter is an all-atom potential, it is reasonable that it should be closer to chemistry-based BNM than to Cα-based ANM. We tested the possibility that minimization of structures prior to BNM is responsible for the difference in BNM modes, by calculating DNM modes from the minimized structures. The resulting average overlap with BNM was 0.76 as opposed to 0.75 for BNM with DNM from unminimized structures, still much lower than DNM agreement with other methods. This suggests that the chemical information present in the all-atom CHARMM potential plays a role in determining the lowest-frequency modes, in addition to the topology of the structure.
To illustrate the differences between the chemical forcefield and ENM, we picked a small, well-studied structure from the data set, a PDZ2 domain from syntenin (PDB ID 1R6J) and computed the correlation matrices (see Methods) from the 90 low-frequency modes of ANM and BNM. Figure 4 shows correlation matrices computed from ANM and BNM modes. In general, they look quite similar, with major features determined by the secondary structure elements: anti-parallel beta sheets appear as positive bands perpendicular to the diagonal, and the two helices result in a thickening of the diagonal band. While the pattern of secondary structures is clear in both potentials, there are evident differences. First, the magnitude of correlation is at least two times weaker in ANM (see the colorbar), and the secondary structure features are not as clear, due to the inclusion of residues as far as 16 Å away. Second, due to identical force constants for distant and proximal interactions, the diagonal band is considerably weaker in the ANM plot than in BNM, which has a more realistic representation of covalent bonds and other main-chain interactions. Both potentials capture the effect of gross topology, but the effects of specific chemistry are hidden in the fine details of the BNM correlation matrix.
Discussion
We analyzed five different coarse-grained potentials used to model the conformational flexibility of protein structures. These were evaluated both by validation against experimental data and by comparison among the different potentials. To our knowledge this is the first systematic attempt to use anisotropic displacement parameters to validate computational predictions, and it behooves us to note the challenges arising from using this data source. The reliability of ADPs has been tested before (Merritt, 1999), with good agreement in ellipsoid shape observed between independently determined structures of the same protein; we found the same to be true for structures of myoglobin in four different crystal forms (Kondrashov, et al, submitted). This shows that ADPs are robust experimental parameters, and to minimize the noise contributions we used the highest resolution crystal structures available. However, quantitative comparison between ADPs and computational predictions is not straightforward, due to contributions of experimental noise, model error (Kuriyan et al., 1986), rigid-body motion of the entire molecules (Kuriyan and Weis, 1991), and specifics of crystal environment, such as crystal contacts between copies of the protein packed in the lattice (Phillips, 1990) and collective lattice modes (Clarage et al., 1992). Further, the ADP represents the best fit of a Gaussian distribution to the electron density of an atom, but anharmonic and multimodal positional distributions are expected for protein atoms, especially in mobile regions, such as the surface. Only atoms with pronounced anisotropy are used for directional comparison, which tend to lie in mobile regions with poorer electron density (see Figure 1), and which are not adequately modeled with a single conformer (DePristo et al., 2004). Thus it is likely that many atoms in our directional dataset are not adequately modeled by the Gaussian ADP model. Despite these caveats, our results show good agreement between the predicted and computed ADPs: for virtually all structures, the overlap fraction between BNM predictions and ADPs is significantly higher than the expectation from a uniform random variable. This suggests that the influence of the factors listed above is not sufficient to overwhelm the important contribution of intramolecular conformational flexibility. This is consistent with a recent comparison of MD simulations with crystallographic B-factors which estimated that rigid-body motions contribute only 20–30% of total positional variance in B-factors (Meinhold and Smith, 2005). The agreement between computation and experiment serves to validate both the interpretation of the experimental data and the reliability of computational predictions.
In our analysis we combined multiple low-frequency normal modes to generate the anisotropic variance for each residue from a large number of modes, weighted by the calculated frequencies, and compare the result with the crystallographic variation. This method has been used in previous work applying normal modes to crystallographic refinement (Kidera and Go, 1990), but is not in common use for validating normal modes with experimental displacements. Instead, the procedure is often to project low-frequency modes individually onto a conformational change, and to obtain a cumulative projection coefficient. This, however, is impossible to do without prior knowledge of the conformational change in the structure, and gives only an agreement between the subspace spanned by a several modes and the conformational change. Our approach does not presume any knowledge beyond the initial structure, and measures agreement with the entire normal mode ensemble, rather than with individual modes.
This is also, as far as we know, the first large-scale comparative study of coarse-grained normal mode methods. Comparison of the modes from different potentials reveals a distinct split between the ENM methods and BNM, as seen in Figure 3. This suggests that the chemical information absent in the ENM potential is observable in the BNM results, although there is significant similarity at low frequency modes due to the shape of the structure reflected in both potential types. The observation opens up a possibility of separating the effect of gross protein structure from that of detailed residue chemistry as reflected by the CHARMM forcefield. A careful comparison of ENM predictions with those from normal modes with chemical forcefield could potentially be used to determine residues whose chemistry plays a key role in the dynamic coupling in the structure, and which would therefore be especially sensitive to mutation. The visual comparison of the correlation patterns from ANM and BNM demonstrates that the chemical effects are subtle in comparison to the topological features captured by both BNM and ANM, and all the other methods.
The choice of computational strategy to address a given problem involves balancing computational efficiency against model detail. Fast calculations are meaningless if they give unreliable results, and extremely accurate calculations are of no use if they cannot be completed in a reasonable time frame. Normal mode analysis is based on a choice to limit the model to the neighborhood of the potential minimum. Further simplification of using an elastic network model potential instead of a physical, all-atom potential is another concession towards efficient calculation and away from physical reality. We found that prediction quality of the magnitude of flexibility is similar for CHARMM BNM and all ENM models, with the exception of HCA. This is consistent with a recent comparison of different levels of ENM potentials, which found that addition of all-atom coordinates resulted in only small improvement in B-factor agreement (Sen et al., 2006). The results once again demonstrate the robustness of the elastic network models, and suggests that the main factor in determining macromolecular flexibility is the number of local contacts, determined by the shape of the molecule (Halle, 2002). In prediction of directionality of motion, there is a clear difference between methods that are based only on Ca coordinates, (HCA and ANM) and those that consider all atoms. CHARMM-based BNM has the best directional agreement as measured by the overlap fraction, while our new method, DNM, and EIN come close to matching this standard. This suggests that an all-atom ENM potential can give an accurate representation of the conformational ensemble of a protein near the native state, but the inclusion of chemical forces improves the model.
We must also consider the cost, both computational and human, required by the different methods. One of the main differences between elastic network model techniques and BNM is that the latter requires an initial minimization step (see Methods). If minimization is not complete, subsequent diagonalization will lead to spurious modes with large, negative frequencies; one must be careful to only pick productive modes when using results BNM, while elastic network models are at a local minimum by construction. Further, the initial setup with an all-atom potential requires attention to the individual oddities of each structure: disulfide bonds, non-standard residues, bound ligands or cofactors. Each of these issues must be dealt with individually, thus making automation of the calculations more difficult. Compared with ENM models, in which most of these details are ignored, CHARMM-based normal modes require a great deal of human effort.
The present results indicate that anisotropic temperature factors from high resolution crystal structures contain a measure of internal molecular flexibility, and can be used as a source of dynamic information and as a test for computational methodologies. Comparison of different methods indicates that elastic network models can describe the conformational ensemble of protein structures with accuracy approaching that of CHARMM, but that there is a substantial spread in prediction quality of different ENM potentials. Using an exclusively Cα-based potential results in a large sacrifice in prediction quality of directionality, but the lowest frequency modes are robust across the methods. The information may help those studying interactions within biological molecules choose the appropriate level of complexity for the system of interest and for the level of detail required of the prediction.
Methods
We use normal mode analysis (Brooks and Karplus, 1983; Go et al., 1983; Levitt et al., 1985) to predict the positional ensemble of protein structures. The different models use distinct potentials, all of which require the knowledge of protein structure. The Hessian matrix of the potential is diagonalized to find the normal modes, or eigenvectors ui and the corresponding frequencies . The decomposition allows us to compute the covariance matrix, which is proportional to the pseudo-inverse of the Hessian. Let δi be the deviation from the mean for component i, then the covariance between two deviations is:
where brackets denote mean value, ui is the i-th component of the k-th normal mode with frequency ωk. Note that the modes with the lowest frequencies make the greatest contribution to residue mobility, so a small fraction of all the modes is sufficient to obtain a good approximation of the sum. This allows us to compute anisotropic variances as 3×3 blocks around the diagonal of the covariance matrix (Kidera and Go, 1990).
We may also compute the correlation coefficient between the deviations of any two atoms, to generate the global correlation matrix:
Elastic Network Models
Anisotropic Network Model (ANM) (Atilgan et al., 2001). is a version of Elastic Network Model (ENM), based on connecting residues with Cα atoms within a cutoff distance Rc with spring-like interactions. The Hessian matrix is a 3N×3N matrix, where N is the number of residues, consisting of 3×3 submatrices Hij which depend on the direction of the vector between Cα atoms i and j, and are 0 if the Cα atoms are more than Rc apart. The diagonal submatrices are defined as follows: This defines a coarse-grained Elastic Network Model of a protein structure with directional information. We implemented this alrogithm using perl code to read PDB files and construct the Hessian, with MATLAB (The Mathworks, Inc., Natick, MA) scripts used for diagonalization.
We introduce two modifications to ANM, analogous to those we had previously proposed for isotropic models (Kondrashov et al., 2006), and term the new model Distance Network Model (DNM). First, the connectivity of the elastic potential is based on distances between nonhydrogen atoms of residue pairs, instead of only the Ca atoms. The contacts from all atoms are added for each residue to yield an interaction potential at the residue level. Second, we introduce different classes of residue interactions based on interatomic distances, with distinct Hookean spring constants. We use distance bins to define the interaction classes, specifically, covalent interactions are found by distance less than 2.3 Å, the next shell is up to 3.3 Å, followed by 5, 7, 9, and 11 Å. The Hessian matrix for each bin is defined exactly as for ANM above, with the difference that the equilibrium distance between two atoms has to be in the distance bin, while the coordinates (xi, yi, zi ) for residue i remain the Cα coordinates. If Ha is the contact matrix for class a, the total Hessian matrix for DNM is a linear combination of the matrices, with ka as the interaction constant for each class:
The constants ka define the strength of interactions, and we choose to use the total number of contacts in each class as a normalization constant, . Thus, although DNM introduces several different interaction constants, but these are defined from the contact matrices, and thus are not free parameters to be optimized. The only free parameter, as in other ENM, is the cutoff distance for atomic contacts, which we vary from 5 to 11 Å, as described in Results. The implementation again used a combination of perl and MATLAB scripts.
The details of the normal mode analysis implementation of Molecular Modeling ToolKit (MMTK) have been described elsewhere (Hinsen, 2000). For this study, we used the harmonic Cα forcefield (HCA) (Hinsen et al., 2000), which defines different interaction constants for covalently bonded and non-covalently bonded Cα atoms. The model uses the reciprocal of distance to weight the harmonic interaction constants, and no parameters are varied from the default values. The MMTK calculations for all 83 structures in the dataset are carried out in 2 hours on a single 2 GHz AMD Athlon processor with 2 GB of RAM.
ElNemo (ElN) (Suhre and Sanejouand, 2004) is an all-atom ENM, which constructs a contact matrix for all atoms within a certain radius, and then treats blocks of one or more residues, as rigid bodies using the Rotation-Translation blocking algorithm (Tama et al., 2000). The two main programs that constitute ElNemo, pdbmat and diagrtb, were kindly provided by the authors and installed on the local cluster. All blocking was done on a residue by residue basis and the interaction cutoff distance was varied from 4 Å to 11 Å. Running ElN on all 83 structures in the dataset using 8 different cutoff distances took roughly 1 day to complete on a 100 node cluster of 2.2 GHz Apple G5 processors with 4 GB of RAM.
Block Normal Modes with CHARMM
Block normal-mode analysis (BNM), originally suggested by Tama, et. al. (Tama et al., 2000) and subsequently improved by Li and Cui (Li and Cui, 2002), computes an all-atom Hessian which is then projected onto a blocked space spanned by the rotational and vibrational degrees of freedom of predefined blocks; in this work each residue is treated as a rotation-translation block, as in ElNemo method above. For this level of coarse-graining, the procedure reduces the Hessian storage space by approximately a factor of 25 and the diagonalization time by a factor of 125. The resulting blocked eigenvectors were then projected back to the all-atom space to give all-atom eigenvectors. This procedure perturbs the magnitudes of the eigenvalues, but in a linear fashion for the low-frequency modes (Li and Cui, 2002; Tama et al., 2000). The appropriate scale factor of 1.7 has been used in this work. Local minimization is performed to ensure that the linear term in the Taylor expansion of the potential is zero (Hayward, 2001).This minimization is completed using cycles of the adapted-basis Newton-Raphson method with gradually decreasing harmonic constraints to remove local steric clashes without perturbing the structure significantly. A final minimization with no harmonic constraints is performed until the RMS energy gradient reached 0.01 kcal/mol/Å. The average minimization time for this set is approximately eight minutes, but minimization times vary widely because of protein size: 54 seconds for the 52 residue 1RB9, and 73 minutes for the 325 residue 1O7J, all computed on 1.8 GHz Athlon single-processor station with 1 GB of memory, running Red Hat Linux 7.2. In some of the systems studied, this level of minimization resulted in modes with large negative frequencies in addition to the normal six rotational/translational modes. In these cases, these modes were ignored for all subsequent calculations. The average diagonalization time for BNM was approximately six minutes, varying widely again: 48 seconds for the 1RB9, and 51 minutes for 1O7J. All calculations are completed using the CHARMM suite of programs (Brooks et al., 1983; Neria et al., 1996). The extended atom CHARMM19 force field (Neria et al., 1996) modified for use with the EEF1 solvation model (Lazaridis and Karplus, 1999) is used for both minimizations and the BNM.
Measures of agreement with crystallographic data
The data set was obtained by searching the Protein Data Bank (Berman et al., 2000) for protein structures determined by X-ray crystallography to at least 1.0 Å resolution, and containing at least 50 residues in a single chain. Structures with more than 50% identity were discarded, leaving 98 non-redundant proteins, of which 87 contained ANISOU cards (anisotropic displacement parameters); 4 more structures were discarded because they contained modified protein residues for which CHARMM libraries are not available. The resultant set is structurally diverse, with all major SCOP superfamilies (Murzin et al., 1995) represented, as shown in Table 1 in Supplemental Materials. All protein chains in the PDB files were kept in the model in order to best represent the crystal environment. Copies of the protein molecule surrounding the structure in the crystal are generated using the symexp command in PyMOL (DeLano, 2002)Residues with at least one atom less than 4 Å from an atom in a crystal copy were considered to be involved in crystal contacts, and excluded from the comparison set Further, ADPs from Cα atoms with multiple conformations determined by occupancy parameter with value other than 1, were also excluded.
The anisotropic parameters are 3×3 matrices that define the variance of a 3-dimensional Gaussian probability distribution for position of each atom:
The six components of ADPs, Uxx, etc., are reported in PDB files in ANISOU cards (Berman et al., 2000). We compare the computationally predicted anisotropic parameters V with those from the crystal structures U. Prediction quality of the magnitude of variation is measured by linear correlation of the traces of the matrices U and V over the whole structure, which we call isotropic correlation. To compare directions of ellipsoids, we first divide all the matrices by their trace, to set all magnitudes to unity. Ellipsoids are described by their principal axes (eigenvectors) and the associated lengths (inverse eigenvalues); the ratio of the smallest to the largest eigenvalue is called its anisotropy (Trueblood et al., 1996). Directionality comparison was restricted to ellipsoids with anisotropy of less than 0.5, since directional comparison of near-spherical ellipsoids is meaningless. The simplest comparison of directionality is the absolute value of the dot product between the major directions. It is a rough estimate of agreement for two ellipsoids whose major axes are dominant, but has the virtue of simplicity. A more systematic measure of ellipsoid similarity was proposed by Merritt (Merritt, 1999), based on computation of the overlap integral between two probability densities. This measure, known to crystallographers as real-space correlation coefficient, is defined for two three-dimensional Gaussian distributions with covariance matrices U and V as follows:
We also compare modes produced by the different normal mode potentials. To compare mode i (as ordered by frequency) from two methods, we take the average between the best agreement for mode i from method a with modes from method b, and the best agreement for mode i from method b with the modes from method a. We compare the modes similar in frequency ordering, specifically, only the modes no more than 3 indices higher or lower. The formula for overlap for mode i between method a and b is:
If the best agreement is between modes of the same index, then the two maxima are the same. Figure 2, Figure 3, and Figure 4 were prepared using MATLAB (The Mathworks, Inc., Natick, MA) and figure 1 with rastep and raster3d (Merritt and Bacon, 1997).
Supplementary Material
Acknowledgments
We thank Karsten Suhre and Yves-Henri Sanejouand for providing the source code for ElNemo software. D.A.K. and R.M.B. were supported through an National Library of Medicine training grant to the Computation and Informatics in Biology and Medicine program at UW-Madison (NLM 5T15LM007359), with R.M.B. also receiving support from a training grant from the Department of Energy, Genomes to Life project (DE-FG2-04ER25627). A.W.VW. was supported by an NSF pre-doctoral fellowship. Q.C. is an Alfred P. Sloan Research Fellow.
Abbreviations
- ENM
elastic network model
- ANM
anisotropic network model
- DNM
distance network model
- BNM
block normal modes
- ElN
ElNemo
- HCA
harmonic Cα potential
- ADP
anisotropic displacement parameter
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Abbondanzieri EA, Greenleaf WJ, Shaevitz JW, Landick R, Block SM. Direct observation of base-pair stepping by RNA polymerase. Nature. 2005;438:460–465. doi: 10.1038/nature04268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Atilgan AR, Durell SR, Jernigan RL, Demirel MC, Keskin O, Bahar I. Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys J. 2001;80:505–515. doi: 10.1016/S0006-3495(01)76033-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bahar I, Atilgan AR, Erman B. Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Fold Des. 1997;2:173–181. doi: 10.1016/S1359-0278(97)00024-2. [DOI] [PubMed] [Google Scholar]
- Bahar I, Rader AJ. Coarse-grained normal mode analysis in structural biology. Curr Opin Struct Biol. 2005;15:586–592. doi: 10.1016/j.sbi.2005.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucl Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brooks B, Karplus M. Harmonic dynamics of proteins: normal modes and fluctuations in bovine pancreatic trypsin inhibitor. Proc Natl Acad Sci U S A. 1983;80:6571–6575. doi: 10.1073/pnas.80.21.6571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M. Charmm - a Program for Macromolecular Energy, Minimization, and Dynamics Calculations. Journal of Computational Chemistry. 1983;4:187–217. [Google Scholar]
- CCP4. The CCP4 Suite: Programs for Protein Crystallography. Acta Crystallogr. 1994;D50:760–763. doi: 10.1107/S0907444994003112. [DOI] [PubMed] [Google Scholar]
- Clarage JB, Clarage MS, Phillips WC, Sweet RM, Caspar DL. Correlations of atomic movements in lysozyme crystals. Proteins. 1992;12:145–157. doi: 10.1002/prot.340120208. [DOI] [PubMed] [Google Scholar]
- Cui Q, Li GH, Ma JP, Karplus M. A normal mode analysis of structural plasticity in the biomolecular motor F-1-ATPase. Journal of Molecular Biology. 2004;340:345–372. doi: 10.1016/j.jmb.2004.04.044. [DOI] [PubMed] [Google Scholar]
- Dauter Z, Wilson KS, Sieker LC, Meyer J, Moulis JM. Atomic resolution (0.94 A) structure of Clostridium acidurici ferredoxin. Detailed geometry of [4Fe-4S] clusters in a protein. Biochemistry. 1997;36:16065–16073. doi: 10.1021/bi972155y. [DOI] [PubMed] [Google Scholar]
- DeLano WL. The PyMOL Molecular Graphics System. 20022002 www.pymol.org.
- DePristo MA, de Bakker PI, Blundell TL. Heterogeneity and inaccuracy in protein structures solved by X-ray crystallography. Structure. 2004;12:831–838. doi: 10.1016/j.str.2004.02.031. [DOI] [PubMed] [Google Scholar]
- Esposito L, Vitagliano L, Sica F, Sorrentino G, Zagari A, Mazzarella L. The ultrahigh resolution crystal structure of ribonuclease A containing an isoaspartyl residue: hydration and sterochemical analysis. J Mol Biol. 2000;297:713–732. doi: 10.1006/jmbi.2000.3597. [DOI] [PubMed] [Google Scholar]
- Eyal E, Yang LW, Bahar I. Anisotropic network model: systematic evaluation and a new web interface. Bioinformatics. 2006;22:2619–2627. doi: 10.1093/bioinformatics/btl448. [DOI] [PubMed] [Google Scholar]
- Frauenfelder H, Petsko GA, Tsernoglou D. Temperature-dependent X-ray diffraction as a probe of protein structural dynamics. Nature. 1979;280:558–563. doi: 10.1038/280558a0. [DOI] [PubMed] [Google Scholar]
- Go N, Noguti T, Nishikawa T. Dynamics of a small globular protein in terms of low-frequency vibrational modes. Proc Natl Acad Sci U S A. 1983;80:3696–3700. doi: 10.1073/pnas.80.12.3696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Halle B. Flexibility and packing in proteins. Proc Natl Acad Sci U S A. 2002;99:1274–1279. doi: 10.1073/pnas.032522499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hamacher K, McCammon JA. Computing the amino acid specificity of fluctuations in biomolecular systems. Journal of Chemical Theory and Computation. 2006;2:873–878. doi: 10.1021/ct050247s. [DOI] [PubMed] [Google Scholar]
- Hayward S. Normal Mode Analysis of Biological Molecules. In: Becker OM, MacKerell A J, Roux B, Watanabe M, editors. Computational Biochemistry and Biophysics. New York: Marcel Dekker; 2001. pp. 153–167. [Google Scholar]
- Hinsen K. The molecular modeling toolkit: A new approach to molecular simulations. J Comp Chem. 2000;21:79–85. [Google Scholar]
- Hinsen K, Petrescu A-J, Dellerue S, Bellissent-Funel M-C, Kneller GR. Harmonicity in slow protein dynamics. Chemical Physics. 2000;261:25–37. [Google Scholar]
- Karplus M, Kuriyan J. Molecular dynamics and protein function. Proc Natl Acad Sci U S A. 2005;102:6679–6685. doi: 10.1073/pnas.0408930102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kidera A, Go N. Refinement of protein dynamic structure: normal mode refinement. Proceedings of the National Academy of Sciences of the United States of America. 1990;87:3718–3722. doi: 10.1073/pnas.87.10.3718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kondrashov DA, Cui Q, Phillips GN., Jr Optimization and evaluation of a coarse-grained model of protein motion using x-ray crystal data. Biophys J. 2006;91:2760–2767. doi: 10.1529/biophysj.106.085894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kundu S, Melton JS, Sorensen DC, Phillips GN., Jr Dynamics of proteins in crystals: comparison of experiment with simple models. Biophys J. 2002;83:723–732. doi: 10.1016/S0006-3495(02)75203-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuriyan J, Petsko GA, Levy RM, Karplus M. Effect of Anisotropy and Anharmonicity on Protein Crystallographic Refinement - an Evaluation by Molecular-Dynamics. Journal of Molecular Biology. 1986;190:227–254. doi: 10.1016/0022-2836(86)90295-0. [DOI] [PubMed] [Google Scholar]
- Kuriyan J, Weis WI. Rigid Protein Motion as a Model for Crystallographic Temperature Factors. Proceedings of the National Academy of Sciences of the United States of America. 1991;88:2773–2777. doi: 10.1073/pnas.88.7.2773. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lanman J, Prevelige PE., Jr High-sensitivity mass spectrometry for imaging subunit interactions: hydrogen/deuterium exchange. Curr Opin Struct Biol. 2004;14:181–188. doi: 10.1016/j.sbi.2004.03.006. [DOI] [PubMed] [Google Scholar]
- Lazaridis T, Karplus M. Effective energy function for proteins in solution. Proteins. 1999;35:133–152. doi: 10.1002/(sici)1097-0134(19990501)35:2<133::aid-prot1>3.0.co;2-n. [DOI] [PubMed] [Google Scholar]
- Levitt M, Sander C, Stern PS. Protein normal-mode dynamics: trypsin inhibitor, crambin, ribonuclease and lysozyme. J Mol Biol. 1985;181:423–447. doi: 10.1016/0022-2836(85)90230-x. [DOI] [PubMed] [Google Scholar]
- Li G, Cui Q. A coarse-grained normal mode approach for macromolecules: an efficient implementation and application to Ca(2+)-ATPase. Biophys J. 2002;83:2457–2474. doi: 10.1016/S0006-3495(02)75257-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lindorff-Larsen K, Best RB, DePristo MA, Dobson CM, Vendruscolo M. Simultaneous determination of protein structure and dynamics. Nature. 2005;433:128–132. doi: 10.1038/nature03199. [DOI] [PubMed] [Google Scholar]
- Longhi S, Czjzek M, Lamzin V, Nicolas A, Cambillau C. Atomic resolution (1.0 angstrom) crystal structure of Fusarium solani cutinase: Stereochemical analysis. Journal of Molecular Biology. 1997;268:779–799. doi: 10.1006/jmbi.1997.1000. [DOI] [PubMed] [Google Scholar]
- Ma JP, Karplus M. Ligand-induced conformational changes in ras p21: A normal mode and energy minimization analysis. Journal of Molecular Biology. 1997;274:114–131. doi: 10.1006/jmbi.1997.1313. [DOI] [PubMed] [Google Scholar]
- Meinhold L, Smith JC. Fluctuations and correlations in crystalline protein dynamics: a simulation analysis of staphylococcal nuclease. Biophysical journal. 2008;88:2554–2563. doi: 10.1529/biophysj.104.056101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merritt EA. Comparing anisotropic displacement parameters in protein structures. Acta Crystallographica. 1999;D55:1997–2004. doi: 10.1107/s0907444999011853. [DOI] [PubMed] [Google Scholar]
- Merritt EA, Bacon DJ. Raster3D Photorealistic Molecular Graphics. Methods in Enzymology. 1997;277:505–524. doi: 10.1016/s0076-6879(97)77028-9. [DOI] [PubMed] [Google Scholar]
- Moffat K. Time-resolved biochemical crystallography: A mechanistic perspective. Chemical Reviews. 2001;101:1569–1581. doi: 10.1021/cr990039q. [DOI] [PubMed] [Google Scholar]
- Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology. 1995;247:536–540. doi: 10.1006/jmbi.1995.0159. [DOI] [PubMed] [Google Scholar]
- Neria E, Fischer S, Karplus M. Simulation of activation free energies in molecular systems. The Journal of Chemical Physics. 1996;105:1902–1921. [Google Scholar]
- Palmer AG, Kroenke CD, Loria JP. Nuclear Magnetic Resonance of Biological Macromolecules, Pt B. San Diego: ACADEMIC PRESS INC; 2001. Nuclear magnetic resonance methods for quantifying microsecond-to-millisecond motions in biological macromolecules; pp. 204–238. [DOI] [PubMed] [Google Scholar]
- Phillips GN., Jr Comparison of the dynamics of myoglobin in different crystal forms. Biophys J. 1990;57:381–383. doi: 10.1016/S0006-3495(90)82540-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sen TZ, Feng YP, Garcia JV, Kloczkowski A, Jernigan RL. The extent of cooperativity of protein motions observed with elastic network models is similar for atomic and coarser-grained models. Journal of Chemical Theory and Computation. 2006;2:696–704. doi: 10.1021/ct600060d. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sheldrick GM, Schneider TR. SHELXL: High -Resolution Refinement. Methods in Enzymology. 1997;277:319–343. [PubMed] [Google Scholar]
- Suhre K, Sanejouand YH. ElNemo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement. Nucleic Acids Res. 2004;32:W610–W614. doi: 10.1093/nar/gkh368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tama F, Gadea FX, Marques O, Sanejouand YH. Building-block approach for determining low-frequency normal modes of macromolecules. Proteins. 2000;41:1–7. doi: 10.1002/1097-0134(20001001)41:1<1::aid-prot10>3.0.co;2-p. [DOI] [PubMed] [Google Scholar]
- Teodoro ML, Phillips GN, Jr, Kavraki LE. Understanding protein flexibility through dimensionality reduction. J Comput Biol. 2003;10:617–634. doi: 10.1089/10665270360688228. [DOI] [PubMed] [Google Scholar]
- Tirion MM. Large Amplitude Elastic Motions in Proteins from a Single-Parameter, Atomic Analysis. Physical Review Letters. 1996;77:1905–1908. doi: 10.1103/PhysRevLett.77.1905. [DOI] [PubMed] [Google Scholar]
- Trueblood KN, Burgi HB, Burzlaff H, Dunitz JD, Gramaccioli CM, Schulz HH, Shmueli U, Abrahams SC. Atomic displacement parameter nomenclature - Report of a subcommittee on atomic displacement parameter nomenclature. Acta Crystallographica Section A. 1996;52:770–781. [Google Scholar]
- Van Wynsberghe A, Li GH, Cui Q. Normal-mode analysis suggests protein flexibility modulation throughout RNA polymerase's functional cycle. Biochemistry. 2004;43:13083–13096. doi: 10.1021/bi049738+. [DOI] [PubMed] [Google Scholar]
- Wilson MA, Brunger AT. The 1.0 A crystal structure of Ca(2+)-bound calmodulin: an analysis of disorder and implications for functionally relevant plasticity. J Mol Biol. 2000;301:1237–1256. doi: 10.1006/jmbi.2000.4029. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.