Using the COREX/BEST Server to Model the Native-State Ensemble

Vincent J Hilser; Steven T Whitten

doi:10.1007/978-1-62703-658-0_14

. Author manuscript; available in PMC: 2022 Jan 24.

Published in final edited form as: Methods Mol Biol. 2014;1084:255–269. doi: 10.1007/978-1-62703-658-0_14

Using the COREX/BEST Server to Model the Native-State Ensemble

Vincent J Hilser, Steven T Whitten

PMCID: PMC8785428 NIHMSID: NIHMS1035308 PMID: 24061926

Abstract

Protein structures under normal conditions exist as ensembles of interconverting, transient microstates. A computer algorithm known as COREX/BEST (Biology using Ensemble-based Structural Thermodynamics) was developed to model microstate structures and describe the native ensembles of proteins in statistical thermodynamic terms. This algorithm has been tested extensively and validated through experimental comparisons examining a range of biophysical and functional phenomena, such as structural cooperativity, pH-dependent stability, and cold denaturation. Here, we describe a Web-based implementation of the COREX/BEST algorithm, called the COREX/BEST Server, and demonstrate how to use this online resource to characterize the structural and thermodynamic properties of the native protein ensemble.

Keywords: Ensemble, Dynamics, Temperature, Electrostatics, pH

1. Introduction

Protein macromolecules often couple structural rearrangements to biological function [1–3] and show significant conformational heterogeneity under normal solution conditions [4–10]. Characterizations of these structural fluctuations are thus needed to understand the physicochemical properties of proteins and the molecular origins to their biological roles. In this chapter, we discuss a statistical thermodynamic model of the protein conformational ensemble known as COREX/BEST [11, 12]. This model uses an algorithm to compute a distribution of states a protein structure may populate under a given set of conditions (e.g., pH and temperature) to provide an ensemble description of protein dynamics. By calculating state probabilities and monitoring quantitatively the population distributions within an ensemble, COREX/BEST has shown an ability to reproduce diverse phenomena such as regional stability variations within protein structures [13–15], long-range intramolecular signaling [16, 17], electrostatic contributions to cooperative transitions [18, 19], non-cooperative cold denaturation [20,21], and functional adaptation [22]. AWeb-based implementation of the COREX/BEST algorithm [23] is freely available to the scientific community and may be accessed by visiting http://best.bio.jhu.edu. A demonstration of using the COREX/BEST Server to characterize the structural and energetic properties of a protein conformational ensemble is presented here.

Detailed descriptions of the COREX/BEST algorithm can be found elsewhere [11, 12]. Briefly, this algorithm uses a user-supplied structure (e.g., a PDB file) as a template to generate a large number of partially folded “microstates” that possess a dual structural-thermodynamic character [21]. Each microstate is generated by treating local structural fluctuations as folding-unfolding reactions that occur in an otherwise folded and native-like protein. By combinatorial unfolding of “folding units”, which are contiguous blocks of residues that fold and unfold independently, and an incremental shift in the boundaries of the folding units, an exhaustive enumeration of partially folded states is achieved. The relative Gibbs free energy for each microstate i, ΔG_i, is determined by structure-based parameterization of the intrinsic energetics: ΔC_p,i, ΔH_i, and ΔS_i (see Note 1). This parameterization gives temperature-dependent stabilities for the microstates that can be converted to probabilities (P_i) by the Boltzmann relation,

P_{i} = \frac{K_{i}}{\sum_{i} K_{i}},

(1)

where the statistical weight of each microstate (K_i) is determined by the relative Gibbs free energy of that microstate ( $K_{i} = e^{- Δ G_{i} / R T}$ , where R is the gas constant and T is absolute temperature), and the summation is over all ensemble states. Figure 1 shows a representative ensemble calculated for the staphylococcal nuclease protein [24].

Fig. 1 — The nuclease conformational ensemble as calculated by the COREX/BEST Server. Calculations in this figure used 25°C as the temperature setting and the PDB file 1STN [24]. (a) Relative stability (ΔG_i) of each microstate plotted as a function of the fraction native (number of residues in folded segments/total number of residues). Each point corresponds to a particular microstate, and the ΔG values were calculated relative the fully folded state (i.e., the relative stability of the fully folded state in the ensemble was 0). (b) Relative populations (P_i) of the 20 microstates with highest computed probability. The *filled black circles* represent the positions of unfolded residues in a microstate. The fully folded native state, which has no unfolded residues, is shown by the red *circle* located at residue position = 0. The *line of black circles* stretching uninterrupted from residue position 6 to residue position 141 represents the fully unfolded state, which had a computed probability of P = 0.00015 at this temperature. The locations of secondary structural elements among the residue positions are as indicated. (c) Cartoon representation of the most probable microstate of the nuclease ensemble. Segments of protein colored *red* indicate regions that would be folded and regions colored *yellow* would be unfolded (positions 44–51). To illustrate the concept that the unfolded segment freely samples accessible conformational space, this region was modeled in the figure as multiple flexible loops

The statistical formula to calculate state probabilities (Eq. 1) can be expanded to account for additional system perturbations [25], such as proton-binding energies,

P {(pH)}_{i} = \frac{K_{i} \cdot \prod_{j} (1 + 10^{p K_{a, i, j} - pH})}{\sum_{i} (K_{i} \cdot \prod_{j} (1 + 10^{p K_{a, i, j} - pH}))},

(2)

where pK_a,i,j is the pK_a value of residue j in microstate i (see Note 2). The addition of the proton-binding energies to the state probabilities yields a structural ensemble sensitive to pH changes and can be used to investigate electrostatic contributions to protein energetics [18, 19]. Currently, the COREX/BEST Server can perform the following tasks: (1) calculate a conformational ensemble based upon an input protein structure, (2) determine the temperature-sensitivity of the computed ensemble, and (3) determine the pH-sensitivity of the ensemble if pK_a values for the ionizable protein groups are supplied. Our discussions in this chapter are limited to the current capabilities of the COREX/BEST Server.

2. Materials

The COREX/BEST Server [23] provides online access to the COREX/BEST algorithm [12]. A computer with a Web browser and internet capabilities is required. The Web site, which can be found at http://best.bio.jhu.edu, was designed to work with standards compliant browsers (e.g., Mozilla’s Firefox). Most modern browsers, however, should be sufficient. Adequate data storage space is required to download the results of COREX/BEST calculations to a user’s computer.

2.1. Input Structure for Generating an Ensemble

The calculations performed by the COREX/BEST Server require uploading a protein structure file written in plain text and in standard PDB file format [26]. Refer to Note 3 for common errors encountered by COREX/BEST calculations due to protein structure files that are not properly prepared.

2.2. Native pK_a Values to Calculate pH Sensitivity

Calculating the pH sensitivity of an ensemble using Eq. 2 requires uploading a file written in plain text that contains a list of the pK_a values for the ionizable groups in the uploaded PDB structure file (i.e., a list of “native” pK_a values). Each ionizable group should be listed on a separate line in this file, with each line containing: (1) the three letter code for the amino acid type that has the ionizable group, (2) the residue number identifying the amino acid position in the structure file, and (3) the pK_a value of the ionizable group in the uploaded structure. For example, consider a protein containing a ε-amino group that belongs to a lysine at residue position 25 and ionizes under native conditions with a pK_a value of 11.55. The uploaded file of pK_a values should contain a line specifying “LYS 25 11.55”, as well as additional lines of text for each additional ionizable group in the structure file. Native pK_a values for the carboxyl and amino ends of a protein chain should not be listed in the uploaded file; the COREX/BEST algorithm assumes that the C-and N-termini ionize with the same pK_a values as model compounds, (i.e., 3.5 and 7.4, respectively) in all ensemble states and thus the two end groups do not contribute to the pH sensitivity of the ensemble in COREX/BEST calculations.

3. Methods

To perform ensemble calculations using the COREX/BEST Server, a user must first register with the Web site. Registration involves providing a name, a valid e-mail address, and a login password. The e-mail address is required so that users can be contacted when calculations have completed. Following successful registration, a typical user session consists of the following: (1) the user logs into the COREX/BEST Server through their Web browser, (2) the user defines workspaces for each project, (3) the user uploads protein structure file(s) to appropriate workspace(s), (4) the user submits jobs (i.e., calculations) to the COREX/BEST Server, and (5) the user may download all or a subset of the calculated data contained within that user’s workspace. Below is the example of performing calculations on the staphylococcal nuclease protein (PDB ID 1STN [24]) using the COREX/BEST Server.

3.1. Creating a Workspace

After login to the COREX/BEST Server, a workspace can be created by clicking on the “Create Workspace” link on the left side of the webpage under the heading “WORKSPACE OPTIONS”. The user is required to provide a name for this workspace and will create the workspace by clicking on the “Create!” button. Created workspaces can be accessed by clicking on links on the left side of the webpage under the heading “MY WORKSPACES”. Accessing a newly created workspace, the user is presented with the options to: (1) upload a new PDB file, or (2) delete the workspace. Click on the link “Upload New PDB” to upload a plain text PDB file. If the file was uploaded correctly, the server displays a cartoon picture of the structure file in the workspace. Also, the user can click on an “Upload Report” link in the workspace that notifies a user of potential errors detected in the PDB file. For example, uploading 1STN gives the following report due to missing the first five and last eight residues in the crystallographic structure:

PDB Upload Report for 1STN on 2012/11/20, 16:55:15:

* SEQRES Chain ‘A’ contains 149 residues

* missing coordinates for residues 1 thru 5, line number 323

* missing coordinates for residues 142 thru 149 (end of sequence)

Upload Recap:

* structure contains 2 atom/residue sequence errors or missing coordinates

Upload complete: coordinates for 136 residues with 1092 atoms

Warnings:

* 2 Discrepancies in uploaded file may cause unexpected results in calculations

END PDB Upload Report for 1STN

3.2. Generating a Conformational Ensemble

Next, a user would calculate a conformational ensemble for nuclease based upon the 1STN structure. To do this, click on the “Perform COREX/BEST Calculations” link under the “Protein Options” heading of the workspace. The user is directed to a webpage titled “Choosing COREX/BEST Jobs”. On this webpage, click the link “Run the Ensemble Generator!”. Next, the user is required to specify the size of the folding unit, referred to as the “Window Size”, which is used to calculate the ensemble. Briefly, the window size is the number of contiguous residues that constitute a folding unit. A window size of 8 means that the protein is folded and unfolded in blocks of eight residues. Smaller window sizes create larger ensembles, since a greater number of folding units can fit along a given chain length for minimally sized folding units. In general, smaller window sizes are good, but larger ensembles require longer run times on the server and COREX/BEST jobs are limited to 24 h each. Any job estimated to run longer than 1 day will not be performed. For reference, the PDB file upload report also contains a table that lists the total number of states (i.e., microstates) that would be generated and estimated run times for ensembles calculated using different window sizes. The table generated for 1STN in its upload report is reproduced in Table 1.

Table 1.

Estimates for fully enumerated ensemble generation

Window size	Total states	Estimated time (h:min:s)	MB data
6	37,748,724	1 days 17:22:40	2,208
7	5,242,866	5:44:48	306
8	1,179,632	1:17:34	69
9	393,198	0:25:51	23

Open in a new tab

In our experience, COREX/BEST ensembles seem to be able to reproduce experimental results with better accuracy when window sizes of eight to ten residues are used to generate the microstates. For proteins larger than 150 residues in length, using a recommended window size can create ensembles that exceed the run-time limits of the Server. To handle large proteins, a Monte Carlo option is available when generating ensembles [16]. Using this option, ensemble microstates are randomly generated and selected or “sampled” using standard Monte Carlo methods [27] via the computed state statistical weight (Eq. 1). Briefly, microstates with higher probabilities are more likely to be sampled in a Monte Carlo generated ensemble (see Note 4).

The user is also required to specify a “Minimum Window Size” for an ensemble. Since a protein chain isn’t necessarily an integer multiple of the user-defined window size, residues at the C-terminal end often don’t fit neatly into a standard folding unit. For example, the 1STN structure contains 136 residues. If the user chooses a window size of 9, this would mean that 15 folding units could be overlaid on the first 135 residue positions of the chain, with a C-terminal residue (i.e., the 136th residue) left by itself. For any protein, the C-terminal “left over” residues are given their own folding unit if their number is equal to or exceeds the minimum window size, otherwise those residues are added to the preceding folding unit. This parameter is also needed for the N-terminal end, as the folding unit boundaries are incrementally shifted one residue at a time toward the C-terminal end to exhaustively generate the ensemble microstates, leaving “left over” N-terminal residues [12]. The incremental shifts in folding unit boundaries are used to remove any boundary-related bias in COREX/BEST results.

3.3. Determining Entropy Weighting

The structure-based energy function utilized in COREX/BEST is coarse and provides a rough estimate of the thermodynamic parameters for each microstate (i.e., ΔH_i, ΔC_p,i, and ΔS_i). Coarse approximation of the conformational entropy of a microstate (ΔS_conf,i), which represents the number of conformational variations that a particular energetic state can occupy, is especially troubling for COREX/BEST simulations. Variations of ±5 % in the computed ΔS_conf term can cause shifts in the unfolding temperature by ~20°C [12]. The conformational entropy term of each micro-state is thus normalized using an entropy-weighting factor,

Δ S_{i} = Δ S_{solv, i} + W \cdot Δ S_{conf, i},

(3)

where ΔS_solv is the solvent entropy (owing to water ordering at the protein surface) and W is the applied weighting factor, typically a value between 0.95 and 1.05. In practice, a common entropy-weighting factor is applied to each microstate of an ensemble. The COREX/BEST algorithm determines an appropriate weighting factor by calculating the value of W that is needed to set the energetic separation between the native and fully unfolded states in the simulation as equal to an experimentally determined free energy difference. For example, the free energy difference between native and unfolded nuclease has been measured to be ~5 kcal/mol at 25°C via guanidine hydrochloride-induced unfolding [28]. To determine an appropriate weighting factor to use in COREX/BEST calculations with 1STN, the user would click on the link “Perform COREX/BEST Calculations” under the “Protein Options’’ heading of the 1STN workspace. The user will be directed to a webpage titled “Choosing COREX/BEST Jobs”. On this webpage, click on the link “Determining the Entropy Weighting Factor”. Next, the user is asked to specify the overall free energy difference measured and the temperature at which the measurement was made. For 1STN, a value of 5.0 would be input into the “Target Delta G” field, and 25°C entered for the temperature. Next, click on “Calc. W”. For 1STN, COREX/BEST calculates that a value of 0.968 should be used for the weighting factor, W.

3.4. Calculating the Temperature-Dependent Stability of an Ensemble

To test the ability of the COREX/BEST algorithm to accurately describe a conformational ensemble, this algorithm was initially designed to reproduce regional stability variations within protein structures as observed by site-specific hydrogen exchange data [13–15]. As demonstrated [11], the propensity of any region in a protein to undergo structural fluctuations can be determined from the microstate probabilities (Eq. 1). Defined as the residue stability constant, k_f, the local stability of each residue for the native fold is calculated as the ratio of the summed probabilities of all microstates in the ensemble in which a particular residue j is in a folded conformation, P_f,j, to the summed probability of all microstates in which residue j is in an unfolded (or nonfolded) conformation, P_nf,j,

k_{f, j} = \frac{P_{f, j}}{P_{nf, j}} .

(4)

Residue stability constants for 1STN can be calculated using the COREX/BEST server by the following: After generating an ensemble for 1STN and determining an appropriate entropy weighting (Subheadings 3.2 and 3.3), click on the link “Perform COREX/BEST Calculations” under the “Protein Options” heading of the 1STN workspace. The user will be directed to a webpage titled “Choosing COREX/BEST Jobs”. On this webpage, click on “Calculate the Residue Specific Stability Constants”. Next, the user will need to specify the entropy weighting factor and the temperature (in °C) to use in the calculation. Also, the user will need to specify which previously generated ensemble should be used. Ensembles are classified by the parameters that were defined in the generation step, namely, window size, minimum window size, and if Monte Carlo sampling was employed. Shown in Fig. 2 are the stability constants calculated for 1STN using a temperature setting of 25°C, an entropy weighting factor of 0.968, and an ensemble generated with a window size of 8 and a minimum window size of 4. High stability constants identify residues that are folded in the majority of the most probable microstates at this temperature, whereas lower stability constants identify residues that are unfolded in many of those microstates. In nuclease, most residues with higher stability constants were located in β-strands 3–6 and in the α-helices. Lower stability constants were found in residues in β-strands 1 and 2, and, most dramatically, the loop connecting β-strand 4 to α-helix 1. In addition to providing residue stability constants, the COREX/BEST Server also outputs a file listing the 50 most probable microstates, giving data that can be used to generate Fig. 1b.

Fig. 2 — The effect of temperature on the nuclease ensemble as calculated by the COREX/BEST Server. (a) Natural logarithm of the residue stability constants, k_f, for 1STN calculated using the temperature settings of 25°C (*filled circles*) and −50°C (*open circles*). The locations of secondary structural elements are as indicated. (b) Projection of the residue stability constants onto the nuclease structure where *yellow* represents residue positions with the lowest k_f values and *red* the highest. *Red colored* residue positions were calculated as stable at 25°C, having large and positive k_f values, whereas *yellow* residues were calculated to have the least stability and lowest k_f values. (c) The nuclease structure color-coded the same as in panel b, except the stability constants were calculated using a temperature setting of −50°C

Notable to the COREX/BEST calculations is the recent suggestion that protein cold denaturation offers experimental access to the native ensemble, allowing for its structural characterization [21]. It’s thought that cold temperatures promote non-cooperative unfolding of a protein [29, 30] by minimizing the favorable energetics related to the nonspecific burial of hydrophobic groups in aqueous solution [31–33]. As such, the nonspecific hydrophobic packing that favors compact protein structures appears to be minimized at low temperatures in a manner that disrupts globular folds into sub-global structural units. The COREX/BEST Server can be used to simulate protein cold denaturation. For example, repeating the calculation of stability constants at −50°C demonstrates that at low temperatures nuclease prefers partially folded states that are mostly unfolded except for the two C-terminal helices (Fig. 2c).

3.5. Calculating the pH Sensitivity of the Conformational Ensemble

The COREX/BEST algorithm can also model the coupling between H⁺ binding reactions, local fluctuations in structure, and global conformational transitions [18, 19]. COREX/BEST performs this calculation by quantifying how the population distribution of the ensemble of microstates is affected by proton binding via Eq. 2. In this calculation, two sets of pK_a values are used to describe the affinity of each H⁺ binding site (see Note 2). One set describes H⁺ binding to sites in folded and native-like regions of the protein. The pK_a values in this set are referred to as “native” pK_a values and are supplied by the user (see Subheading 2.2). In practice, standard continuum electrostatics methods have been used to generate the set of pK_a,native values for use with COREX/BEST [18]. The second set of pK_a values describes H⁺ binding to sites in unfolded regions, using the intrinsic pK_a values of model compounds [34, 35]. pK_a,intrinsic values are supplied by the COREX/BEST algorithm.

Figure 3 shows the pH sensitivity of the 1STN ensemble calculated by the COREX/BEST Server and identifies which of the ionizable residues were responsible for its pH-dependent stability in the calculation. For nuclease, mutagenic tests involving alanine substitutions of the ionizable residues provided experimental support for the COREX/BEST results [18]. To perform this simulation using the COREX/BEST Server, a user should first generate an ensemble by the steps outlined above, but also check the “pH Ensemble?” box when specifying window size and minimum window size. The COREX/BEST Server saves structural information needed to assign “native” or “intrinsic” pK_a values to each ionizable group in each microstate only when the “pH Ensemble?” box is checked. After the ensemble has been generated, the user should click on the “Calculate the pH Sensitivity an Ensemble” on the “Choosing COREX/BEST Jobs” page.

Fig. 3 — The effect of pH on the nuclease ensemble as calculated by the COREX/BEST Server. (a) Natural logarithm of the stability constants *(black), k*_f, and the residue-specific protection constants *(blue dashed line), k*_p. Protection constants identify which ionizable residues are folded and in a native-like environment in the most probable microstates [19]. In general, ionizable residues with high protection constant values (labeled in the figure) contribute to the pH-dependent population shifts of the ensemble (if pK_a,native ≠ pK_a,intrinsic), whereas those with low protection constant values do not. (b) pH dependent folding of the nuclease ensemble determined by 〈fraction native〉 = Σ fraction native_i•P(pH)_i where P(pH)_i was calculated by Eq. 2 and the fraction native value of each microstate was calculated as the number of residues in folded segments divided by the total number of residues. These data show that solution changes to acidic or basic conditions promote cooperative unfolding of the 1STN ensemble. (c) Cartoon representation of the nuclease structure. The ionizable residues with high k_p values (labeled in a) highlighted by showing the positions of their side chain atoms

4. Notes

The relative Gibbs free energy for each microstate, ΔG_i, is determined by structure-based parameterization of ΔC_p,i, ΔH_i, and ΔS_i [12]. The heat capacity, ΔC_p,i, is known to originate primarily from changes in hydration and has been parameterized in terms of changes in solvent accessible apolar (ΔASA_apol) and polar (ΔASA_pol) surface area,
$Δ C_{p, i} = Δ C_{p, apol, i} + Δ C_{p, pol, i},$ (5)
and
$Δ C_{p, i} = a_{Δ C_{p}} \cdot Δ {ASA}_{apol, i} + b_{Δ C_{p}} \cdot Δ {ASA}_{pol, i},$ (6)
Where a_ΔCp = 0.44 cal/mol KÅ² and b_ΔCp = 0.26 cal mol KÅ² [36–38]. The enthalpy change, ΔH_i, also scales with accessible surface areas and can be written as,
$Δ H {(60^{°} C)}_{i} = a_{Δ H} \cdot Δ {ASA}_{apol, i} + b_{Δ H} \cdot Δ {ASA}_{pol, i},$ (7)
and
$Δ H {(T)}_{i} = Δ H {(60^{°} C)}_{i} + Δ C_{p, i} (T - 60^{°} C),$ (8)
where T is the temperature a_ΔH = −8.44 cal/mol a² and b_ΔH = 31.4 cal/mol Å² [36, 39]. The entropy change, ΔS_i, includes two contributions, one from changes in solvation and the other from changes in the conformational entropy,
$Δ S_{i} = Δ S_{solv, i} + Δ S_{conf, i},$ (9)
The solvation contribution can be written in terms of the polar and apolar values of ΔC_p if the temperatures at which ΔS_solv,apol = 0 and ΔS_solv,pol = 0 are known (T_s,apol and T_s,pol, respectively),
$Δ S {(T)}_{solv, i} = Δ C_{p, apol,} ln (T / T_{s, apol}) + Δ C_{p, pol,} ln (T / T_{s, pol}) .$ (10)

T_s,apol has been shown to be 385 K [31], while T_s,pol has been shown to be 335 K [40]. The conformational entropies are calculated by considering three contributions,
$Δ S_{conf, i} = Δ S_{bu - ex, i} + Δ S_{ex - un, i} + Δ S_{bb, i},$ (11)
where ΔS_bu-ex,i is the summed entropy change for all side chains that are buried in the fully folded state and become exposed in a microstate, ΔS_ex-un,i is the summed entropy change of solvent-exposed side chains upon unfolding of the peptide backbone, and ΔS_bb,i is the backbone entropy change for residues that become unfolded in a microstate. The magnitudes of the conformational entropy contributions for each amino acid type have been determined from computational analysis of the probabilities of the different dihedral and torsion angles and are reported elsewhere [40, 41]. The temperature-dependent Gibbs free energy for each ensemble state, ΔG(T)_i, is then expressed in terms of the standard thermodynamic equation,
$Δ G {(T)}_{i} = Δ H {(T)}_{i} - T (Δ S {(T)}_{solv, i} + Δ S_{conf, i}) .$ (12)
For each microstate, each ionizable group can have one of two values: the pK_a value in the native conformation (pK_a,native, which are user-supplied) or that of the unfolded state (pK_a,intrinsic, which are supplied by COREX/BEST). The set of pK_a,intrinsic values are those of model compounds in water [34, 35]. It’s recommended to calculate pK_a,native values using continuum electrostatic methods based on solution of the linearized Poisson-Boltzmann equation [42] applied to the high-resolution structure (e.g., 1STN). For each microstate, titratable residues in unfolded regions are assigned the model compound pK_a values (pK_a,intrinsic). Titratable groups in folded regions are assigned pK_a values based on the solvent accessibility of the titratable atoms. This was done to correct for the case in which a residue may reside in a folded region, however, due to the unfolding of segments of the protein that pack against the titratable site, the ionizable group is exposed to solvent. To apply this correction term in a general manner, a cutoff threshold was determined by comparison of COREX/BEST calculated and experimentally measured proton binding curves [18,19]. When the averaged solvent accessibility of the ionizable atoms of a titratable group is <31 % (for Glu, Asp, Tyr, Lys, and Arg) or <45 % (for His), these groups are assigned pK_a,native values. Otherwise, groups are assigned pK_a,intrinsic values. The COREX/BEST Server allows the user to change these two cutoff thresholds when calculating the pH sensitivity of an ensemble.
All lines of an uploaded PDB file that begin with the characters “ATOM” are used for COREX/BEST calculations; the algorithm is blindly unaware of the possibility that a PDB file may contain multiple models of the same structure. If a PDB file contains multiple models, either in terms of the whole chain (e.g., NMR model structures) or for the side chain conformations of individual residues, the Server will incorrectly assume that each model represents new atoms. Accordingly, additional atoms that physically aren’t present in a structure produce nonsensical COREX/BEST results. Also, PDB files that are missing side chain atoms should be avoided. The energy function used by COREX/BEST (see Note 1) was parameterized and trained using protein structures that contained side chain atoms for each residue.
The COREX/BEST algorithm can use a Metropolis Monte Carlo method of sampling [27] to decrease the run-time required to generate an ensemble by decreasing the number of microstates generated. By design, this method uses the microstate probabilities calculated from the COREX/BEST energy function (see Note 1 and Eq. 1) to bias sampling in favor of the more probable microstates. The COREX/BEST energy function, however, depends on temperature and pH. The Monte Carlo sampling routine uses a default temperature setting of 25°C and includes no proton binding energies when calculating microstate probabilities. Thus, Monte Carlo sampled ensembles work best (i.e., produce equivalent results compared to fully enumerated and non-sampled ensembles) for calculations designed to simulate temperatures at or near 25°C and that include no proton binding energies.

Acknowledgments

This work was supported by NIH grant R01-GM63747 to V.J.H. and the Texas Higher Education Coordinating Board grant 003615-0003-2011 to S.T.W.

References

1.Changeux JP, Edelstein SJ (2005) Allosteric mechanisms of signal transduction. Science 308:1424–1428 [DOI] [PubMed] [Google Scholar]
2.Karplus M, Kuriyan J (2005) Molecular dynamics and protein function. Proc Natl Acad Sci U S A 102:6679–6685 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Eisenmesser EZ, Millet O, Labeikovsky W, Korzhnev DM, Wolf-Watz M, Bosco DA, Skalicky JJ, Kay LE, Kern D (2005) Intrinsic dynamics of an enzyme underlies catalysis. Nature 438:117–121 [DOI] [PubMed] [Google Scholar]
4.Bai YW, Sosnick TR, Mayne L, Englander SW (1995) Protein folding intermediates: native-state hydrogen exchange. Science 269:192–197 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.SwintKruse L, Robertson AD (1996) Temperature and pH dependences of hydrogen exchange and global stability for ovomucoid third domain. Biochemistry 35:171–180 [DOI] [PubMed] [Google Scholar]
6.Chamberlain AK, Handel TM, Marqusee S (1996) Detection of rare partially folded molecules in equilibrium with the native conformation of RNaseH. Nat Struct Biol 3:782–787 [DOI] [PubMed] [Google Scholar]
7.Fuentes EJ, Wand AJ (1998) Local dynamics and stability of apocytochrome b(562) examined by hydrogen exchange. Biochemistry 37:3687–3698 [DOI] [PubMed] [Google Scholar]
8.Itzhaki LS, Neira JL, Fersht AR(1997) Hydrogen exchange in chymotrypsin inhibitor 2 probed by denaturants and temperature. J Mol Biol 270:89–98 [DOI] [PubMed] [Google Scholar]
9.Yang DW, Kay LE (1996) Contributions to conformational entropy arising from bond vector fluctuations measured from NMR-derived order parameters: application to protein folding. J Mol Biol 263:369–382 [DOI] [PubMed] [Google Scholar]
10.Li ZG, Raychaudhuri S, Wand AJ (1996) Insights into the local residual entropy of proteins provided by nmr relaxation. Protein Sci 5:2647–2650 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Hilser VJ, Freire E (1996) Structure-based calculation of the equilibrium folding pathway of proteins. Correlation with hydrogen exchange protection factors. J Mol Biol 262:756–772 [DOI] [PubMed] [Google Scholar]
12.Hilser VJ (2001) Modeling the native state ensemble. Methods Mol Biol 168:93–116 [DOI] [PubMed] [Google Scholar]
13.Hilser VJ, Freire E (1997) Predicting the equilibrium protein folding pathway: structure-based analysis of Staphylococcal nuclease. Proteins 27:171–183 [DOI] [PubMed] [Google Scholar]
14.Hilser VJ, Townsend BD, Freire E (1997) Structure-based statistical thermodynamic analysis of T4 lysozyme mutants: structural mapping of cooperative interactions. Biophys Chem 64:69–79 [DOI] [PubMed] [Google Scholar]
15.Hilser VJ, Dowdy D, Oas TG, Freire E (1998) The structural distribution of cooperative interactions on proteins: analysis of the native state ensemble. Proc Natl Acad Sci USA 95:9903–9908 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Pan H, Lee JC, Hilser VJ (2000) Binding sites in Escherichia coli dihydrofolate reductase communicate by modulating the conformational ensemble. Proc Natl Acad Sci USA 97:12020–12025 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Liu T, Whitten ST, Hilser VJ (2007) Functional residues serve a dominant role in mediating the cooperativity of the protein ensemble. Proc Natl Acad Sci U S A 104:4347–4352 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Whitten ST, García-Moreno EB, Hilser VJ (2005) Local conformational fluctuations can modulate the coupling between proton binding and global structural transitions in proteins. Proc Natl Acad Sci U S A 102:4282–4287 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Whitten ST, García-Moreno EB, Hilser VJ (2008) Ligand effects on the protein ensemble: unifying the descriptions of ligand binding, local conformational fluctuations, and protein stability. Methods Cell Biol 84:871–891 [DOI] [PubMed] [Google Scholar]
20.Babu CR, Hilser VJ, Wand AJ (2004) Direct access to the cooperative substructure of proteins and the protein ensemble via cold denaturation. Nat Struct Mol Biol 11:352–357 [DOI] [PubMed] [Google Scholar]
21.Whitten ST, Kurtz AJ, Pometun MS, Wand AJ, Hilser VJ (2006) Revealing the nature of the native state ensemble through cold denaturation. Biochemistry 45:10163–10174 [DOI] [PubMed] [Google Scholar]
22.Schrank TP, Bolen DW, Hilser VJ (2009) Rational modulation of conformational fluctuations in adenylate kinase reveals a local unfolding mechanism for allostery and functional adaptation in proteins. Proc Natl Acad Sci U S A 106:16984–16989 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Vertrees J, Barritt P, Whitten ST, Hilser VJ (2005) COREX/BEST server: a web browser-based program that calculates regional stability variations within protein structures. Bioinformatics 21:3318–3319 [DOI] [PubMed] [Google Scholar]
24.Hynes TR, Fox RO (1991) The crystal structure of staphylococcal nuclease refined at 1.7 A resolution. Proteins 10:92–105 [DOI] [PubMed] [Google Scholar]
25.Hilser VJ, García-Moreno EB, Oas TG, Kapp G, Whitten ST (2006) A statistical thermodynamic model of the protein ensemble. Chem Rev 106:1545–1558 [DOI] [PubMed] [Google Scholar]
26.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28:235–242 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092 [Google Scholar]
28.Whitten ST, García-Moreno EB (2000) pH dependence of stability of staphylococcal nuclease: evidence of substantial electrostatic interactions in the denatured state. Biochemistry 39:14292–14304 [DOI] [PubMed] [Google Scholar]
29.Freire E, Murphy KP, Sanchez-Ruiz JM, Galisteo ML, Privalov pL (1992) The molecular basis of cooperativity in protein folding: thermodynamic dissection of interdomain interactions in phosphoglycerate kinase. Biochemistry 31:250–256 [DOI] [PubMed] [Google Scholar]
30.Griko YV, Venyaminov SY, Privalov PL (1989) Heat and cold denaturation of phosphoglycerate kinase (interaction of domains). FEBS Lett 244:276–278 [DOI] [PubMed] [Google Scholar]
31.Baldwin RL (1986) Temperature dependence of the hydrophobic interaction in protein folding. Proc Natl Acad Sci U S A 83:8069–8072 [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Lopez CF, Darst RK, Rossky PJ (2008) Mechanistic elements of protein cold denaturation. J Phys Chem B 112:5961–5967 [DOI] [PubMed] [Google Scholar]
33.Privalov PL (1990) Cold denaturation of proteins. Crit Rev Biochem Mol Biol 25:281–305 [DOI] [PubMed] [Google Scholar]
34.Matthew JB, Gurd FR, García-Moreno EB, Flanagan MA, March KL, Shire SJ (1985) pH-dependent processes in proteins. CRC Crit Rev Biochem 18:91–197 [DOI] [PubMed] [Google Scholar]
35.Schaefer M, van Vlijmen HWT, Karplus M (1998) Electrostatic contributions to molecular free energies in solution. Adv Protein Chem 51:1–57 [DOI] [PubMed] [Google Scholar]
36.Murphy KP, Freire E (1992) Thermodynamics of structural stability and cooperative folding behavior in proteins. Adv Protein Chem 43: 313–361 [DOI] [PubMed] [Google Scholar]
37.Gomez J, Hilser VJ, Xie D, Freire E (1995) The heat-capacity of proteins. Proteins 22:404–412 [DOI] [PubMed] [Google Scholar]
38.Habermann SM, Murphy KP (1996) Energetics of hydrogen bonding in proteins: a model compound study. Protein Sci 5:1229–1239 [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Xie D, Freire E (1994) Structure-based prediction of protein-folding intermediates. J Mol Biol 242:62–80 [DOI] [PubMed] [Google Scholar]
40.D’Aquino JA, Gomez J, Hilser VJ, Lee KH, Amzel LM, Freire E (1996) The magnitude of the backbone conformational entropy change in protein folding. Proteins 25:143–156 [DOI] [PubMed] [Google Scholar]
41.Lee KH, Xie D, Freire E, Amzel LM (1994) Estimation of changes in side chain configu-rational entropy in binding and folding: general methods and application to helix formation. Proteins 20:68–84 [DOI] [PubMed] [Google Scholar]
42.Fitch CA, Karp DA, Lee KK, Stites WE, Lattman EE, García-Moreno EB (2002) Experimental pKa values of buried residues: analysis with continuum methods and role of water penetration. Biophys J 82:3289–3304 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Changeux JP, Edelstein SJ (2005) Allosteric mechanisms of signal transduction. Science 308:1424–1428 [DOI] [PubMed] [Google Scholar]

[R2] 2.Karplus M, Kuriyan J (2005) Molecular dynamics and protein function. Proc Natl Acad Sci U S A 102:6679–6685 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Eisenmesser EZ, Millet O, Labeikovsky W, Korzhnev DM, Wolf-Watz M, Bosco DA, Skalicky JJ, Kay LE, Kern D (2005) Intrinsic dynamics of an enzyme underlies catalysis. Nature 438:117–121 [DOI] [PubMed] [Google Scholar]

[R4] 4.Bai YW, Sosnick TR, Mayne L, Englander SW (1995) Protein folding intermediates: native-state hydrogen exchange. Science 269:192–197 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.SwintKruse L, Robertson AD (1996) Temperature and pH dependences of hydrogen exchange and global stability for ovomucoid third domain. Biochemistry 35:171–180 [DOI] [PubMed] [Google Scholar]

[R6] 6.Chamberlain AK, Handel TM, Marqusee S (1996) Detection of rare partially folded molecules in equilibrium with the native conformation of RNaseH. Nat Struct Biol 3:782–787 [DOI] [PubMed] [Google Scholar]

[R7] 7.Fuentes EJ, Wand AJ (1998) Local dynamics and stability of apocytochrome b(562) examined by hydrogen exchange. Biochemistry 37:3687–3698 [DOI] [PubMed] [Google Scholar]

[R8] 8.Itzhaki LS, Neira JL, Fersht AR(1997) Hydrogen exchange in chymotrypsin inhibitor 2 probed by denaturants and temperature. J Mol Biol 270:89–98 [DOI] [PubMed] [Google Scholar]

[R9] 9.Yang DW, Kay LE (1996) Contributions to conformational entropy arising from bond vector fluctuations measured from NMR-derived order parameters: application to protein folding. J Mol Biol 263:369–382 [DOI] [PubMed] [Google Scholar]

[R10] 10.Li ZG, Raychaudhuri S, Wand AJ (1996) Insights into the local residual entropy of proteins provided by nmr relaxation. Protein Sci 5:2647–2650 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Hilser VJ, Freire E (1996) Structure-based calculation of the equilibrium folding pathway of proteins. Correlation with hydrogen exchange protection factors. J Mol Biol 262:756–772 [DOI] [PubMed] [Google Scholar]

[R12] 12.Hilser VJ (2001) Modeling the native state ensemble. Methods Mol Biol 168:93–116 [DOI] [PubMed] [Google Scholar]

[R13] 13.Hilser VJ, Freire E (1997) Predicting the equilibrium protein folding pathway: structure-based analysis of Staphylococcal nuclease. Proteins 27:171–183 [DOI] [PubMed] [Google Scholar]

[R14] 14.Hilser VJ, Townsend BD, Freire E (1997) Structure-based statistical thermodynamic analysis of T4 lysozyme mutants: structural mapping of cooperative interactions. Biophys Chem 64:69–79 [DOI] [PubMed] [Google Scholar]

[R15] 15.Hilser VJ, Dowdy D, Oas TG, Freire E (1998) The structural distribution of cooperative interactions on proteins: analysis of the native state ensemble. Proc Natl Acad Sci USA 95:9903–9908 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Pan H, Lee JC, Hilser VJ (2000) Binding sites in Escherichia coli dihydrofolate reductase communicate by modulating the conformational ensemble. Proc Natl Acad Sci USA 97:12020–12025 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Liu T, Whitten ST, Hilser VJ (2007) Functional residues serve a dominant role in mediating the cooperativity of the protein ensemble. Proc Natl Acad Sci U S A 104:4347–4352 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Whitten ST, García-Moreno EB, Hilser VJ (2005) Local conformational fluctuations can modulate the coupling between proton binding and global structural transitions in proteins. Proc Natl Acad Sci U S A 102:4282–4287 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Whitten ST, García-Moreno EB, Hilser VJ (2008) Ligand effects on the protein ensemble: unifying the descriptions of ligand binding, local conformational fluctuations, and protein stability. Methods Cell Biol 84:871–891 [DOI] [PubMed] [Google Scholar]

[R20] 20.Babu CR, Hilser VJ, Wand AJ (2004) Direct access to the cooperative substructure of proteins and the protein ensemble via cold denaturation. Nat Struct Mol Biol 11:352–357 [DOI] [PubMed] [Google Scholar]

[R21] 21.Whitten ST, Kurtz AJ, Pometun MS, Wand AJ, Hilser VJ (2006) Revealing the nature of the native state ensemble through cold denaturation. Biochemistry 45:10163–10174 [DOI] [PubMed] [Google Scholar]

[R22] 22.Schrank TP, Bolen DW, Hilser VJ (2009) Rational modulation of conformational fluctuations in adenylate kinase reveals a local unfolding mechanism for allostery and functional adaptation in proteins. Proc Natl Acad Sci U S A 106:16984–16989 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Vertrees J, Barritt P, Whitten ST, Hilser VJ (2005) COREX/BEST server: a web browser-based program that calculates regional stability variations within protein structures. Bioinformatics 21:3318–3319 [DOI] [PubMed] [Google Scholar]

[R24] 24.Hynes TR, Fox RO (1991) The crystal structure of staphylococcal nuclease refined at 1.7 A resolution. Proteins 10:92–105 [DOI] [PubMed] [Google Scholar]

[R25] 25.Hilser VJ, García-Moreno EB, Oas TG, Kapp G, Whitten ST (2006) A statistical thermodynamic model of the protein ensemble. Chem Rev 106:1545–1558 [DOI] [PubMed] [Google Scholar]

[R26] 26.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28:235–242 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092 [Google Scholar]

[R28] 28.Whitten ST, García-Moreno EB (2000) pH dependence of stability of staphylococcal nuclease: evidence of substantial electrostatic interactions in the denatured state. Biochemistry 39:14292–14304 [DOI] [PubMed] [Google Scholar]

[R29] 29.Freire E, Murphy KP, Sanchez-Ruiz JM, Galisteo ML, Privalov pL (1992) The molecular basis of cooperativity in protein folding: thermodynamic dissection of interdomain interactions in phosphoglycerate kinase. Biochemistry 31:250–256 [DOI] [PubMed] [Google Scholar]

[R30] 30.Griko YV, Venyaminov SY, Privalov PL (1989) Heat and cold denaturation of phosphoglycerate kinase (interaction of domains). FEBS Lett 244:276–278 [DOI] [PubMed] [Google Scholar]

[R31] 31.Baldwin RL (1986) Temperature dependence of the hydrophobic interaction in protein folding. Proc Natl Acad Sci U S A 83:8069–8072 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Lopez CF, Darst RK, Rossky PJ (2008) Mechanistic elements of protein cold denaturation. J Phys Chem B 112:5961–5967 [DOI] [PubMed] [Google Scholar]

[R33] 33.Privalov PL (1990) Cold denaturation of proteins. Crit Rev Biochem Mol Biol 25:281–305 [DOI] [PubMed] [Google Scholar]

[R34] 34.Matthew JB, Gurd FR, García-Moreno EB, Flanagan MA, March KL, Shire SJ (1985) pH-dependent processes in proteins. CRC Crit Rev Biochem 18:91–197 [DOI] [PubMed] [Google Scholar]

[R35] 35.Schaefer M, van Vlijmen HWT, Karplus M (1998) Electrostatic contributions to molecular free energies in solution. Adv Protein Chem 51:1–57 [DOI] [PubMed] [Google Scholar]

[R36] 36.Murphy KP, Freire E (1992) Thermodynamics of structural stability and cooperative folding behavior in proteins. Adv Protein Chem 43: 313–361 [DOI] [PubMed] [Google Scholar]

[R37] 37.Gomez J, Hilser VJ, Xie D, Freire E (1995) The heat-capacity of proteins. Proteins 22:404–412 [DOI] [PubMed] [Google Scholar]

[R38] 38.Habermann SM, Murphy KP (1996) Energetics of hydrogen bonding in proteins: a model compound study. Protein Sci 5:1229–1239 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Xie D, Freire E (1994) Structure-based prediction of protein-folding intermediates. J Mol Biol 242:62–80 [DOI] [PubMed] [Google Scholar]

[R40] 40.D’Aquino JA, Gomez J, Hilser VJ, Lee KH, Amzel LM, Freire E (1996) The magnitude of the backbone conformational entropy change in protein folding. Proteins 25:143–156 [DOI] [PubMed] [Google Scholar]

[R41] 41.Lee KH, Xie D, Freire E, Amzel LM (1994) Estimation of changes in side chain configu-rational entropy in binding and folding: general methods and application to helix formation. Proteins 20:68–84 [DOI] [PubMed] [Google Scholar]

[R42] 42.Fitch CA, Karp DA, Lee KK, Stites WE, Lattman EE, García-Moreno EB (2002) Experimental pKa values of buried residues: analysis with continuum methods and role of water penetration. Biophys J 82:3289–3304 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Using the COREX/BEST Server to Model the Native-State Ensemble

Vincent J Hilser

Steven T Whitten

Abstract

1. Introduction

Fig. 1.

2. Materials

2.1. Input Structure for Generating an Ensemble

2.2. Native pK_a Values to Calculate pH Sensitivity

3. Methods

3.1. Creating a Workspace

3.2. Generating a Conformational Ensemble

Table 1.

3.3. Determining Entropy Weighting

3.4. Calculating the Temperature-Dependent Stability of an Ensemble

Fig. 2.

3.5. Calculating the pH Sensitivity of the Conformational Ensemble

Fig. 3.

4. Notes

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Using the COREX/BEST Server to Model the Native-State Ensemble

Vincent J Hilser

Steven T Whitten

Abstract

1. Introduction

Fig. 1.

2. Materials

2.1. Input Structure for Generating an Ensemble

2.2. Native pKa Values to Calculate pH Sensitivity

3. Methods

3.1. Creating a Workspace

3.2. Generating a Conformational Ensemble

Table 1.

3.3. Determining Entropy Weighting

3.4. Calculating the Temperature-Dependent Stability of an Ensemble

Fig. 2.

3.5. Calculating the pH Sensitivity of the Conformational Ensemble

Fig. 3.

4. Notes

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

2.2. Native pK_a Values to Calculate pH Sensitivity