Entropy and Free Energy of a Mobile Protein Loop in Explicit Water

Srinath Cheluvaraja; Mihail Mihailescu; Hagai Meirovitch

doi:10.1021/jp801827f

. Author manuscript; available in PMC: 2009 Apr 21.

Published in final edited form as: J Phys Chem B. 2008 Jul 10;112(31):9512–9522. doi: 10.1021/jp801827f

Entropy and Free Energy of a Mobile Protein Loop in Explicit Water

Srinath Cheluvaraja ¹, Mihail Mihailescu ¹, Hagai Meirovitch ^1,^*

PMCID: PMC2671085 NIHMSID: NIHMS97309 PMID: 18613721

Abstract

Estimation of the energy from a given Boltzmann sample is straightforward since one just has to average the contribution of the individual configurations. On the other hand, calculation of the absolute entropy, S (hence the absolute free energy F) is difficult because it depends on the entire (unknown) ensemble. We have developed a new method called, “the hypothetical scanning molecular dynamics” (HSMD) for calculating the absolute S from a given sample (generated by any simulation technique). In other words, S (like the energy) is “written” on the sample configurations, where HSMD provides a prescription of how to “read” it. In practice, each sample conformation, i is reconstructed with transition probabilities and their product leads to the probability of i, hence to the entropy. HSMD is an exact method where all interactions are considered and the only approximation is due to insufficient sampling. In previous studies HSMD (and HS Monte Carlo – HSMC) has been extended systematically to systems of increasing complexity, where the most recent is the 7-residue mobile loop, 304–310 (Gly-His-Gly-Ala-Gly-Gly-Ser) of the enzyme porcine pancreatic α-amylase modeled by the AMBER force field and AMBER with the implicit solvation GB/SA (paper I). In the present paper we make a step further and extend HSMD to the same loop capped with TIP3P explicit water at 300 K. As in paper I, we are mainly interested in entropy and free energy differences between the free and bound microstates of the loop, which are obtained from two separate MD samples of these microstates. The contribution of the loop to S and F is calculated by HSMD and that of water by a particular thermodynamic integration procedure. As expected, the free microstate is more stable than the bound microstate by a total free energy difference, F_free − F_bound = −4.8 ± 1, as compared to −25.5 kcal/mol obtained with GB/SA. We find that relatively large systematic errors in the loop entropies, S_free(loop) and S_bound(loop) are cancelled in their difference which is thus obtained efficiently and with high accuracy, i.e. with a statistical error of 0.1 kcal/mol. This cancellation, which has been observed in previous HSMD studies, is in accord with theoretical arguments given in paper I.

I. Introduction

I.1. The difficulty in calculating the absolute entropy

The commonly used simulation methods, Metropolis Monte Carlo (MC)¹ and molecular dynamics²^,³ (MD) enable one a direct calculation of quantities such as the energy E_i that are “written” on a simulated configurations i. However, these methods (due to their dynamic character) do not provide the absolute entropy S (hence the absolute Helmholtz free energy F, F=E−TS, where T is the absolute temperature) in a straightforward manner. More specifically, $S = - k_{B} \sum P_{i}^{B} ln P_{i}^{B}$ , where k_B is the Boltzmann constant and $P_{i}^{B}$ is the Boltzmann probability of configuration i

P_{i}^{B} = exp [- E_{i} / k_{B} T] / Z .

(1)

However, $P_{i}^{B}$ depends not only on i but also on the whole ensemble through the partition function Z, (Z=Σexp[−E_i/k_BT]), which cannot be obtained directly from a finite sample. Therefore, in spite of the progress achieved during the years, calculation of S from a single MC or MD sample still remains a difficult problem.

In recent years we have developed a new method for calculating the absolute S and F from a single sample called the hypothetical scanning Monte Carlo (HSMC) (or HSMD, where MD is used). HSMC(D) is based on ideas of previous methods suggested by Meirovitch, the local states (LS)⁴^–⁶ and the hypothetical scanning (HS).⁷^–⁹ HSMC(D) has been developed systematically as applied to liquid argon, TIP3P water,¹⁰^,¹¹ self-avoiding walks on a square lattice,¹² and peptides,¹³^–¹⁵ where for the first three models HSMC(D) results have been found to agree within error bars to thermodynamic integration results obtained by extensive MC or MD simulations. Also, for polyglycine molecules differences ΔF_mn and ΔS_mn for α-helix, extended, and hairpin microstates were calculated very reliably by HSMC^118–120 (IV.C–E).

Very recently HSMD has been applied successfully to a mobile loop of the protein α-amylase, ¹⁶ where the system was modeled by the AMBER96 force field¹⁷ alone and the AMBER96 force field with the GB/SA implicit solvent of Still and coworkers;¹⁸ this paper (Ref. ¹⁶) is referred here as paper I. In the present paper we make a step further in the development of HSMD, where GB/SA is replaced by explicit water, i.e., the same loop is capped with TIP3P¹⁹ water molecules.

I.2. Microstates of biological macromolecules

Before discussing the loop and HSMC(D) further, one should emphasize the importance of the free energy in structural biology as a criterion of stability. Bio-macromolecules such as proteins have rugged potential energy surface, E(x) (x is the 3N-dimensional vector of the Cartesian coordinates of the molecule’s N atoms), which is “decorated” by a tremendous number of localized wells and “wider” wells, defined over regions, Ω_m, which we call microstates; thus, each microstate consists of many localized wells (an example for a microstate is the α-helical region of a peptide). A microstate Ω_m, which typically constitutes only a tiny part of the entire conformational space Ω, can be represented by a sample (trajectory) generated by a local MD simulation starting from a structure that belongs to Ω_m. MD studies have shown that a molecule will visit a localized well only for a very short time [several femtoseconds (fs)] while staying for a much longer time within a microstate,²⁰^,²¹ meaning that the microstates are of a greater physical significance than the localized wells (see also I.4 below).

A central aim of computational structural biology is to identify the most stable microstates, i.e., those with the largest conformational partition function Z_m (or equivalently with lowest Helmholtz free energy, F_m)

F_{m} = - k_{B} T ln Z_{m} = - k_{B} T ln \int_{m} exp [- E (x) / k_{B} T] d x

(2)

where the integration is carried out over the limited microstate Ω_m, rather than over Ω (for simplicity we shall denote in most cases a microstate Ω_m by m). Thus, the protein folding problem is the notoriously difficult task of identifying the microstate with the global minimum F_m, which practically might be achieved by two challenging stages: (1) identifying an initial set of microstates with expected high stability (e.g., based on an energetic criterion), and (2) calculating their relative populations, p_m/p_n [p_m = exp[−F_m/k_BT]/Z, which leads to minimum F_m,

p_{m} / p_{n} = Z_{m} / Z_{n} = exp - [Δ F_{mn} / k_{B} T]

(3)

where ΔF_mn= F_m−F_n.

Calculation of relative populations is also required in problems which are less challenging than protein folding, i.e., in cases of intermediate flexibility, where a flexible protein segment (e.g., a side chain or a surface loop), a cyclic peptide, or a ligand bound to an enzyme populates significantly several microstates in thermodynamic equilibrium. It is of interest to know whether the conformational change adopted by a loop (a side chain, ligand, etc.) upon binding has been induced by the other protein (induced fit ²²^,²³) or alternatively the free loop already interconverts among different microstates where one of them is selected upon binding (selected fit²⁴). This analysis requires calculating p_m values, which are also needed for a correct analysis of NMR and x-ray data of flexible macromolecules.²⁵^–²⁸ Calculation of F is essential in many other biological processes. Thus, F determines the binding affinities of protein-protein interactions, it is an important factor in enzymatic reactions, electron transfer, and ion transport through membranes, and it leads to the solubilities of small molecules.

I.3. Advantages of the absolute F and S

The examples discussed above require calculating only the difference in free energy ΔF_mn (rather than the absolute F) which can be obtained, in principle, by applying thermodynamic integration (TI) techniques or a counting method which leads to ΔF_mn= − k_BTln[(#m)/(#n)], where #m (#n) are the populations of m and n obtained from a long MD trajectory.²⁹^–³⁵ In paper I¹⁶ we discuss these methods, emphasizing their limitations where m and n are separated by high energy barriers, which demonstrates the need for methods for calculating the absolute F_m from a given sample; with such methods, one will be able to carry out (only) two separate local MD simulations of microstates m and n, calculating directly the absolute F_m and F_n hence their difference ΔF_mn = F_m − F_n, where the complex TI process or the long runs needed for the counting method are avoided.

The absolute S and F can be calculated by the harmonic³⁶^–³⁸ and quasi-harmonic³⁹ approximations and can also be obtained by TI provided that a reference state R with known F_R is available and an efficient integration path R→m can be defined. However, for non-homogeneous systems such integration might not be trivial, and in models of peptides and proteins defining adequate reference states is a difficult problem (for further discussions on these and other methods, see paper I¹⁶ and Ref.35).

With the LS, HS and HSMC(D) techniques mentioned earlier, each conformation i of a sample (generated by MC MD or any other technique) is reconstructed step-by-step (from nothing) using transition probabilities (TPs). The product of these TPs leads to an approximation for the correct Boltzmann probability $P_{i}^{B}$ (eq 1) from which various free energy functionals can be defined. The TPs of HSMC(D) are stochastic in nature calculated by MC or MD simulations, where all interactions are taken into account. From this respect HSMC(D) (unlike HS and LS) can be viewed as exact;¹⁰ the only approximation involved is due to insufficient MC(MD) sampling. HSMC(D) has unique features: it provides rigorous lower and upper bounds for F, which enable one to determine the accuracy from HSMC(D) results alone without the need to know the correct answer. Furthermore, F can be obtained from a very small sample and in principle even from any single conformation (e.g., see results for argon in Ref. 10).

I.4. Problems to define microstates by computer simulation

Thus far we have dealt with microstates (and their populations) without providing a practical definition for them This, however, is not a straightforward task which has been ignored to a large extent in the literature but has been given considerable thoughts by us in the course of the years.⁵^,⁶^,²⁶^,⁴⁰^–⁴³ (see also paper I¹⁶). To illustrate the problem assume a peptide model based on constant bond lengths and bond angles in a helical microstate Ω_h, i.e., the dihedral angles φ_i and ψ_i are expected to vary within relatively small ranges Δφ_i and Δψ_i around φ_i = −60° and ψ_i = −50° (we ignore for a moment the side chains). However, the correct limits of Ω_h in terms of [φ_i, ψ_i] are unknown because the strongly correlated angles define a complicated narrow “pipe” within the region, Δφ₁×Δψ₁×Δφ₂×Δψ₂·····Δφ_N ×Δψ_N. Obviously, these correlations are taken into account by an exact simulation method and thus, in practice, Ω_h can be defined (or more correctly, represented) by a local MC (MD) sample of conformations initiated from an α-helical structure.

However, this definition should be used with caution. Thus, a short simulation will span only a small part of Ω_h and this part will grow constantly as the simulation continues; correspondingly, the calculated average potential energy, E_h and the entropy S_h (obtained by any method) will both increase and the free energy, F_h is expected to change as well. As the simulation time is increased further, side chain dihedrals will “jump” to different rotamers, which according to our definition should also be included within Ω_h; for a long enough simulation the peptide is expected to ”leave” the α-helical region moving to a different microstate. Thus, in practice, the microstate size and the corresponding thermodynamic quantities depend on the simulation time t. Therefore, in practice there is always some arbitrariness in the definition of a microstate, which affects the calculated averages. This arbitrariness is severe with some methods and can be controlled (minimized) by others.

Because the size of m (n) depends on t, calculation of differences S_m − S_n (F_m − F_n) from their absolute values [with QH, LS, or HSMC(D)] will depend on t as well where (as mentioned above) typically m and its energy and entropy all grow with t. To be able to carry out reliable estimation of ΔS_mn (ΔF_mn, etc.) we simulate both m and n for the same t looking for a range of t values where ΔF_mn(t), ΔS_mn(t) and ΔE_mn(t) are stable within the statistical errors [due to simultaneous increase of E_m(t), E_n(t), etc.]. With HSMC(D) one can calculate a series of improving approximations denoted S^A(t_r) by increasing the reconstruction time t_r which leads to improving differences $Δ S_{m n}^{A} (t_{r})$ [and $Δ F_{m n}^{A} (t_{r})$ ]; if these differences converge within the statistical errors, the converged values are considered to be the correct differences due to cancellation of equal systematic errors in $S_{m}^{A} (t_{r})$ and $S_{n}^{A} (t_{r})$ (a similar procedure is applicable with LS) (see a detailed discussion in II.10 of paper I¹⁶).

Obviously, if m is less stable than n the t values should be adjusted (i.e., decreased) to fit the stability of m. If m is significantly larger than n, t_m should be large enough to allow an adequate coverage of m However, if ΔS_mn(t) increases monotonically it constitutes a lower bound. If the microstate is restrictive, e.g., side chains should populate a single rotamer, the MD sample can be composed of several smaller samples each starts from the same structure with a different set of velocities. One should always verify that the samples remain in the original microstates and have not “escaped” to neighbor ones. We have developed methods for analyzing the stability of a microstate by calculating distribution profiles of dihedral angles.²⁶^,⁴¹^,⁴³

I.5. A mobile loop in porcine pancreatic α-amylase

As in paper I¹⁶, we apply HSMD to two structures (microstates) of the flexible surface loop 304–310 (Gly-His-Gly-Ala-Gly-Gly-Ser) of the enzyme porcine pancreatic α-amylase (PPA). PPA is a single polypeptide chain of 496 amino acid residues⁴⁴^–⁴⁷ consisting of three structural domains, domain A (residues 1–99, 170–404), domain B (residues 100–169) and domain C (residues 405–496). Domain A adopts a (β/α)₈ barrel structure and contains the three catalytic residues Asp197, Glu233 and Asp300. A deep cleft in this domain is accepted to be the substrate-binding site.⁴⁴^–⁵⁰ An essential chloride ion and a calcium ion are located closer to this V-shaped depression and have been suggested to enhance the catalytic activity.⁵⁰^–⁵³ (for more details about PPA see Paper I).

In the crystal structures of the free protein (PPA I⁴⁴ and II⁴⁷ which differ by two residues) the above loop has larger B-factors than the average B-factors of the atoms in the protein. However, in the crystal structures of PPA I complexed with acarbose⁴⁵ and PPA II complexed with V-1532⁴⁷ the B-factors of this loop are close to the average value in the protein where the loop has moved toward the active site. The maximum main-chain movement is ~5 Å at His305, which approaches the inhibitor from the solvent side to make a hydrogen bond with a glucose residue. The outcome of this movement is an apparent closure of the surface edge of the cleft.⁴⁵ Subsequently, several hypotheses have been put forward with respect to the function of the mobile loop in α-amylases, such as providing assistance in holding the glucose residues in a favorable orientation during catalysis,⁴⁵ or assisting in the transition state,⁵⁴ or inducing a trap-release mechanism of substrate and products.⁵⁵

In paper I we carried out two MD simulations, starting from the x-ray structures of the free and complexed PPA II, 1pif and 1pig, respectively,⁴⁷ which spanned the corresponding microstates and the entropy and free energy was calculated by HSMD. In this initial study the loop was modeled by the AMBER96 force field¹⁷ alone (where solvation effects are not considered) and by the AMBER96 and the highly approximate GB/SA implicit solvent.¹⁸. In the present work we make an additional step in the development of HSMD by extending it to the above loop capped by TIP3P water molecules.¹⁹

II. Theory and methodology

II.1. The loop and the protein’s template

As was pointed out above we study the 7-residue loop, 304–310 of PPA in two microstates related to the free and bound loop structures; the starting point is the available crystal structures of PPA II, 1pif and 1pig,⁴⁷ respectively. Because the structures of these proteins are almost identical, we have chosen (as in paper I) to carry out the calculations with the 1pif structure, where the loop structure of 1pig is attached to the 1pif structure by superimposing the structure of 1pig on that of 1pif (the ligand was discarded). This might indicate whether the transition of the loop to the bound microstate constitutes a selected fit, i.e., whether this microstate is reachable in the free protein. PPA is a relatively large protein and it would be computationally unfeasible to include all of its atoms in the calculations. Therefore, exactly as in paper I, we consider only a template of 700 atoms (the same atoms for the bound and free structures) that are close to the loop where the rest of the protein’s atoms are ignored. Using the same template as in paper I will allow comparing the present results to those obtained in paper I with implicit solvent. The construction of the template is described in some detail below.

Thus (see Figure 1 and Ref. 56), the center of mass of the loop backbone atoms (in the x-ray structure of 1ipf) is calculated as a reference point denoted x_cmb. A distance (R_temp) is chosen such that if the distance of any atom of a residue from x_cmb is less than R_temp, the entire residue is included in the template. Otherwise, the residue is eliminated. As in paper I, R_temp =12 Å (which leads to a template of 700 atoms). Then, the loop and template atoms are relaxed to a nearby geometry. This minimization is carried out using harmonic positional restraints with force constant of 5 kcal mol⁻¹Å⁻², which are applied to all heavy atoms. This eliminates bad atomic overlaps and strains in the original structure, while keeping the atoms still reasonably close to the PDB coordinates. All these procedures are exactly the same as in paper I.

A two dimensional diagram of the spherical water restraining region. The loop is represented as the heavy black curve, and the protein template is the region shown in gray. The dashed circle (radius = R_temp), defines the edge of the template Three positions are marked with the symbol, ⊗, in the figure. These are, starting from the bottom, x_cm, x_cmb, and x_sph. x_cm is the center of mass of the loop and template atoms, while x_cmb is the center of mass of the loop backbone. x_cm and x_cmb are connected by a dotted line which defines the vector direction (pointing from x_cm to x_cmb) that is used to determine the position of x_sph. (That is, x_sph is shifted away from the template by r_shift, see eq 4.) Water molecules are contained within a spherical region defined by the distance, R_cap, measured from x_sph. This containment region is represented by the large outer circle. Note that generally, R_cap > R_temp, and therefore the edge of this circle (sphere in 3D) is shifted to keep the water molecules on the “loop side” of the model system.

The treatment of water has been discussed in Ref. 56, where the effect of minimalist explicit solvation models for surface loops in proteins has been studied, following the original work of Steinbach and B. Brooks.⁵⁷ In Ref. 56 we performed MD simulations of surface loops capped by TIP3P water,¹⁹ in the presence of fixed templates, where the prime focus of the study was to check the performance of the small numbers (e.g., ~100) of water molecules employed. The number of water molecules, N was systematically varied, and convergence with large N was monitored to reveal the minimum number required for the loop to exhibit realistic (fully hydrated) behavior. It was found that the loop backbone can stabilize with a surprisingly small number of water molecules (as low as 5 molecules per amino acid residue). The side chains require somewhat larger N, roughly 12 water molecules per residue. The importance of this result lies in the fact that at this hydration level, computational times are comparable to those required for GB/SA,¹⁸ while the “minimalist explicit models” are expected to provide a viable and potentially more accurate alternative. In the present paper we have chosen to apply initially N=70 due to the fact that five out of the seven residues are Gly and Ala, which practically are without side chains. However, we have also carried out calculations using N=120 (see III.4).

To hold these waters around the loop they are restrained with a flat-welled half-harmonic potential (a force constant of 10 kcal mol⁻¹Å⁻²), based on the distance from the “center” of the loop region. That is, the distance of each water molecule (in practice, the oxygen atom) is measured from a restraining center denoted x_sph. If this distance is greater than a prescribed distance, R_cap, a harmonic restoring force is applied, otherwise the restraining force is zero. A reasonable restraining center could be, for example, the center of mass of the loop backbone atoms (i.e. x_sph = x_cmb). However, we have found that our template is too “thin” and during MD simulations water molecules percolate through cavities in the template to its “back side”. To avoid this undesired situation we have chosen R_cap=14 Å (i.e. R_cap is larger than R_temp) and as suggested in Ref 56, x_sph was defined by,

x_{sph} = x_{cmb} + r_{shift} (x_{cmb} - x_{cm}) / ∣ (x_{cmb} - x_{cm}) ∣

(4)

where x_cm is the overall center of mass of the loop-template system. Here, the effect is to shift the center of the restraining sphere (x_sph) toward the “loop side” of the loop-template system by r_shift Å (see Figure 1). However, because changing r_shift affects the energy of the system one would seek to adopt its smallest value that still avoids the percolation of water. We carried out several MD simulations (500 and 1000 ps long) for R_cap=14 Å and r_shift = 2 – 7, 9, 15, and 25 Å calculating the various energy components; it has been found that water seeping is eliminated for r_shift ≥ 4 Å, but as a step of precaution have decided to use r_shift=5 Å, for which the energy results are similar to those obtained with r_shift = 4 Å.

As in paper I, the potential energy is defined by the AMBER96 force field,¹⁷ the His residue is protonated in the free and bound states. The reconstruction of the loop structure is carried out in internal coordinates, therefore, the conformations simulated by MD should be transferred from Cartesians to the dihedral angles φ_i, ψ_i, and ω_i (i=1,N=7), the bond angles θ_i,_l (i=1,N, l=1,3), the side chain angles χ and the corresponding bond angles. As in paper I we consider only three χ angles two of His and one of Ser, while the contribution of the side chain of Ala is ignored. Also, because the side chains are much shorter than the backbone and are not restricted by the loop closure condition, the effect of their bond angles on entropy differences is expected to be small and is thus ignored; we have argued in paper I that to a good approximation bond stretching can be ignored as well. For convenience, these angles (ordered along the backbone) are denoted by α_k, k=1,45=K. The MD runs are carried out with the package TINKER,⁵⁸ where the loop and the capped TIP3P waters are free to move while the template is kept fixed in its x-ray coordinates and the total potential energy E is

E = E_{loop} + E_{water}

(5)

where E_loop includes the loop-loop and loop-template potential energy and E_water consists of the water-water, water-loop, and water-template interactions (the template-template energy is constant and thus is ignored).

II.2. Statistical mechanics of a loop in internal coordinates

The partition function of the loop/water system Z (eq 1) is an integration of exp-[E/k_BT] with respect to x_loop, the Cartesian coordinates, of the loop and x^N, the 3N coordinates of the N water molecules over a microstate m (for brevity we omit the letter m in most equations). However, it is convenient to change the variables of integration from x_loop to internal coordinates, α_k, k=1,K which makes the integral dependent also on a Jacobian, J, which for a linear chain has been shown to be a simple function of the bond angles and bond lengths independent of the dihedral angles.³⁶^,³⁷^,³⁹ This transformation is applied under the assumption that the potentials of the bond lengths (“the hard variables”) are strong and therefore their average values can be assigned to J, which to a good approximation can be taken out of the integral (however, see a later discussion in this section). For the same reason one can carry out the integration over the bond lengths (assuming that they are not correlations with the α_k) and the remaining integral becomes a function of the K dihedral and bond angles (α_k) ³⁶^,³⁷^,³⁹ and a Jacobian that depends only on the bond angles; the partition function is

Z^{'} = D Z = D \int_{m} exp ([- E_{loop} ([α_{k}]) - E_{water} ([α_{k}], x^{N})] / k_{B} T) d α_{1} \dots d α_{K} d x^{N},

(6)

where [α_k] = [α₁,…α_K]. D is a product of the integral over the bond lengths and their Jacobian J. The Jacobian [Π_j sin(θ_j)] of the bond angles, θ_j that should appear under the integral is omitted for simplicity (However, in paper I we have shown that the Jacobian cancels out in entropy and free energy differences; therefore, we shall discard the Jacobian from future discussions.) We assume D to be the same (i.e., constant) for different microstates of the same loop and therefore lnD cancels and can be ignored in calculations of free energy and entropy differences. The Boltzmann probability density corresponding to Z (eq 6) is (E is defined in eq 5)

ρ^{B} ([α_{k}], x^{N}) = exp {- E ([α_{k}], x^{N}) / k_{B} T} / Z,

(7)

and the exact entropy S and exact free energy F (defined up to an additive constant) are

S = - k_{B} \int_{m} ρ^{B} ([α_{k}], x^{N}) ln ρ^{B} ([α_{k}], x^{N}) d α_{1} \dots d α_{K} d x^{N}

(8)

and

F = \int_{m} ρ^{B} ([α_{k}], x^{N}) {E ([α_{k}], x^{N}) + k_{B} T ln ρ^{B} ([α_{k}], x^{N})} d α_{1} \dots d α_{K} d x^{N}

(9)

It should be pointed out that the fluctuation of the exact F is zero;⁵⁹ thus (provided that the above assumptions about the bond lengths are correct) one can substitute the expression for ρ_B ([α_k], x^N) (eq 7) inside the curly brackets of eq 9 to obtain,

E ([α_{k}], x^{N}) + k_{B} T ln ρ^{B} ([α_{k}], x^{N}) = - k_{B} T ln Z = F,

(10)

i.e. the expression in the curly brackets is constant and equal to F for any set ([α_k], x^N) within m. This means that the free energy can be obtained from any single conformation if its Boltzmann probability density is known. However, the fluctuation of an approximate free energy (i.e., which is based on an approximate probability density) is finite and it is expected to decrease as the approximation improves.⁹^,⁴⁰^,⁵⁹^–⁶¹ Because HSMC(D) provides an approximation for ρ^B ([α_k], x^N), it enables one, in principle, to estimate the free energy of the system from any single structure [Notice, however, that calculation of ρ^B ([α_k], x^N) for a single conformation depends on the entire microstate as is also evident from the HSMC(D) procedure discussed later].

With MD the bond stretching energy is taken into account in eq 9 (and in free energy functionals defined later) while the corresponding entropy is ignored. The contribution of this energy to the free energy becomes an additive constant if one accepts the assumptions about the stretching energy and the corresponding Jacobian made prior to eq 6. This is a very good approximation; however, if the bond stretching entropy should be considered, we have argued in paper I, II.6 that it can be estimated approximately within the framework of HSMD.

II.3. Exact future scanning procedure

HSMC(D) (as well as HS and LS) is based on the ideas of the exact scanning method where a system is constructed (from nothing) step-by-step using transition probabilities (TPs). The product of these TPs is equal to the Boltzmann probability (eqs 1 and 7) from which the entropy and free energy can be calculated. Practically, a loop/water configuration is generated by initially building a loop structure followed by the construction of a configuration of the water molecules. In this way a sample of statistically independent system configurations can be obtained.

For simplicity this construction is described for a loop consisting of M Gly residues (with dihedral and bond angles denoted α_k,1≤ α_k ≤ 6M=K) in microstate m; the loop is surrounded by N water molecules moving within the volume defined by the sphere of radius R_cap, the template, and the loop. Starting from nothing, a conformation of the loop is built first by defining the angles α_k step-by-step using transition probabilities (TPs) and adding the related atoms;⁶² for example, the angle φ determines the coordinates of the two hydrogens connected to C^α, while the bond angle N-C^α-C′ determines the position of C′. Thus, at step k, k−1 angles α_1,···,α_k₋₁ have already been determined; these angles and the related structure (the past) are kept constant, and α_k is defined with the exact TP density ρ(α_k|α_k₋₁···α₁),

ρ (α_{k} ∣ α_{k - 1}, \dots, α_{1}) = Z_{future} (α_{k}, \dots, α_{1}) / [Z_{future} (α_{k - 1}, \dots, α_{1})]

(11)

where Z_future (α_k, ···,α₁) is a future partition function. The term “future” indicates that the integration defining Z_future is carried out over the variables α_k ₊₁, ···,α_K and the 3N coordinates x^N of the water molecules which will be determined in future steps of the build-up process. In this integration the atoms treated in the past are held fixed in their coordinates (which are determined by α₁ ···α_k), while α_k₊₁,···,α_K are varied in a restrictive way where the corresponding conformations of the “future” part of the loop remain in microstate m. Thus

Z_{future} (α_{k}, \dots, α_{1}) = \int_{m} exp - [(E (α_{K}, \dots, α_{1}, x^{N}) / k_{B} T] d α_{k + 1} \dots d α_{K} d x^{N}

(12)

where E (eq 5) is the total potential energy of the loop/template/water system, which also imposes the loop closure condition. The product of the TPs (eq 11) leads to the (Boltzmann) probability density of the entire loop conformation,

ρ_{loop}^{B} (α_{K}, \dots, α_{1}) = \prod_{k = 1}^{K} ρ (α_{k} ∣ α_{k - 1}, \dots, α_{1}) .

(13)

After the loop structure has been constructed a configuration of water molecules is generated step-by-step, where the TP density for placing water molecule k at x_k is

ρ_{water} (x_{k} ∣ α_{K}, \dots, α_{1}, x^{k - 1}) = Z_{future} (α_{K}, \dots, α_{1}, x^{k}) / [Z_{future} (α_{K}, \dots, α_{1}, x^{k - 1})]

(14)

where the loop conformation is kept constant and the k−1 water molecules that have already been treated are fixed at their coordinates, x^k⁻¹ and the summation in Z(x^k) is over the as yet undecided N−k+1 water molecules. (Notice that x_k denotes the 3 Cartesian coordinates of water molecule k, while x^k denotes the set of Cartesian coordinates of the k molecules 1,2,….,k). The Boltzmann probability density of the water is

ρ_{water}^{B} (α_{K}, \dots, α_{1}, x^{N}) = \prod_{k = 1}^{N} ρ_{water} (x_{k} ∣ α_{K}, \dots, α_{1}, x^{k - 1})

(15)

and the probability density of the loop/water configuration is,

ρ^{B} ([α_{k}], x^{N}) = ρ_{loop}^{B} ([α_{k}]) ρ_{water}^{B} ([α_{k}], x^{N}) = exp {- E ([α_{k}], x^{N}) / k_{B} T} / Z_{future} (α_{1})

(16)

where Z_future (α₁)(= Z) is the partition function of the entire loop/water system for microstate m. Because ρ_B ([α_k], x^N) is known one can obtain the free energy from any single loop/water configuration (see eq 10). In addition to S (eq 8), one can define for m “the loop entropy of mean force”, S_loop,

S_{loop} = - k_{B} \int_{m} ρ^{B} ([α_{k}]) ln ρ^{B} ([α_{k}]) d [α_{k}]

(17)

where d[α_k] ≡ dα₁ ···dα_K; S_loop is defined up to an additive constant. Extending the exact scanning procedure to side chains is straightforward.

This construction procedure (which is not feasible for a large loop/water system) provides the theoretical basis for HSMC(D). Thus, the exact scanning method is equivalent to any other exact simulation technique (in particular Metropolis MC and MD) in the sense that large samples generated by such methods lead to the same averages and fluctuations. Therefore, one can assume that a given MC or MD sample has rather been generated by the exact scanning method, which enables one to reconstruct each conformation i by calculating the TP densities that hypothetically were used to create it step-by-step; this is the basis for HSMC(D) (as well as the HS and LS methods).

II.4. The HSMC(D) method

The theory of HSMD is again described as applied loop consisting of Gly residues. One starts by generating an MD sample of microstate m with water molecules; the conformations are then represented in terms of the dihedral and bond angles α_k,1≤α_k ≤ 6N=K, and the variability range Δα_k is calculated,

Δ α_{k} = α_{k} (max) - α_{k} (min),

(18)

where α_k(max) and α_k(min) are the maximum and minimum values of α_k found in the sample, respectively. Δα_k, α_k(max), and α_k(min) enable one to verify that the sample spans correctly the microstate m.

System configuration ([α_k], x^N) (denoted i for brevity) is reconstructed in two stages, where the loop structure is reconstructed first followed by the reconstruction of the water configuration. Thus, at step k of stage 1, k−1 angles α_k₋₁···α₁ have already been reconstructed and the TP density of α_k, ρ(α_k|α_k₋₁,···, α₁) is calculated from an MD sample of n_f conformations (generated in Cartesian coordinates), where the entire future of the loop and water is moved [i.e., the loop atoms defined by α_k, ···, α_K and the water coordinates (x^N)] while the past (the loop atoms defined by α₁, ···, α_k₋₁) are held fixed at their values in conformation i. A small segment (bin) δα_k is centered at α_k(i) and the number of visits of the future chain to this bin during the simulation, n_visit, is calculated; one obtains,

ρ_{loop} (α_{k} ∣ α_{k - 1}, \dots, α_{1}) \approx ρ^{HS} (α_{k} ∣ α_{k - 1}, \dots, α_{1}) = n_{visit} / [n_{f} δ α_{k}]

(19)

where ρ^HS(α_k|α_k₋₁,···,α₁) becomes exact for very large n_f (n_f → ∞) and a very small bin (δα → 0). This means that in practice ρ^HS(α_k|α_k₋₁,···,α₁) will be somewhat approximate due to insufficient future sampling (finite n_f), a relatively large bin size δα_k, an imperfect random number generator, etc. This equation is suitable for HSMC. However, for practical reasons, with HSMD a pair of angles should be treated simultaneously, where each pair consisting of a dihedral angle and its successive bond angle (e.g., φ and the bond angle N-C^α-C′). Thus, at each step both α_k and α_k+₁ are considered and n_visit is increased by 1 only if α_k and α_k+₁ are located within the limits of δα_k and δα_k₊₁, respectively; therefore eq 19 becomes

ρ^{HS} (α_{k + 1}, α_{k} ∣ α_{k - 1}, \dots, α_{1}) = n_{visit} / [n_{f} δ α_{k} δ α_{k + 1}],

(20)

where in paper I we have shown that δα_k and δα_k₊₁can be optimized. Notice that with HSMD the future loop conformations generated by MD at each step k remain in general within the limits of m, which is represented by the analyzed MD sample. The corresponding probability density is

ρ^{HS} (α_{K}, \dots, α_{1}) = \prod_{k = 1}^{K} ρ^{HS} (α_{k + 1}; α_{k} ∣ α_{k - 1}, \dots, α_{1}),

(21)

where in the product only odd values of k are used. ρ^HS ([α_k]) defines an approximate entropy functional, denoted $S_{loop}^{A}$ which can be shown using Jensen’s inequality to constitute a rigorous upper bound for S_loop (eq 17),¹⁰

S_{loop}^{A} = - k_{B} \int_{m} ρ^{B} ([α_{k}]) ln ρ^{HS} ([α_{k}]) d [α_{K}] .

(22)

$ρ_{loop}^{B}$ (eq 13) is the Boltzmann probability density of [α_K] in m. Thus, for microstate m, $S_{loop}^{A}$ can be estimated from a Boltzmann sample (of size n_s) generated by MD using the arithmetic average,

{\bar{S}}_{loop}^{A} (m) = - \frac{k_{B}}{n_{s}} \sum_{t = 1}^{n_{s}} ln ρ^{HS} (t, m)

(23)

where ρ^HS (t, m) is the value of ρ^HS ([α_k]) obtained for configuration t of the sample of m. $S_{loop}^{A}$ constitutes a measure of the loop flexibility of a pure geometrical character, i.e. with no direct dependence on the interaction energy. We denote the difference in the loop entropies obtained for a specific set of parameters by $Δ S_{loop}^{A}$ but the converged difference, which is expected to be exact within the statistical errors, is denoted by ΔS_loop

Δ S_{loop} = Δ S_{loop}^{A} = {\bar{S}}_{loop}^{A} (m) - {\bar{S}}_{loop}^{A} (n)

(24)

In the same way one calculates the arithmetic averages of the energies over the n_s system configurations

{\bar{E}}_{loop} (m) = \frac{1}{n_{s}} \sum_{t = 1}^{n_{s}} E_{loop} (t, m)

(25)

where E_loop (t, m) is the loop-loop and loop-template interaction energy of loop conformation t. The corresponding difference is,

Δ E_{loop} = {\bar{E}}_{loop} (m) - {\bar{E}}_{loop} (n)

(26)

One can also define a free energy difference, ΔF_loop for the loop,

Δ F_{loop} = Δ E_{loop} - T Δ S_{loop}

(27)

To reconstruct the water configuration one can use the HSMC(D) procedure for fluids developed previously, which would lead to $ρ_{water}^{HS} ([α_{k}], x^{N})$ [as an approximation for $ρ_{water}^{B} ([α_{k}], x^{N})$ (eq 15)] and then to the contribution of the water configuration to the free energy $F_{water} ([α_{k}], x^{N}) = E_{water} ([α_{k}], x^{N}) + k_{B} T ln ρ_{water}^{HS} ([α_{k}], x^{N})$ . However, this procedure for fluids has not been optimized yet and it is relatively time consuming. Alternatively, one can obtain F_water([α_k], x^N) by a thermodynamic integration (TI) procedure, where the water molecules are integrated from an ideal gas to their TIP3P form within the spherical volume (R_cap) and the presence of constant template and loop structure; however, this would be a complex procedure as well. Since the free and bound loop structures have the same template and because we are mainly interested in free energy differences, we have applied a much simpler TI procedure based on the same reference state for the two microstates. In this state the water-water and water-template interactions are preserved but the (fixed) loop structure [α_k] does not “see” the surrounding waters, i.e. the loop-water interactions (electrostatic and Lennard Jones) are switched off. These interactions are gradually increased (from zero) during an MD simulation of water [while the loop structure remains fixed at ([α_k])]; For [α_k] of microstate m one obtains from the integration $F_{water}^{TI} ([α_{k}], m)$ which is then averaged over the n_s sample configurations (see eq 23),

{\bar{F}}_{water}^{TI} (m) = \frac{1}{n_{s}} \sum_{t = 1}^{n_{s}} F_{water}^{TI} (t, m)

(28)

and the difference in the free energy of water between m and n denoted ΔF_water is

Δ F_{water} = {\bar{F}}_{water}^{TI} (m) - {\bar{F}}_{water}^{TI} (n)

(29)

In the same way one calculates the arithmetic averages of the water energy over the n_s system configurations

{\bar{E}}_{water} (m) = \frac{1}{n_{s}} \sum_{t = 1}^{n_{s}} E_{water} (t, m)

(30)

where E_water (t, m) is the water-water, water-template, and water-loop interaction energy of system configuration t. The corresponding differences are,

Δ E_{water} = {\bar{E}}_{water} (m) - {\bar{E}}_{water} (n)

(31)

and

Δ E_{total} = Δ E_{water} + Δ E_{loop}

(32)

The difference in the total free energy between microstates m and n is

Δ F_{total} = Δ E_{loop} - T Δ S_{loop} + Δ F_{water}

(33)

The difference in the water entropy between m and n is

T Δ S_{water} = T [{\bar{S}}_{water}^{TI} (m) - {\bar{S}}_{water}^{TI} (n)] = Δ E_{water} - Δ F_{water},

(34)

where the corresponding difference in the total entropy is

T Δ S_{total} = T Δ S_{water} + T Δ S_{loop} .

(35)

It should be pointed out again that the dependence of ΔF_total (eq 33) (and TΔS_total, eq 35) on the bond stretching energy is through E_loop while this interaction is ignored in $S_{loop}^{A}$ (eqs 22 and 23). However, under the assumptions leading to eq 6 this is not expected to affect differences in free energy which are our main interest; see also paper I (II.6).

II.5. The reconstruction procedure with HSMD

The HSMD reconstruction procedure needs further discussions. Thus, the MD simulation of the future chain at step k starts from the reconstructed conformation i, and every g fs the current conformation is considered, where the n_init initial considered conformations are discarded for equilibration. The next n_f (considered) future conformations are represented in internal coordinates and their contribution to n_visit (eq 20) is calculated. An essential issue is how to guarantee an adequate coverage of microstate m, i.e., that the future chains will span its entire region (in particular the side chain rotamers) while avoiding their “overflow” to neighboring microstates, conditions that will occur for a too small and a too large n_f, respectively. (Note that even at step k, where the “past” of the loop is kept fixed, the (future) unfixed part can leave the microstate during long MD simulations. Such “overflow” is more likely to happen for small residues such as Gly and for small k.) To be able to control the extent of coverage of m the following procedure has been applied: n_f has been divided into several (j) shorter repetitive procedures (“units”), each based on n′_f < n_f conformations where n_f=jn′_f, and each unit starts from the reconstructed structure i with a different set of velocities followed by equilibration of size, n_init; obviously, one would seek to determine the minimal values for n′_f, j, and n_init, which would keep the future chains within m while allowing its adequate sampling. A similar procedure was first suggested by Brady & Karplus⁶³ within the framework of the quasi-harmonic method, and was also used in implementations of the LS method to peptides.²⁶^,⁴¹

In paper I (II.6) we have discussed (and applied) several measures which enable one to estimate the extent of coverage of the reconstructed samples of the future chains. Because the present Δα_k values are smaller than those in paper I (see Table 1) and we use the same n_f values, we did not consider it necessary to apply these measures here. However, it should be emphasized again that we are interested in an entropy difference, $Δ S_{m, n}^{A}$ between two microstates, where $Δ S_{m, n}^{A}$ is considered to be reliable (i.e., to lead to ΔS_loop, eq 24) if its results are found to be stable for a large range of the parameters n_init, n′_f, and j. From now on we shall replace in most cases n′_f by the word unit.

TABLE 1.

Two sets of differences Δα_k (in degrees) between the minimum and maximum values of dihedral angles in the free and bound samples^a

	Explicit Solvent				Implicit solvent
	Free		Bound		Free		Bound
Reidue	Δφ	ΔΨ	Δφ	ΔΨ	Δφ	ΔΨ	Δφ	ΔΨ
Gly 1	47	90	75	121	76	153	92	148
His 2	86	104	107	70	139	130	125	105
Gly 3	83	99	173	177	175	124	80	95
Ala 4	133	87	96	132	131	94	143	288
Gly 5	73	88	95	88	107	100	199	360
Gly 6	133	98	116	81	126	109	285	267
Ser 7	65	59	84	71	83	64	243	109
χ¹ (His)	78		89		55		53
χ²(His)	107		106		130		108
χ¹ (Ser)	166		163		317		321

Open in a new tab

Δα_k are defined in eq 18. The explicit solvent results were calculated in the present study based on a sample of n_s=600 configurations of the loop and 70 TIP3P water molecules using the AMBER force field. The implicit solvent results have been obtained in paper I¹⁶ from the entire sample of 500 conformations using the AMBER force field and the GB/SA implicit solvent.

II.6. The Local States (LS) and the quasi-harmonic (QH) methods

With the LS method⁴^–⁶ (applied to an N-residue polyglycine with 6N=K backbone angles, α_k) the ranges Δα_k (eq 18) are divided into l equal segments, where l is the discretization parameter. These segments are denoted by ν_k, (ν_k=1,l), where an angle α_k is represented by the segment ν_k to which it belongs, and a conformation i is expressed by the corresponding vector of segments [ν₁(i), ν₂(i), …,ν_K (i)]. The TP ρ(α_k|α_k₋₁···α₁) can be estimated only approximately by n(ν_k, ···, ν_k₋_b)/{n(ν_k₋₁, ···, ν_k₋_b)[Δα_k/l]}, where n(ν_k, ···, ν_k₋_b) is the number of times the local state [i.e., the vector (ν_k,···, ν_k₋_b)] appears in the sample; b is the correlation parameter. One obtains the approximate probability density, $ρ_{i} (b, l) = \prod_{k = 1}^{K} p (ν_{k} ∣ ν_{k - 1}, \dots, ν_{k - b}) / (Δ α_{k} / l)$ , the larger are b and l the better the approximation (for enough statistics). ρ_i (b, l) defines a rigorous upper bound, $S_{loop}^{A}$ (eq 22) where ρ_i (b,l) replaces ρ^HS; $S_{loop}^{A}$ can be estimated by eq 23.

With the QH method introduced by Karplus and Kushick,³⁹ the Boltzmann probability density of structures defining a microstate is approximated by a multivariate Gaussian. Thus,

S_{l oop}^{QH} (m) = (k_{B} / 2) {N + ln [{(2 π)}^{N} Det (σ)]}

(36)

where the covariance matrix, σ, is obtained from a local MD (MC) sample and N is (usually) the number of internal coordinates. Clearly, S^QH constitutes an upper bound for S since correlations higher than quadratic are neglected; also, an-harmonic contributions are ignored, and QH is not suitable for diffusive systems such as water. While QH has been used extensively during the years, a systematic study of its performance has been carried out only recently by Gilson’s group⁶⁴ who have found that the performance of QH deteriorates significantly in Cartesian coordinates and when applied to more than one microstate.³⁵

III. Results and discussion

III.1. Simulation details

We carried out two MD simulations at T=300 K starting from the free and bound PDB structures (which were capped by 70 TIP3P waters); by considering a structure every 0.5 ps these simulations led to stable samples of n_s=600 conformations. These simulations and the reconstruction simulations (for generating the future samples) were carried out with the velocity-Verlet algorithm⁶⁵ based on a time step of 2 fs, where bonds involving hydrogens (including those of water) were frozen to their ideal values by the RATTLE algorithm;⁶⁵ the Berendsen⁶⁵ heat bath controlled the temperature. Cut-offs on long-range interactions were not imposed, and in the reconstruction process a structure was added to the sample every g=10 fs, where the n_init=250 initial structures (2.5 ps) were discarded for equilibration. The future samples were generated for four bin sizes, δ= Δα_k/45, Δα_k/30, Δα_k/15, and Δα_k/10, centered at α_k (i.e., α_k ±δ/2) (eqs 19 and 20). If the counts of the smallest bin are smaller than 50 the bin size is increased to the next size, and if necessary to the next one, etc. In the case of zero counts, n_visit is taken to be 1; however, zero counts is a very rare event.

To obtain the loop entropy, ${\bar{S}}_{loop}^{A} (m)$ (eq 23) the calculations are based on unit n′_f=250 (2.5 ps) and future sample sizes n_f =250 (j=1), 500 (j=2), 750 (j=3), and 1250 (j=5). To examine the convergence of the results for $Δ S_{loop}^{A}$ (eq 24) we have extracted from the main samples (of the free and bound microstates) two sets (pairs) of partial samples. One set, which consists of n_s=100, was obtained by selecting every 6^th conformation of the main samples; the corresponding conformations were reconstructed (as above) with n_f = 250, 500, 750, and 1250. The second partial set (which was selected in a similar way) consists of n_s=40 but its conformations were reconstructed more extensively again with unit n′_f=250 (2.5 ps) but n_f =1250 (j=5), 2500 (j=10), 5000 (j=20), and 10⁴ (j=40).

III.2. Results for the loop entropy

In Table 1 we present the values of Δα_k (eq 18) for the free and bound microstates obtained from the corresponding MD samples (of size 600). These values suggest that the two samples indeed are concentrated in conformational space, and the corresponding values for the χ angles of the two microstates are comparable. For comparison we also provide the corresponding values obtained in paper I (Table 6) for the same loop in the GB/SA implicit solvent. The present results, in most cases, are larger than those obtained in paper I for the loop in vacuum (not shown) but are smaller (and in some cases considerably smaller) than results obtained with implicit solvent as Table 1 reveals. Larger Δα_k values are expected to correlate with higher entropy and indeed the results for $T {\bar{S}}_{loop}^{A} (m)$ in Table 2 are smaller than those of paper I obtained with implicit solvent, as discussed below.

TABLE 2.

HSMD results (in kcal/mol) for the entropy, $T S_{loop}^{A}$ (eqs 22 and 23) at T=300 K for the free and bound microstates^a

Free loop

Bound loop

Bin size

n_f (j)

T S_{loop}^{A}

T S_{loop}^{A}

Δα_k/15

250 (1)

67.18 (4)

68.72 (4)

“

500 (2)

66.48 (7)

67.86 (8)

“

750 (3)

66.17 (4)

67.58 (8)

“

1250 (5)

65.74 (4)

67.19 (8)

“

1250 (2)

69.9 (1)

70.3 (2)

Δα_k/30

250 (1)

67.04 (9)

68.61 (7)

“

500 (2)

66.22 (7)

67.61 (7)

“

750 (3)

65.77 (4)

67.15 (8)

“

1250 (5)

65.19 (4)

66.49 (3)

“

1250 (2)

69.4 (1)

69.8 (2)

Δα_k/45

250 (1)

67.03 (4)

68.60 (5)

“

500 (2)

66.17 (7)

67.56 (7)

“

750 (3)

65.69 (4)

67.08 (8)

“

1250 (5)

65.06 (4)

66.36 (8)

“

1250 (2)

69.4 (2)

69.7 (2)

TS^QH

78.6 (1)

87 (6)

TS^LS

87.4 (1)

90 (7)

Open in a new tab

The bin sizes are δ = Δα_k/l. n_f denotes the sample size of the future chains used in the reconstruction process, n_f = unit×j, where j is the number of simulations of unit size applied at each reconstruction step. Generation of the samples (of n_s=600 conformations) and their reconstruction is based on the AMBER force field¹⁷ and 70 TIP3P water molecules.¹⁹ However, the underscored results for n_f=1250 (2) (unit=650) were obtained in paper I from samples of 200 conformations using the AMBER force field and the GB/SA implicit solvation. The statistical error in the last significant digit is given in parentheses, e.g., 65.06 (4) = 66.06 ± 0.04. S^QH (eq 36) is the quasi-harmonic entropy and S^LS is $Δ S_{loop}^{A}$ obtained by the local states method using b=2 and the discretization parameter, l=10 (see II.6); these results were obtained from larger samples (see text for details). The entropy is defined up to an additive constant that is the same for both microstates.

Table 2 contains results for the entropy, $T {\bar{S}}_{loop}^{A} (m)$ (eq 23) for the free and bound microstates. The statistical errors were obtained from the fluctuations and results obtained for partial samples. For comparison we have added in Table 2 results for n_f=1250 (appear with an underscore) obtained in paper I for a loop in implicit solvent. These results are larger (by ~4.2 and ~3.2 kcal/mol for the free end bound microstates) than the results for n_f=1250 obtained with explicit water, which is in accord with Δα_k(implicit) being larger in most cases than Δα_k(explicit) in Table 1.

Being an upper bound, one would expect ${\bar{S}}_{loop}^{A} (m)$ to decrease with decreasing bin size and increasing n_f – an expectation which is fully satisfied. Also, results for the same n for Δα_k/30 and Δα_k/45 are almost converged; thus, $T {\bar{S}}_{loop}^{A} (Δ α_{k} / 30, n_{f} = 1250) - T {\bar{S}}_{loop}^{A} (Δ α_{k} / 45, n_{f} = 1250) = 0.13 kcal / mol$ for both microstates, which is close to the statistical errors. On the other hand, results of the same bin (obtained for different n_f values) are not converged. The computer time required to reconstruct a loop structure capped with 70 water molecules using n_f=500 is ~1.6 h CPU on a 2.1 GHz Athlon processor, which is smaller than 3.6 h CPU required for reconstructing a structure in implicit solvent in paper I

The HSMD results for the entropy are compared in the table to those obtained with the LS and QH methods from larger MD samples of 25,000 loop-water configurations. These samples were obtained from 2.5 ns trajectories where a configuration is retained every 0.1 ps (where the 20 ps initial trajectory is discarded for equilibration). The central values of $T S_{loop}^{QH}$ (eq 36) exceed the HSMD results (for n_s =600) by ~14 and ~20 kcal/mol for the free and bound microstates, respectively, while the corresponding LS results (eq 23, using b=1, l=10, see II.6) are even higher, exceeding the HSMD results by ~22 and ~24 kcal/mol. These elevated results are in accord with both $S_{loop}^{QH}$ and $S_{loop}^{LS}$ being upper bounds; however, they might also be affected by the longer MD trajectories generated for QH and LS than for HSMD, as discussed in the I.4. S^LS > S^QH was also found in previous studies.¹³^–¹⁵

III.3. Differences in loop entropy

In paper I and ref 16 converging results for TΔS^A were obtained already for unit 2.5 ps and n_f ≥ 500 (and even for smaller n_f values using optimized bins). Because the Δα_k values for a loop in explicit water are smaller than those obtained with implicit solvent (Table 1) we have applied unit =2.5 ps and n_f =250–1250 also in the present study and to examine the convergence of the results for of $Δ S_{loop}^{A}$ for smaller samples, we present in Table 3 also results for n_s=200, 100, and 40 as discussed in III.1. The results in the table are given for the two smallest bins of Δα_k/30 and Δα_k/45.

TABLE 3.

Entropy differences, $T Δ S_{loop}^{A}$ (in kcal/mol) at T=300 K between the free and bound microstates obtained by HSMD for different samples in explicit water^a

Bin size

n_f

T Δ S_{loop}^{A}

n_f

T Δ S_{loop}^{A}

n_s = 600

n_s = 200

n_s = 100

n_s = 40

Δα_k/30

250

− 1.6

−1.5 (1)

−1.1 (2)

1250

−0.9 (3)

“

500

−1.4

−1.3 (1)

−1.0 (2)

2500

−0.9 (4)

“

750

−1.4

−1.4 (1)

−1.1 (2)

5000

−0.9 (2)

“

1250

−1.3

−1.3 (1)

0.9 (2)

10⁴

−1.0 (2)

Δα_k/45

250

−1.6

−1.5 (1)

−1.1 (2)

1250

−0.9 (3)

“

500

−1.4

−1.3 (1)

−1.0 (2)

2500

−0.9 (3)

“

750

−1.4

−1.4 (1)

−1.1 (2)

5000

−0.9 (2)

“

1250

−1.3

−1.3 (1)

0.9 (2)

10⁴

−1.0 (2)

Open in a new tab

$T Δ S_{loop}^{A}$ is defined in eq 24 and its results are given only for the two smallest bins, δ = Δα_k/30 and δ = Δα_k/45, using unit=2.5 ps. The table consists of two parts. The results in the left-hand part are based on 250 ≤ n_f ≤ 1250 and are presented for the entire (two) samples (n_s=600), for the samples’ first 200 conformations, and for n_s=100 by considering every 6th conformation of the entire sample. The results in the right-hand part are based on 1250 ≤ n_f ≤ 10⁴ using samples of n_s=40 by considering every 15th conformation of the entire sample. The statistical error is defined in Table 2; for n_s=600 it is smaller than ±0.1. Results for TΔS^QH and TΔS^LS are not given due to their low accuracy (see Table 2). All calculations were carried out with the AMBER96 force field and 70 TIP3P water molecules.

The statistical errors of the results for $T Δ S_{loop}^{A} (n_{s} = 600)$ are not larger than ±0.06 kcal/mol and to simplify the comparison are not presented in the table. All the results for n_f ≥ 500 converge to −1.3 kcal/mol within ±0.1 kcal/mol, and even those for n_f =250 (−1.6 kcal/mol) deviate only by 0.3 – 1.3 kcal/mol; furthermore, the results for Δα_k/15, which are not provided in Table 3 are very good, −1.4 and −1.5 kcal/mol. This extent of convergence is comparable to that obtained in previous HSMD studies.¹⁵^,¹⁶ Notice that the results for the first 200 conformations of the samples are equal to those based on n_s=600 (while providing a factor of 3 reduction in computer time). On the other hand, the results obtained for n_s =100 and 40 are higher by 0.3–0.4 kcal/mol above the n_s=600 values. Still the merit of such calculations (that might provide further reduction in computer time) depends on the errors caused by the other components, i.e., the total energy and the water contribution to the entropy. It should be pointed out that for a loop in implicit solvent (paper I) $T Δ S_{loop}^{A} = + 0.3 \pm 0.1$ , while in explicit water the loop entropy of the free microstate is lower than that of the bound microstate.

This convergence of entropy differences stems from the cancellation (in $T Δ S_{loop}^{A}$ ) of approximately equal systematic errors in $S_{loop}^{A} (free)$ and $S_{loop}^{A} (bound)$ as discussed in detail in section II.10 of paper I. Thus, Table 2 shows that the worst approximations that still lead to a good $T Δ S_{loop}^{A}$ value differ from the best ones by $S_{loop}^{A} (m) (Δ α_{k} / 15, n_{f} = 250) - T S_{loop}^{A} (m) (Δ α_{k} / 45, n_{f} = 1250) = 2.1 and 2.4 kcal / mole$ for the free and bound microstates, respectively; these differences constitute lower bounds because the correct TS_loop values might be significantly smaller than $T S_{loop}^{A} (m) (Δ α_{k} / 45, n_{f} = 1250)$ . The large errors in the results for LS and QH do not allow calculating meaningful differences.

III.4. Thermodynamic integration of water

As described earlier, in the TI process the interaction energy [electrostatic and Lennard Jones (LJ)] between a fixed loop structure and the (moving) water molecules is decreased gradually to zero (rather than increased from zero) at constant T and V, where the water-water and water-template potential energy is unchanged. For the (LJ) potential we have used the shifted scaling potential, introduced by Zacharias et al.,⁶⁶

ϕ (r_{i j,} λ) = λ 4 ε [\frac{σ^{12}}{{(r_{i j}^{2} + δ (1 - λ))}^{6}} - \frac{σ^{6}}{{(r_{i j}^{2} + δ (1 - λ))}^{3}}],

(37)

where the shift parameter, δ=2 Å², prevents the divergence of the potential (and its derivative) at small pair separations; a similar scaling function is used for the electrostatic interactions. The free energy derivatives with respect to λ, ∂F/∂λ is

\frac{\partial F}{\partial λ} = {〈 \frac{\partial E (x^{N}, λ)}{\partial λ} 〉}_{λ},

(38)

where the derivative of the energy is calculated analytically. The integration with respect to λ is carried out by dividing the range [1,0] into 16 equal integration bins Δ λ_i. The (λ=1 → λ=0) integration of the electrostatic interactions is carried out first (in the presence of intact LJ interactions) followed by a λ =1→0 integration of the LJ interactions. Thus, the entire two-stage process is based on 32 ∂F/∂λ_i integration steps.

The MD simulation consists of a 2 fs integration step, where every 20 fs the current water configuration is added to the sample. For each (Δλ_i) step the initial simulation (5 ps) is used for equilibration and is thus discarded; the following 20 ps (1000 configurations) are used for evaluating <∂F/∂λ_i>. Notice that in spite of the advanced scaling function (eq 37), in the last steps of the LJ integration (i.e., λ close to zero) the results always increased strongly; therefore, we have adopted the results integrated up to λ=0.25. For a single loop structure this free energy integration requires ~5 h CPU on a 2.1 GHz Athlon processor. As shown in Table 4, the free energy integrations over the two samples have given ${\bar{F}}_{water}^{TI} (m) \sim 31.2$ and 35.5 kcal/mol. (eq 28) for the free and bound microstates, respectively, thus leading to ΔF_water = − 4.4±0.7 kcal/mol (eq 29), i.e., the water provides higher stability to the free microstate. To check whether computer time can further be decreased, we also calculated ${\bar{F}}_{water}^{TI} (m)$ by considering the contribution of only 100 configurations of each sample (i.e., every 6^th configuration was considered) obtaining ΔF_water = − 5.0 kcal/mol, which is within the error bars of the result based on the entire samples, while computer time is reduced by a factor of 6.

TABLE 4.

Energy and free energy averages for the loop and water obtained from the samples (n_s=600) of the free and bound microstates at T=300 K^a

Ē_loop(m)

Ē_water (m)

Ē_total (m)

{\bar{F}}_{water}^{TI} (m)

Free

−108.3 ± 2

−1021.9 ± 2

−1130.8 ± 0.5

31.2 ± 0.5

Bound

−106.6 ± 0.7

−1016.5 ± 1.5

−1123.0 ± 1.5

35.5 ± 0.4

Free-Bound

ΔE_loop

ΔE_water

ΔE_total

ΔF_water

−1.7 ± 0.9

−5.4 ± 2.5

−7.2 ± 1.2

−4.4 ± 0.7

Open in a new tab

Ē_loop(m) (eq 25) is the loop-loop and loop-template average energy for microstate m, Ē_water(m) (eq 30) is the water-water, water-loop, and water-template average energy for m where Ē _total (m) is their sum. ${\bar{F}}_{water}^{TI} (m)$ (eq 28) is the average free energy of water for fixed loop structures. The corresponding differences are ΔE_loop (eq 26), ΔE_water (eq 31), ΔE_total (eq 32), and ΔF_total (eq 33). All results are in kcal/mol, where ${\bar{F}}_{water}^{TI} (m)$ is defined up to an additive constant which is the same for both microstates.

It should be pointed out that unlike an NVT system of pure water under periodic boundary conditions, the water system here is not homogeneous. Computer graphics has shown that the water molecules cover the loop and parts of the template while the outer region of the spherical volume is predominately vacant. Indeed, a crude calculation shows that the empty volume is ~5000 Å³ which would require ~160 water molecules to obtain the experimental density of water. Thus, crevices in the template might remain empty or become occupied by waters for long simulation time, which leads to increased fluctuations of thermodynamic parameters. Therefore, a systematic improvement in the integration parameters (i.e., using up to 64 Δλ_i steps and longer simulations for each TI step) has not decreased these fluctuations. We expect this picture to improve as the number of waters, N increases, and the effect of various minimalist models studied. We have already carried out preliminary simulations with N=120, but could not reach definite conclusions due to the dependence of the results on the parameters r_shift and R_cap. This problem will be studied systematically in the future by considering a larger template, where using r_shift might not be needed, because the percolation of water will be avoided.

In Table 4 we provide results for the free and bound microstates obtained for Ē_loop(m) (eq 25), Ē_water(m) (eq 30), their sum, Ē_total(m) and ${\bar{F}}_{water}^{TI} (m)$ (eq 28); we also provide the corresponding differences (free-bound), where the errors were estimated from results obtained for partial samples. It is of interest to point out that the results for Ē_loop(_m) ~ −108 and −106 for the free end bound microstates are in the same range of ~ −137 and −99 kcal/mol, respectively obtained in paper I for a loop in vacuum; however, the difference here ΔE_loop = −1.7 is much smaller than in paper I (~ −38 kcal/mol) due to the effect of water. This low ΔE_loop (eq 26) together with a small entropy contribution, TΔS_loop = −1.3 leads to ΔF_loop= −0.4 ± 1 kcal/mol (eq 27), i.e., the loop contributes very little to the higher stability of the free microstate (see Table 5 where all the various differences are summarized).

TABLE 5.

Summary of energy, entropy, and free energy differences (in kcal/mol) for the free and bound microstates at T=300 K^a

ΔF_water	ΔE_water	TΔS_water	ΔE_loop	TΔS_loop	ΔF_loop
−4.4 ± 0.7	−5.4 ± 2.5	−1.0 ± 2	−1.7 ± 0.9	−1.3 ± 0.2	−0.4 ± 1
	ΔF_total	ΔE_total	TΔS_total
	−4.8 ± 1	−7.2 ± 1.2	−2.3 ± 2

Open in a new tab

Differences (free-bound) in the following quantities: free energy of water, ΔF_water (eq 29), energy of water, ΔE_water (eq 31), energy of loop, ΔE_loop (eq 26), entropy of water, TΔS_water (eq 34), entropy of loop, TΔS_loop (eq 24), total free energy, ΔF_total (eq 33), total energy, ΔE_total (eq 32), and total entropy, TΔS_total (eq 35).

Thus, the relatively high difference, ΔE_total = −7.2 kcal/mol is contributed mainly by water, ΔE_water = −5.4 kcal/mol (eq 31, and Tables 4 and 5). This energy difference for water is slightly counterbalanced by TΔS_water = −1.0 (Table 5) leading to ΔF_water = −4.4 kcal/mol. A similar effect is demonstrated (Table 5) for the total energy values (due to the small contributions of the loop discussed above), i.e., ΔE_total = −7.2, and TΔS_total = −2.3 where ΔF_total = −4.8 kcal/mol.

Our results also show that ΔE_total = −7.2 correctly predicts the higher stability of the free microstate, and this value is not significantly different from ΔF_total = −4.8 kcal/mol -the correct measure of stability due to a relatively small entropy effect, TΔS_total = −2.3 kcal/mol. Notice that the free microstate was found to be the more stable microstate also in implicit water (paper I); however, the free energy difference there is significantly larger than here, ΔF_implicit= −25.5 kcal/mol. This higher stability is expected because the bound loop structure was superimposed on the template of the free protein. However, one should bear in mind that this result is mainly due to water and it thus depends strongly on the model of water used, which is presently based on r_shift=5 Å. Also, it is not clear how the number of water molecules and their density would affect the relative stability of the two microstates.

IV. Summary and conclusions

In paper I HSMD has been extended to a protein loop by treating the short loop (207–209) of pancreatic α-amylase modeled in vacuum and in the implicit solvent GB/SA. In the present paper we have made an important step further by extending HSMD to the same loop solvated by explicit water; treating the same loop in the free and bound microstates enables one to compare the effects of the different models. Computation of the entropy and free energy is divided into two stages, where the loop’s entropy is calculated first by reconstructing its structures in the presence of moving waters; this is followed by the calculation of the free energy of the surrounding water in the presence of a fixed loop structure. As in previous studies, we have found that already small reconstruction samples of 500 structures lead to the correct entropy difference, TΔS_loop. Furthermore, the same difference was obtained from a partial sample of 200 configurations (rather than 600), which decreases computer time by a factor of 3, where other means to improve efficiency (discussed in the Summary of paper I but were not applied here) are expected to decrease computer time considerably further. The fast convergence of the results for TΔS_loop supports (like previous calculations) theoretical arguments discussed in paper I that relatively large systematic errors in $S_{loop}^{A} (m)$ are cancelled to a large extent in differences, $Δ S_{loop}^{A}$ (eq 24). The relatively small statistical errors stem from the small system (loop and water) moved by MD. Notice that the calculations of the transition probabilities of different steps are completely independent and they are also independent of the integration of water. Therefore, the reconstruction steps and the TI of water can be fully parallelized.

Calculation of the free energy of water (second stage) was carried out successfully (and with relatively small error bars) by TI rather than HSMD, where again the difference in free energy ΔF_water obtained from a partial sample of 100 configurations is equal within the statistical error to that obtained from the entire sample Thus, the application of our entire methodology to a loop capped with water has been successful. However, the performance of the water model used with respect to other models has not been investigated; in particular, the effect of the number of waters capped, their volume, density, and the size of the shifting parameter on ΔF_total and other thermodynamic parameters should be studied. Such a study is being carried out now with respect to the 4-residue mobile loop, 287–290 of the protein acetylcholinesterase (AChE) from Torpedo californica. Treating this loop has the advantage that ΔF_total has been estimated from experimental data and was calculated by several techniques.⁶⁷

In accordance with paper I, the quasi-harmonic approximation and the local states method overestimate the entropy significantly, which might reflect strong long-range correlations and an-harmonic effects within the loop due to the loop-template, loop-loop and loop-water interactions.

Acknowledgments

This work was supported by NIH grant 2-R01 GM066090-4 A2.

References

1.Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. J Chem Phys. 1953;21:1087. [Google Scholar]
2.Alder BJ, Wainwright TE. J Chem Phys. 1959;31:459. [Google Scholar]
3.McCammon JA, Gelin BR, Karplus M. Nature. 1977;267:585. doi: 10.1038/267585a0. [DOI] [PubMed] [Google Scholar]
4.Meirovitch H. Chem Phys Lett. 1977;45:389. [Google Scholar]
5.Meirovitch H, Vásquez M, Scheraga HA. Biopolymers. 1987;26:651. doi: 10.1002/bip.360260508. [DOI] [PubMed] [Google Scholar]
6.Meirovitch H, Koerber SC, Rivier J, Hagler AT. Biopolymers. 1994;34:815. doi: 10.1002/bip.360340703. [DOI] [PubMed] [Google Scholar]
7.Meirovitch H. Phys Rev A. 1985;32:3709. doi: 10.1103/physreva.32.3709. [DOI] [PubMed] [Google Scholar]
8.Meirovitch H, Scheraga HA. J Chem Phys. 1986;84:6369. [Google Scholar]
9.Meirovitch H. J Chem Phys. 2001;114:3859. [Google Scholar]
10.White RP, Meirovitch H. J Chem Phys. 2004;121:10889. doi: 10.1063/1.1814355. [DOI] [PubMed] [Google Scholar]
11.White RP, Meirovitch H. J Chem Phys. 2006;124:204108. doi: 10.1063/1.2199529. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.White RP, Meirovitch H. J Chem Phys. 2005;123:214908. doi: 10.1063/1.2132285. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Cheluvaraja S, Meirovitch H. J Chem Phys. 2005;122:054903. doi: 10.1063/1.1835911. [DOI] [PubMed] [Google Scholar]
14.Cheluvaraja S, Meirovitch H. J Phys Chem B. 2005;109:21963. doi: 10.1021/jp052969l. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Cheluvaraja S, Meirovitch H. J Chem Phys. 2006;125:024905. doi: 10.1063/1.2208608. [DOI] [PubMed] [Google Scholar]
16.Cheluvaraja S, Meirovitch H. J Chem Theory Comput. 2008;4:192. doi: 10.1021/ct700116n. [DOI] [PubMed] [Google Scholar]
17.Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Jr, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA. J Am Chem Soc. 1995;117:5179. [Google Scholar]
18.Qiu D, Shenkin PS, Hollinger FP, Still WC. J Phys Chem. 1997;101:3005. [Google Scholar]
19.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. J Chem Phys. 1983;79:926. [Google Scholar]
20.Elber R, Karplus M. Science. 1987;235:318. doi: 10.1126/science.3798113. [DOI] [PubMed] [Google Scholar]
21.Stillinger FH, Weber TA. Science. 1984;225:983. doi: 10.1126/science.225.4666.983. [DOI] [PubMed] [Google Scholar]
22.Getzoff ED, Geysen HM, Rodda SJ, Alexander H, Tainer JA, Lerner RA. Science. 1987;235:1191. doi: 10.1126/science.3823879. [DOI] [PubMed] [Google Scholar]
23.Rini JM, Schulze-Gahmen U, Wilson IA. Science. 1992;255:959. doi: 10.1126/science.1546293. [DOI] [PubMed] [Google Scholar]
24.Constantine KL, Friedrichs MS, Wittekind M, Jamil H, Chu CH, Parker RA, Goldfarb V, Mueller L, Farmer BT. Biochemistry. 1998;37:7965. doi: 10.1021/bi980203o. [DOI] [PubMed] [Google Scholar]
25.Kessler H, Matter H, Gemmecker G, Kottenhahn M, Bates JW. J Am Chem Soc. 1992;114:4805. [Google Scholar]
26.Baysal C, Meirovitch H. Biopolymers. 1999;50:329. doi: 10.1002/(SICI)1097-0282(199909)50:3<329::AID-BIP8>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]
27.Korzhnev DM, Salvatella X, Vendruscolo M, Di Nardo AA, Davidson AR, Dobson CM, Kay LE. Nature. 2004;430:586. [Google Scholar]
28.Eisenmesser EZ, Millet O, Labeikovski W, Korzhnev DM, Wolf-Watz M, Bosco DA, Skalicky JJ, Kay LE, Kern D. Nature. 2005;438:117. doi: 10.1038/nature04105. [DOI] [PubMed] [Google Scholar]
29.Beveridge DL, DiCapua FM. Annu Rev Biophys Biophys Chem. 1989;18:431. doi: 10.1146/annurev.bb.18.060189.002243. [DOI] [PubMed] [Google Scholar]
30.Kollman PA. Chem Rev. 1993;93:2395. [Google Scholar]
31.Jorgensen WL. Acc Chem Res. 1989;22:184. [Google Scholar]
32.Meirovitch H. In: Reviews in Computational Chemistry. Lipkowitz KB, Boyd DB, editors. Wiley-VCH; New York: 1998. p. 12.p. 1. [Google Scholar]
33.Gilson MK, Given JA, Bush BL, McCammon JA. Biophys J. 1997;72:1047. doi: 10.1016/S0006-3495(97)78756-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Boresch S, Tettinger F, Leitgeb M, Karplus M. J Phys Chem B. 2003;107:9535. [Google Scholar]
35.Meirovitch H. Curr Opin Struct Biol. 2007;17:181. doi: 10.1016/j.sbi.2007.03.016. [DOI] [PubMed] [Google Scholar]
36.Gô N, Scheraga HA. J Chem Phys. 1969;51:4751. [Google Scholar]
37.Gô N, Scheraga HA. Macromolecules. 1976;9:535. [Google Scholar]
38.Hagler AT, Stern PS, Sharon R, Becker JM, Naider F. J Am Chem Soc. 1979;101:6842. [Google Scholar]
39.Karplus M, Kushick JN. Macromolecules. 1981;14:325. [Google Scholar]
40.White RP, Meirovitch H. J Chem Phys. 2003;119:12096. [Google Scholar]
41.Meirovitch H, Meirovitch E. J Phys Chem. 1996;100:5123. [Google Scholar]
42.Meirovitch H, Hendrickson TF. Proteins. 1997;29:127. [PubMed] [Google Scholar]
43.Baysal C, Meirovitch H. Biopolymers. 2000;53:423. doi: 10.1002/(SICI)1097-0282(20000415)53:5<423::AID-BIP6>3.0.CO;2-C. [DOI] [PubMed] [Google Scholar]
44.Qian M, Haser R, Payan F. J Mol Biol. 1993;231:785. doi: 10.1006/jmbi.1993.1326. [DOI] [PubMed] [Google Scholar]
45.Qian M, Haser R, Buisson G, Duee E, Payan F. Biochemistry. 1994;33:6284. doi: 10.1021/bi00186a031. [DOI] [PubMed] [Google Scholar]
46.Qian M, Haser R, Payan F. Protein Sci. 1995;4:747. doi: 10.1002/pro.5560040414. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Machius M, Vertesy L, Huber R, Wiegand G. J Mol Biol. 1996;260:409. doi: 10.1006/jmbi.1996.0410. [DOI] [PubMed] [Google Scholar]
48.Brayer GD, Sidhu G, Maurus R, Rydberg EH, Braun C, Wang Y, et al. Biochemistry. 2000;39:4778. doi: 10.1021/bi9921182. [DOI] [PubMed] [Google Scholar]
49.Rydberg EH, Li C, Maurus R, Overall CM, Brayer GD, Withers SG. Biochemistry. 2002;41:4492. doi: 10.1021/bi011821z. [DOI] [PubMed] [Google Scholar]
50.Numao S, Maurus R, Sidhu G, Wang Y, Overall CM, Brayer GD, Withers SG. Biochemistry. 2002;41:215. doi: 10.1021/bi0115636. [DOI] [PubMed] [Google Scholar]
51.Steer ML, Levitzki A. FEBS Letters. 1973;31:89. doi: 10.1016/0014-5793(73)80079-1. [DOI] [PubMed] [Google Scholar]
52.Levitzki A, Steer ML. Eur J Biochem. 1974;41:171. doi: 10.1111/j.1432-1033.1974.tb03257.x. [DOI] [PubMed] [Google Scholar]
53.Aghajari N, Feller G, Gerday C, Haser R. Protein Sci. 2002;11:1435. doi: 10.1110/ps.0202602. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Brayer GD, Luo Y, Withers SG. Protein Sci. 1995;4:1730. doi: 10.1002/pro.5560040908. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Ramasubbu N, Paloth V, Luo Y, Brayer GD, Levine MJ. Acta Crystallog sect D. 1996;52:435. doi: 10.1107/S0907444995014119. [DOI] [PubMed] [Google Scholar]
56.White RP, Meirovitch H. J Chem Theory Comput. 2006;2:1135. doi: 10.1021/ct600317d. [DOI] [PubMed] [Google Scholar]
57.Steinbach PJ, Brooks BR. Proc Natl Acad Sci USA. 1993;90:9135. doi: 10.1073/pnas.90.19.9135. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Ponder JW. TINKER - software tools for molecular design, version 3.9. Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine; St. Louis, Mo: 2001. [Google Scholar]
59.Meirovitch H, Alexandrowicz Z. J Stat Phys. 1976;15:123. [Google Scholar]
60.Meirovitch H. J Chem Phys. 1999;111:7215. [Google Scholar]
61.Szarecka A, White RP, Meirovitch H. J Chem Phys. 2003;119:12084. [Google Scholar]
62.Meirovitch H, Vásquez M, Scheraga HA. Biopolymers. 1988;27:1189. doi: 10.1002/bip.360270802. [DOI] [PubMed] [Google Scholar]
63.Brady J, Karplus M. J Am Chem Soc. 1985;107:6103. [Google Scholar]
64.Chang CE, Chen W, Gilson MK. J Chem Theory Comput. 2005;1:1017. doi: 10.1021/ct0500904. [DOI] [PubMed] [Google Scholar]
65.Allen MP, Tildesley DJ. Computer Simulation of Liquids. Clarenden Press; Oxford: 1987. [Google Scholar]
66.Zacharias M, Straatsma TP, McCammon JA. J Chem Phys. 1994;100:9025. [Google Scholar]
67.Olson MA. Proteins. 2004;57:645. doi: 10.1002/prot.20294. [DOI] [PubMed] [Google Scholar]

[R1] 1.Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. J Chem Phys. 1953;21:1087. [Google Scholar]

[R2] 2.Alder BJ, Wainwright TE. J Chem Phys. 1959;31:459. [Google Scholar]

[R3] 3.McCammon JA, Gelin BR, Karplus M. Nature. 1977;267:585. doi: 10.1038/267585a0. [DOI] [PubMed] [Google Scholar]

[R4] 4.Meirovitch H. Chem Phys Lett. 1977;45:389. [Google Scholar]

[R5] 5.Meirovitch H, Vásquez M, Scheraga HA. Biopolymers. 1987;26:651. doi: 10.1002/bip.360260508. [DOI] [PubMed] [Google Scholar]

[R6] 6.Meirovitch H, Koerber SC, Rivier J, Hagler AT. Biopolymers. 1994;34:815. doi: 10.1002/bip.360340703. [DOI] [PubMed] [Google Scholar]

[R7] 7.Meirovitch H. Phys Rev A. 1985;32:3709. doi: 10.1103/physreva.32.3709. [DOI] [PubMed] [Google Scholar]

[R8] 8.Meirovitch H, Scheraga HA. J Chem Phys. 1986;84:6369. [Google Scholar]

[R9] 9.Meirovitch H. J Chem Phys. 2001;114:3859. [Google Scholar]

[R10] 10.White RP, Meirovitch H. J Chem Phys. 2004;121:10889. doi: 10.1063/1.1814355. [DOI] [PubMed] [Google Scholar]

[R11] 11.White RP, Meirovitch H. J Chem Phys. 2006;124:204108. doi: 10.1063/1.2199529. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.White RP, Meirovitch H. J Chem Phys. 2005;123:214908. doi: 10.1063/1.2132285. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Cheluvaraja S, Meirovitch H. J Chem Phys. 2005;122:054903. doi: 10.1063/1.1835911. [DOI] [PubMed] [Google Scholar]

[R14] 14.Cheluvaraja S, Meirovitch H. J Phys Chem B. 2005;109:21963. doi: 10.1021/jp052969l. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Cheluvaraja S, Meirovitch H. J Chem Phys. 2006;125:024905. doi: 10.1063/1.2208608. [DOI] [PubMed] [Google Scholar]

[R16] 16.Cheluvaraja S, Meirovitch H. J Chem Theory Comput. 2008;4:192. doi: 10.1021/ct700116n. [DOI] [PubMed] [Google Scholar]

[R17] 17.Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Jr, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA. J Am Chem Soc. 1995;117:5179. [Google Scholar]

[R18] 18.Qiu D, Shenkin PS, Hollinger FP, Still WC. J Phys Chem. 1997;101:3005. [Google Scholar]

[R19] 19.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. J Chem Phys. 1983;79:926. [Google Scholar]

[R20] 20.Elber R, Karplus M. Science. 1987;235:318. doi: 10.1126/science.3798113. [DOI] [PubMed] [Google Scholar]

[R21] 21.Stillinger FH, Weber TA. Science. 1984;225:983. doi: 10.1126/science.225.4666.983. [DOI] [PubMed] [Google Scholar]

[R22] 22.Getzoff ED, Geysen HM, Rodda SJ, Alexander H, Tainer JA, Lerner RA. Science. 1987;235:1191. doi: 10.1126/science.3823879. [DOI] [PubMed] [Google Scholar]

[R23] 23.Rini JM, Schulze-Gahmen U, Wilson IA. Science. 1992;255:959. doi: 10.1126/science.1546293. [DOI] [PubMed] [Google Scholar]

[R24] 24.Constantine KL, Friedrichs MS, Wittekind M, Jamil H, Chu CH, Parker RA, Goldfarb V, Mueller L, Farmer BT. Biochemistry. 1998;37:7965. doi: 10.1021/bi980203o. [DOI] [PubMed] [Google Scholar]

[R25] 25.Kessler H, Matter H, Gemmecker G, Kottenhahn M, Bates JW. J Am Chem Soc. 1992;114:4805. [Google Scholar]

[R26] 26.Baysal C, Meirovitch H. Biopolymers. 1999;50:329. doi: 10.1002/(SICI)1097-0282(199909)50:3<329::AID-BIP8>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]

[R27] 27.Korzhnev DM, Salvatella X, Vendruscolo M, Di Nardo AA, Davidson AR, Dobson CM, Kay LE. Nature. 2004;430:586. [Google Scholar]

[R28] 28.Eisenmesser EZ, Millet O, Labeikovski W, Korzhnev DM, Wolf-Watz M, Bosco DA, Skalicky JJ, Kay LE, Kern D. Nature. 2005;438:117. doi: 10.1038/nature04105. [DOI] [PubMed] [Google Scholar]

[R29] 29.Beveridge DL, DiCapua FM. Annu Rev Biophys Biophys Chem. 1989;18:431. doi: 10.1146/annurev.bb.18.060189.002243. [DOI] [PubMed] [Google Scholar]

[R30] 30.Kollman PA. Chem Rev. 1993;93:2395. [Google Scholar]

[R31] 31.Jorgensen WL. Acc Chem Res. 1989;22:184. [Google Scholar]

[R32] 32.Meirovitch H. In: Reviews in Computational Chemistry. Lipkowitz KB, Boyd DB, editors. Wiley-VCH; New York: 1998. p. 12.p. 1. [Google Scholar]

[R33] 33.Gilson MK, Given JA, Bush BL, McCammon JA. Biophys J. 1997;72:1047. doi: 10.1016/S0006-3495(97)78756-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Boresch S, Tettinger F, Leitgeb M, Karplus M. J Phys Chem B. 2003;107:9535. [Google Scholar]

[R35] 35.Meirovitch H. Curr Opin Struct Biol. 2007;17:181. doi: 10.1016/j.sbi.2007.03.016. [DOI] [PubMed] [Google Scholar]

[R36] 36.Gô N, Scheraga HA. J Chem Phys. 1969;51:4751. [Google Scholar]

[R37] 37.Gô N, Scheraga HA. Macromolecules. 1976;9:535. [Google Scholar]

[R38] 38.Hagler AT, Stern PS, Sharon R, Becker JM, Naider F. J Am Chem Soc. 1979;101:6842. [Google Scholar]

[R39] 39.Karplus M, Kushick JN. Macromolecules. 1981;14:325. [Google Scholar]

[R40] 40.White RP, Meirovitch H. J Chem Phys. 2003;119:12096. [Google Scholar]

[R41] 41.Meirovitch H, Meirovitch E. J Phys Chem. 1996;100:5123. [Google Scholar]

[R42] 42.Meirovitch H, Hendrickson TF. Proteins. 1997;29:127. [PubMed] [Google Scholar]

[R43] 43.Baysal C, Meirovitch H. Biopolymers. 2000;53:423. doi: 10.1002/(SICI)1097-0282(20000415)53:5<423::AID-BIP6>3.0.CO;2-C. [DOI] [PubMed] [Google Scholar]

[R44] 44.Qian M, Haser R, Payan F. J Mol Biol. 1993;231:785. doi: 10.1006/jmbi.1993.1326. [DOI] [PubMed] [Google Scholar]

[R45] 45.Qian M, Haser R, Buisson G, Duee E, Payan F. Biochemistry. 1994;33:6284. doi: 10.1021/bi00186a031. [DOI] [PubMed] [Google Scholar]

[R46] 46.Qian M, Haser R, Payan F. Protein Sci. 1995;4:747. doi: 10.1002/pro.5560040414. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] 47.Machius M, Vertesy L, Huber R, Wiegand G. J Mol Biol. 1996;260:409. doi: 10.1006/jmbi.1996.0410. [DOI] [PubMed] [Google Scholar]

[R48] 48.Brayer GD, Sidhu G, Maurus R, Rydberg EH, Braun C, Wang Y, et al. Biochemistry. 2000;39:4778. doi: 10.1021/bi9921182. [DOI] [PubMed] [Google Scholar]

[R49] 49.Rydberg EH, Li C, Maurus R, Overall CM, Brayer GD, Withers SG. Biochemistry. 2002;41:4492. doi: 10.1021/bi011821z. [DOI] [PubMed] [Google Scholar]

[R50] 50.Numao S, Maurus R, Sidhu G, Wang Y, Overall CM, Brayer GD, Withers SG. Biochemistry. 2002;41:215. doi: 10.1021/bi0115636. [DOI] [PubMed] [Google Scholar]

[R51] 51.Steer ML, Levitzki A. FEBS Letters. 1973;31:89. doi: 10.1016/0014-5793(73)80079-1. [DOI] [PubMed] [Google Scholar]

[R52] 52.Levitzki A, Steer ML. Eur J Biochem. 1974;41:171. doi: 10.1111/j.1432-1033.1974.tb03257.x. [DOI] [PubMed] [Google Scholar]

[R53] 53.Aghajari N, Feller G, Gerday C, Haser R. Protein Sci. 2002;11:1435. doi: 10.1110/ps.0202602. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R54] 54.Brayer GD, Luo Y, Withers SG. Protein Sci. 1995;4:1730. doi: 10.1002/pro.5560040908. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R55] 55.Ramasubbu N, Paloth V, Luo Y, Brayer GD, Levine MJ. Acta Crystallog sect D. 1996;52:435. doi: 10.1107/S0907444995014119. [DOI] [PubMed] [Google Scholar]

[R56] 56.White RP, Meirovitch H. J Chem Theory Comput. 2006;2:1135. doi: 10.1021/ct600317d. [DOI] [PubMed] [Google Scholar]

[R57] 57.Steinbach PJ, Brooks BR. Proc Natl Acad Sci USA. 1993;90:9135. doi: 10.1073/pnas.90.19.9135. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R58] 58.Ponder JW. TINKER - software tools for molecular design, version 3.9. Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine; St. Louis, Mo: 2001. [Google Scholar]

[R59] 59.Meirovitch H, Alexandrowicz Z. J Stat Phys. 1976;15:123. [Google Scholar]

[R60] 60.Meirovitch H. J Chem Phys. 1999;111:7215. [Google Scholar]

[R61] 61.Szarecka A, White RP, Meirovitch H. J Chem Phys. 2003;119:12084. [Google Scholar]

[R62] 62.Meirovitch H, Vásquez M, Scheraga HA. Biopolymers. 1988;27:1189. doi: 10.1002/bip.360270802. [DOI] [PubMed] [Google Scholar]

[R63] 63.Brady J, Karplus M. J Am Chem Soc. 1985;107:6103. [Google Scholar]

[R64] 64.Chang CE, Chen W, Gilson MK. J Chem Theory Comput. 2005;1:1017. doi: 10.1021/ct0500904. [DOI] [PubMed] [Google Scholar]

[R65] 65.Allen MP, Tildesley DJ. Computer Simulation of Liquids. Clarenden Press; Oxford: 1987. [Google Scholar]

[R66] 66.Zacharias M, Straatsma TP, McCammon JA. J Chem Phys. 1994;100:9025. [Google Scholar]

[R67] 67.Olson MA. Proteins. 2004;57:645. doi: 10.1002/prot.20294. [DOI] [PubMed] [Google Scholar]

PERMALINK

Entropy and Free Energy of a Mobile Protein Loop in Explicit Water

Srinath Cheluvaraja

Mihail Mihailescu

Hagai Meirovitch

Abstract

I. Introduction

I.1. The difficulty in calculating the absolute entropy

I.2. Microstates of biological macromolecules

I.3. Advantages of the absolute F and S

I.4. Problems to define microstates by computer simulation

I.5. A mobile loop in porcine pancreatic α-amylase

II. Theory and methodology

II.1. The loop and the protein’s template

Figure 1.

II.2. Statistical mechanics of a loop in internal coordinates

II.3. Exact future scanning procedure

II.4. The HSMC(D) method

II.5. The reconstruction procedure with HSMD

TABLE 1.

II.6. The Local States (LS) and the quasi-harmonic (QH) methods

III. Results and discussion

III.1. Simulation details

III.2. Results for the loop entropy

TABLE 2.

III.3. Differences in loop entropy

TABLE 3.

III.4. Thermodynamic integration of water

TABLE 4.

TABLE 5.

IV. Summary and conclusions

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Entropy and Free Energy of a Mobile Protein Loop in Explicit Water

Srinath Cheluvaraja

Mihail Mihailescu

Hagai Meirovitch

Abstract

I. Introduction

I.1. The difficulty in calculating the absolute entropy

I.2. Microstates of biological macromolecules

I.3. Advantages of the absolute F and S

I.4. Problems to define microstates by computer simulation

I.5. A mobile loop in porcine pancreatic α-amylase

II. Theory and methodology

II.1. The loop and the protein’s template

Figure 1.

II.2. Statistical mechanics of a loop in internal coordinates

II.3. Exact future scanning procedure

II.4. The HSMC(D) method

II.5. The reconstruction procedure with HSMD

TABLE 1.

II.6. The Local States (LS) and the quasi-harmonic (QH) methods

III. Results and discussion

III.1. Simulation details

III.2. Results for the loop entropy

TABLE 2.

III.3. Differences in loop entropy

TABLE 3.

III.4. Thermodynamic integration of water

TABLE 4.

TABLE 5.

IV. Summary and conclusions

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases