Abstract
The conformational heterogeneity of the N-terminal domain of the ribosomal protein L9 (NTL91-39) in its folded state is investigated using isotope-edited two-dimensional infrared spectroscopy. Backbone carbonyls are isotope-labeled (13C=18O) at five selected positions (V3, V9, V9G13, G16, and G24) to provide a set of localized spectroscopic probes of the structure and solvent exposure at these positions. Structural interpretation of the amide I line shapes is enabled by spectral simulations carried out on structures extracted from a recent Markov state model. The V3 label spectrum indicates that the β-sheet contacts between strands I and II are well folded with minimal disorder. The V9 and V9G13 label spectra, which directly probe the hydrogen-bond contacts across the β-turn, show significant disorder, indicating that molecular dynamics simulations tend to overstabilize ideally folded β-turn structures in NTL91-39. In addition, G24-label spectra provide evidence for a partially disordered α-helix backbone that participates in hydrogen bonding with the surrounding water.
Introduction
Crystallographic protein structures often leave the impression that proteins exist as a unique three-dimensional arrangement of atoms; however, even well-folded proteins are soft, dynamic molecules that sample a range of conformations (1–5). Fluctuations, structural disorder, and conformational changes on timescales from picoseconds to seconds are key elements of protein function, including signaling (6), allostery (7), enzyme catalysis (8), recognition and binding (9,10), and protein-protein interactions (11). Even so, the role of conformational diversity remains underappreciated as rapidly-interconverting structures are still invisible to most experimental techniques. NMR, in addition to its accepted role in protein structure determination, has also provided key insights into protein dynamics (8,12). For example, NMR has revealed the importance of entropic contributions from side-chain fluctuations in defining the relationships among protein folding, stability, and function (9,13,14). Single-molecule methods have provided access to the kinetics of large-amplitude conformational changes on millisecond-to-second timescales (15,16). Recently, ultrafast two-dimensional infrared (2D IR) spectroscopy has emerged as a powerful technique to observe fast conformational dynamics (17–22).
In an effort to experimentally characterize conformational variation in a folded protein, we have made use of amide I 2D IR spectroscopy to investigate the native state ensemble of the 39-residue N-terminal domain of the L9 protein (NTL91-39), shown in Fig. 1 (23,24). The 2D IR line shapes capture a snapshot of the three-dimensional structure of the backbone ensemble with picosecond resolution, free of motional averaging (17,25). Site-specific structural information is obtained by isotope-labeling of specific backbone carbonyls. A single 13C=18O replacement shifts the frequency of a residue, isolating it from the remaining backbone absorption band. The measured vibrational frequencies of the label, determined by hydrogen bonding to the oxygen, are thereby sensitive to the type and strength of a hydrogen bond as well as fluctuations and solvent exposure. Pairwise labeling is sensitive to the vibrational couplings between two residues (20,26,27). Strong electrostatic interactions between hydrogen-bonded pairs produce large couplings that red-shift (decrease) the vibrational frequency. For NTL91-39, we incorporate 13C=18O isotope labels on the peptide unit following the side chain at five sites in different regions of the protein: V3 (β-sheet), V9 (β-turn), V9G13 (β-turn dual label), G16 (β-strand, solvent-exposed), and G24 (α-helix).
Figure 1.

(a) Cartoon representation of the crystal structure of NTL91-39 K12M (Protein Data Bank 2HBA). (Red, yellow, and green colors) The α-helix, the β-sheet, and the β-turn regions, respectively. The amide units with 13C=18O isotope labels are indicated on the structure. (b) Rotated view showing the G16 label as well as the V9 and G13 labels in the β-turn region. To see this figure in color, go online.
A detailed molecular interpretation of these measurements emerges from structure-based spectroscopic modeling. An atomistic molecular structure is translated into a 2D IR spectrum using a semiclassical model for the coupled amide I dipoles of the backbone that maps the molecular electric field acting on the C=O bond to vibrational frequencies. Dipole-dipole interactions between peptide groups determine the strength of the couplings. Structural variation can be tested by averaging over an ensemble, which can influence linewidth and intensities. As a basis for characterizing structural heterogeneity, we make use of a Markov state model (MSM) that groups conformers from a molecular dynamics (MD) simulation based on exchange kinetics, forming a convenient basis for comparison between MD simulations and experiments (28–30). The combination of isotope-edited 2D IR spectroscopy, MSMs, and spectral simulations that is used here has recently been used to characterize conformational variation and folding in β-hairpins and β-turns (26,31).
Materials and Methods
Isotope labeling and peptide synthesis
Isotope replacement reactions were carried out following a previous protocol (32). Briefly, 13COOH-labeled FMOC-glycine and valine (Cambridge Isotope Laboratories, Andover, MA) were dissolved in a 6:1 mixture of dioxane: H218O, acidified with acetyl chloride and refluxed under a nitrogen atmosphere for ∼24 h. Once the reaction had reached completion, the solvent was removed by lyophilization and the product recrystallized from a mixture of dichloromethane and n-hexanes. Enrichment ratios of >95% and >99% for Gly and Val, respectively, were confirmed by electrospray ionization mass spectrometry. Peptide samples (sequence: NH2-MKVIFLKDVKGKGKKGEIKNVADGYANNFLFKQGLAIEA-CONH2) were synthesized at the Swanson Biotechnology Center (Koch Institute for Integrative Cancer Research at MIT, Massachusetts Institute of Technology, Cambridge, MA). After HPLC purification, samples were triply lyophilized from D2O (∼1 mg/mL) in order to remove residual trifluoroacetic acid and exchange labile protons. Circular dichroism was carried out on the unlabeled sample to confirm proper folding.
Sample preparation and 2D IR spectroscopy
Peptide samples were reconstituted in 10 mM MOPS buffer in D2O with 100 mM NaCl. The final concentration was kept <5 mg/mL to minimize aggregation. The pH (raw reading, uncorrected for isotope effect) was adjusted with DCl to 2.2 ± 0.1 to protonate all side-chain carboxylates and prevent spectral overlap with the isotope labels by blue-shifting the C=O stretches. Samples were held between two 1-mm CaF2 windows separated by a 50-μm Teflon spacer in a temperature-controlled cell at 24 ± 0.1°C. Based on the thermodynamic parameters of Horng et al. (24) we estimate a 6–10% unfolded population under these experimental conditions. Because NTL91-39 is prone to aggregation, samples were discarded when aggregation was detected as a sharp IR band at 1612 cm−1, where the detection threshold was ∼3% of the molecular concentration. Sample solutions were either prepared fresh or were flash-frozen in liquid nitrogen and stored at −20°C.
The custom-built 2D IR spectrometer was described previously in Khalil et al. (33). In short, spectra were collected in the parallel (ZZZZ) polarization condition. The population time (t2) was set to 150 fs, and the coherence time (t1) was scanned to 2.4 and 1.8 ps in 4 fs steps for rephasing and nonrephasing, respectively. Because no off-diagonal features are observed in the isotope-label region, only diagonal slices of the 2D IR spectra are analyzed. Diagonal cuts are offset by +12 cm−1 along the detection axis (ω3), corresponding approximately to the ridges of maximum intensity in the two-dimensional peaks. Spectra were phased by minimizing differences between the ω3 projections of the 2D IR spectra and pump-probe spectra in the main amide I band region. The ω3 offset of the labeled peaks is likely due to a small phase mismatch between the main band and the low-frequency isotope region or slight miscalibration of the detection frequency. Unfortunately, the low signal of the isotope labels in the pump-probe spectrum was not sufficient to reliably phase the spectra using the isotope peaks. To compare the intensities of the different labels, 2D IR spectra are corrected for the spectrum of the pulses and normalized to the maximum intensity of the main amide-I band.
MSM and spectral simulations
The MSM and spectral simulation methods are described in Baiz et al. (34). In summary, the MSM was built from ∼2 ms of MD trajectories carried out on the K12M mutant of NTL91-39 by Lindorff-Larsen et al. (35). The full MSM can be generally partitioned into two ensembles: states with low root-mean-square deviation (RMSD) (0–5 Å) to the crystal structure (Protein Data Bank 2HBA) that we call native states, and a set of high-RMSD disordered states (RMSD ∼ 5–20 Å) that are separated by a ∼2 kcal/mol free-energy barrier (see Fig. 8 in Baiz et al. (34)). Approximately 140 Markov states fall into the native state ensemble, and because the focus of the article is to characterize the heterogeneity of this ensemble, spectral simulations are carried out using the 140 lowest-RMSD states of the full 727-state MSM.
Because the backbone in the native state is heterogeneous, we use the term “folded state” to refer to residues in crystal-structure-like configurations primarily defined by their φ/ψ angles and hydrogen-bonding patterns. “Disordered state” refers to local backbone configurations showing solvent-exposed residues, which in the crystal structure appear in stable protein-protein hydrogen bonds. Specifically, 75% of the native population in the MSM has RMSD within 1 Å of the crystal structure. However, it is important to consider that because the ensembles retains most of the native secondary structure, both folded and disordered configurations belong to the native-state ensemble.
Amide I spectra were calculated using a mixed quantum-classical model that maps parameters from an MD simulation to a local mode Hamiltonian that describes the amide I spectroscopy (36). Position-restrained trajectories were launched from five randomly selected configurations of each Markov state (1 ns, 2 fs steps, snapshots saved every 1 ps, SPC/E water, OPLS/AA, using the software GROMACS, Ver. 4.5.1, www.gromacs.org). For each starting structure, 1000 protein and solvent configurations were converted into local amide I Hamiltonians. Diagonal elements (or site energies) were obtained by mapping the electric field and potential at each residue’s N, H, C, O positions into a frequency and transition dipole moment using a map parameterized by Jansen et al. (37,38). Uniform −60 cm−1 shifts were applied to site energies corresponding to 13C=18O labeled units in the Hamiltonian matrices. Off-diagonal terms (couplings) were computed from φ/ψ maps for adjacent residues and a transition charge-coupling model for the remaining interactions. Static 2D IR spectra were computed by diagonalization of the Hamiltonian matrices followed by numerical sums over Liouville pathways (39). Spectra were computed as a population-weighted sum over the 140 Markov states. Equilibrium populations were extracted from the first eigenvector of the full MSM as described previously in Lindorff-Larsen et al. (35).
Results
Energy landscape of the native-state ensemble
Fig. 2 shows a network diagram of the 140-state MSM for the native ensemble. States are represented by nodes, which are color-coded by the all-atom RMSD to the crystal structure as well as by the energy of the states extracted from the populations of the full MSM. Node sizes are proportional to the number of connections to other nodes, and the thickness of the connecting lines represents their relative interconversion probabilities. Although some states do not appear to be connected to other states in the network, in the full MSM they are connected to disordered states that are not included in this plot (34). A few low-RMSD states display multiple connections whereas most states have a small number of connections. This is referred to as a “kinetic hub topology”, where low-RMSD states act as hubs for channeling flow through the network (40,41). Each node is, on average, connected to 5.2 other nodes. The top three nodes, which are also the lowest-RMSD states, have degrees of 118, 108, and 71, while high-RMSD nodes are only connected to a few others. From an energy landscape perspective, these kinetic hub states represent a set of lowest energy minima, thus interconversion between states in the ensemble is funneled through these few lowest-RMSD states. The second map, color-coded by pseudo free energy, shows that most of the population resides in low-RMSD states, with a sparse population in the high-RMSD regions. More specifically, 66% of the total population resides within the four most populated states, with the remaining 34% distributed over 136 states. These structurally similar states represent the mean folded structure of the protein.
Figure 2.

(Left) Network plot of the Markov state model for the native ensemble of NTL91-39. The states are color-coded by RMSD to the crystal structure. The node size encodes the degree, or number of connections to other states, of each state. (Dashed circle) Registry-shifted states (34). (Center) Structure overlay of the four highest populated states. The main differences are observed in the β-turn region. (Right) Nodes color-coded by the pseudo free energy of native states F = −ln(P), where P denotes the respective populations in the full MSM. To see this figure in color, go online.
These four lowest-RMSD states, labeled A–D, are similar in structure but represent distinct minima in the energy landscape. The four states are within 1.5 kT of each other at the simulation temperature. Structural differences between these states are reflected primarily in the side-chain configurations. For example, in state A, the V9 (β-turn) forms hydrophobic contacts with L30 (α-helix) and F5 (β-sheet), whereas V9 in the state B site is partially solvent-exposed. From the perspective of the backbone, these four configurations display very similar protein-protein hydrogen-bonding patterns, and only small differences in φ/ψ angles. Spectroscopically, these low-RMSD states appear virtually identical, thus we refer to these as folded structures. Nonetheless, differences in these structures suggest that the hydrophobic core of NTL91-39 is markedly heterogeneous and dynamic.
Amide I isotope spectra: experiment
Fig. 3 shows a comparison of experimental and simulated 2D IR spectra of the unlabeled and five different isotope-labeled samples in the 1560–1620 cm−1 region in which the isotope-labeled peaks appear. Above 1600 cm−1 the onset of the main amide I band is observed. The antidiagonal widths are similar for all labels and no cross-peaks are observed between the label and the main band. Therefore, the essential information is captured by the diagonal slices of the 2D IR spectra along the positive peak, as illustrated in Fig. 4. Fourier transform infrared spectroscopy (FTIR) spectra for these samples were uninformative because, except for V3, the low-amplitude label peaks are difficult to distinguish from the strong sloping background in this region.
Figure 3.

Two-dimensional infrared spectra in the label region of the five different isotope labels of NTL91-39. (Dashed lines in the experimental spectrum) Positions of the diagonal slices. To see this figure in color, go online.
Figure 4.

Diagonal slices of 2D IR spectra along the peak maxima in the isotope-label region (see Fig. 3). (Top) Experimental spectra. (Circles) Interpolated 2D IR data; (solid curves) dual Gaussian fits to the data as shown in Table 1. Gaussian center frequencies are indicated below each peak. All spectra are normalized to the maximum amplitude of the main amide I band. (Bottom) Simulated diagonal slices of the 2D IR spectra for the same labels. The G13 label (dashed curve) does not appear in the experimental spectrum, but it is included for comparison with V9 and V9G13 experimental curves. To see this figure in color, go online.
The spectrum for V3 is typical of what one expects for the 7–15 cm−1 linewidth and Gaussian line shape for a single conformer (26). Remarkably, the other spectra exhibit asymmetric line shapes with widths exceeding 20 cm−1. For comparison, the average frequency downshift induced by a single hydrogen bond to a backbone carbonyl is ∼16 cm−1 (42). Shoulders appear on the red or blue side of the main peaks, indicating the presence of two or more partially overlapping peaks that arise from spectroscopically distinct structural ensembles. Experimental 2D IR diagonals are fit to a sum of Gaussians (Fig. 4, solid curves),
where Ai, ωc,i, and σi, represent the amplitude, center frequency, and peak width, respectively (see Table 1). Label spectra are well fit to a sum of two Gaussians with a third component representing the band above 1600 cm−1. Coincidentally, the 12–15 cm−1 separation observed between the band centers approximately corresponds to the shift arising from the formation of a hydrogen bond. Also, the linewidth of the lower frequency band broadens relative to the high-frequency band, as expected for a solvent-exposed carbonyl.
Table 1.
Gaussian fit parameters corresponding to the experimental label spectra (Fig. 4)
| Parameter | Label |
||||
|---|---|---|---|---|---|
| V3 | V9 | V9G13 | G16 | G24 | |
| A1 | 3.3E-02 | 2.2E-01 | 2.5E-01 | 8.5E-02 | 4.0E-02 |
| ωc,1 (cm−1) | 1572 | 1577 | 1573 | 1580 | 1572 |
| σ1 (cm−1) | 8.2 | 11.1 | 10.8 | 13.9 | 9.0 |
| A2 | 4.4E-01 | 4.8E-02 | 6.6E-02 | 4.6E-02 | 4.9E-02 |
| ωc,2 (cm−1) | 1593 | 1592 | 1587 | 1590 | 1587 |
| σ2 (cm−1) | 7.8 | 6.7 | 7.5 | 7.6 | 9.9 |
The features observed in experimental spectra are as follows.
UL
The unlabeled NLT91-39 spectrum shows no peaks in the label region except for a small feature near 1565 cm−1, most likely because of residual deprotonated carboxylate groups (pKa ∼4–4.5). The spectrum is included to demonstrate the flat baseline and excellent solvent background suppression of 2D IR. The onset of the main amide I band from the unlabeled vibrations can be observed above 1600 cm−1. Note that the clean baseline observed in the UL spectrum eliminates the need for background subtraction.
V3
The V3 spectrum shows an intense, narrow peak centered around 1593 cm−1 along with a low-amplitude peak near 1572 cm−1. The narrow peak indicates that V3(C=O) is held rigidly in place by a single strong hydrogen bond across the β-strands. V3 is located in a largely hydrophobic patch of the protein with minimal solvent exposure.
V9 and V9G13
The V9 spectrum shows two peaks centered around 1577 and 1592 cm−1 with 82 and 18% amplitude ratios, respectively. The V9G13 spectrum contains similar features compared to V9 that are red-shifted to 1573 cm−1 and 1587 cm−1, but with lower intensity on the blue side and higher intensity on the red side of the band. The increase in intensity is consistent with vibrational coupling between the two residues that results from a pair of hydrogen bonds in the folded β-turn conformation.
G16
The G16 spectrum exhibits broad bands with significantly lower intensity compared to the V3, V9, and V9G13 labels. G16 is located in flexible, highly solvent-exposed regions of the backbone in which G16 carbonyl remains fully solvent-exposed even in the crystal structure (Fig. 1), partially accounting for the observed broader, low-intensity bands.
G24
The G24 spectra display two low-intensity broad peaks, suggesting some heterogeneity in this region. The low intensity of the peak in the simulations (Fig. 4) also point to a disordered and solvent-exposed region.
Amide I isotope spectra: simulations
In comparing experimental and simulated spectra, there are a number of qualitative similarities and some notable differences. For example, the lower intensity of the G16 and G24 labels compared to V3, V9, and V9G13 is successfully captured by the model. Relative intensities of the peaks and the red shift of V9G13 relative to V3 are qualitatively reproduced, but linewidths and peak frequencies are not consistent with experiment. It is important to note that the electrostatic map reproduces the label frequencies and line shapes only qualitatively. Maps provide a means to interpret relative intensities and frequency shifts for different structures including protein-protein contacts and protein-water hydrogen bonding, but at present cannot provide absolute frequencies or intensities. Aside from this specific shortcoming of the spectral models, discrepancies also arise from structural differences between MSM and the experimental ensembles and thus form the basis for critically evaluating populations in the MSM and experimental ensembles.
UL
Similar to experiment, the unlabeled spectrum shows a smooth baseline with only a small dip in the 1600–1610 cm−1 region.
V3
The V3 label shows a strong, narrow peak, qualitatively similar to experiment. This reflects the single hydrogen bond in the β-sheet, the high backbone rigidity, and stability at this position, and a single spectroscopically distinct state of the backbone. The negative component on the red side of the main band is due to the interference between the positive and negative peaks in the 2D IR spectrum (Fig. 3).
V9, G13, and V9G13
These labels report on the flexibility of the disordered loop region. G13 is included in the simulations to help interpret differences between the V9 and V9G13 experimental spectra. Similar to experiment, V9G13 is red-shifted relative to V9. However, the V9 frequency appears close to that of V3. Interestingly, V9 and G13 display virtually identical peak frequencies and intensities; but V9G13, the dual label, is red-shifted with respect to the individual labels. The shift is attributed to vibrational coupling through a dipole-dipole interaction. Coupling splits the local modes into a red-shifted mode in which the carbonyls oscillate out-of-phase and carry most of the band intensity, and a weak, blue-shifted, in-phase mode. Finally, because the simulated bands are broader compared to experiment, the two peaks are not distinguishable in the simulations. In the next section, we present a more rigorous structural interpretation of the β-turn spectra.
G16 and G24
These two labels show broad, low-amplitude peaks that can be attributed to a high degree of solvent exposure at these residue positions. The separation between high- and low-frequency peaks seen in experimental spectra is not present in the simulations, likely due to a combination of spectral overlap between the broad peaks and, as discussed below, because the MSM may not fully capture the structural ensembles present in the experiment. To characterize the structures that give rise to the distinct peaks in the G24 spectra, we decompose the MSM into different hydrogen-bonding environments near the first turn of the α-helix and compare their spectral signatures.
Backbone heterogeneity
β-turn configuration
The two peaks observed in the experimental V9 and V9G13 spectra alone provide strong evidence for two distinct β-turn ensembles, but spectral modeling based on the MSM delivers more detailed structural insight. MSM configurations reveal two different ensembles: a primarily folded β-turn, accounting for 91% of the MSM population; and a disordered turn, with the remaining 9% population as shown in Fig. 5 a. Folded turn configurations show a protein-protein hydrogen bond between the V9(O) and M12(H). In contrast, disordered turn configurations have a solvent-exposed V9(O) acting as a water hydrogen-bond acceptor. The disordered loop states correspond approximately to the near-native state recently described by Schwantes and Pande (43) based on a different MSM derived from the same MD trajectories. According to the new MSM, the population of the near-native state is ∼3.5%, somewhat lower than our estimated value of 9%. To investigate how these structures are reflected in the spectroscopy, the 140 Markov states were separated into two subensembles based on their hydrogen-bond distance,
with folded and disordered configurations defined as those with a distance of rVM < 3.5 Å and > 3.5 Å, respectively. Note that states A–D in Fig. 2 correspond to folded turn configurations. The choice of order parameter is primarily motivated by the electrostatic contributions from the different atoms to the V9 C=O frequency. Because the M12(H) is the closest nonbonded atom to the C=O unit, rVM is a natural choice for the purpose of spectrally separating the structural ensembles. The 3.5 Å cutoff is selected primarily because it separates the two ensembles based on the V9-G13 coupling strength (Fig. 5 c).
Figure 5.

(a) Representative structures of the folded β-turn and disordered turn conformations represented in the MSM of NTL91-39 overlaid onto a cartoon structure of the Markov state with lowest RMSD to the crystal structure. (Spheres) Carbonyls corresponding to the V9 and G13 residues. (Yellow dashes) The two folded hydrogen bonds. (Orange double-headed arrow) The V9-M12 hydrogen bond used as the order parameter to distinguish between folded and disordered turn structures. (b) V9-, G13-, and V9G13-label spectra calculated for folded (solid) and disordered (dashed) structures, respectively. (c) Scatter plot of the V9-G13 coupling constant in wavenumbers as a function of rVM distance for the 140 states in the MSM. (Circles are color-coded by the overall RMSD of each state to the crystal structure.) To see this figure in color, go online.
Scatter plots in Fig. 6 a show the average number of V9(O) protein-protein hydrogen bonds (nPP), including backbone and side-chain contacts, and protein-water hydrogen bonds (nPW) as a function of rVM. Here the geometric hydrogen bond definition (rO···O < 3.5 Å and θOOH < 35°) was used for all analyses. On average, the folded turn conformations contain 1.7 protein-protein bonds, and only 0.2 protein-water hydrogen bonds. The folded turn is not a typical β-turn with a single (n + 2 → n) hydrogen bond, but allows V9(O) to accept two hydrogen bonds from the M12 and G13 amide N-H. Increasing rVM beyond 3.5 Å leads to a decrease of nPP along with an essentially linear increase of nPW. In the disordered conformation, nPP (red) decreases to nearly zero, but nPW (blue) only increases to ∼1.3–1.5. Therefore, the total number of hydrogen bonds nTOT = nPP+nPW decreases in the disordered configurations. Because the number of hydrogen bonds accepted by the carbonyl is correlated with the frequency of the residue, increasing disorder is expected to blue-shift the labeled band. This data is expressed as V9(O) solvent exposure (nPW/ nTOT) in Fig. 6 b. Backbone solvent exposure in the β-turn is well correlated with rVM, indicating that only folded turns remain protected from the solvent. Finally, in Fig. 6 c, the MSM network diagram is color-coded by solvent exposure to show that highly populated states are low-exposure folded turn configurations, similar to the results shown in Fig. 2.
Figure 6.

Hydrogen-bond analysis of the folded and disordered transitions. (a) Total number of hydrogen bonds accepted by V9(O) separated by protein-protein H-bonds nPP (red) and protein-water H-bonds nPW (blue). (b) Plot of backbone exposure calculated as the ratio of protein-water to total hydrogen bonds nPW/(nPP + nPW). (Dashed lines) Linear fits to guide the eye. (c) MSM showing states color-coded by rVM; (yellow) folded states; (red) disordered states. To see this figure in color, go online.
Fig. 5 b shows calculated V9, G13, and V9G13 spectra based on folded (solid curves) and disordered (dashed curves) subensembles. While the widths of the peaks are similar, the folded turns show a ∼2× intensity enhancement that results from smaller frequency fluctuations of the residues in protein-protein H-bonds. Analysis of the V9 frequency distributions reveals a standard deviation of 7.0 and 8.5 cm−1 for the folded and disordered turns, respectively. In other words, disordered turn V9 exhibits ∼20% larger frequency fluctuations due to the high degree of backbone exposure compared to the folded turn. Oscillator strengths for the folded turn V9 are, on average, only 3% higher than the disordered turn V9, revealing that the difference in amplitude results almost exclusively from narrower frequency distributions instead of increases in oscillator strength. Similarly, the folded G13 peaks (gray curves) are narrower and more intense compared to disordered turn spectra. The interpretation is the same: folded-turn H-bonds lock the G13 carbonyl into place, lowering solvent exposure. G13 site energy analysis reveals a 12.2 and 15.0 cm−1 standard deviation for the folded and disordered configurations, respectively.
V9G13 dual label spectra (green curves) are most distinct between the folded and disordered configurations. Without vibrational coupling between the labeled oscillators, V9G13 spectra calculated for disordered configurations are similar to the sum of the individual V9 and G13 spectra, but in the folded turn, vibrational coupling results in a ∼10–12 cm−1 red shift and a large intensity increase. The same effect is observed in the experiment where the V9G13 peak appears more intense on the red side and less intense on the blue side compared to V9. The differences between the two labels are, however, smaller in the experiment than simulations. Fig. 5 c shows a plot of the V9-to-G13 coupling, namely the off-diagonal term in the Hamiltonian matrix that represents the degree of dipole-dipole coupling between these two sites. The plot indicates that for folded turns, the coupling is ∼8 cm−1, whereas, for a disordered turn conformation, the coupling is practically zero. This analysis illustrates the short-range nature of amide I vibrational couplings, serving as essentially binary probes of protein contacts.
The calculated V9 band of the folded turn is centered around 1598 cm−1, whereas the disordered turn is centered near 1600 cm−1 (Fig. 5 b). The small shift can be explained in terms of total hydrogen bonds. To a first approximation, in the native state nTOT = 2 and in the disordered state nTOT = 1.5, so nTOT decreases by 0.5, resulting in a ∼7 cm−1 frequency blue shift. The simulations tend to overestimate the spectral shift induced by turn folding (i.e., difference between V9 and V9G13 in simulated spectra), which could be due to the fact that point dipole interactions, which successfully capture long-range interactions, may overestimate the degree of vibrational coupling between two hydrogen-bonded residues.
Based on these simulations, we believe that the experimental peaks (Fig. 4) centered at 1577 and 1592 cm−1 correspond to folded and disordered configurations, respectively. The 15 cm−1 difference between these two bands is most likely because simulations underestimate the spectral shifts between the two configurations. Using the intensities from the simulations, we estimate populations of the two states in experiment. Simulated spectra, which are normalized to the main amide I band intensity, show an amplitude ratio of 3.1 between the native and disordered peaks, whereas experiments show a ratio of 4.5. Although we cannot accurately quantify populations with this model, the similarity between these two values indicates that a significant fraction (>40%) of the experimental population is disordered turns, compared to the 9% observed in the MD simulation, indicating that simulations tend to overstabilize folded turn configurations.
Solvent exposure at the G16- and G24-label positions
Next, we examine the solvent exposure at the G16 and G24 positions by binning structures based on the average number of protein-protein and protein-water hydrogen bonds accepted by these carbonyls. For the purpose of analysis, we consider structures that fall into two principal H-bond bins, structures with nPP ≥ 1 (folded, 89% of total population) and structures with nPW ≥ 1 (solvent-exposed, 1.4% of total population). Interestingly, the degree of solvent exposure appears to be uncorrelated to the global RMSD of the protein with a correlation coefficient ρ = 0.06 (not population-weighted), or to rVM (ρ = 0.23) indicating that disorder at the G24 position is purely local in nature. Fig. 7 a shows configurations randomly sampled from the two bins, and Fig. 7 c shows the MSM network diagram color-coded by G24(O) solvent exposure. The high nPP structures correspond to a folded helix, where the G24(O) is hydrogen bonded to the F29 amide, whereas the high nPW states feature a fully exposed G24 carbonyl. Fig. 7 b shows a G24-label spectrum for these two ensembles. The folded spectrum shows a peak centered around 1600 cm−1, whereas the solvent-exposed spectrum only shows the low-frequency edge of the main band. Analysis of the G24 site energies reveals that the average residue frequencies (standard deviations) are 1608 (8.5) and 1610 (7.2) cm−1 for solvent-exposed and folded bins, respectively, indicating that the two peaks in the experimental spectrum, 1587 and 1572 cm−1, correspond to a folded helix structure and a solvent-exposed G24 site, respectively. In the spectral models, backbone flexibility contributes to frequency fluctuation though nearest-neighbor frequency shifts, which are calculated via φ/ψ angle maps (37,38). In order to partially separate the individual contributions, solvent exposure, and backbone flexibility from the residue frequencies and disordered regions, we repeated the frequency calculations without nearest-neighbor frequency shifts. These frequencies (standard deviations) are 1614 (8.7) and 1599 (7.6) cm−1 for solvent-exposed and folded bins, respectively, indicating that the backbone orientation compensates for the blue shift in solvent-exposed configurations. Fluctuations are similar in both sets of simulations, suggesting that the broadness of the line shapes in the G24 spectra originates primarily from solvent exposure.
Figure 7.

(a) Structures of the folded and solvent-exposed G24 configurations of the α-helix aligned to the lowest RMSD structure (black outline). (Red) The G24 carbonyls. (b) Simulated diagonal 2D IR spectra of the two conformations. The folded peak is more intense compared to solvent-exposed conformations. (c) MSM color-coded by G24(O) solvent exposure: nPW/nTOT. To see this figure in color, go online.
In the case of G16, which resides in a partially disordered region immediately preceding the third β-strand, we find an average of 1.1 protein-water H-bonds and negligible protein-protein H-bonds, indicating that G16 remains fully solvent-exposed in all structures. These results are not surprising because in the crystal structure, the residue does not display protein-protein hydrogen bonds. Compared with G24, the G16 spectrum has additional intensity on the blue side (Fig. 4), pointing toward a more solvent-exposed backbone at this site.
Finally, it is important to compare the frequency shifts observed for V9 versus G24. In the second label, solvent-exposed configurations are red-shifted compared to the native helix conformation, whereas the opposite trend is observed in the case of V9. These can be interpreted by considering the total number of hydrogen bonds: in the case of G24, the C=O accepts a single hydrogen bond from its partner in the helix, whereas in partially solvent-exposed configurations water molecules participate in additional hydrogen bonds, thus red-shifting the residue frequency. The opposite trend is observed for V9, where folded configurations display two stable protein-protein hydrogen bonds across the β-turn, but only ∼1.5 protein-water hydrogen bonds in the solvent-exposed structures, explaining the blue shift observed in the spectrum.
Discussion
Structure and stability of folded NTL91-39
The experimental spectra are consistent with a structure that has the overall topology in Fig. 1, and point toward a well-folded β-sheet, a significant population of disordered backbone configurations in the β-turn, and partially disordered residues in the α-helix region. In the experiments, the sharp V3 peak indicates that the residues in β-strands I and II are in rigid hydrogen-bonding configurations with little heterogeneity. In contrast, the breadth and red shift of V9 and V9G13 spectra indicate that a significant fraction (>40%) of the β-turn population resides in disordered and solvent-exposed configurations. The two-peak structure observed for G24 indicates that the first turn of the helix adopts different hydrogen-bonding conformations, perhaps with solvent-exposed carbonyl configurations similar to previous observations for exterior-facing residues of α3D (44,45). Spectra for G16 show broad line shapes as a result of disorder and solvent exposure.
Simulations of these spectra using Markov states and other subensembles allow us to build an atomistic picture for the conformations in the folded state that are consistent with experiment. Overall, simulations show a well-folded structure with 66% of the population in four low-RMSD that are distinguishable only by low-amplitude hinging motions of the β-turn. The remaining 34% of the population has significant configurational freedom, with RMSD values of up to 5 Å. Folded β-sheets characterized by the hydrogen bond at the V3 position are apparent in >99% of the population, and these populations lead to excellent agreement with experiment. Spectra calculated for other sites indicate that MD simulations underestimate their disordered populations. The MD simulations show a tightly folded β-turn with only ∼9% of the population in disordered configurations, whereas our experimental assignment shows a >40% population of disordered turns. Similarly, solvent-exposed helix turns at G24 constitute 1.4% of the total MSM population, but experiments suggest considerably higher variation in the helix structure. We were unable to assign specific conformations to the G16 line shapes, but the breadth indicates significant disorder and solvent exposure for this carbonyl. Beyond the few lowest-RMSD states, measures of protein disorder in the α-helix and β-loop regions and for the entire protein appear to be uncorrelated, indicating that the disorder arises from local fluctuations with a short correlation length.
Taken together, these results suggest a force field that results in overstabilized structures in the native-state ensemble. However, a one-to-one comparison between simulation and experiment must take into account the following differences in conditions: simulations were carried out on the K12M mutant, a mutation near the center of the loop region that destabilizes the unfolded state with respect to wild-type raising Tm by 18°C. This mutation could partially account for the overstabilization of the native β-turn configurations in the simulations. Additionally, simulations were run near Tm (for K12M), with multiple folding and unfolding events, yet a coarse separation of the free-energy surface folded and disordered ensembles reveals a folded to unfolded equilibrium constant of ∼4 (see Baiz et al. (34)), implying that simulations overstabilize the native state ensemble over denatured states. In contrast, experiments were run on wild-type NTL91-39, and at 15°C below Tm (for WT at pH = 2.2) with an equilibrium constant of >10. Experimental conditions should therefore favor low-RMSD configurations, although the connection between stability (Tm) and backbone flexibility is unclear. Finally, it is important to consider the difference in pH between simulation and experiment. In addition to changes in protein stability, differences in side-chain protonation could alter the electrostatics at the labeled residues and thus shift their frequencies. These pH changes should be taken into consideration in order to get quantitative agreement between simulation and experiment.
These results also add further insight into our previous study of NTL91-39 folding, which was carried out with the full MSM and T-jump protein folding experiments (34). The earlier study, which was rooted in spectral simulations of the fully coupled amide-I vibrations rather than site-specific labels, concluded that the overall changes in calculated 2D IR spectra associated with the global unfolding were consistent with changes observed in experimental spectra. While the earlier work provided evidence for a well-folded secondary structure, these new studies present fresh evidence of backbone solvent exposure that depends on position. Our present interpretation of a stable β-sheet in the MSM appears to be consistent with previous conclusions. Comparing results for the helix suggests that the overall secondary structure is retained in the folded state, but with a significant degree of backbone solvation in the G24 region. This finding points to the importance and power of a full self-consistent analysis of infrared spectroscopy with and without isotope editing strategies.
Further examination of the MSM provides additional predictions regarding the folding mechanism of NTL91-39. A more recent MSM built by Schwantes and Pande (43) from the same MD trajectories showed that 17% of the flux is channeled through near-native states that have a secondary structure similar to the folded state, but which display nonnative side-chain contacts. These states closely resemble the disordered β-turn states described here, suggesting that the turn likely plays a significant role in folding. In addition, the K12M mutation in the β-turn, which is observed to significantly destabilize nonnative states, is observed to participate in the hydrophobic core of the protein. T-jump unfolding experiments with isotope labels in the β-turn would elucidate the relationship between β-turn disorder and the global folding coordinate.
Isotope-edited 2D IR spectroscopy as a structural tool
Our results show that isotope-edited 2D IR spectroscopy and structural modeling based on MD simulations is a promising technique for investigating protein structure in solution. Similar to methods used in magnetic resonance, a self-consistent analysis of multiple structural restraints from site-specific experiments drawing on trial protein structures can be used to determine the most probable structure. However, as a method that captures picosecond snapshots, 2D IR spectra can potentially be extended to characterizing structural heterogeneity, for instance in the study of intrinsically disordered protein structure (31). Our current analysis of experiments is qualitative, focusing on general correlations between structure and amide I frequency shifts and intensities. Further steps will be needed in order perform quantitative analyses of structural distributions or to build structural ensembles consistent with experiment, most of which are of a theoretical or computational nature.
Of primary importance is the need for quantitatively accurate spectral models. Similar to electronic structure calculations, as of this writing, spectral simulations are useful not as a one-to-one comparison with experiment, but to aid in interpreting the molecular origin of the observed spectra. Spectral models that accurately capture frequency variations in different environments would be essential for quantitative interpretation of experimental spectra. Differences between experimental and simulated spectra of NTL91-39 highlight the challenges associated with applying contemporary electrostatic models to small proteins. Notably, peaks in the simulated spectra appear somewhat broader and blue-shifted compared to experiment. The peak-width differences could be, in part, attributed to motional narrowing, which is not captured by the static models used here. Although the electrostatic map approach has shown excellent agreement with experiment for small peptides (31,46,47), Ganim and Tokmakoff (48) pointed out some general shortcomings that appear to be systematic. Peaks are often blue-shifted with respect to experiment because maps underestimate the solvent-induced frequency shift relative to the gas phase, and the dynamical effects that give rise to line broadening (motional narrowing) are often ignored. Furthermore, systematic studies of vibrational interactions between amide I groups are needed to improve our understanding of the through-bond and through-space coupling mechanisms.
With an accurate spectral model, one can turn toward methods for comparing to trial structures. This will require strategies for identifying statistically meaningful trial structures, which makes the MSM approach particularly appealing. As of this writing, clustering of MSMs relies on structures from MD snapshots, but the fact that IR spectroscopy probes site energies and couplings suggests MSMs could be reencoded into more spectroscopically compatible variables, to more intuitively map structure-frequency relationships in proteins. Further, methods will be needed to score the information content of an IR spectrum. The frequency and line shape of labeled vibrations or the main amide I band may form strong or limited restraints on a structure. To what extent does a sequence of amide I frequencies restrain a structure, or distinguish between conformers? A statistical approach to this analysis will be required to refine structural ensembles. Strategic selection of specific sites for labeling can be used to design maximally informative experiments.
In principle, the information content that we extracted from our two-dimensional spectra in this study was also available from traditional FTIR spectroscopy. However, the structural interpretation becomes significantly more challenging in practice. The primary advantage is that 2D IR is not affected by solvent absorptions as much as FTIR, and therefore complex difference spectroscopy is not required. This is particularly challenging when conducting specific isotope-labeling studies for large proteins and low concentrations. The same nonlinear effect that suppresses low transition dipole moments in 2D IR spectra also serves to sharpen the resonances of the isotope labels. However, 2D IR has its own unique challenges, including that peak heights are not straightforwardly related to populations, data acquisition times are significantly longer than FTIR, and spectral calculations are significantly longer.
Conclusions
In summary, isotopic substitutions were used to characterize the hydrogen-bonding environments at different positions along the backbone of NTL91-39 and demonstrate how significant backbone flexibility can be observed for a well-folded protein. The β-loop region in particular, V9 and V9G13, is characterized by a large population of partially disordered states with a fully solvent-exposed backbone. Surprisingly, we observe that the α-helix configurations (G24) still display considerable disorder.
The information content in 2D IR on protein structural heterogeneity and disorder has potential to be used in a variety of contexts, for instance in studies of conformational selection (49,50) or intrinsically disordered proteins (4,51–53). Lacking a well-defined secondary structure, some proteins exist as an ensemble of rapidly interconverting structures that play an active role in biology. Experimentally characterizing protein dynamics has been especially challenging for biophysical methods because crystallography and NMR often reinforce a static, structurecentric view of proteins. A more complete physical description of a protein would not only include the dominant equilibrium conformations but also the free-energy surface around and connecting these minima. Spectra presented here serve to visualize the principal structures that compose the native ensemble, but leave open questions related to kinetics. We expect that, assisted by simulations, T-jump 2D IR experiments on isotope-labeled samples could provide a detailed experimental characterization of the complex free-energy landscape of small proteins (54).
Until recently, points of comparison between simulation and experiment have been at the level of equilibrium structures and folding kinetics. As a result, force fields are built to stabilize secondary structures, and thus may fail to properly sample the configurational space of the protein. Our study illustrates how experimental data can be used to test and improve force fields or aid in developing new experimentally compatible methods for interpreting structural variation seen in MD simulations. In conclusion, we believe that the synergy among state-of-the-art MD simulations, spectral modeling, and ultrafast IR spectroscopy will deliver an intuitive view of proteins beyond average equilibrium structures.
Author Contributions
C.R.B. and A.T. designed the research; C.R.B. performed the research; C.R.B. and A.T. contributed analytic tools; C.R.B. analyzed the data; and C.R.B. and A.T. wrote the article.
Acknowledgments
We thank Christian Schwantes and Professor Vijay Pande (Stanford University, Stanford, CA) as well as Professor Yu-Shan Lin (Tufts University, Medford, MA) for providing the Markov state model and for insightful discussions.
This project was funded by the National Science Foundation (under grants No. CHE-1212557 and No. CHE-1414486), the Massachusetts Institute of Technology Laser Biomedical Research Center (under grant No. P41-EB015871), and a startup grant from the University of Chicago. C.R.B. gratefully acknowledges the National Institutes of Health for a Ruth L. Kirschstein National Research Service Award (under grant No. F32GM105104).
References
- 1.Henzler-Wildman K., Kern D. Dynamic personalities of proteins. Nature. 2007;450:964–972. doi: 10.1038/nature06522. [DOI] [PubMed] [Google Scholar]
- 2.Dill K.A., MacCallum J.L. The protein-folding problem, 50 years on. Science. 2012;338:1042–1046. doi: 10.1126/science.1219021. [DOI] [PubMed] [Google Scholar]
- 3.Bryngelson J.D., Onuchic J.N., Wolynes P.G. Funnels, pathways, and the energy landscape of protein folding: a synthesis. Proteins. 1995;21:167–195. doi: 10.1002/prot.340210302. [DOI] [PubMed] [Google Scholar]
- 4.Uversky V.N., Dunker A.K. Understanding protein non-folding. Biochim. Biophys. Acta. 2010;1804:1231–1264. doi: 10.1016/j.bbapap.2010.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Mobley D.L., Dill K.A. Binding of small-molecule ligands to proteins: “what you see” is not always “what you get”. Structure. 2009;17:489–498. doi: 10.1016/j.str.2009.02.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Smock R.G., Gierasch L.M. Sending signals dynamically. Science. 2009;324:198–203. doi: 10.1126/science.1169377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Tzeng S.R., Kalodimos C.G. Protein dynamics and allostery: an NMR view. Curr. Opin. Struct. Biol. 2011;21:62–67. doi: 10.1016/j.sbi.2010.10.007. [DOI] [PubMed] [Google Scholar]
- 8.Henzler-Wildman K.A., Lei M., Kern D. A hierarchy of timescales in protein dynamics is linked to enzyme catalysis. Nature. 2007;450:913–916. doi: 10.1038/nature06407. [DOI] [PubMed] [Google Scholar]
- 9.Boehr D.D., Nussinov R., Wright P.E. The role of dynamic conformational ensembles in biomolecular recognition. Nat. Chem. Biol. 2009;5:789–796. doi: 10.1038/nchembio.232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Popovych N., Tzeng S.R., Kalodimos C.G. Structural basis for cAMP-mediated allosteric control of the catabolite activator protein. Proc. Natl. Acad. Sci. USA. 2009;106:6927–6932. doi: 10.1073/pnas.0900595106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Schreiber G., Haran G., Zhou H.X. Fundamental aspects of protein-protein association kinetics. Chem. Rev. 2009;109:839–860. doi: 10.1021/cr800373w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Palmer A.G., 3rd NMR characterization of the dynamics of biomacromolecules. Chem. Rev. 2004;104:3623–3640. doi: 10.1021/cr030413t. [DOI] [PubMed] [Google Scholar]
- 13.Wand A.J. Dynamic activation of protein function: a view emerging from NMR spectroscopy. Nat. Struct. Biol. 2001;8:926–931. doi: 10.1038/nsb1101-926. [DOI] [PubMed] [Google Scholar]
- 14.Baldwin A.J., Kay L.E. NMR spectroscopy brings invisible protein states into focus. Nat. Chem. Biol. 2009;5:808–814. doi: 10.1038/nchembio.238. [DOI] [PubMed] [Google Scholar]
- 15.Lu H.P. Sizing up single-molecule enzymatic conformational dynamics. Chem. Soc. Rev. 2014;43:1118–1143. doi: 10.1039/c3cs60191a. [DOI] [PubMed] [Google Scholar]
- 16.Schuler B. Single-molecule fluorescence spectroscopy of protein folding. ChemPhysChem. 2005;6:1206–1220. doi: 10.1002/cphc.200400609. [DOI] [PubMed] [Google Scholar]
- 17.Ganim Z., Chung H.S., Tokmakoff A. Amide I two-dimensional infrared spectroscopy of proteins. Acc. Chem. Res. 2008;41:432–441. doi: 10.1021/ar700188n. [DOI] [PubMed] [Google Scholar]
- 18.Hamm P., Zanni M.T. Cambridge University Press; New York: 2011. Concepts and Methods of 2D Infrared Spectroscopy. [Google Scholar]
- 19.Serrano A.L., Waegele M.M., Gai F. Spectroscopic studies of protein folding: linear and nonlinear methods. Protein Sci. 2012;21:157–170. doi: 10.1002/pro.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Remorino A., Hochstrasser R.M. Three-dimensional structures by two-dimensional vibrational spectroscopy. Acc. Chem. Res. 2012;45:1896–1905. doi: 10.1021/ar3000025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kim Y.S., Hochstrasser R.M. Applications of 2D IR spectroscopy to peptides, proteins, and hydrogen-bond dynamics. J. Phys. Chem. B. 2009;113:8231–8251. doi: 10.1021/jp8113978. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Fayer M.D. Taylor & Francis; Boca Raton, FL: 2013. Ultrafast Infrared Vibrational Spectroscopy. [Google Scholar]
- 23.Horng J.C., Moroz V., Raleigh D.P. Characterization of large peptide fragments derived from the N-terminal domain of the ribosomal protein L9: definition of the minimum folding motif and characterization of local electrostatic interactions. Biochemistry. 2002;41:13360–13369. doi: 10.1021/bi026410c. [DOI] [PubMed] [Google Scholar]
- 24.Horng J.-C., Moroz V., Raleigh D.P. Rapid cooperative two-state folding of a miniature α-β protein and design of a thermostable variant. J. Mol. Biol. 2003;326:1261–1270. doi: 10.1016/s0022-2836(03)00028-7. [DOI] [PubMed] [Google Scholar]
- 25.Baiz C.R., Peng C.S., Tokmakoff A. Coherent two-dimensional infrared spectroscopy: quantitative analysis of protein secondary structure in solution. Analyst (Lond.) 2012;137:1793–1799. doi: 10.1039/c2an16031e. [DOI] [PubMed] [Google Scholar]
- 26.Smith A.W., Lessing J., Knoester J. Melting of a β-hairpin peptide using isotope-edited 2D IR spectroscopy and simulations. J. Phys. Chem. B. 2010;114:10913–10924. doi: 10.1021/jp104017h. [DOI] [PubMed] [Google Scholar]
- 27.Strasfeld D.B., Ling Y.L., Zanni M.T. Strategies for extracting structural information from 2D IR spectroscopy of amyloid: application to islet amyloid polypeptide. J. Phys. Chem. B. 2009;113:15679–15691. doi: 10.1021/jp9072203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Pande V.S., Beauchamp K., Bowman G.R. Everything you wanted to know about Markov state models but were afraid to ask. Methods. 2010;52:99–105. doi: 10.1016/j.ymeth.2010.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Noé F., Fischer S. Transition networks for modeling the kinetics of conformational change in macromolecules. Curr. Opin. Struct. Biol. 2008;18:154–162. doi: 10.1016/j.sbi.2008.01.008. [DOI] [PubMed] [Google Scholar]
- 30.Voelz V.A., Bowman G.R., Pande V.S. Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1–39) J. Am. Chem. Soc. 2010;132:1526–1528. doi: 10.1021/ja9090353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lessing J., Roy S., Tokmakoff A. Identifying residual structure in intrinsically disordered systems: a 2D IR spectroscopic study of the GVGXPGVG peptide. J. Am. Chem. Soc. 2012;134:5032–5035. doi: 10.1021/ja2114135. [DOI] [PubMed] [Google Scholar]
- 32.Marecek J., Song B., Raleigh D.P. A simple and economical method for the production of 13C,18O-labeled Fmoc-amino acids with high levels of enrichment: applications to isotope-edited IR studies of proteins. Org. Lett. 2007;9:4935–4937. doi: 10.1021/ol701913p. [DOI] [PubMed] [Google Scholar]
- 33.Khalil M., Demirdoven N., Tokmakoff A. Coherent 2D IR spectroscopy: molecular structure and dynamics in solution. J. Phys. Chem. A. 2003;107:5258–5279. [Google Scholar]
- 34.Baiz C.R., Lin Y.S., Tokmakoff A. A molecular interpretation of 2D IR protein folding experiments with Markov state models. Biophys. J. 2014;106:1359–1370. doi: 10.1016/j.bpj.2014.02.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Lindorff-Larsen K., Piana S., Shaw D.E. How fast-folding proteins fold. Science. 2011;334:517–520. doi: 10.1126/science.1208351. [DOI] [PubMed] [Google Scholar]
- 36.la Cour Jansen T., Dijkstra A.G., Knoester J. Modeling the amide I bands of small peptides. J. Chem. Phys. 2006;125:44312. doi: 10.1063/1.2218516. [DOI] [PubMed] [Google Scholar]
- 37.la Cour Jansen T., Knoester J. A transferable electrostatic map for solvation effects on amide I vibrations and its application to linear and two-dimensional spectroscopy. J. Chem. Phys. 2006;124:044502. doi: 10.1063/1.2148409. [DOI] [PubMed] [Google Scholar]
- 38.la Cour Jansen T., Knoester J. Nonadiabatic effects in the two-dimensional infrared spectra of peptides: application to alanine dipeptide. J. Phys. Chem. B. 2006;110:22910–22916. doi: 10.1021/jp064795t. [DOI] [PubMed] [Google Scholar]
- 39.Cheatum C.M., Tokmakoff A., Knoester J. Signatures of β-sheet secondary structures in linear and two-dimensional infrared spectroscopy. J. Chem. Phys. 2004;120:8201–8215. doi: 10.1063/1.1689637. [DOI] [PubMed] [Google Scholar]
- 40.Bowman G.R., Pande V.S. Protein folded states are kinetic hubs. Proc. Natl. Acad. Sci. USA. 2010;107:10890–10895. doi: 10.1073/pnas.1003962107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Dickson A., Brooks C.L., 3rd Native states of fast-folding proteins are kinetic traps. J. Am. Chem. Soc. 2013;135:4729–4734. doi: 10.1021/ja311077u. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Gnanakaran S., Hochstrasser R.M. Conformational preferences and vibrational frequency distributions of short peptides in relation to multidimensional infrared spectroscopy. J. Am. Chem. Soc. 2001;123:12886–12898. doi: 10.1021/ja011088z. [DOI] [PubMed] [Google Scholar]
- 43.Schwantes C.R., Pande V.S. Improvements in Markov state model construction reveal many non-native interactions in the folding of NTL9. J. Chem. Theory Comput. 2013;9:2000–2009. doi: 10.1021/ct300878a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Manas E.S., Getahun Z., Vanderkooi J.M. Infrared spectra of amide groups in α-helical proteins: evidence for hydrogen bonding between helices and water. J. Am. Chem. Soc. 2000;122:9883–9890. [Google Scholar]
- 45.Walsh S.T., Cheng R.P., DeGrado W.F. The hydration of amides in helices; a comprehensive picture from molecular dynamics, IR, and NMR. Protein Sci. 2003;12:520–531. doi: 10.1110/ps.0223003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Woys A.M., Almeida A.M., Zanni M.T. Parallel β-sheet vibrational couplings revealed by 2D IR spectroscopy of an isotopically labeled macrocycle: quantitative benchmark for the interpretation of amyloid and protein infrared spectra. J. Am. Chem. Soc. 2012;134:19118–19128. doi: 10.1021/ja3074962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Reppert M., Tokmakoff A. Electrostatic frequency shifts in amide I vibrational spectra: direct parameterization against experiment. J. Chem. Phys. 2013;138:134116. doi: 10.1063/1.4798938. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Ganim Z., Tokmakoff A. Spectral signatures of heterogeneous protein ensembles revealed by MD Simulations of 2DIR spectra. Biophys. J. 2006;91:2636–2646. doi: 10.1529/biophysj.106.088070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Kiefhaber T., Bachmann A., Jensen K.S. Dynamics and mechanisms of coupled protein folding and binding reactions. Curr. Opin. Struct. Biol. 2012;22:21–29. doi: 10.1016/j.sbi.2011.09.010. [DOI] [PubMed] [Google Scholar]
- 50.Daniels K.G., Tonthat N.K., Oas T.G. Ligand concentration regulates the pathways of coupled protein folding and binding. J. Am. Chem. Soc. 2014;136:822–825. doi: 10.1021/ja4086726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Wright P.E., Dyson H.J. Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J. Mol. Biol. 1999;293:321–331. doi: 10.1006/jmbi.1999.3110. [DOI] [PubMed] [Google Scholar]
- 52.Xie H., Vucetic S., Obradovic Z. Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions. J. Proteome Res. 2007;6:1882–1898. doi: 10.1021/pr060392u. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Habchi J., Tompa P., Uversky V.N. Introducing protein intrinsic disorder. Chem. Rev. 2014;114:6561–6588. doi: 10.1021/cr400514h. [DOI] [PubMed] [Google Scholar]
- 54.Jones K.C., Peng C.S., Tokmakoff A. Folding of a heterogeneous β-hairpin peptide from temperature-jump 2D IR spectroscopy. Proc. Natl. Acad. Sci. USA. 2013;110:2828–2833. doi: 10.1073/pnas.1211968110. [DOI] [PMC free article] [PubMed] [Google Scholar]
