Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Dec 21.
Published in final edited form as: J Am Chem Soc. 2011 Nov 28;133(50):20335–20340. doi: 10.1021/ja2071362

Differential ordering of the protein backbone and side chains during protein folding revealed by site-specific recombinant infrared probes

Sureshbabu Nagarajan , Humeyra Taskent-Sezgin , Dzmitry Parul , Isaac Carrico , Daniel P Raleigh , R Brian Dyer †,*
PMCID: PMC3241911  NIHMSID: NIHMS337528  PMID: 22039909

Abstract

The timescale for ordering of the polypeptide backbone relative to the side chains is a critical issue in protein folding. The interplay between ordering of the backbone and side chains is particularly important for the formation of β-sheet structures, as the polypeptide chain searches for the native stabilizing cross-strand interactions. We have studied these issues in the N-terminal domain of protein L9 (NTL9), a model protein with mixed α/β structure. We have developed a general approach for introducing site-specific IR probes for the side chains (azide) and backbone (13C=18O) using recombinant protein expression. T-jump, time-resolved IR spectroscopy combined with site-specific labeling enables independent measurement of the respective backbone and side chain dynamics with single residue resolution. We find that side chain ordering in a key region of the β-sheet structure occurs on a slower time scale than ordering of the backbone during the folding of NTL9, likely due to the transient formation of nonnative side chain interactions.

Keywords: Protein folding, dynamics, infrared, isotope label, azidohomoalanine

INTRODUCTION

Protein folding involves several structural transitions, including ordering of the polypeptide backbone to form specific secondary structures and the global topology, and the packing of side chains to form the stabilizing tertiary interactions of the hydrophobic core. The dynamics of these transitions and their relative order, whether they happen concurrently or sequentially or with some more complex time dependence, are still only poorly understood.1 Furthermore, non-native backbone conformations or tertiary interactions may act as kinetic traps that slow productive folding. Recent studies suggest that transiently formed non-native side chain interactions produce additional entropic and enthalpic barriers, and unfolding of such misfolded states may become the rate-limiting step to folding. 25 β-sheet proteins may be particularly prone to forming non-native contacts during folding as the system searches for the correct cross-strand interactions. We have investigated these issues in a widely studied model system for protein folding dynamics, NTL9, the first 56 residues of the ribosomal protein L9. We have measured the dynamics of forming the native cross-strand interactions in the β-sheet of NTL9, compared to the rate of ordering of the backbone, with single amino acid specificity.

An ideal folding experiment should yield structural specificity on all time scales for describing the progress of the side chain and backbone ordering at residue specific resolution. Such an experiment would require site-specific, non-perturbing spectroscopic labels to probe the folding of side chains and backbone separately, preferably introduced using simple recombinant methods that can be applied to nearly any protein. Most folding studies use fluorescence monitored stopped flow or T-jump methods to measure folding rates and to deduce the effects of mutations. Fluorescence spectroscopy can be a local probe, sensitive to structural changes around the chromophore or specific quenching interactions, but it can be difficult to interpret in terms of specific structure and is almost always interpreted to follow the global folding reaction. We have observed differences between kinetics measured by fluorescence (local Trp environment) versus infrared (global backbone conformation).6 In addition to this ambiguity, the introduction of fluorescence probes into different parts of the structure risks perturbing both the structures and dynamics being studied. In order to resolve these problems, we have developed non-perturbing, site-specific infrared (IR) labeling methods using azide (side chain) and 13C=18O (amide backbone) labels that can be easily incorporated into proteins through recombinant protein expression, and that allow the direct independent monitoring of specific side chain and specific backbone sites.

IR spectroscopy is a well-established approach to investigate the secondary structural content and dynamics of protein folding.7,8 The IR spectral signatures corresponding to specific protein secondary structures, however, are often obscured within a broad featureless amide I band.9 Likewise, side chain vibrations often occur in congested spectral regions, and hence are difficult to assign to specific structures. A powerful method to follow folding dynamics with single residue specificity is site-specific isotope editing of the IR spectrum by incorporating 13C=18O groups into specific sites in the peptide backbone through peptide synthesis.10 In the same way, azide labels can be introduced at specific positions to provide an IR label that is sensitive to local side chain structural rearrangements. Azide labels have yet to be exploited in kinetic refolding studies, although they have been used in equilibrium studies. 11

A major limitation of these approaches, however, has been the difficulty in introducing the labels into larger proteins that are not amenable to peptide synthesis. Therefore we have developed a simple and general method for site-specific incorporation of these labels in high yield into recombinantly produced proteins in an E. coli expression system. The method employs a strain of E. coli (B834) that is auxotrophic for methionine to selectively label individual Met sites introduced by site-specific mutation.12 The label is introduced by expressing the protein in a growth medium containing either 13C=18O labeled Met or its azide containing analog, azidohomoalanine (Aha). The limitation of this method is that it requires introducing a single Met into the position to be labeled. Since Met is one of the most rarely occurring amino acids, however, it is generally straightforward to design single-Met mutants in order to introduce a site-specific label in a target protein. The methionyl aminopeptidase that cleaves N-terminal Met residues in E. Coli also cleaves Aha in most cases, thus addition of an extra labeled site at the N-terminus is not an issue.13 If necessary the protein of interest can also be expressed with a factor Xa or other cleavable site at the N-terminus to allow removal of any undesired N-terminal labels. Thus we emphasize that this labeling approach is completely general and can be applied to any expressed protein by introducing Met into sites to be labeled and removing it everywhere else.

By combining these site specific labeling methods with T-jump IR spectroscopy, we have studied the ordering of the backbone and side chains of NTL9. It is an ideal model system for studying β-sheet formation because it is small, folds relatively rapidly and is very stable over a wide range of pH and temperature.14,15 It has been characterized as a two-state folder both kinetically and thermodynamically.16,17 Wild type NTL9 has a single Met that we have targeted for labeling because of its strategic location at the N-terminus (M1), as shown in Figure 1. M1 is required for folding and its side chain is integrated into the beta-sheet of the native structure, with both its backbone carbonyl and thioether side chain participating in cross-strand interactions. We find that the dynamics of ordering of the M1 side chain is significantly slower than the ordering of the peptide backbone, which has significant implications for the folding mechanism of NTL9.

Figure 1.

Figure 1

Ribbon diagram of NTL9 structure (2HBB) showing the location of Met-1 (in CPK format)

EXPERIMENTAL SECTION

Protein Expression and Purification

An overnight culture of B834-pET3a-NTL9 was grown in LB rich media with ampicillin. This starter culture was added to 1 liter of M9 minimal media supplemented with 18 amino acids (including methionine, but no tyrosine and cysteine). Cells were grown to an OD600 of 0.8–1.0, harvested and re-suspended in M9 media salt solution. After agitation for several minutes at room temperature, cells were harvested again and re-suspended in M9 minimal media supplemented with 17 amino acids (no methionine, tyrosine and cysteine) plus 40 mg azidohomoalanine or 13C=18O labeled methionine. Protein expression was induced with 1mM IPTG at 37°C for 4 hours. Protein was purified from the supernatant of the cell lysate by cation exchange chromatography followed by reverse-phase HPLC on a Vydac C4 semi-preparative column. An A-B gradient system was used, with buffer A composed of 0.1% (v/v) solution of TFA and buffer B composed of 90% (v/v) acetonitrile, 9.9% (v/v) water and 0.1% (v/v) TFA. The gradient was 0–90% B in 90 min. The expression yield for NTL9-M1Aha was 15 mg, which was around 30% of the wild type expression yield obtained in LB media. The yield for NTL9-M1-13C=18O was 40 mg which is around 80% of the wild type yield. Proteins labeled with azidohomoalanine and 13C=18O IR probes (NTL9-M1AHA and NTL9-M1-13C=18O) were characterized by mass spectroscopy using an LTQ-Orbitrap XL Mass Spectrometer.

FTIR Spectroscopy

Equilibrium FTIR temperature-dependent spectra were recorded on a Varian 3100 FTIR spectrometer equipped with liquid nitrogen cooled mercury cadmium telluride (MCT) detector. The spectra were the result of 256 scans recorded at a resolution of 2 cm−1. The proteins were dissolved in a buffer containing 20 mM sodium phosphate and 10 mM sodium chloride at a pH* of 5.4, in D2O. pH* refers to the uncorrected (for D2O) pH-meter reading at 25 °C. The NTL9 protein concentration for IR experiments is ~ 2 mM. A split IR cell composed of CaF2 windows was utilized with a path length of 100 µm to record the spectrum of both the reference (buffer in D2O) and the sample (protein in the D2O buffer) side of the IR transmission cell under identical conditions at each temperature. The temperature of the IR cell was controlled by a water bath, and the sample temperature was measured by a thermocouple attached to the cell. The absorbance spectra of the protein were determined from the negative logarithm of the ratio of the single beam spectra of the sample to the reference side of the IR split cell at each temperature.

Time-Resolved Temperature-Jump (T-Jump) IR Kinetic Measurements

The time-resolved T-jump apparatus used to measure the protein relaxation kinetics in this study has been described previously.18 This method is a pump-probe experiment where 1.91 µm radiation is the pump beam that initiates a rapid T-jump in the sample, thereby perturbing the folding equilibrium. A quantum cascade laser (Daylight Solutions Inc., Poway, CA) tunable either in the 1535–1695 cm−1 or the 2035–2145cm−1 region is used to probe structural changes in the sample as the system relaxes to a new equilibrium at the final temperature in response to the T-jump. The changes in transmission of the IR probe beam are detected by a fast (200 MHz) photovoltaic (PV) MCT detector (Kolmar Technologies, Newburyport, MA). The 1.91 µm (10 ns fwhm Gaussian pulse width, ~30 mJ/pulse) pump radiation is obtained from a H2 (g) filled Raman shifter (1 Stokes shift) pumped by a 10 Hz repetition rate Q-switched DCR-4 Nd:YAG laser (Spectra Physics, Mountain View, CA) and is absorbed by weak combination bands in the D2O solution. This pump radiation was chosen due to its transmission properties (87% pump radiation transmitted through 100 µm path length sample cell) that allow for nearly uniform heating in the pump-probe overlap region and because most peptides and proteins do not absorb at this wavelength. The same split cell used for the equilibrium FTIR experiments was used for the kinetic measurements with the reference D2O buffer compartment serving as an internal thermometer to determine the magnitude of the T-jump. The protein relaxation kinetic traces were extracted by subtracting the change in absorbance of the reference (D2O buffer) from the sample (protein in D2O buffer) in response to the T-jump.

The kinetic traces were recorded from the nanosecond to tens of milliseconds time regime with the thermal energy diffusing from the pump-probe interaction volume with a lifetime of about 5 ms and were fit to a double exponential function to account for both the protein kinetics and the cooling, as described in the supplemental material. The data analysis was performed in IGOR Pro (Wavemetrics, Inc.).

RESULTS AND DISCUSSION

Site specific labeling of NTL9

The expression of NTL9 in E. coli B834 (auxotrophic for Met) is performed in a minimal growth medium lacking normal Met, but with added 13C=18O labeled Met or the azide containing Met analog, azidohomoalanine (Aha). In this way the expressed protein is selectively labeled in the backbone or side chain, respectively, wherever Met occurs in its sequence. The 13C=18O label is not scrambled, and is incorporated with better than 95% efficiency as determined by mass spectroscopy (SI Appendix, Fig. S1). Similarly, the Aha label is incorporated with high yield (>95%), again as determined by mass spectroscopy (SI Appendix, Fig. S2). The FTIR spectra provide additional evidence for the incorporation of the labels (Figs. 2 and 3).

Figure 2.

Figure 2

(A) Difference FTIR spectra of 13C=18O-M1-NTL9 as a function of temperature in the range 10–90 °C in 10 °C increments (arrow shows direction of increasing T) in the amide I spectral region. Difference spectra were formed by subtracting the lowest temperature absorbance spectrum from the spectra at the higher temperatures. (B) The melt curves (ΔA vs T) at the unlabeled (1629 cm−1, filled circles) and labeled (1594 cm−1, open circles) positions.

Figure 3.

Figure 3

(A) FTIR spectra of Aha-NTL9 in the azide spectral region as a function of temperature in the range 10–90 °C in 10 °C increments. (B) The melt curves (ΔA vs T) at 2094 cm−1 (open circles) and 1629 cm−1 (filled circles).

Temperature-Dependent Equilibrium FTIR Spectroscopy

The temperature-dependent difference FTIR spectra of 13C=18O-M1-NTL9 from 10 to 90 °C shown in Figure 2A reveal the spectral changes due to unfolding, whereas all other spectral features (such as the D2O background) are removed, thus highlighting the labeled and unlabeled amide I features. The negative band at 1629 cm−1 is due to the unlabeled folded structure (anti-parallel β-sheet and solvated helix) whereas the corresponding positive feature centered at 1665 cm−1 result from the formation of disordered structure as the protein unfolds. The single 13C=18O labeled Met is shifted lower in frequency (to 1594 cm−1) relative to the unlabeled 12C=16O band and is well resolved. The observed isotopic shift of about 42 cm−1 is significantly less than the expected shift of 75 cm−1 for a local C=O oscillator. This result is consistent with the β-sheet conformation of the M1 residue, that leads to extensive dipolar coupling among the aligned cross-strand 12C=16O oscillators; in contrast, the labeled site is decoupled due to the frequency mismatch.19 The other minor negative peak in the difference spectra at 1555 cm−1 is due to a slight decrease in the population of a deprotonated carboxylate side chain with increasing temperature, likely due to a small increase in its pKa due to temperature, or as the protein unfolds, or both. The corresponding melt curves for this protein monitored at the natural abundance (1629 cm−1) and labeled (1594 cm−1) amide I frequencies are shown in Figure 2B. The error bars represent the uncertainty of the absorbance measurement, determined from the baseline noise in this region. The melt curves are identical within the error of the measurement, meaning that the local melting sensed by M1 is the same as the global melt sensed by the unlabeled amide I band. The midpoint of the transition (Tm) is determined from a sigmoid fit, assuming a flat post-transition baseline. We have found this approach to produce a robust fit provided the data extend past the inflection point of the melt, because the unfolded state IR spectrum and hence the post-transition baseline exhibit little temperature dependence.10 The Tm for the cooperative transition is 77 °C regardless of the probe frequency, which agrees well with other techniques.11,15,16

While the M1 isotope label does not perturb the NTL9 structure or stability, the Aha (azide) label may not be so innocent. We previously determined that the Aha-M1 protein has nearly identical m-values (the slope of the ΔG° versus [urea] curve) but a slightly lower (0.81 kcal/mol) stability compared to wild-type, and spectral characteristics (CD, NMR) that are identical to wild-type.11 As a further test of the effect of the label, we compared the IR spectra of the unlabeled and Aha labeled proteins, which have identical amide bands, indicating that the secondary structure and overall fold of the protein does not change in the labeled protein. The melt curve of the mutant followed by the IR amide I absorbance at 1629 cm−1 has a Tm of 71 °C, consistent with its slightly lower stability (SI Appendix, Fig. S3). The azide ν3 asymmetric stretching vibration of the labeled protein, which occurs in an uncongested region of the IR spectrum as shown in Figure 3A, offers another probe of the protein structure and stability. The azide absorbance profile is complicated by a shoulder at 2117 cm−1. The shoulder could reflect multiple conformations or environments or it could be due to a Fermi resonance. Fermi resonances have been observed for azides directly attached to aromatic rings but have been proposed to be weak or nonexistent for alkyl azides.21,22 We previously reported that the intensity of the shoulder decreases as the pH is raised, which we attributed to de-protonation of the N-terminus.11 Structural and thermodynamic evidence support the formation of a salt-bridge between the carboxylate of D23 and the protonated N-terminus of M1.17,23 This effect is distinct from the temperature dependent changes observed in Figure 3A. The temperature induced unfolding of the protein leads to a decrease in the azide peak at 2094 cm−1 (folded) with a concomitant increase of a broad azide peak at 2115 cm−1 (unfolded). The melt curves (Figure 3B) each show a sigmoidal, cooperative transition with a Tm of 71°C. Despite the very different structures probed by the azide band, which reports on the local environment of the M1 side chain, and the amide I band, which reports on the global backbone conformation, we observe the same Tm within the error of the measurement. NTL9 appears to fold in a strictly two-state process, without populating any folding intermediates in an equilibrium sense.17

Temperature-Jump Time-Resolved IR Spectroscopy

While NTL9 is a two-state folder in a thermodynamic sense, it may still access folding intermediates transiently that are not detectable in equilibrium experiments. To test this possibility, we investigated the relaxation kinetics of NTL9 using laser induced T-jump coupled with time-resolved IR spectroscopy. Figure 4 compares the relaxation kinetics for Aha-M1-NTL9 monitored at 1629 cm−1 (global backbone ordering) and 2094 cm−1 (Aha-M1 local side chain ordering), following a T-jump of 11°C to a final temperature of 71°C, the midpoint of the transition for Aha-M1-NTL9. The initial bleach (µs time scale) is due to the protein relaxation, whereas the slower recovery of the bleach is due to the cooling of the solution (ms time scale). The observed relaxation kinetics for the side chain ordering is clearly slower than that of the backbone ordering. The data were fit to double exponential kinetics to account for both the protein folding dynamics and the longer time scale cooling. Full details of the fitting procedure are provided in the supplemental material. The observed relaxation lifetimes derived from the fits to the transients in Fig. 4 are 152 ±0.4 µs (1629 cm−1) and 505 ±34 µs (2094 cm−1). Multiple measurements at 2094 cm−1 gave an average of 529 µs and standard deviation of 39 µs. The error of the exponential fits is greater for the 2094 cm−1 transient for two reasons. First, the transient is noisier at 2094 cm−1 because the signal is about an order of magnitude smaller. Second, because the shorter lifetime at 1629 cm−1 allows us to fit out to more than 10 lifetimes, producing a very accurate fit whereas the longer lifetime at 2094 cm−1 limits the fit to about 4 lifetimes due to overlap with the cooling phase. Nevertheless, the difference in the lifetimes for ordering of the backbone and Aha side chain is clearly larger than the error of the measurement. It is also important to note that these measurements are made on the same sample to avoid any variance in kinetics due to sample conditions. Comparing the kinetics monitored by following the amide I band to the kinetics monitored by the azide group for the same Aha mutant compensates for any difference in stability with the WT protein. The measurements for the unlabeled backbone peak at 1629 cm−1 serve as a convenient internal standard for comparison of the kinetics for both the side chain and backbone labels. Thus, we observe that the global ordering of the backbone (sheet and helix formation) is three times faster than the side chain ordering sensed by the azide. The NTL9 folding kinetics monitored by global probes such as CD or fluorescence is extremely well fit by a cooperative two-state model,17 so the same kinetics are expected regardless of how they are probed. The two most likely interpretations of the different rates are: (a) the N-terminus, which is part of the first β-strand, folds on a slower timescale than the rest of the protein or (b) side chain ordering in the region of M1 is slower than the backbone ordering, possibly due to the formation of non-native interactions.

Figure 4.

Figure 4

Comparison of the relaxation kinetics of Aha-NTL9 monitored at 2094 cm−1 (azide) and 1629 cm−1 (amide I) (dashed curves) fit to single exponential functions (solid curves). The final temperature (Tf) following an 11 °C T-jump is 71 °C.

We tested whether the part of the β-strand containing the N-terminus folds on a different timescale than the global backbone ordering using the 13C=18O-M1 labeled version of NTL9. This sample allows us to probe the global folding of the backbone using the 12C=16O (unlabeled) IR band and the specific ordering at the M1 position using the 13C=18O (labeled) IR band. In this experiment, the relaxation kinetics of the labeled site (1594 cm−1) are directly compared to the unlabeled amide I band (1629 cm−1) of exactly the same sample, to avoid any effect of differing conditions on the observed rate. Figure 5 compares the relaxation kinetics for a T-jump of 11 °C to a final temperature of 77 °C, the midpoint of the transition for wild-type NTL9. The transients at these two probe frequencies are indistinguishable, and exponential fits yield relaxation lifetimes of 149 ±0.5 µs (1629 cm−1) and 141 ±0.5 µs (1594 cm−1). These lifetimes are essentially the same, indicating that the formation of the β-sheet structure (including strand 1 that contains M1) is concurrent with the global ordering of the NTL9 backbone. Thus we conclude that the slower rate observed for the azide label is due to slower ordering of the side chain compared to the backbone.

Figure 5.

Figure 5

Comparison of the relaxation kinetics of 13C=18O-M1-NTL9 at 1629 cm−1 (unlabeled amide I) and 1594 cm−1 (M1 labeled amide I). Tf is 77 °C following an 11 °C T-jump.

The slower side chain ordering at M1 reveals the existence of a near native kinetic intermediate with an alternative arrangement of side chains in the β1-loop- β2 motif that contains M1. Several other lines of evidence support this interpretation. Experimental studies have shown that there are electrostatic interactions involving residues D8 and K12 in the unfolded state and in the transition state for folding that do not persist in the native state, suggesting that this region is prone to misfolding. 14,17,23 Somewhat analogous folding behavior has been observed for the PDZ domain and the SH3 domain,24 suggesting that NTL9 is not a special case. The PDZ domain folds through a kinetic intermediate that is largely stabilized by native like interactions but is misfolded in a limited region involving the packing of the N-terminal β-hairpin and the second helix.3 Finally, our interpretation is supported by the results of molecular dynamics (MD) simulations of the folding of a truncated variant of NTL9 by the Pande group.25 Long timescale MD simulations of NTL9 (1–39) folding trajectories using a Markov state model suggest that the formation of the β1-β2 strand pairing is the rate limiting step for folding. The simulations show that the loop connecting the two strands is highly flexible, probably because it contains seven of the protein’s eight lysine residues and three of its five glycine residues. The Pande group has postulated that this flexibility may produce a large entropic barrier to folding as the system searches to find the correct cross-strand interactions. Furthermore, they observed the protein trapped in a near-native configuration with alternative side-chain arrangements in the β-sheet structure. In addition, they resolved a subtle kinetic intermediate corresponding to an alternative arrangement of the β1-loop- β2 motif.24

CONCLUSION

In summary, we have shown that the IR probes azidohomoalanine and 13C=18O labeled methionine can be inserted readily into proteins that are recombinantly expressed in an E. coli Met auxotroph. Combining this labeling method with T-jump IR spectroscopy, we have determined that the ordering of the backbone relative to the side chains in the β1-loop- β2 motif of NTL9 occurs on different timescales, despite the observation that all global probes of its folding are consistent with strictly two-state behavior. Our results support the prediction from MD simulations that the formation of alternative side chain arrangements of the β-hairpin loop and subsequent unfolding/refolding of these interactions plays an important role in the rate limiting step of the folding of NTL9. These results demonstrate that even simple, apparently two state folders can sample a rich free energy landscape. The results described here reveal the potential of IR in conjunction with independent labeling of the backbone and side chains for elucidating the details of protein folding with unprecedented molecular specificity.

Supplementary Material

SuppMatl

ACKNOWLEDGMENT

This work was supported by NIH grant GM053640 (R.B.D.) and NSF grant MCB-0919860 (D.P.R.).

Footnotes

ASSOCIATED CONTENT

Supporting information material contains supporting figure and legends. This material is available free of charge via the Internet at http://pubs.acs.org.

REFERENCES

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SuppMatl

RESOURCES