Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Jul 2.
Published in final edited form as: Chemphyschem. 2020 Jun 4;21(13):1436–1443. doi: 10.1002/cphc.202000249

Accurate Backbone 13C and 15N Chemical Shift Tensors in Galectin-3 by MAS NMR and QM/MM: Details of Structure and Environment Matter

Jodi Kraus 1,2, Rupal Gupta 1,3, Manman Lu 1,2,4, Angela M Gronenborn 2,4,*, Mikael Akke 5,*, Tatyana Polenova 1,2,*
PMCID: PMC8080305  NIHMSID: NIHMS1691982  PMID: 32363727

Abstract

Chemical shift tensors obtained from solid-state NMR spectroscopy are very sensitive reporters of structure and dynamics in proteins. While accurate 13C and 15N chemical shift tensors are accessible by magic angle spinning (MAS) NMR, their quantum mechanical calculations remain challenging, particularly for 15N atoms. Here we compare experimentally determined backbone 13Cα and 15NH chemical shift tensors by MAS NMR with hybrid quantum mechanics/molecular mechanics/molecular dynamics (MD-QM/MM) calculations for the carbohydrate-binding domain of galectin-3. Excellent agreement between experimental and computed 15NH chemical shift anisotropy values was obtained using the Amber ff15ipq force field when solvent dynamics was taken into account in the calculation. Our results establish important benchmark conditions for improving the accuracy of chemical shift calculations in proteins and may aid in the validation of protein structure models derived by MAS NMR.

Keywords: Chemical shift anisotropy, microcrystalline protein, QM/MM, recoupling, solid-state NMR

INTRODUCTION

Magic angle spinning (MAS) solid-state NMR spectroscopy is a powerful tool for the characterization of structure and dynamics of biological assemblies at atomic resolution.[1] The most widely used NMR observable is the chemical shift since it is intricately dependent on the surrounding environment of the nucleus, influenced by conformation, motions, hydrogen bonding and electrostatic interactions.[24] In solution, only isotropic chemical shifts can be extracted from the NMR spectrum, while in the solid state the anisotropic components of the chemical shift tensor (CST) can also be obtained. Importantly, chemical shift anisotropy (CSA) parameters can be used during the refinement stages of a protein structure determination to improve the accuracy of solid-state NMR structures.[5]

Assessment of protein chemical shifts with respect to structure either relies on empirical formulae derived from databases of experimental chemical shifts or quantum mechanical (QM) methods, with the latter, in principle, permitting accurate calculations of CSTs.[68] QM calculations for different levels of theory and molecular models have enjoyed varying degrees of success, although they remain particularly challenging for amide protons, carbonyl, and amide nitrogen atoms.[912] Commonly, Density Functional Theory (DFT) is employed; however, DFT scales as N4 with the number of atoms, rendering it computationally too expensive for all but small systems. For chemical shift calculations of proteins, most frequently the protein is partitioned into smaller fragments that capture the local environment around the atoms under consideration. Fragments (clusters) are either selected based on simple distance criteria, or can be judiciously constructed to capture interactions that are important for the system of interest.[1315] Naturally, truncation of the protein into smaller fragments inevitably results in loss of accuracy.[16, 17] Another approach uses hybrid quantum mechanics/molecular mechanics (QM/MM), which permits balancing the computational demands and the attainable accuracy. In essence, this method entails treating the region of interest quantum mechanically and a larger region around it with molecular mechanics. This approach can be implemented in an automated manner, (AF-QM/MM),[18] where the individual calculations are parallelizable. Therefore, AF-QM/MM has unique advantages and can be easily implemented into protein structure refinement algorithms.

We previously investigated the accuracy of chemical shifts calculated by various quantum mechanical methods, exploring the dependence on the level of theory (functional and basis set), the nature and the size of the molecular fragment, and the inclusion of crystallographic water molecules.[1012] We found that prediction of accurate backbone 15NH chemical shifts is still challenging, given their complex relationship with a variety of different factors, most notably hydrogen bonding. Here, we expanded our earlier studies and systematically investigated the accuracy of 15NH and 13Cα chemical shift predictions for the carbohydrate recognition domain of galectin-3 (termed “galectin-3C” in the remainder of this text), a β-sheet 15.7 kDa protein, containing 10 anti-parallel β-strands, a small, 5-residue helix and several loops.[19] Galectin-3C is ideally suited as a benchmark protein since it has been extensively characterized by solution NMR,[20, 21] ~40 high-resolution crystal structures are available [19, 2227] as well as a neutron diffraction structure at 1.7 Å resolution that permitted placing hydrogen atoms.[27] Furthermore and importantly, we acquired extensive solid-state NMR experimental data for the crystal form that was used for both the X-ray and neutron diffraction studies. We measured 13Cα and 15NH CSTs for microcrystalline galectin-3C in MAS NMR experiments and compared the measured values with CSTs calculated based on the high-resolution X-ray and neutron diffraction structures. We systematically examined geometry optimization protocols in both the QM and MM regions of the AF-QM/MM calculations as well as tested the influence of solvent dynamics for arriving at the best agreement between the computed and experimentally measured chemical shift tensor values. A high level of agreement between computed and experimentally determined 13Cα CSA values was found, similar to our previous results for isotropic 13Cα shifts in the same system.[11] For 15NH CSAs, the agreement is not as satisfying, but can be significantly improved by i) using the most up to date force fields (for example, Amber ff15ipq) and ii) taking solvent dynamics into account. As a result, we achieved the best to date agreement between experimental and calculated 15NH CSTs. Therefore, we are confident that implementation of quantum mechanical CST calculations into protein structure refinement algorithms will yield improved MAS-derived protein structures in the future.

RESULTS AND DISCUSSION

Experimental 13Cα and 15NH CSA Tensors

We obtained very high resolution 2D and 3D MAS NMR spectra for galectin-3C, which permitted complete resonance assignments for 136 out of 138 residues,[11] and the determination of 73 13Cα and 77 15NH site-specific chemical shift tensors was achieved using 3D RNCSA experiments.[28] Representative CSA lineshapes are shown in Figure 1. The values for both 13Cα and 15NH CSTs are consistent with those in other rigid microcrystalline proteins such as the B1 domain of protein G (GB1),[29] E. coli thioredoxin,[30] dynein light chain 8 LC8,[30] CAP-Gly domain of the mammalian dynactin,[31] and Oscillatoria agardhii agglutinin (OAA).[12] Since galectin-3C is a primarily β-sheet protein, the 13Cα CSTs are all associated with residues in β-strands or loops/turns regions, and the values are indicative for this local geometry. In contrast to 13Cα, 15NH CSTs are significantly influenced by hydrogen bonding, long-range electrostatics and the solvent environment, as reported recently for microcrystalline E. coli thioredoxin reassemblies, LC8, and CAP-Gly. Similar findings emerged for galectin-3C, for which an average 15NH isotropic chemical shift for non-hydrogen bonded residues of 118.6 ppm is observed, while for hydrogen-bonded residues this value is 121.6 ppm. The same trend is seen for the principal components of the tensor (Table S1 of Supporting Information). The systematic effect of hydrogen bonding on experimental 15NH CSTs is an important effect to consider when assessing the tensors calculated by QM/MM and attention has to be paid that distinct structural features have to be implemented for reliably using calculated CSTs in structure refinement protocols.

Figure 1.

Figure 1.

a) Structure of galectin-3C (PDB ID: 3ZSJ), indicating one representative region of 3.5 Å sphere. The central residue (ball and stick representation) and a boundary region (wireframe representation) is depicted. The remainder of the protein is represented by point charges, illustrated by the mesh around the ribbon structure. b) The NCA 2D spectrum (left) and representative experimental (black) and simulated (magenta) 13Cα and 15NH CSA lineshapes (right) for residues I240 and S244. Experimental CSA lineshapes were extracted from the corresponding peaks in the 2D NCA spectrum.

Backbone 13Cα Chemical Shift Tensors from QM/MM

Comparing the 13Cα experimental CSA values to those calculated by QM/MM for the 0.86 Å resolution X-ray crystal structure (PDB ID: 3ZSJ) of lactose-bound galectin-3C,[19] excellent agreement was observed (Figure 2). The average relative error (i.e. the absolute error compared to the experimental 13Cα CSA magnitude) is only 0.1, suggesting that the region treated at the quantum level in the AF-NMR fragments faithfully reproduces the 13Cα CSA. Since the 13Cα CST is predominantly influenced by local geometry, the 3.5 Å cutoff distance is sufficient to represent the core of each fragment. Figure 2a shows the differences between calculated and measured 13Cα reduced anisotropy, Δδσσ = δ11 − δiso), plotted versus residue number. The accuracy of the calculated 13Cα CSTs appears similar throughout the entire chain. The significant outliers are residues with Δδσ > 2 ppm, which represents the approximate maximum error in the experimental measurement. Such error is typically no greater than 10% of the δσ value for CSTs measured using these RNCSA-based pulse sequences. For fifteen residues with Δδσ > 2 ppm, the largest differences come from glycine residues. This may be related to dynamics.[32, 33] If residues undergo motions on nano- to microsecond timescales, the tensors are dynamically averaged, and accurate calculations require an integrated MD-QM/MM approach. This was previously noted for HIV-1 CA capsid protein assemblies.[32] However, galectin-3C does not exhibit dynamics on these timescales, apart from terminal residues, since the dipolar order parameters for both 1H-13C and 1H-15N never fall below 0.9.[11] The differences between calculated and observed principal components for δ22 and δ33 are very similar, while those for δ11 are smaller. Specifically, the root-mean-square errors (RMSEs) are 3.9, 4.7, and 4.8 ppm for δ11, δ22, and δ33, respectively, and are summarized in Table S2 of the Supporting Information. Naturally, the exact tensor orientations in the molecular frame vary depending on residue type and local backbone conformation. In addition, the position of the hydrogen atoms will also affect the tensor orientation. Here, the Hα atoms were positioned using Amber ff99sb force field libraries and this will affect the QM/MM calculations, i.e., may contribute to the noted differences between the experimental and QM/MM 13Cα chemical shift tensor orientations.

Figure 2.

Figure 2.

13Cα chemical shift tensors for galectin-3C from MAS NMR experiments and QM/MM calculations using the X-ray structure (PDB ID: 3ZSJ) as input coordinates. a) Absolute difference between QM/MM calculated and MAS NMR 13Cα reduced anisotropy parameters, plotted versus residue number. β-sheet secondary structure is indicated with grey bars, and loop regions are shown in white. bars, and loop regions are shown in white. Residues exhibiting differences larger than 6 ppm between experimental and calculated 15NH δσ are labeled in magenta. NH atoms that are involved in H-bonds within the protein are labeled with a magenta asterisk and those that hydrogen bond to a crystallographic water molecule are labeled with a black asterisk. b) Linear correlation between the experimental and QM/MM calculated 15NH CST principal components. For comparison, a trend line representing perfect agreement between theory and experiment is shown as the black solid line c) Residues (in magenta stick representation) with differences >6 ppm between experimental and calculated 15NH CST principal component values mapped on the galectin-3C structure (grey ribbon representation).

Backbone 15NH Chemical Shift Tensors from QM/MM

The role of input coordinates

The comparison between the experimental backbone 15NH CSTs and those calculated by QM/MM using the X-ray structure (PDB ID: 3ZSJ) as input coordinates is shown in Figure 3. Here, large differences are noted and 17 out of the 21 outliers exhibit errors Δδσ > 6 ppm. The corresponding RMSEs for all tensor components are listed in Table S2 of the Supporting Information. Interestingly, all of these outliers are associated with β-strands. 15 out of the 21 are involved in hydrogen bonding, donating a H-bond to a backbone CO in the anti-parallel β-sheet. Two other 15NH CSTs which exhibit large differences are N119 and Y247. Both of these amide protons hydrogen-bond to a crystallographic water molecule. These data suggest that the effect of hydrogen-bonding is not correctly taken into account in the QM/MM calculations based on modeled hydrogen positions. In addition, it may be the case that the overall effect of the hydrogen bond network in β-sheets, in which each hydrogen bonding unit affects its neighbors, cannot be accurately accounted for by including only those individual hydrogen-bond partners that are contained in the 3.5 Å sphere fragment. Thus, it seems insufficient to treat a single hydrogen bond in isolation in CSA calculations for deriving accurate 15NH CSTs. Indeed, a previous computational study by Viswanathan et al. showed that hydrogen bonding enthalpy is more favorable when extended β-strands self-associate to form β-sheets,[34] suggesting that the fragments for AF-QM/MM calculations do not correctly capture the effects imparted by extended β-sheets.

Figure 3.

Figure 3.

15NH chemical shift tensors for galectin-3C from MAS NMR experiments and QM/MM calculations using the X-ray structure (PDB ID: 3ZSJ) as input coordinates. a) Absolute difference between QM/MM calculated and MAS NMR 15NHreduced anisotropy parameter, plotted versus residue number. β-sheet secondary structure is indicated with grey bars, and loop regions are shown in white. Residues exhibiting differences larger than 6 ppm between experimental and calculated 15NH δσ are labeled in magenta. NH atoms that are involved in H-bonds within the protein are labeled with a magenta asterisk and those that hydrogen bond to a crystallographic water molecule are labeled with a black asterisk. b) Linear correlation between the experimental and QM/MM calculated 15NHCST principal components. For comparison, a trend line representing perfect agreement between theory and experiment is shown by the black solid line c) Residues (in magenta stick representation) with differences >6 ppm between experimental and calculated 15NH CST principal component values mapped on the galectin-3C structure (grey ribbon representation).

To test whether the accuracy of the calculated 15NH CSTs can be improved by including the experimentally determined hydrogen atom positions, we used the coordinates of the 1.7 Å resolution neutron structure of galectin-3C (PDB ID: 6EYM). Following the generation of fragments for each residue, geometry optimization by DFT was carried out to optimize the 1HN position. The optimization protocols were systematically tested for a set of 20 residues (representing residues with both the best and worst agreement), and the results were compared to the equivalent calculations using the Amber-minimized X-ray crystal structure. As can be noted, optimization of hydrogen atom positions in the neutron structure fragments by DFT only resulted in a marginal improvement of the calculated 15NH CSTs (Figure 4f and Table 1). The average difference between the N- H bond length in the Amber-minimized X-ray crystal structure and the DFT optimized neutron structure is 0.01–0.02 Å. The results of these calculations are in accord with our previous work,[12] which found that such a difference in the N-H bond length will contribute <2 ppm towards the magnitude of the reduced 15N anisotropy.

Figure 4.

Figure 4.

Summary of calculation strategies for 15NH CSTs in galectin-3C. (a-d) Calculations for T133 as an example. QM/MM calculations were performed using two different force fields, Amber ff99sb (a) and ff15ipq (c). (b) Effect of DFT optimization of the hydrogen atom positions based on the neutron structure. (d) Solvent dynamics using solvated snapshots from a 40 ps MD simulation. (e-h) Linear correlation between experimental and calculated 15NH δσ from QM/MM with ff99sb (e) and ff15ipq (g), DFT optimized 1HN positions from the neutron structure (f), and MD-QM/MM using ff15ipq and solvated snapshots (h). (i-l) Linear correlation between experimental and calculation 15NH CST principal component values from QM/MM with ff99sb (i) and ff15ipq (k), DFT optimized 1HN positions from the neutron structure (j), and MD-QM/MM using ff15ipq and solvated snapshots (l). For comparison, a trend line representing perfect agreement between theory and experiment is shown as black solid lines.

TABLE 1.

Summary of 13C and 15N CSA tensors calculations for a selected set of 20 residues in galectin-3C for (representing the best and worst agreement with experiment).

QM/MM (ff99sb) QM/MM (ff15ipq) DFT optimized MD-QM/MM (with ff15ipq)
m R2 m R2 m R2 m R2
δiso 0.99 0.93 0.95 0.88 1.02 0.97 0.99 0.96
δσ 1.07 0.46 1.08 0.27 1.03 0.61 1.09 0.43
NH δiso 1.10 0.82 0.97 0.74 1.12 0.56 0.55 0.26
δσ 0.51 0.22 0.67 0.54 0.73 0.31 0.92 0.57

However, for some residues, the calculated 15N δσ values differ by >2 ppm between the two optimization methods, suggesting the presence of additional contributions. In these cases, inspection of individual fragments is instructive: For L131, the number and nature of atoms in the 3.5 Å sphere are identical for both structures, although we observe a reduction in Δδσ of 10.2 ppm when using DFT to optimize the geometry of the neutron structure fragment (Figure S1). Upon close inspection we find that the 15NH and 1HN positions in the central residue are slightly different between the two fragments, which could affect the 15N CSA, however unlikely to such a large degree. In addition, the i-1 residue in the fragment, M130, exhibits a different sidechain conformation in the two structures. In the X-ray structure, two sidechain conformers (A and B) with different occupancies are modeled for M130. Conformer A has an occupancy of 0.7 and this conformer is present in the DFT optimized neutron structure fragment. In the Amber minimized X-ray fragment, the side chain of M130 is found in the low occupancy (0.3) conformer B. Since conformer A, the one in the DFT optimized neutron structure fragment, yields better agreement between experiment and calculation it may be likely that this side chain conformation is the predominant one in the microcrystalline sample used for MAS NMR. Similar detailed differences between the DFT optimized neutron and the Amber minimized structures apply to other fragments. For all fragments, the central core atom positions are very similar, while other atoms in the surrounding region can occupy different positions. Such differences may be due to the Amber minimization protocol, while the differences in hydrogen atom positions may result from the DFT optimization. As a result, we suggest that reliable 15NH CSA calculations require not only accurate geometries for the core residue in the fragments, but also for the entire surrounding region.

The role of MM force field

In order to assess how the choice of the force field in the initial structure equilibration influences the input coordinates and the accuracy of the calculated 15NH CSA, we applied the Amber force field ff15ipq[35] to minimize the X-ray structure. Unlike other force fields, ff15ipq is re-parameterized to better capture the conformational propensities and preferences for each amino acid type. We hypothesized that the resulting energy-minimized structure would be a better starting structure for the AF-QM/MM calculations. Results obtained with the ff99sb and ff15ipq energy-minimized structures for the identical set of 20 residues are shown in Figure 4g and Table 1. The agreement between experimental and calculated 15NH CSA is improved, with an increase in R2 value from 0.22 (ff99sb) to 0.54 (ff15ipq). The improvement using a superior force field is more significant than using DFT to optimize 1HN positions in the neutron crystal structure. As pointed out above, manual inspection of the fragment geometries revealed the occasional difference in side chain orientation. In general, however, it is clear the details of the force field generating the input structure clearly influence the calculated CSA values.

For the calculations performed using ff99sb (Figure 4i), there is a systematic offset of the δ11 component, which is approximately aligned with the N-H bond vector in the molecular frame. The systematic offset for the δ11 component is eliminated in the calculations performed using ff15ipq (Figure 4k). Experimental and calculated δ22 and δ33 components show no significant differences for ff99sb versus ff15ipq minimized structures. As a result, inaccuracies in the δ11 component will affect the 15NH CSA. When comparing the minimized structures, the heavy atom RMSD between the minimized structures and the x-ray crystal structure is 0.06 and 0.03 Å for the calculations performed with ff99sb and ff15ipq respectively. Therefore, it is likely that ff15ipq is more reliable in capturing hydrogen bonding interactions (which directly affect δ11), due to its implicitly polarized charge model. Taken together, it is clear that the 15NH CST is influenced by multiple factors that must be carefully and accurately accounted for in order to achieve high levels of agreement between theory and experiment.

Inclusion of Solvent Dynamics from MD-QM/MM Increases Accuracy of 15NH Chemical Shift Tensors

Finally, we also evaluated the effect of including solvent dynamics in the QM/MM calculations. Within a protein crystal, water molecules are dynamic on the picosecond timescale[36] and it is not sufficient to include static crystallographic water molecules. We tested the role of solvent dynamics by performing a 40 picosecond molecular dynamics simulation in explicit solvent using the ff15ipq force field. Coordinate files were generated every 2 picoseconds, and a AF-QM/MM calculation was performed for each snapshot. After 40 picoseconds, the 15NH CSA values were averaged to generate an “ensemble-averaged” value. The results of these calculations are summarized in Figure 4h and Table 1. As can be appreciated, this methodology yielded the most accurate results to date, with a correlation coefficient close to 1 (0.92) and an R2 value equal to 0.57. Since galectin-3C does not exhibit large amplitude motions on the nano- to micro-second time scale, this improvement is due to the combination of using the most up-to-date force field as well as including solvent dynamics and short time scale conformational dynamics. The heavy atom RMSD between the structure snapshots throughout the MD trajectory varies from 0.5 Å to 0.8 Å, which is higher than for the static structures minimized from ff99sb and ff15ipq. However, throughout the trajectory, there is no correlation between RMSD and accuracy of the computed CSTs. Similar to the calculations based on the static ff15ipq refined structure, the δ11 component does not exhibit a systematic offset. Additionally, the correlation between the experimental and calculated δ33 component is also improved. Since the δ33 component is also dependent on hydrogen bonding, it is important to represent these interactions correctly. For example, for residues located in loop regions, the simulation with explicit solvent indicates that the carbonyl group of the preceding residue forms a hydrogen bond with a water molecule. This interaction persists throughout the duration of the MD simulation, influencing the local environment in the peptide plane and modulating the δ33 component of the CST. Thus, the modest improvement observed in the simulation with explicit solvent is readily accounted for. However, we also note that, while the agreements of Δδσ and the underlying principal components are all improved, the agreement of the isotropic chemical shift (δiso) becomes worse. Maybe the good agreement for δiso using the ff99sb-minimized X-ray structure is due to cancellation of errors in the principal components.

Taken together, our findings demonstrate that using improved force fields with the inclusion of solvent dynamics, significant improvements in accuracy of 15NH CSA calculations are obtained. Importantly, our calculations were inexpensive with regard to computing power; the substitution of the ff99sb force field by ff15ipq added no additional computational time and 40 picoseconds molecular dynamics simulation are routinely accessible. Therefore, this approach is cheap and easy to implement into NMR structure refinement protocols.

While considerable improvements were seen using the above approach, we did not evaluate the effects of long-range electrostatics, currently modeled using point charges in the MM region. Since point charges directly affect the QM sub-system, their incorrect handling may contribute to errors in 15NH CSA calculations. Using polarizable force fields allows for the MM and QM regions to be coupled to one another electrostatically, and the QM and MM regions are mutually polarized. Thus, instead of point charges, point multipoles are used to model electrostatic interactions. The use of polarizable force fields like AMOEBA[37] may offer attractive alternatives to conventional additive QM/MM schemes and may further improve the accuracy of the calculated 15NH CSA values.

CONCLUSIONS

Here we evaluated the agreement between MAS NMR-derived experimental and QM/MM calculated backbone CSA tensors for galectin-3C. 13Cα CSAs calculated from QM/MM generally agree well with the experimental values, indicating that fragmentation of the overall structure into 3.5 Å spheres treated by DFT represent an effective approach. For 15NH CSAs, considerable variability in accuracy is observed, caused by details in a variety of factors, such as hydrogen bonding, electrostatics, and solvation. Significant improvement in the accuracy of calculated 15NH CSAs was noted when using the Amber ff15ipq force field, combined with a 40-picosecond molecular dynamics simulation. Overall, our results illustrate the importance of accurate input coordinates for calculating CSA tensors, with further improvements via the inclusion of ensemble averages calculated over short dynamics trajectories. Correctly taking into account the multiple factors that contribute to the 15NH CSA will allow to extract rich structural information in the future. Ongoing and further improvements in computational strategies will open the door for implementing CSA tensors into protein structure refinement.

MATERIALS AND METHODS

Protein expression, purification and crystallization of galectin-3C

Protein expression, purification and crystallization of galectin-3C were performed as described previously.[19, 38, 39] The final protein purity was >95% as determined by SDS-PAGE and >98% by solution NMR. Microcrystals of galectin-3C were grown and 30 mg were packed into a 3.2 mm Bruker thin-walled rotor for MAS NMR experiments.

MAS NMR experiments

MAS NMR experiments were performed on a 14.1 T narrow bore Bruker AVIII spectrometer using a 3.2 mm HCN EFree MAS probe. 1H, 13C, and 15N Larmor frequencies were 599.8 MHz, 150.8 MHz, and 60.8 MHz, respectively. 14 kHz MAS was used for all experiments and was maintained to within ± 5 kHz by a Bruker MAS III controller. The temperature of the sample in the MAS rotor was 4 ± 0.1 °C and kept using the Bruker BCU temperature controller. 90° pulse lengths were 2.9 μs (1H), 3.7 μs (13C), and 4.8 μs (15N), with cross polarization contact times of 2.0 ms (1H-15N) and 1.0 ms (1H-13C). 1H-15N and 1H-13C CP used a 95–105% linear amplitude ramp on the 1H channel. Band-selective 15N-13Cα transfer was achieved through the use of a 5.0 ms SPECIFIC-CP[40] with a tangent amplitude ramp on the 15N channel and a constant rf field on the 13C channel. SPINAL-64 decoupling[41] was used during direct and indirect acquisition periods. For 15NH and 13Cα RNCSA 3D experiments,[28] R1425 and R1013 recoupling sequences were used to selectively recouple the CSA interaction during t1 evolution. The basic R element was a π pulse. The rf field strength (N/2nωr) and phase (ν/N*180°) used for each symmetry sequence depended on the symmetry properties for each sequence. Heteronuclear interactions were decoupled during the RNCSA recoupling period by applying a π pulse on either the 13C or 15N channel at the center of each R cycle. A recycle delay of 2.0 seconds was used for all experiments.

NMR data processing

NMR data processing was performed using NMRPipe[42] and all spectra were analyzed using Sparky. For the 3D datasets, 45° shifted sine bell apodization was used, followed by Lorentz-to-Gaussian transformation in the 13C and 15N dimensions. No apodization functions were applied to the 13Cα or 15NH resonances during data processing. CSA lineshapes were extracted in a semi-automated process using home-written scripts.

Simulation of CSA lineshapes

Simulation of CSA lineshapes was performed using Simpson version 1.1.2.[43] To produce a powder average, 320 pairs of (α,β) angles were generated using the REPULSION[44] algorithm and 16 γ angles (5120 angle triplets) were used for all simulations.

QM/MM

QM/MM calculations of protein backbone 13C and 15N CSA tensors were performed using Gaussian 09[45] at the OLYP[46]/tzvp[47] level of theory in the quantum mechanical region. Each input file was generated by AF-NMR. Initial coordinates for the calculations were derived from the X-ray crystal structure (PDB ID: 3ZSJ). The input coordinates of the X-ray structure were prepared for Amber minimization by removing any ligands, crystallographic water molecules and adding hydrogen atoms. This structure was then minimized using either the Amber FF99SB or the Amber FF15IPQ molecular mechanics force field. Calculated CSTs were referenced to ubiquitin (PDB ID: 1D3Z) calculated at the same level of theory (1H=32.0 ppm, 13C=182.5 ppm, and 15N=237.8 ppm). Linear regression analysis was performed for each set of calculations to assess the agreement between experimental CSTs from MAS NMR and calculated CSTs from QM/MM. Perfect agreement between theory and experiment would be a slope and an R2 value of unity, with no offset (intercept of 0).

DFT calculations

DFT calculations of protein backbone 13C and 15N CSA tensors were performed using Gaussian 09 at the OLYP/tzvp level of theory. Input files for 20 residues representing the best and worst agreement with experiment were generated by AF-NMR with no minimization or imbedded point charges. Initial coordinates were derived from the neutron crystal structure (PDB ID: 6EYM). The input coordinates of the neutron crystal structure were prepared by removing any ligands, crystallographic water molecules, directly replacing deuterium atoms with hydrogen atoms. For each fragment, heavy atoms were fixed and the hydrogen (deuterium) atom geometries were optimized using B3LYP/6–31g level of theory before calculating NMR CSA tensors.

Molecular dynamics simulations

Molecular dynamics simulations of galectin-3C (PDB ID: 3ZSJ) were performed using OpenMM[48] with Amber ff15ipq and the SpcE water model. To prepare the simulation, a water box equal to the size of the crystal unit cell was constructed, and Cl counter ions were added to neutralize the charge. The structure was equilibrated and energy minimized for 1000 steps of 0.002 picoseconds. Following energy minimization, the molecular dynamics simulation was performed in explicit solvent with Langevin dynamics at 300 K for 20000 time steps, equivalent to 40 picoseconds. A snapshot was generated after every 2 picoseconds, and each snapshot was used directly as input into AF-NMR to calculate 13C and 15N CSA tensors.

Ensemble averaged CSA tensor calculations

Ensemble averaged CSA tensor calculations were prepared using snapshots from the MD simulations with QM/MM calculations. 20 solvated snapshots were extracted from the MD simulation. Each snapshot was used as input into AF-NMR without further manipulation, so the input files contained explicit water molecules. For the 20 residues of interest (which represented the best and worst agreement with experiment), a QM/MM calculation was performed for each snapshot, totaling 20 Gaussian 09 calculations per residue. The CSA tensors were then averaged for each residue, and referenced as described previously.

Supplementary Material

Supporting information

ACKNOWLEDGEMENTS

We thank David Case for useful feedback and discussions on how to modify AF-NMR for different force fields and the use of solvated snapshots. This work was supported by the National Institutes of Health (NIH Grants P50AI1504817 and P50GM082251, Technology Development Project 2) and is a contribution from the Pittsburgh Center for HIV Protein Interactions. JK is supported by the National Science Foundation Graduate Research Fellowship Program (#1247394). We acknowledge the support of the NIGMS P30GM110758-01 grant for the support of core instrumentation infrastructure at the University of Delaware. Protein production was carried out by the Lund Protein Production Platform (LP3) at Lund University. MA was supported by the Swedish Research Council (2018-4995) and the Knut and Alice Wallenberg Foundation (2013.0022).

REFERENCES

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting information

RESOURCES