Abstract
Differential epitope mapping saturation transfer difference (DEEP‐STD) NMR spectroscopy is a recently developed powerful approach for elucidating the structure and pharmacophore of weak protein–ligand interactions, as it reports key information on the orientation of the ligand and the architecture of the binding pocket.1 The method relies on selective saturation of protein residues in the binding site and the generation of a differential epitope map by observing the ligand, which depicts the nature of the protein residues making contact with the ligand in the bound state. Selective saturation requires knowledge of the chemical‐shift assignment of the protein residues, which can be obtained either experimentally by NMR spectroscopy or predicted from 3D structures. Herein, we propose a simple experimental procedure to expand the DEEP‐STD NMR methodology to protein–ligand cases in which the spectral assignment of the protein is not available. This is achieved by experimentally identifying the chemical shifts of the residues present in binding hot‐spots on the surface of the receptor protein by using 2D NMR experiments combined with a paramagnetic probe.
Keywords: DEEP-STD, mixed molecular dynamics, NMR spectroscopy, TEMPOL
The 3D structure of a small bioactive molecule in complex with its receptor gives atomic information that is essential for understanding the biological effects triggered by biomolecular recognition processes, as well as for the discovery and design of new drugs. Several techniques are used to achieve this aim, with X‐ray crystallography, NMR spectroscopy and, more recently, cryo‐electron microscopy being the most relevant approaches.
Recently, we developed the DiffErential EPitope mapping saturation transfer difference (DEEP‐STD) NMR methodology for weak protein–ligand interactions,1 as an extension of the general STD NMR method.2 The DEEP‐STD NMR technique allows the orientation of the ligand to be derived through differential selective saturation of different sets of key protein residues in the binding site. Namely, two STD NMR experiments are carried out, each one saturating different sets of protein residues, and the difference between the resulting spectra is quantified and mapped onto the ligand structure (differential epitope map). In order to perform the DEEP‐STD NMR experiment accurately, it is of paramount importance to know beforehand the chemical shifts of the residues present in the binding site, in order to identify which set of residues to target, that is, choosing the irradiation frequencies to ensure that the selective saturation is applied on residues that are present in the binding site. Experimental chemical shifts can be obtained by NMR spectroscopy or derived from a 3D structure obtained by X‐ray diffraction or by homology modelling.3 The experimental DEEP‐STD factors can be further combined with molecular docking and STD intensity predictions by CORCEMA‐ST4 in order to select the docking model that best fits the experimental data.
For when there is no chemical‐shift assignment of the receptor protein, we here propose a general approach to experimentally identify the chemical shifts of those binding pocket resonances that relies on identifying ligand binding hot‐spots on the surface of the protein (a ligand‐binding hot‐spot is a site on the surface of the protein that has a high probability for interaction with a ligand5) by using 2D NMR spectroscopy. This approach is compatible with the STD NMR technique, inexpensive and relatively fast, all of which which should allow broad applicability.
The slow molecular tumbling of large proteins in solution is characterized by an overall correlation time expected to be in the range of 10−8 seconds. However, the internal correlation time of the surface residues of globular proteins might be significantly shorter. As a result, residues in the core of the protein follow a slow‐motion regime due to their low flexibility; this causes the signals to become broadened beyond detection. Conversely, the greater flexibility of surface residues causes them to follow a fast‐motion regime that leads to crosspeaks in 2D NMR experiments that will be narrower and, hence, detectable. Therefore, these spectra are more likely to display signals from residues exposed on the surface or in very flexible regions of the protein.6
By using 2D 1H,1H TOCSY experiments, hot‐spots can be readily mapped by adding paramagnetic probes such as 4‐hydroxy‐2,2,6,6‐tetramethylpiperidin‐1‐oxyl (TEMPOL) to a protein sample.7 A decrease in the intensity of specific protein TOCSY crosspeaks, compared with the spectrum recorded without the paramagnetic probe, allows those residues interacting with the probe to be easily identified because they are affected by the paramagnetic relaxation enhancement (PRE) effect.8, 9 Previous PRE studies with TEMPOL have demonstrated the greater accessibility of this probe to proteins′ specific binding sites rather than surface regions.7, 10, 11 As a result of these experiments, the identified hot‐spot resonances can then be considered as input frequencies for the DEEP‐STD NMR experiments.
Binding hot‐spots on the surface of proteins have previously been identified by NMR spectroscopy using paramagnetic probes along with classical molecular dynamics (MD) as well as by mixed MD using 5–50 % probe/water mixtures.7, 12, 13 For the development of our protocol, we combined this approach with DEEP‐STD NMR using the structurally characterized catalytic domain (belonging to glycoside hydrolase family 33, GH33) of the intramolecular trans‐sialidase (IT‐sialidase) from the human gut symbiont Ruminococcus gnavus, RgNanH‐GH33, in complex with 2,7‐anhydro‐Neu5Ac (PDB ID: 4X4A) as a benchmark.14 RgNanH‐GH33 is a 489‐residue domain that can be considered out of the typical range for swift assignment and structure determination by NMR spectroscopy and was previously used to develop the DEEP‐STD NMR approach.1
We first performed 2D homonuclear 1H,1H TOCSY experiments on RgNanH‐GH33. The protein was exchanged in 10 mm [D11]Tris D2O buffer (pH 7.8) with 100 mm NaCl and used at a concentration of 1.2 mm. First, a 2D 1H,1H TOCSY reference spectrum of the protein was acquired, then two spectra in the presence of 2 and 12 mm of TEMPOL. The spectra obtained in the absence or in the presence of increasing concentrations of TEMPOL showed that the probe selectively interacts with some residues of the protein, as only some resonances in the spectra were significantly affected, as seen by a decreased intensity (Figure 1). The chemical shifts most affected by the presence of TEMPOL were at 0.6, 0.74, 1.06, 1.15, 1.26, 6.6, 6.74, 7.04, 7.57, 8.56 ppm (Figure S1 in the Supporting Information). These resonances, although lacking a specific assignment, are typical of aliphatic and aromatic amino acids, and we can exclude the presence of the NH resonances in the spectra as the protein was solvated in a D2O buffer. The identified resonances from the TEMPOL‐attenuated TOCSY spectra of RgNanH‐GH33 were indeed in very good agreement with the predicted chemical shifts of key aliphatic and aromatic residues in the binding pocket of the enzyme (Ile258, Ile338, Val502, Thr557, Tyr525, Tyr677 and Trp698).1
To further validate our approach and exclude false positives (i.e., binding hot‐spots outside the binding site), we carried out MD simulations, an approach successfully used in the past to identify ligand binding pockets for the development of small‐molecule inhibitors.15 Here, MD simulations were used to confirm the accessibility of TEMPOL to the specific binding pocket of RgNanH‐GH33. To efficiently explore the configurational space of the RgNanH‐GH33‐TEMPOL system, three different MD approaches were considered: 1) long MD (1.0 μs) with a low concentration of TEMPOL (10 mm) starting from a random configuration of the system, 2) 16 independent replicas of short MD (0.8 μs, total 10 mm of TEMPOL), and 3) 16 independent short replica MD simulations with a high concentration (50 % w/w) of TEMPOL in water, known as the MixMD approach.12
In each case, we first analysed the backbone RMSD of RgNanH‐GH33 for each trajectory, and showed that the presence of TEMPOL did not affect the structure of the protein, even for MixMD, as the average backbone RMSD was only approximately 1 Å (Figure S5). In the case of the long MD and the 16 short replicas, for which there were relatively few molecules of TEMPOL in the simulation, the interaction between TEMPOL and RgNanH‐GH33 was analysed by computing the contacts between TEMPOL and each residue in RgNanH‐GH33 over the course of each trajectory, in order to construct a fractional occupancy map for each residue. In the case of MixMD, in which there were 381 molecules of TEMPOL in the bounding box, the occupancy was measured by using a 0.5 Å grid to create bins for each TEMPOL molecule in each frame of the trajectory. The resulting 3D histograms were then visualized by means of the isomesh feature in PyMOL16 by using a structure averaged over the whole trajectory for RgNanH‐GH33 (Figure 2).
Firstly, the long MD simulation containing a low concentration of TEMPOL did map several binding hot‐spots, including the area of the binding site, but the outcome was dependent on the starting coordinates of the system. To overcome this issue, the same experiment was repeated with 16 different, independent short replicas of 50 ns each, according to a previous protocol.7 In this case, although the mapping of the binding hot‐spots was clear, the extension of sampling of the surface was not complete (Figure S5). In the MixMD approach, high concentrations of the probe enabled most of the protein surface to be mapped in a short time. In order to avoid biasing the system by the starting coordinates, 12 independent trajectories were run starting from different initial random configurations of the system. Figure 2 displays the average structure of the protein together with the occupancy grid, showing that the area of the known binding site of the 2,7‐anhydro‐Neu5Ac ligand is the major site for the interaction with the paramagnetic probe.
This result clearly excludes the presence of false positives and, more importantly, confirms that TEMPOL is selective for the binding site. Together with the TEMPOL/RgNanH‐GH33 interaction TOCSY experiment, these MD data build a solid argument for the use of TEMPOL‐based TOCSY experiments to identify specific chemical shifts from residues in the binding pocket in order to perform the DEEP‐STD NMR experiments.
We then carried out the DEEP‐STD NMR study with the frequencies identified by the TEMPOL approach. RgNanH‐GH33 (50 μm) in the presence of 2,7‐anhydro‐Neu5Ac (1 mm) in 10 mm [D11]Tris D2O buffer (pH 7.8) with 100 mm NaCl at 298 K was saturated with a train of Gaussian pulses of 50 ms each for 0.75 s, centred on the chemical shifts of the binding hot‐spots 0.6, 0.74, 1.06, 1.15, 1.26, 6.6, 6.74, 7.04, 7.57, 8.56 ppm.
As the absence of chemical‐shift assignment of the protein prevents the irradiation frequencies to specific protons in the binding pocket from being known, we propose here a novel approach: instead of using a single pair of frequencies to determine the differential epitope map of the ligand (DEEP‐STD map),1 an averaging approach should be followed. First, the DEEP‐STD factors for each experiment resulting from all the possible pairs of aliphatic and aromatic frequencies experimentally identified before are calculated (Figure S3). In our case, this resulted in 25 differential epitope maps. Secondly, all the obtained DEEP‐STD factors are averaged to obtain a unique DEEP‐STD map. This approach produces a more accurate depiction of the orientation and the nature of the amino acids surrounding the ligand in the binding pocket, particularly when no chemical shifts from the protein are available.
Figure 3 A shows the experimental average DEEP‐STD map of 2,7‐anhydro‐Neu5Ac binding to RgNanH‐GH33.
The map highlights that CH3, H3ax, H3eq, H5 are oriented toward aliphatic residue protons, whereas H6, H7, H4 present little to no preferred orientation, and H9, H9′ and H8 protons are oriented toward aromatic residues. This result is in excellent agreement with the crystal structure of the complex between 2,7‐anhydro‐Neu5Ac and RgNanH‐GH33.1 To confirm that our average DEEP‐STD map is a reliable representation of the architecture of the binding pocket, we compared theoretical predictions of the average DEEP‐STD map by using the CORCEMA‐ST approach (see the Supporting Information).4 The average DEEP‐STD factors calculated by using CORCEMA‐ST (Figure 3 B) are in excellent agreement with the experimentally obtained ones. This result further validates our approach, demonstrating that the TEMPOL‐based TOCSY approach is a reliable and powerful approach for identifying the suitable set of saturating frequencies to carry out DEEP‐STD NMR studies in the absence of protein chemical‐shift assignment.
Although here we have applied this approach to an enzyme with a polar binding pocket that favours H‐bond interactions with TEMPOL, it has been previously shown that the interaction between proteins and TEMPOL can involve weak van der Waals forces, hydrogen bonding and hydrophobic interactions. Several authors have described interactions of TEMPOL with proteins such as ubiquitin, lysozyme, tendamistat, Sso7d, cyclophyllin, and BTPI,10, 17, 18, 19, 20, 21 which present different hydrophobicity/hydrophilicity profiles in their binding sites; this makes us confident of the general applicability of the new protocol to different types of protein target. Nonetheless, some expected limitations are that TEMPOL must bind the protein with low affinity so as to allow an easy interpretation of the spectra in the absence or in the presence of the paramagnetic agent, and it should not induce changes in the conformation of the protein upon binding, which would lead to misinterpretation of resonances to consider for DEEP‐STD NMR or to conformational instability of the protein. The competition of TEMPOL with water tightly bound to the protein is also worth noting; in unfavourable cases this might prevent the probe from approaching the protein surface.18
In summary, we have developed a simple experimental procedure to expand the field of application of the DEEP‐STD NMR methodology for deriving ligand orientation to protein–ligand cases in where the spectral assignment of the protein is not available, that is, when 1) a full NMR assignment is not possible, 2) the predicted chemical shifts from the structure are not in line with the experimental data (e.g., due to the dynamics of the protein, not accounted for in calculations on a static X‐ray structure) or 3) chemical‐shift assignments are lacking. Combining 2D TOCSY experiments in the absence/presence of a paramagnetic probe with the determination of an average DEEP‐STD map by saturation at all the experimentally determined frequencies has been demonstrated to be a powerful approach to determine the type of protein residues most likely to interact with the ligand. The obtained information on the orientation of the ligand in the binding pocket of the protein opens several interesting applications of the DEEP‐STD NMR methodology, for example in the hit‐to‐lead stage of drug discovery as in 3D‐QSAR studies. Further, if combined with the K D of the complex, the experimentally obtained averaged DEEP‐STD factors could be used as descriptors to evaluate success or failure of hit modifications during the hit‐to‐lead stage.
Conflict of interest
The authors declare no conflict of interest.
Biographical Information
Jesús Angulo obtained his doctorate at the Instituto de Investigaciones Químicas (CSIC/University of Seville) in 2002 under the supervision of Dr. Pedro Nieto. He moved to the group of Prof. Thomas Peters (University of Lübeck, Germany) to specialize in ligand‐based NMR spectroscopy. He rejoined the IIQ in 2008 before joining the School of Pharmacy at the University of East Anglia in 2013, where he is a Senior Lecturer in NMR Spectroscopy. His research group focuses on structural and dynamics characterization of biologically active molecules and their complexes, with a particular focus on glycans, by using NMR spectroscopy and computational techniques.
Supporting information
Acknowledgements
This work was supported by the Biotechnology and Biological Sciences Research Council (BBSRC) through a New Investigator grant (BB/P010660/1) awarded to J.A.. N.J. and L.E.T. acknowledge funding by the BBSRC Institute Strategic Programmes for Gut Health and Food Safety (BB/J004529/1) and Gut Microbes and Health (BB/R012490/1). We also acknowledge access to UEA′s Faculty of Science Research Facilities. Data supporting this article are available as electronic supplementary files at https://bit.ly/2BTwSER, or upon request to the corresponding author.
R. Nepravishta, S. Walpole, L. Tailford, N. Juge, J. Angulo, ChemBioChem 2019, 20, 340.
References
- 1. Monaco S., Tailford L. E., Juge N., Angulo J., Angew. Chem. Int. Ed. 2017, 56, 15289–15293; [DOI] [PMC free article] [PubMed] [Google Scholar]; Angew. Chem. 2017, 129, 15491–15495. [Google Scholar]
- 2.
- 2a. Mayer M., Meyer B., Angew. Chem. Int. Ed. 1999, 38, 1784–1788; [DOI] [PubMed] [Google Scholar]; Angew. Chem. 1999, 111, 1902–1906; [Google Scholar]
- 2b. Mayer M., Meyer B., J. Am. Chem. Soc. 2001, 123, 6108–6117; [DOI] [PubMed] [Google Scholar]
- 2c. Viegas A., Manso J., Nobrega F. L., Cabrita E. J., J. Chem. Educ. 2011, 88, 990–994; [Google Scholar]
- 2d. Angulo J., Nieto P. M., Eur. Biophys. J. 2011, 40, 1357–1369; [DOI] [PubMed] [Google Scholar]
- 2e. Bhunia A., Bhattacharjya S., Chatterjee S., Drug Discovery Today 2012, 17, 505–513. [DOI] [PubMed] [Google Scholar]
- 3. Han B., Liu Y., Ginzinger S., Wishart D., J. Biomol. NMR 2011, 50, 43–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Jayalakshmi V., Krishna N. R. J., J. Magn. Reson. 2004, 168, 36–45. [DOI] [PubMed] [Google Scholar]
- 5. Zerbe B. S., Hall D. R., Vajda S., Whitty A., Kozakov D., J. Chem. Inf. Model. 2012, 52, 2236–2244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Keniry M. A., Carver J. A. in Annual Reports on NMR Spectroscopy, Vol. 48 (Ed.: G. A. Webb), Academic Press, London, 2002, pp. 32–63. [Google Scholar]
- 7. Niccolai N., Morandi E., Gardini S., Costabile V., Spadaccini R., Crescenzi O., Picone D., Spiga O., Bernini A., Biochim. Biophys. Acta Proteins Proteomics 2017, 1865, 201–207. [DOI] [PubMed] [Google Scholar]
- 8. Solomon I., Phys. Rev. 1955, 99, 559–566. [Google Scholar]
- 9. Clore G. M., Iwahara J., Chem. Rev. 2009, 109, 4108–4139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Fesik S. W., Gemmecker G., Olejniczak E. T., Petros A. M., J. Am. Chem. Soc. 1991, 113, 7080–7082. [Google Scholar]
- 11. Petros A. M., Mueller L., Kopple K. D., Biochemistry 1990, 29, 10041–10048. [DOI] [PubMed] [Google Scholar]
- 12. Ung P. M., Ghanakota P., Graham S. E., Lexa K. W., Carlson H. A., Biopolymers 2016, 105, 21–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Lexa K. W., Carlson H. A., J. Chem. Inf. Model. 2013, 53, 391–402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Tailford L. E., Owen C. D., Walshaw J., Crost E. H., Hardy-Goddard J., Le Gall G., de Vos W. M., Taylor G. L., Juge N., Nat. Commun. 2015, 6, 7624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Seco J., Luque F. J., Barril X., J. Med. Chem. 2009, 52, 2363–2371. [DOI] [PubMed] [Google Scholar]
- 16. The PyMOL Molecular Graphics System, Version 1.7, Schrödinger, LLC.
- 17. Niccolai N., Spiga O., Bernini A., Scarselli M., Ciutti A., Fiaschi I., Chiellini S., Molinari H., Temussi P. A., J. Mol. Biol. 2003, 332, 437–447. [DOI] [PubMed] [Google Scholar]
- 18. Scarselli M., Bernini A., Segoni C., Molinari H., Esposito G., Lesk A. M., Laschi F., Temussi P., Niccolai N., J. Biomol. NMR 1999, 15, 125–133. [DOI] [PubMed] [Google Scholar]
- 19. Niccolai N., Spadaccini R., Scarselli M., Bernini A., Crescenzi O., Spiga O., Ciutti A., Di Maro D., Bracci L., Dalvit C., Temussi P. A., Protein Sci. 2001, 10, 1498–1507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Bernini A., Venditti V., Spiga O., Ciutti A., Prischi F., Consonni R., Zetta L., Arosio I., Fusi P., Guagliardi A., Niccolai N., Biophys. Chem. 2008, 137, 71–75. [DOI] [PubMed] [Google Scholar]
- 21. Pintacuda G., Otting G., J. Am. Chem. Soc. 2002, 124, 372–373. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.