Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Feb 1.
Published in final edited form as: Chembiochem. 2018 Nov 12;20(3):312–318. doi: 10.1002/cbic.201800481

Chemical and biochemical strategies to explore the substrate recognition of O-GlcNAc cycling enzymes

Chia-Wei Hu [a], Matthew Worth [b], Hao Li [a], Jiaoyang Jiang [a],*
PMCID: PMC6433133  NIHMSID: NIHMS1018170  PMID: 30199580

Abstract

The O-linked N-acetylglucosamine (O-GlcNAc) modification is an essential component in cell regulation. A single pair of human enzymes conducts this modification dynamically on a broad variety of proteins: O-GlcNAc transferase (OGT) adds the GlcNAc residue and O-GlcNAcase (OGA) hydrolyzes it. This modification is dysregulated in many diseases, but its exact role on particular substrates remains unclear. In addition, no apparent sequence motif was found in the modified proteins and the factors controlling the substrate specificity of OGT and OGA are largely unknown. In this concept, we will discuss recent developments of chemical and biochemical methods toward addressing the challenge of OGT and OGA substrate recognition. We hope the new concept and knowledge from these studies will promote research in this area to advance understanding of O-GlcNAc regulation in health and disease.

Keywords: glycosylation, O-GlcNAc, substrate recognition, microarrays, chemical probes

Graphical Abstract

graphic file with name nihms-1018170-f0001.jpg

Introduction

O-GlcNAcylation is an essential and reversible post-translational modification in metazoans in which N-acetylglucosamine (GlcNAc) is appended to serine and threonine residues of over 1,000 proteins (Figure 1).[15] Dynamic O-GlcNAc cycling regulates a wide variety of cellular functions, including transcription, translation, and signal transduction.[1,2] Importantly, many proteins are aberrantly O-GlcNAcylated in diseases such as cancer,[3,4] diabetes,[5,6] and Alzheimer’s disease.[7,8] Only one enzyme, O-GlcNAc transferase (OGT), is known to add this modification to nucleocytoplasmic and mitochondrial proteins,[9,10] while one other, O-GlcNAcase (OGA), removes this modification (Figure 1).[11,12] Recent advances in quantitative mass spectrometry studies have revealed that the O-GlcNAc level of certain proteins changes in response to environmental stimuli and in different disease states.[1323] Therefore, understanding what mechanisms regulate the enzyme activity and substrate specificity of OGT and OGA will help elucidate the biological functions of this modification in health and disease.

Figure 1.

Figure 1.

Overview of O-GlcNAcylation and its cycling enzymes. Top panel: schematic of reversible O-GlcNAcylation. O-GlcNAc cycling enzymes, O-GlcNAc transferase (OGT) and O-GlcNAcase (OGA), are represented by the full-length ncOGT model (generated from PDB 3PE4 and 1W3B) and a truncated human OGA structure (OGAcryst from PDB: 5TKE). Bottom panel: the schematic of domain architecture of ncOGT and OGA. OGT is comprised of a tetratricopeptide repeat domain (TPR, orange), catalytic domain (N-Cat and C-Cat, blue), and intervening domain (Int-D, light gray). OGA is comprised of a catalytic domain (Catalytic, cyan), stalk domain (Stalk, pink), and pseudo-histone acetyltransferase domain (Pseudo HAT, light gray).

OGT is a member of the GT-B glycosyltransferase family.[24] It is composed of an N-terminal tetratricopeptide repeat (TPR) region and a C-terminal catalytic region that is split by an intervening domain (Figure 1).[10,25,26] The TPR region folds into a super alpha helix and has been suggested to play an important role in protein interactions.[27,28] Furthermore, mutations in the TPR region have been implicated in intellectual disability.[2931] Three isoforms of human OGT have been reported,[32,33] which differ only in the number of TPR repeats: 13.5 for ncOGT (the most abundant isoform in cytoplasm and nucleus), 9 for mOGT (mitochondria), and 2.5 for a short isoform, sOGT. However, the physiological importance of the two shorter isoforms remain mostly unknown.[3436] Several studies reported the requirement of the extended TPR region for binding and glycosylating protein- but not peptide-substrates.[3740] While the full-length ncOGT is not yet amenable to crystallization, structural studies of OGT4.5 (an N-terminal truncated OGT construct containing 4.5 TPR repeats) in complexes with sugar donor and acceptor peptide substrates have revealed key features in substrate recognition: a) the active site-bound sugar donor forms extensive interactions with the acceptor peptide near the catalytic pocket, and b) a series of asparagine residues lining the inside of the TPR play an important role in binding to the backbones of peptide substrates.[26,4143] To better understand the substrate specificity of OGT, future research will be needed to illustrate the binding modes of this essential enzyme with protein substrates as well as other interacting partners.

OGA is a multi-domain protein, with an N-terminal catalytic domain sharing significant sequence homology to glycoside hydrolase 84 (GH84) family of enzymes, a stalk domain, and a C-terminal domain showing sequence homology to histone acetyltransferase (pseudo-HAT domain) (Figure 1).[12,44] However, the pseudo-HAT domain seems to lack residues necessary for binding acetyl coenzyme A, and its functions, including any in substrate binding, are not yet determined.[45,46] OGA substrate recognition is also a topic of intense interest; the extent to which OGA recognizes protein substrates beyond the GlcNAc residue has been a long-standing question.[47,48] As a major breakthrough, structures of truncated human OGA containing the catalytic and stalk domains have recently been solved.[4951] Intriguingly, a potential substrate-binding cleft created by the dimerized OGA protein was discovered. In support of this, the structures of OGA in complex with each of five distinct glycopeptides demonstrate that all the peptide substrates are bound in the substrate-binding cleft.[49,52] The interactions of GlcNAc in the OGA catalytic pocket are highly conserved, while the peptides are bound in a bidirectional yet generally similar V-shaped conformation. Notably, some peptides are engaged in side-chain specific interactions with OGA residues on the substrate-binding cleft, and mutagenesis analysis confirmed the importance of some of these interactions. It is expected that future research to elucidate how OGA interacts with protein substrates beyond the catalytic domain will provide a more complete understanding on the molecular basis of OGA-substrate recognition.

Besides the progress made with structural studies, traditional approaches such as immunoprecipitation and yeast two-hybrid screening have also been applied to investigate the interactions between OGT and its binding partners.[5357] However, these methods may not be optimal for studying O-GlcNAcylation due to its dynamic nature, transient interactions, and large number of substrates. Other complementary methods have therefore been introduced and employed to start addressing these challenges, which will be critical to discover how these enzymes’ activities and specificities are altered in diseases. Here, we will highlight recently reported chemical and biochemical approaches that have not been widely reviewed. Some of these approaches have only started to be developed, while others are already being applied to investigate OGT- and OGA-substrate recognition. Methods to simply detect O-GlcNAcylation or measure O-GlcNAc stoichiometry will not be emphasized here, as they have been extensively reviewed elsewhere.[15,5862] We hope this concept paper will spur efforts to make new discoveries in this exciting field.

Microarrays to elucidate substrate preferences of OGT and its variants

One of the popular approaches for studying OGT substrate preferences has been microarrays, which can be performed with peptides or purified proteins (Figure 2). The first peptide microarray applied to OGT examined the preference for different types of residues surrounding the glycosylation site.[63] It utilized a panel of biotinylated peptides derived from α-crystallin, a known OGT substrate, where each peptide had one amino acid changed relative to the parent peptide (AIPVSREEK). Following incubation with OGT and UDP-GlcNAz, an azido analogue of UDP-GlcNAc, the glycosylated peptides were labeled with phosphine-FLAG for detection in an azido-ELISA assay. They found that OGT had variable tolerance for amino acid substitutions at different sites. For instance, mutations at the +3 position (Glu) relative to the serine glycosylation site produced generally moderate reductions in O-GlcNAcylation, while changing the +2 subsite (Glu) increased glycosylation. In contrast, mutations at the −2 (Pro), −1 (Val), and +1 (Arg) positions led to dramatic reductions in modification. These results suggested that OGT possesses some preferences for residues flanking the glycosylation site, but the extent to which this applies to other peptide substrates remained to be examined. Another recent study utilized radiolabeled UDP-[3H]GlcNAc for peptide O-GlcNAcylation. Following quantification on a scintillation counter, 70 out of 720 peptides were significantly glycosylated, implying that OGT has a substantial level of substrate specificity dictated by the peptide-binding site alone.[43] Peptide microarrays were later used to examine the substrate specificity of different OGT isoforms. Using RL2 (an anti-O-GlcNAc antibody) for O-GlcNAc detection, one study found that a number of peptides were glycosylated at different levels by each of the three OGT isoforms, indicating differential substrate specificities.[64] Besides the applications to study OGT, a microarray with synthetic glycopeptides was also developed for OGA,[65] which may be used in the future to investigate the substrate preferences of this hydrolase.

Figure 2.

Figure 2.

Microarray strategy to study OGT substrate preferences. A large number of immobilized peptides or proteins on the chip are incubated with OGT and UDP-GlcNAc (or a sugar donor analogue). Several methods can be used for detection of O-GlcNAcylation, including click chemistry of azido sugar with alkyne-linked biotin and detection by fluorescent streptavidin, anti-O-GlcNAc antibodies, and scintillation counting.

In addition to peptide microarrays, a protein microarray method for OGT has also been established. One commercially available microarray contains about 8,000 purified human proteins and the O-GlcNAcylation of these proteins was detected by CTD110.6, another anti-O-GlcNAc antibody.[66] This protein microarray method identified 230 possible OGT protein substrates in humans, most of which were not previously known. To further improve detection sensitivity and specificity, the same group applied UDP-GlcNAz and detected O-GlcNAzylated proteins with cyclooctyne-linked biotin and fluorescent streptavidin.[67] Using this assay to evaluate the activity of WT OGT and its mutant, they provided direct evidence that an asparagine ladder within the TPR of OGT is important for glycosylation of a large majority of proteins on the microarray.

There are a number of notable features of microarrays. They can test up to thousands of substrates at a time quantitatively and present simplified substrate identification. Peptide microarrays allow quick follow-up experiments with both mutated and modified peptide sequences. Protein microarrays are advantageous in their ability to test all the purified proteins at similar concentrations, which may otherwise be present in cells at highly variable levels or only expressed in certain cell types or conditions. This technique also offers a more accurate assessment of which proteins OGT prefers to glycosylate. However, microarrays have some limitations. These include potential interference with OGT binding due to attachment of peptides or proteins to a solid surface, high cost, low reproducibility, and that it is only applicable in vitro. Endogenous glycosylation or other modifications may already be present on proteins in microarrays as well. Pre-treatment of arrays with bacterial OGA and lambda phosphatase could be an option to reduce the interference from endogenous O-GlcNAcylation and phosphorylation.[68] Nevertheless, microarray is an efficient approach to evaluate O-GlcNAc cycling enzymes and their variants against a large number of substrates in high throughput.

Chemical probes to study OGT substrate recognition

One of the challenges in studying OGT substrate recognition is that the binding of protein substrate is dependent on prior UDP-GlcNAc binding,[26] and there is a lack of strategies to efficiently distinguish them. A UDP-GlcNAc analogue called GlcNAc Electrophilic Probe 1 (GEP1) was recently reported to address this limitation. In combination with structure-guided mutagenesis, this new probe can rapidly assess the role of specific OGT residues in binding of sugar donor compared to acceptor substrate (Figure 3).[69] The GEP1 probe contains an allyl chloride electrophile extended from the N-acetyl group of UDP-GlcNAc. In the presence of protein substrates, this probe can be used as a regular sugar donor to conduct glycosylation by OGT. In the absence of acceptor substrates, this probe can efficiently label an OGT active site residue C917 to generate a covalent modification on OGT. The binding ability of OGT toward sugar and protein substrate can be evaluated by varied levels of probe modified protein and OGT. This can be reported using click-chemistry conjugation of an alkyne fluorophore with the 6’ azido group on a modified GEP1 probe (named GEP1A) in a fluorescence assay (Figure 3). The assay was applied to identify a couple of important asparagine residues in the OGT TPR region for protein substrate binding. Orthogonal radiolabeled kinetic experiments further confirmed the data of this assay. In agreement with the results from other methods (e.g., protein microarrays), these findings suggest that the asparagine residues on the inner surface of TPR domain are a generic binding region contributing to OGT interactions with different acceptor substrates. The results of GEP1A fluorescence assay also suggest that the importance of particular TPR residues could be dependent on the protein substrates, thus conferring modest substrate specificity to OGT. Taken together, this study demonstrated that the GEP1A fluorescence assay can be applied to rapidly screen the binding ability of OGT variants with protein substrates, accelerating the discovery of crucial residues for substrate recognition. Moreover, these experiments highlighted a notable feature of GEP1A assay, complementary to the conventional activity-based assays: even when no glycosylation is detected, GEP1A fluorescence assay can still potentially discriminate the impact of the mutated OGT residues on sugar binding compared to protein substrate binding.

Figure 3.

Figure 3.

Application of GlcNAc Electrophilic Probes (GEPs) to characterize OGT-protein substrate recognition. Top: In the presence of protein substrate, GEP1A can label both OGT and its protein substrate, which can be readily detected by click chemistry-based fluorescence assay. The relative level of modified proteins between OGT-WT and its mutants (derived from structure-guided mutagenesis) reflects the altered OGT ability of protein binding compared to sugar binding. Bottom: Preincubation of GEP1 enables OGT crosslinking with various protein substrates in situ.

The GEP1 probe can also potentially stabilize OGT-protein substrate complexes to facilitate investigation of their interactions. Following a brief pre-incubation with OGT, GEP1 was able to in situ crosslink OGT with over 100 proteins in cell lysates (Figure 3).[69] Most of these crosslinked proteins were reported to be O-GlcNAcylated, and dozens of possible new OGT substrate proteins were also identified. The crosslinking assay of GEP1 can facilitate identification of novel OGT substrates and may be optimized for characterizing the binding mode of particular substrates with OGT. It should be noted that this probe is not currently applicable in cells, and the electrophiles and assay conditions may require optimization for different substrates. Another chemical probe with potential to be adapted for this type of research is Ac4GlcNDAz.[70,71] This GlcNAc analogue is modified with the photoactivatable crosslinking group diazirine and is resistant to hydrolysis by WT OGA. It has been used to identify O-GlcNAcylated protein binding partners, and because OGT and OGA can be O-GlcNAcylated, this probe may be used to study their interactions as well. Overall, small molecule probes are versatile since they can be modified with different functional groups to tune their reactivity or to be applied in entirely new strategies. Consequently, chemical probes have great potential to overcome challenges in the field, including those that have not yet been applied to O-GlcNAc cycling enzymes (such as unnatural amino acids).[72] By improving the specificity, sensitivity, and efficiency of the labeling and detection methods, chemical probes are promising tools for cellular or in vivo studies.

Engineered OGA for enriching glycoprotein substrates and binding partners

Due to the lack of a conserved motif in OGA substrates, approaches to engineer OGA for identifying interacting proteins have recently been developed. One method utilized a catalytically impaired bacterial OGA mutant (CpOGAD298N) that binds to O-GlcNAcylated proteins without quickly hydrolyzing the sugar, and thus can be used for enrichment of O-GlcNAcylated proteins (Figure 4A).[73] Despite the unclear function of bacterial OGA orthologs, affinity purification of this mutant followed by mass spectrometry analysis led to identification of over 2,000 enriched proteins from Drosophila embryo lysates, and 43 of them were mapped with O-GlcNAc sites.[74] Genetic interaction experiments were then applied to validate two novel O-GlcNAc substrates along with their physiological roles. This study revealed the potential of using OGA mutant to not only investigate OGA interaction with particular substrates in vitro or in cells, but also examine the OGA-regulated O-GlcNAc repertoire under different conditions.

Figure 4.

Figure 4.

Engineered OGA to enrich substrates and binding partners. A) Catalytically impaired bacterial OGA (CpOGAD298N) binds to substrate without quickly hydrolyzing the sugar. The interacting substrates can then be enriched by affinity purification of CpOGAD298N and identified by LC-MS/MS. B) The expression of an OGA-BirA fusion protein in cells results in proximity-dependent biotinylation of OGA-bound proteins, including OGA substrates. The enriched biotinylated proteins can then be detected by LC-MS/MS.

The second approach applied the proximity-dependent biotinylation assay (BioID), wherein the protein of interest is expressed as a fusion with biotin ligase (BirA), which biotinylates proteins within about 10 nm (Figure 4B).[75] Biotin capture can then be applied to enrich the interacting proteins. One study combined this with stable isotopic labeling of amino acids in cell culture (SILAC) to quantitatively analyze the proteins enriched by OGA.[76] They identified 90 proteins that were potential interacting partners of OGA and validated that one of them, fatty acid synthase, reduces OGA activity upon binding. Some considerations should be taken into account for OGA BioID methods. For example, exogenous expression of OGA may cause unintended signaling that alters O-GlcNAc homeostasis, and fused proteins may also interfere with binding (the BioID study expressed two different fusion proteins, one each with BirA on either terminus of OGA). Biotin ligation may also not be efficient for transient OGA-substrate interactors, as overnight incubation was required. Of note, CpOGAD298N and OGA-BirA methods could also identify protein interactors that are not direct substrates of human OGA, thus control experiments are needed to discriminate non-substrate binders. Despite that, these approaches can be potentially applied in different cellular conditions and will be valuable to illustrate the potential of protein engineering in studying OGA protein interactions.

Imaging techniques to investigate OGT substrate- and localization-specific activity in cells

Since the O-GlcNAcylation in cells is dynamic, it is essential to learn how OGT and OGA interact with different substrates in a temporal and spatial manner. Methods that can be applied to answer this question, particularly in live cells, will greatly extend our knowledge about the protein substrate recognition of O-GlcNAc cycling enzymes. A promising strategy to accomplish this is fluorescence imaging, which offers notable advantages over other approaches, including that it can account for protein expression level and cellular localization. Several efforts have utilized genetically encoded fluorescent proteins to detect O-GlcNAcylation on specific substrates. One such approach expressed a construct containing two fluorescent proteins linked by an OGT peptide substrate and an O-GlcNAc binding lectin.[77,78] Glycosylation of the substrate peptide brought the two fluorescent proteins into proximity, which could be detected by changes in their fӧrster resonance energy transfer (FRET). This method was utilized to monitor the spatiotemporal changes of OGT activity in cells upon serum stimulation. A consideration is that the excess acceptor fluorophore in regular FRET experiments usually produces high background signal and reduces detection sensitivity, which may not be suitable for certain substrates with low expression or O-GlcNAcylation level.

An alternative detection method, called fluorescence lifetime imaging microscopy (FLIM-FRET), was applied in an effort to increase detection sensitivity. One such study incorporated Ac4GalNAz (a cell-permeable acetylated GalNAc analogue), which can be metabolically converted to UDP-GlcNAz and used as a sugar donor by OGT in cells (Figure 5).[79,80] The OGT protein substrate of interest was fused with a fluorescent protein (FRET donor). After the substrate is O-GlcNAzylated, it can be conjugated with a fluorophore-linked alkyne FRET acceptor. Stronger FRET between the protein and small molecule fluorophore renders a faster decrease of donor fluorescence, which can sensitively report the O-GlcNAcylation on the particular substrate. This technique overcomes the issue of high background signal produced by excess acceptor fluorophore. The researchers applied this strategy to study O-GlcNAcylation on Tau and β-catenin proteins, which are implicated in neurodegenerative diseases and cancer, respectively.[81,82] In a similar study, O-GlcNAcylation on several proteins was imaged using cyclopropene-tetrazine click chemistry, which avoids the use of potentially toxic copper catalyst and is more specific than cyclooctyne-azide click chemistry.[83] FLIM-FRET imaging of O-GlcNAcylation allows real-time analysis of different OGT variants in specific conditions and compartments of live cells, an advantage not available with many other methodologies. The O-GlcNAcylation observed in the FRET approaches could, however, differ from that in unaltered cells due to the need for exogenous expression and fusion to a fluorophore of the substrate protein. Moreover, differential O-GlcNAcylation could result from an equilibrium of OGT and OGA activities. A recent study also reported nonspecific reactions of cysteine residues with acetylated GalNAc/GlcNAc analogues that have been widely used for metabolic labeling of O-GlcNAc in cells.[23] Finally, imaging may be difficult with proteins that are not highly glycosylated, and distance limitation between the fluorophores may restrict the application of FRET on certain large substrate proteins. Other strategies such as SNAP-tag, which exploits an alkyltransferase (O6-alkylguanine-DNA-alkyltransferase) to label fusion proteins with a fluorophore, may also be adapted to investigate OGT substrate preferences in cells.[84]

Figure 5.

Figure 5.

Fluorescence lifetime imaging microscopy-fӧrster resonance energy transfer (FLIM-FRET) to monitor substrate-specific OGT activity in live cells. An OGT substrate fused with fluorescent protein can be metabolically labeled by azido sugar in cells. Click-chemistry conjugation of glycosylated protein to an acceptor fluorophore leads to proximity-induced FRET between donor and acceptor on the glycosylated substrate. The reduced fluorescence lifetime of the donor can report the O-GlcNAcylation on the protein substrate.

Conclusions

In the past few years, different types of strategies have been introduced and applied to investigate the substrate recognition of O-GlcNAc cycling enzymes. While current knowledge on this topic was mainly derived from structural studies, the new complementary chemical and biochemical strategies mentioned here will be valuable for characterizing OGT and OGA protein substrate interactions under different regulatory conditions in vitro and in cells. These studies, in addition to published structural data, allow us to propose general concepts for substrate recognition of OGT. Although there is no strict sequence motif specifically targeted by OGT, amino acid preferences close to the glycosylated residue and extended substrate binding conformation appear to be important. A series of asparagine residues lining the concave surface of the TPR domain are also likely to provide additional binding affinity and substrate selectivity through interactions with other regions of the protein substrates. Compared to OGT, there is even less known about OGA substrate recognition. As with OGT, most discoveries to date are derived from X-ray crystallography, which revealed that OGA can bind substrates beyond the GlcNAc moiety. However, little is known about the extent of these interactions, and few assays like the ones described above have been applied to OGA. Thus, many questions remain for the substrate recognition of O-GlcNAc cycling enzymes, including:

  • What is the molecular basis of TPR region contribution to the substrate selectivity of OGT?

  • Is the pseudo-HAT domain important for substrate recognition or protein binding to OGA?

  • How much does protein sequence or conformational preference direct OGT or OGA to different substrates?

  • Whether recruiter proteins guide OGT or OGA to substrates and if so under what conditions?

  • What differences in substrate recognition exist among the isoforms of O-GlcNAc cycling enzymes?

New approaches will still be needed to address the questions above. Future studies of OGT and OGA substrate recognition will likely expand our understanding of cell regulation and promote development of novel molecular tools to modulate O-GlcNAc cycling on particular substrates for biomedical applications.

Acknowledgements

This work was supported by NIH grants R01 GM121718 and R01 GM126300.

Footnotes

Conflict of Interest

The authors declare no conflict of interest.

References

RESOURCES