Abstract
O-linked β-N-acetylglucosamine (O-GlcNAc), a critical post-translational modification predominantly found in the nucleus, plays a substantial role in regulating gene expression by modulating transcription factors (TFs) activity. However, quantitative analysis investigating the influence of O-GlcNAcylation on protein-DNA interactions at a proteome scale remains undone. Herein, a pulldown screening approach using a consensus TF response element (catTFRE) was employed to unravel the impact of fluctuating levels of O-GlcNAcylation on the DNA binding efficiency of endogenous TFs/co-factors. Utilizing quantitative proteomics, we identified a substantial enhancement in the binding capacity of 241 nuclear proteins (NPs) to DNA sequences due to elevated levels of O-GlcNAcylation, whereas a decrease in DNA binding was observed for 2 NPs concurrently. Intriguingly, the O-GlcNAcylation elevation significantly enhanced the binding of 146 TFs/co-factors to specific DNA sequences. We further established that the O-GlcNAcylation of several Forkhead family TFs, including FOXA1 and FOXC1, notably enhances their binding to specific DNA sequences in living cells. Our research presents an efficacious approach to assessing the impact of O-GlcNAcylation on the interactions between proteins and DNA. This significantly enhances our understanding of the role O-GlcNAcylation plays in the regulation of transcription.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-025-07074-7.
Subject terms: Biochemistry, Cell biology, Chemical biology, Genetics, Molecular biology
Introduction
O-linked β-N-acetylglucosamine (O-GlcNAc) monosaccharide is critical post-translational modification (PTM) of nuclear, cytoplasmic, and mitochondrial proteins associated with a wide range of cellular processes, such as gene transcription, chromatin assembly1,2, signal transduction3,4, and cell metabolism5. Contrary to traditional glycosylation, O-GlcNAcylation is a unique, reversible protein modification, governed by two enzymes: O-linked N-acetylglucosamine transferase (OGT) which attaches O-GlcNAc to substrate proteins, and N-acetyl-β-glucosaminidase (OGA) which removes it6,7. Cells adeptly adjust O-GlcNAcylation levels on crucial proteins, thereby modifying their activity and biological functions8. This empowers them to adaptively respond to variations in environmental conditions and signaling prompts9,10. Significantly, our research, along with the work of others, has produced compelling evidence that O-GlcNAcylation is present on a variety of transcription factors (TFs) and other nuclear proteins (NPs), playing an instrumental role in the regulation of transcription1,11,12.
TFs and co-factors, play a crucial role in the rapid activation of necessary adaptive signaling cascades by binding to the promoter regions of various target genes2,13, ensuring the conversion of cell signals into transcriptional reprogramming. Dynamic changes in O-GlcNAcylation levels, by regulating the stability of proteins and protein–protein interactions, subsequently influence intracellular biological processes and control chromatin behavior14,15. In recent years, the impact of O-GlcNAcylation on protein-DNA interactions has gradually been unveiled16. O-GlcNAcylation can modulate the binding ability of NPs with DNA, thereby steering gene expression at a genomic level17. This transformative process induces a reconfiguration of cellular transcriptional patterns, ultimately determining cellular phenotypic manifestations. We previously used quantitative proteomics methods to reveal distinct variations in the chromatin binding of O-GlcNAc modified proteins under different cellular O-GlcNAcylation states. For instance, O-GlcNAcylation has been demonstrated to enhance the binding of the TF FOXA1 with DNA, subsequently altering chromatin localization of FOXA1 throughout the genome18. This results in reshaping the expression characteristics of downstream genes, thereby amplifying the metastatic and invasive potential of breast cancer cells. It was also reported that O-GlcNAcylation of co-factor HMGB1 modifies its DNA-binding capabilities and its activities in processing DNA damage19. These findings illuminate that O-GlcNAcylation indeed exerts a potent regulatory effect on the DNA binding capacities of NPs, most notably TFs/co-factors. However, a comprehensive proteomic-level analysis is still lacking to elucidate the specific effects of O-GlcNAcylation on the DNA binding of various TFs/co-factors.
The Ding group developed a method that allows simultaneous evaluation of the DNA-binding activity of various endogenous TFs, using a synthetic DNA sequence composed of tandem arrays of TF response elements (catTFREs) representing a range of TF families, thus enabling identification of numerous TFs on a proteomic scale20–22. In this study, we employ this pulldown screening approach, to efficiently dissect the influence of O-GlcNAcylation on the interaction between NPs and specific DNA sequences. Cells were separately treated with an OGT inhibitor (OSMI-1) or an OGA inhibitor (Thiamet G), resulting in two groups of NPs with high and low levels of O-GlcNAcylation, respectively. Utilizing catTFREs to capture TFs and other NPs, we successfully compared the variances in the DNA-binding capacities of proteins across groups with different O-GlcNAcylation states. Through the use of label-free quantitative proteomics, we observed a significant increase in the DNA-binding capacity of a multitude of TFs/co-factors in response to elevated O-GlcNAcylation levels. More specifically, we discovered that the O-GlcNAcylation of selected Forkhead family TFs significantly bolstered their binding to designated DNA sequences in cells. Our research elucidates the impact of O-GlcNAcylation on protein-DNA interactions, offering significant insights into its role in transcription regulation.
Materials and methods
Cell culture and reagents
MCF7 and 22RV1 cells were purchased from Type Culture Collection Cell Bank of the Chinese Academy of Sciences and cultured at 37 °C with 5% (v/v) CO2 within 6 months from resuscitation. All the cells were cultured in 90% RPMI-1640 (Gibco) supplemented with 10% fetal bovine serum (FBS; Gibco) and 1% penicillin/streptomycin antibiotics (Gibco). MCF7 cells at 50% confluence were cultured with culture medium containing 50 μM OSMI-1 (MedChemExpress) or 50 μM Thiamet G (MedChemExpress) for 36 h to regulate the O-GlcNAcylation of TFs. Since both OSMI-1 and Thiament G are dissolved in Dimethyl sulfoxide (DMSO; MedChemExpress), MCF7 cells were treated with DMSO as a negative control23,24. Wild type MCF7 cells served as blank control.
Extraction of nuclear proteins (NPs)
In 10 cm dishes, the MCF7 cells were washed twice with ice-cold phosphate-buffered saline to remove culture medium, then treated with 1 mL ice-cold Hypotonic Buffer of nucleoprotein extraction kit (SanGon Biotech, #C500009). NPs were extracted in accordance with the manufacturer’s instructions. Protein concentration of NPs was determined by bicinchoninic acid protein assay kit (Solarbio). Approximately 2 mg NPs was extracted from 4 × 107 MCF7 cells. NPs extracted from OSMI-1-treated cells were expected to exhibit low O-GlcNAcylation status, while those extracted from Thiamet G-treated cells were expected to exhibit high O-GlcNAcylation status.
catTFRE and succinylated wheat germ agglutinin (sWGA) pulldown assay
Biotinylated primers (Supplementary Materials Dataset S8), catTFRE-1, catTFRE-2 and catTFRE-80 were synthesized by General Biol Co., Ltd. Dynabeads (BeaverBeads™ Streptavidin) were purchased from Beaver Co., Ltd. Biotinylated DNA was pre-immobilized on Dynabeads and then mixed with NPs from MCF7 cells. The mixtures were supplemented with equal volume of Lysis Buffer (SanGon Biotech, #C500009), and then incubated for 6 h at 4 °C. The supernatant was discarded, and the Dynabeads were washed twice with Lysis Buffer and then one with phosphate-buffered saline. The catTFRE pulldown beads were re-suspended with 40 μL of 1 × SDS loading buffer and boiled for 15 min.
For competitive binding of catTFRE-80 assay, 5 mg low and high O-GlcNAcylation states NPs were mixed with 5 pM catTFRE-80 pre-bound to beads for 12 h at 4 °C, and the beads were washed twice with Lysis Buffer and then once with phosphate-buffered saline. Then the catTFRE-80 beads were re-suspended with 200 μL of elution buffer (0.1 M Glycine, 0.1–0.5% detergent, pH 2.5–3.1) for 1 h at 65 °C. The supernatant was collected and added with 200 μL neutralize buffer (1 M Tris, pH 8.0), then mixed with sWGA beads for 12 h at 4 °C. The flow-through was collected. Then the sWGA beads were washed twice with Lysis Buffer and then once with phosphate-buffered saline, then re-suspended with 400 μL of 1 × SDS loading buffer and boiled for 15 min as sWGA pulldown sample. Both 1 mg of non-glycosylated and glycosylated recombinant TFs (GST-FOXA1 or GST-FOXC1) were mixed with 5 pM of catTFRE-80 pre-bound to beads, and sWGA pulldown and flow-through samples were prepared in the same manner.
For sWGA pulldown assay, 50 μL of sWGA-agarose (Vector Laboratories, #AL-1023S) were mixed with 5 mg of NPs in either low or high O-GlcNAcylation states for 12 h at 4 °C. The beads were washed twice with Lysis Buffer and once with phosphate-buffered saline, then resuspended in 50 µL of 1 × SDS loading buffer and boiled for 15 min. The proteins captured by sWGA-agarose were prepared as sWGA pulldown sample.
LC–MS/MS analysis and label-free quantification (LFQ)
Five pM of catTFRE-80 and RS were used to interact with 5 mg of NPs extracted from wild-type MCF7 cells, as described, resulting in a catTFRE-enriched NP sample and an RS negative control sample. Five mg of NPs extracted from wild-type MCF7 cells were also prepared as an NP input sample. Similarly, 5 pM of catTFRE were used to interact with 5 mg of NPs in either low or high O-GlcNAcylation states, producing catTFRE-captured NP samples. Five mg of NPs in both low and high O-GlcNAcylation states were prepared as inputs. Fifty μL of sWGA-agarose (Vector Laboratories, #AL-1023S) were used to interact with 5 mg of NPs in either low or high O-GlcNAcylation states, as described, resulting in sWGA pulldown sample. These samples were loaded on 3% SDS–polyacrylamide gel electrophoresis gels. The bands were excised and then subjected to in-gel trypsin digestion, as previously described25. The digested peptides were passed onto a timsTOF Pro2 mass spectrometer (MS) coupled with a nanoElute HPLC system (Bruker Daltonics), in which a 300 μm i.d. × 5 mm C18 TRAP column (μ-Precolumn, Thermo Scientific) and a 75 μm i.d. × 25 cm in-house packed analytical column containing 1.8 μm C18 RP particles were used. A 60 min gradient at 300 nL/min was adopted to separate the peptides with buffer A (0.1% formic acid) and buffer B (0.1% formic acid in ACN), from 2% B to 22% B in 45 min, to 35% B in 10 min, to 80% B in 5 min, at 80% B for 5 min. Each sample was injected three or six times for more confident identification.
Data acquisition was performed under DIA mode and MS parameters were listed as following. The dia-PASEF method involved one MS1 scan followed by twelve dia-PASEF scans, utilizing 28 mass width windows (25 Da each) across a m/z range of 452–1152, with two mobility windows per mass window, totaling 56 windows. Ion mobility ranged from 0.75 to 1.4 Vs/cm2. The ion source voltage was set to 1700 V. Ion fragmentation was achieved using CID, with collision energy varying linearly from 20 eV at 1/K0 of 0.6 Vs/cm2 to 59 eV at 1/K0 of 1.6 Vs/cm2. DIA files were processed with Spectronaut (BGS Factory Settings) using the direct DIA workflow with default parameters, which included selecting Trypsin/P as the enzyme, a peptide length range of 7–52, and allowing for two missed cleavages, as well as variable modifications such as oxidation of methionine and acetylation of protein N-termini, with carbamidomethylation of cysteine specified as a fixed modification, and an FDR of 1% set for both protein and PSM identifications. For the protein identification and label free qualitative analysis, the representative proteins were identified (more than two peptides identified in MS) in at least four of six (or two of three) replicates of at least one sample. The missing values were imputed from a normal distribution (downshift of 1.8 standard deviations and a width of 0.3 standard deviations). Proteins that met the fold change ≥ 2 (or 3) for differential levels and p ≤ 0.05 (two-sided unpaired Student’s t test) were considered for the analyses. The annotation analysis was performed using Human TFDB26 and Metascape software (v3.0)27.
Protein purification
The sequence coding human full-length wild type FOXA1 and FOXC1 were subcloned into pET-28a(+) vector containing a C-terminal GST tag, and pMAL-C2X-OGT was purchased from Addgene. The plasmids encoding OGT were a gift from Professor David J. Vocadlo (Simon Fraser University, Canada). The recombinant protein (co-expression or not with OGT) was expressed as a GST fusion protein in E. coli BL21(DE3) cells at 19 °C for 12 h of induction with 0.5 mM IPTG (Solarbio). Cells were collected by centrifugation at 8000 rpm for 5 min, then resuspended in lysis buffer (20 mM Tris, 500 mM NaCl) and sonicated on ice. The fusion proteins were initially purified from cell lysates using GST 4FF (Pre-Packed Gravity Column, Sangon Biotech).
Immunoprecipitation
Full-length wild type human FOXA1WT and O-GlcNAcylation site mutants (Thr432/Ser441/Ser443 → Ala) FOXA13A18 were subcloned into pCMV-Puro vector containing an N-terminal HA tag. 22RV1 cells transfected with empty vector or FOXA1 expressing vector were used for functional validation of O-GlcNAcylation. Proteins were extracted from cells using Western/immunoprecipitation lysis buffer (Beyotime, #P0013) supplemented with protease inhibitor and phosphatase inhibitor cocktail (Roche) at 4 °C. Protein concentration of cell lysate was determined by bicinchoninic acid protein assay. For immunoprecipitation (IP) and co-IP, each cell lysate was added with anti-HA-magnetic beads (Bimake, #B26202), rotated at 4 °C overnight. Immunoprecipitates were then washed with cold Western/IP lysis buffer and then prepared as protein sample.
Western blotting (WB)/lectin blotting
Proteins sample were separated by SDS–polyacrylamide gel electrophoresis, followed by coomassie blue staining or transferred to polyvinylidene difluoride membrane (Millipore). The membranes were blocked with 5% nonfat milk solution and hybridized with primary antibody at 4 °C overnight. After washing, horseradish peroxidase (HRP)-conjugated secondary antibodies were used for visualized. The primary antibodies used were anti-FOXA1 (Abcam, #ab170933, 1:1000), anti-FOXC1 (Abcam, #ab226219, 1:1000), anti-CREB1 (Abcam, #ab178322, 1:1000), anti-GATA3 (Abcam, #ab199428, 1:1000), anti-O-GlcNAc MultiMab mix (CST, #82332, 1:1000), anti-HA tag (CST, #3724, 1:1000), anti-GAPDH (Proteintech, 60004-1-Ig, 1:4000), anti-Histone 3 (CST, #4499, 1:1000), Lectin sWGA (Vector Laboratories, #B-1025S, 1:8000) was used for lectin blotting. The appropriate secondary antibody used were anti-mouse IgG-HRP (CST, #7076, 1:20,000), anti-rabbit IgG-HRP (CST, #7074, 1:20,000), and Streptavidin-HRP (CST, #3999, 1:50,000).
Quantitative real-time PCR analysis (qPCR)
All primer sequences used for qPCR are provided in Supplementary Material Dataset S8. Total RNA was isolated using the Trizol (Invitrogen). By using One Step SYBR Prime-Script PLUS RT-PCR Kit (TaKaRa) and the Thermal Cycler Dice instrument (TaKaRa) according to the manufacturer’s instructions, A total of 4 μg of RNA were reverse transcribed and amplified.
FOXA1 chromatin immunoprecipitation qPCR (ChIP-qPCR) assays were performed as previously described18. Briefly, cross-linked chromatin complexes were captured from 1 × 107 cells and sonicated with a Sonics Vcx130pb. After precleared with vehicle-magnetic beads, chromatin complexes were immunoprecipitated with anti-HA-magnetic beads (Bimake, #B26202) at 4 °C for 12 h. The beads were washed four times at 4 °C with low-salt buffer, two times with high-salt buffer. Then the beads were de–cross-linked using Proteinase K (Sigma-Aldrich), then purified DNA was used for qPCR. For ChIP-qPCR, qPCR amplification was performed using primers specific for the indicated FOXA1 binding DNA. The fold enrichment of FOXA1 binding relative to the input was calculated and IgG was used as negative controls.
For the catTFREs pulldown qPCR, both 1 pM of catTFRE-80 and RS were added to interact with 5 mg of NPs. Subsequently, 2 μg of anti-FOXA1 (Abcam, #ab170933, 1:1000) and anti-FOXC1 (Abcam, #ab226219, 1:1000) antibodies, pre-bound to Protein G beads (Selleck, #B23202), were employed separately to capture the corresponding DNA–protein complexes for 12 h at 4 °C. The DNA–protein complexes obtained through immunoprecipitation were washed twice with phosphate-buffered saline. The complexes were then treated with DC buffer from the MiniBEST DNA Fragment Purification Kit (TaKaRa, #9761) to release the DNA. DNA purification was carried out according to the manufacturer’s instructions. In the qPCR analysis, amplification was conducted using primers specifically designed for catTFRE-80 or RS sequences. The fold enrichment of catTFRE-80 or RS compared to the input DNA was calculated, with IgG serving as the negative control. Similarly, the interaction between 1 pM of catTFREs and 5 mg of NPs from MCF-7 cells, treated with the OGA inhibitor Thiamet G or DMSO, was analyzed.
Data analysis and availability
Two-sided unpaired Student’s t-test for single comparison using the GraphPad Prism 8.0 software. No samples were excluded from the analysis. The correlations between protein or mRNA levels were analyzed by Pearson’s correlation coefficients (r). Statistical analyses were performed with two-sided unpaired. p values ≤ 0.05 was taken as statistically significant. The data are expressed as means ± SEM. To ensure reproducibility, blots and qPCR were repeated at least twice as indicated in the specific methods and legends. All results were successfully repeated. Detailed n values for each panel in the figures are stated in the corresponding legends. The raw mass spectral data in our study is available via iProX28,29 with identifier PXD057128.
Results
Design and functional identification of catTFREs for TF pulldown
In order to design DNA sequences for enriching NPs, including TFs, we referred to the JASPAR TF binding database to identify human TF response elements20. We meticulously curated responsive elements from 42 human TF families, encompassing nearly all pivotal human TF families, and engineered a construct by arranging two consecutive copies of each sequence with a three-nucleotide spacer in between. The resulting synthesized DNA sequence spans a total length of 2.6 kb (catTFRE-80, Fig. 1A and Supplementary Materials Dataset S1). To validate the functionality of the designed DNA sequences in enriching TFs, we further developed specialized constructs, targeting TFs reported to undergo O-GlcNAcylation. CatTFRE-1 was designed to attract the TF FOXA1 (Forkhead box protein A1) and FOXC1 (Forkhead box protein C1), while catTFRE-2 was formulated to attract the TF CREB1 (Cyclic AMP-responsive element-binding protein 1) and GATA3 (GATA Binding Protein 3). We synthesized and cloned these three catTFRE sequences to pUC57 plasmid, and selected a 3.3 kb random sequence (RS) from R2TV4 plasmid as non-regulatory DNA control (Fig. 1A and Supplementary Materials Dataset S1).
Fig. 1.
Design and identification of catTFREs for TF pulldown analysis. (A) Outline of the catTFREs pulldown assay used to analyze the impact of O-GlcNAcylation on the interactions between TFs/co-factors and DNA. catTFREs and RS are amplified by PCR with biotinylated primers. Then biotinylated DNA is subjected to interaction with NPs, resulting in DNA–protein complexes. DNA–protein complexes are captured by streptavidin magnetic beads, and those proteins are analyzed by WB or LC–MS/MS. Additionally, DNA is purified from DNA–protein complexes captured through immunoprecipitation, and the purified DNA is then subjected to qPCR. (B) Biotinylated DNA (catTFRE-1, catTFRE-2, catTFRE-80 and RS) was subjected to interaction with NPs (1 mg, 2 mg, and 5 mg). The protein levels of TFs (FOXC1, FOXA1, CREB1 and GATA3) captured by DNA were monitored by WB. (C) Regulation of TFs (FOXA1, FOXC1) O-GlcNAcylation levels on the protein levels of TFs (FOXA1, FOXC1) in DNA–protein complexes. TFs (FOXA1, FOXC1) were extracted from Thiamet G and DMSO treated MCF7 cells. The O-GlcNAcylation levels of TFs (FOXA1, FOXC1) and the protein levels of TFs (FOXA1, FOXC1) inclouded in DNA–protein complexes were monitored by WB. (D) The levels of catTFRE-80 and RS captured by TFs (FOXA1, FOXC1) were compared by qPCR. The data are presented as means ± SEM, n = 4 biologically independent experiments (two-sided unpaired Student’s t-test). (E) Effect of TFs (FOXA1, FOXC1) O-GlcNAcylation levels on the DNA levels of catTFRE-1 in DNA–protein complexes. The O-GlcNAcylation levels of TFs (FOXA1, FOXC1) was analyzed by WB. Then The levels of catTFRE-1 captured by TFs (FOXA1 and FOXC1), with varying degrees of O-GlcNAcylation levels, were compared using qPCR. The data are presented as means ± SEM, n = 3 biologically independent experiments (two-sided unpaired Student’s t-test).
Using biotinylated primers in PCR, we labeled both catTFREs and RS with biotin. Subsequently, the biotinylated catTFREs and RS were subjected to pulldown assays with NPs. It was observed that catTFRE-80 and catTFRE-1 adeptly seized FOXA1 and FOXC1, while catTFRE-2 and RS exhibited limited affinity towards Forkhead family TFs. Likewise, catTFRE-2 and catTFRE-80 demonstrated a high affinity for capturing CREB1 and GATA3, in contrast, catTFRE-1 and RS displayed a markedly lower capability to bind these TFs (Fig. 1B). Moreover, we conducted pulldown experiments on NPs from MCF-7 cells treated with DMSO or the OGA inhibitor Thiamet G to represent low and high levels of O-GlcNAcylation states respectively (Fig. 1A). Our results revealed that increasing of O-GlcNAcylation upon OGA inhibition increases TF binding to catTFREs (Fig. 1C). To rigorously confirm the specificity of the interaction between catTFREs and TFs, we utilized antibodies targeting FOXA1 and FOXC1 to selectively bind their corresponding catTFRE-80-protein and RS-protein complexes. Subsequently, we extracted the DNA from these immunoprecipitated DNA–protein complexes and subjected it to qPCR analysis. CatTFRE-80 exhibits significantly higher affinity for the TFs (FOXA1, FOXC1) compared to the RS (Fig. 1D). The inhibition of O-GlcNAcylation through the use of an OGT inhibitor results in a significant reduction in the DNA binding capacity of FOXA1 and FOXC1, as demonstrated in Fig. 1E. These data demonstrate that the enrichment efficiency of TFs by catTFREs is predominantly determined by the TF-specific DNA sequences, highlighting catTFRE’s utility in assessing the impact of O-GlcNAcylation on TF-DNA binding capabilities.
Quantitative proteomics analysis unveils the impact of O-GlcNAcylation on protein-DNA interactions
To thoroughly elucidate the impact of O-GlcNAcylation on the interaction between TFs and catTFRE-80 at a proteomic level, we utilized both catTFRE-80 and RS to pulldown proteins from a comparable amount of NPs derived from MCF7 cells. Subsequently, the isolated proteins were subjected to LFQ proteomics analysis (Fig. 2A). Proteomics analysis of three biological replicates of NPs input group and catTFRE-80 captured proteins group identified a total of 5618 proteins. After valid value filtering (a protein must be identified in two out of three replicates of at least one sample and must be subcellularly located in the nucleus), we pursued the analysis of 2100 high-confidence NPs, which included 1043 TFs/co-factors. Further analysis showed that catTFRE-80 could capture > 90% of NPs (1721 proteins, including 876 TFs/co-factors) from input NPs (1905 proteins, including 932 TFs/co-factors), underscoring the effectiveness of catTFRE-80 for capturing NPs including TFs/co-factors (Fig. 2B,C, Supplementary Materials Dataset S2). Proteomics analysis of both the catTFRE-80 group and the RS negative control group (each with three biological replicates) identified 2017 NPs with high confidence, including 1026 TFs/co-factors. Subsequent quantitative analyses revealed that, compared to RS, 521 NPs (including 305 TFs/co-factors) were enriched by catTFRE-80 (fold change ≥ 2, p value ≤ 0.05, see Fig. 2D,E, Supplementary Materials Dataset S3). These findings validate the utilization of catTFRE-80 for the enrichment of NPs, inclusive of TFs/co-factors, that partake in DNA interaction at the proteomic level. Furthermore, it can be employed to scrutinize the quantitative alterations in the nucleoprotein-DNA interaction.
Fig. 2.
Quantitative proteomics study exposes the influence of O-GlcNAcylation on the interactions between proteins and DNA. (A) Outline of quantitative proteomics in this study. NPs were extracted from MCF7 cells. Proteins captured by indicated DNA and input samples were subjected to LC–MS/MS analysis. (B) Venn diagram illustrating the overlap of NPs identified in the input samples and those enriched by the catTFRE-80. (C) Venn diagram showing the overlap of TFs/co-factors between input and catTFRE enriched. (D) Volcano plot of label-free relative quantitative proteomics data of 2017 NPs (n = 3 biologically independent experiments, two-sided unpaired Student’s t-test). 521 proteins were significantly enriched by catTFRE-80 (p values ≤ 0.05, fold change ≥ 2). Part of NPs (FOXA1, FOXC1 and CREB1) with critical function in nucleus are labeled. (E) Volcano plot of label-free relative quantitative proteomics data of 1026 TFs/co-factors (n = 3 biologically independent experiments, two-sided unpaired Student’s t-test). 305 TFs/co-factors were enriched by catTFRE-80 (p values ≤ 0.05, fold change ≥ 2). Part of TFs/co-factors with critical function in transcription are labeled. (F) NPs were extracted from MCF7 cells treated with Thiamet G, DMSO, no treatment or OSMI-1, and their O-GlcNAcylation states were analyzed by by WB. (G) Volcano plot of label-free relative quantitative proteomics data of 1374 NPs captured by catTFRE (n = 6 biologically independent experiments, two-sided unpaired Student’s t-test). With the change of O-GlcNAcylation levels, 237 important NPs exhibited significantly changed DNA–protein interaction (p values ≤ 0.05, fold change ≥ 2). Part of NPs are labeled.
Next, we analyzed the impact of O-GlcNAcylation on the interaction between NPs and DNA. The regulation of the O-GlcNAcylation level of NPs was achieved by exposing MCF7 cells to OSMI-1 or Thiamet G. This procedure was followed to acquire NPs with significantly different O-GlcNAcylation states (Fig. 2F). We utilized catTFRE-80 to pulldown proteins from OSMI-1 and Thiamet G groups of NPs, and then subjected the enriched proteins to MS analysis. In total, 1374 high-confidence NPs were detected across the two groups (for a protein to be identified, it must appear in at least four out of the six replicates from at least one sample). Following statistical analysis, it was found that due to changes in O-GlcNAcylation, 237 NPs exhibited ≥ 2-fold differences in interaction with DNA (p value ≤ 0.05, Fig. 2G, Supplementary Materials Dataset S4). These findings tentatively indicate that O-GlcNAcylation can modulate the interaction capabilities between DNA and a multitude of NPs, inclusive of TFs/co-factors, on a proteomics scale.
O-GlcNAcylation augments the protein-DNA interaction at the proteomic scale
O-GlcNAc is a major glycosylation modification in the cell nucleus. To thoroughly analyze the influence of changed O-GlcNAcylation on protein-DNA interaction at a proteomic scale, we utilized sWGA (a lectin specifically binding to GlcNAc moieties) to enrich O-GlcNAc modified proteins in NPs under both low and high O-GlcNAcylation states, subsequently employing this for quantitative proteomic evaluation of the binding ability with catTFRE-80. In both low and high O-GlcNAcylation states, sWGA-agarose beads obtained 4818 high-confidence NPs (identified in four out of six replicates in at least one sample). After comparison with the known O-GlcNAc modified proteins collected in The O-GlcNAc Database (v2.0)30, a further 1228 proteins with definitive O-GlcNAcylation sites were identified (Supplementary Materials Dataset S5). By comparison, there were 1557 O-GlcNAc modified proteins present in the input, of which 1187 could be enriched by sWGA-agarose (accounting for approximately 70% of the input group), suggesting that sWGA is highly efficient in enriching O-GlcNAc modified proteins from the input samples (Fig. 3A). Since O-GlcNAcylation can potentially influence protein biological activity, it may result in varying protein levels and altered nuclear localization of NPs, depending on the levels of O-GlcNAcylation. Additionally, some proteins may nonspecifically bind to sWGA. To mitigate these effects and accurately quantify the degree of O-GlcNAcylation in NPs from both low and high O-GlcNAcylation samples, we employed a methodology wherein the MS quantification value of an NP in the sWGA group is divided by its respective quantification value in the input group, thereby ascertaining the O-GlcNAcylation degree of said NP (Fig. 3B). Quantitative analysis results revealed that among the 1346 high-confidence NPs enriched by sWGA, the actions of OSMI-1 and Thiamet G significantly led to the decline in O-GlcNAcylation degree of 97 NPs and the increase in O-GlcNAcylation degree of 309 NPs, respectively (fold change ≥ 3, p value ≤ 0.05, Fig. 3C, Supplementary Materials Dataset S6). These findings suggest that the O-GlcNAcylation levels of these NPs can exhibit significant variations between low and high O-GlcNAcylation states, which could be utilized for further analysis of alterations in DNA binding capabilities.
Fig. 3.
O-GlcNAcylation enhances the DNA–protein interaction of most NPs. (A) Venn diagram showing the overlap of O-GlcNAc modified proteins between input and sWGA pulldown group. Proteins identified by MS were compared with the known O-GlcNAc-modified proteins collected in The O-GlcNAc Database (v2.0), resulting in the identification of O-GlcNAc-modified NPs. (B) The quantification value of a NP in the sWGA group is divided by its respective quantification value in the input group to represent the O-GlcNAcylation degree, reflecting the O-GlcNAcylation degree of each NP. Low and high O-GlcNAcylation states NPs were used as input groups. sWGA was used to enrich O-GlcNAc modified proteins, resulting in sWGA pulldown groups. (C) Volcano plot of quantitative proteomics data of NPs’ O-GlcNAcylation Degree (n = 6 biologically independent experiments, two-sided unpaired Student’s t-test). 406 NPs showed significantly changed O-GlcNAcylation degree (fold change ≥ 3, p values ≤ 0.05). (D) The MS quantification value of a NP in the catTFRE-80 group is divided by its respective quantification value in the input group to represent the DNA binding capability of each NP. Low and high O-GlcNAcylation states NPs were used as input groups. catTFRE-80 was used to capture DNA binding proteins, resulting in catTFRE-80 pulldown groups. (E) Volcano plot of quantitative proteomics data of NPs’ DNA binding capability (n = 6 biologically independent experiments, two-sided unpaired Student’s t-test). With the alterations in the O-GlcNAcylation state, the DNA binding capabilities of 360 NPs significantly changed (fold change ≥ 3, p values ≤ 0.05). (F) Scatter plot illustrating the relationship between the O-GlcNAcylation degree alteration of NPs and their DNA-binding capability alteration is presented. The Pearson correlation coefficient (r) is calculated through linear correlation analysis to assess the strength and direction of this relationship. (G) Sankey diagram showed the relationship between the significantly changed O-GlcNAcylation degree of 354 NPs and their changed DNA binding capability.
Next, we employed catTFRE-80 to capture NPs from equal quantities of low and high O-GlcNAcylation states groups, and subjected the corresponding input to MS analysis. Similarly, in order to quantitatively evaluate the variations in binding capabilities of a certain NP with catTFRE-80 under different O-GlcNAcylation degree, we divided each NP’s quantification in the catTFRE-80 group by its quantification in the corresponding input group. This provides us the DNA binding capabilities of each NP under low and high O-GlcNAcylation degree (Fig. 3D). The results indicated that among 1275 O-GlcNAc modified NPs, the alterations in the O-GlcNAcylation state significantly influenced the DNA binding capabilities of 360 NPs (253 up-regulated DNA binding capabilities, 107 down-regulated DNA binding capabilities, fold change ≥ 3, p value ≤ 0.05, Fig. 3E, Supplementary Materials Dataset S6). Moreover, by associating the degree of O-GlcNAcylation with the DNA binding capacity of specific NPs, we discerned a significant positive correlation (Person r = 0.77) between the O-GlcNAcylation degree alteration of these NPs and their DNA binding capability alteration (Fig. 3F), suggesting that higher levels of O-GlcNAcylation augment the DNA affinity of these proteins.
After taking the intersection of glycosylated NPs where there was a significant change in the O-GlcNAcylation degree, we further analyzed the effect of the O-GlcNAcylation state on the DNA binding capabilities of these 354 NPs. It was discovered that the DNA binding capability of 241 NPs could be amplified by O-GlcNAcylation. Notably, among 92 NPs exhibiting elevated O-GlcNAcylation degree in the OSMI-1 treated group relative to Thiamet G treated counterparts, 55 demonstrated concomitant increases in DNA-binding capability (Fig. 3G). Conversely, within the 262 NPs displaying heightened O-GlcNAcylation degree under Thiamet G treated group compared to OSMI-1 treated group, 186 showed enhanced DNA interaction potential. These coordinated directional shifts in post-translational modification status and functional activity collectively establish a robust positive association between O-GlcNAcylation degree and DNA-binding capability (Fig. 3G). Approximately 30% of NPs (111) did not exhibit significant alterations in their DNA binding capabilities amidst variations in the O-GlcNAcylation state. Only 2 NPs were uncovered where O-GlcNAcylation diminishes the DNA binding abilities of NPs (Supplementary Materials Dataset S7). These findings indicate that O-GlcNAcylation can potentiate the interaction of numerous NPs with DNA at the proteomics level, suggesting a universal principle of O-GlcNAcylation acting as a critical intranuclear modification in the functional modulation of NPs.
The impact of O-GlcNAcylation on the DNA binding capability of TFs/co-factors was further analyzed. Within the sWGA enriched cohort of 634 TFs/co-factors, the application of OSMI-1 or Thiamet G instigated a change in the O-GlcNAcylation degree of 233 TFs/co-factors (40 decreased and 193 increased, fold change ≥ 3, p value ≤ 0.05, Fig. 4A). On the other hand, under different O-GlcNAcylation degree, the DNA binding capacity of 47 TFs/co-factors significantly diminished, while it was amplified for 159 others (fold change ≥ 3, p value ≤ 0.05, Fig. 4B). A strong positive correlation (Person r = 0.82, Fig. 4C) was observed between the O-GlcNAcylation degree alteration of these TFs/co-factors and their DNA binding capability alteration, indicating that heightened levels of O-GlcNAcylation boost the DNA affinity of these factors. Statistical analysis reveals that, in a state of reduced O-GlcNAcylation, 23 TFs/co-factors displayed decreased DNA binding proficiency. In contrast, 123 TFs/co-factors demonstrated enhanced DNA binding capacity in a condition of elevated O-GlcNAcylation (Fig. 4D). This indicates that O-GlcNAcylation generally enhances the interaction between DNA and TFs/co-factors. These TFs/co-factors regulated by O-GlcNAcylation comprise 73 TFs, distributed across diverse TF families (Supplementary Materials Dataset S9). The most predominant is the C2H2 family, representing 33% of the total, whereas the Fork head family, constituting 8%, has piqued our interest (Fig. 4E). We discovered that among the 9 Fork head family TFs enriched by sWGA, 6 (FOXA1, FOXC1, FOXF2, FOXK1, FOXP1 and FOXP4) demonstrated significant variations in their binding capacity to catTFRE-80 concurrent with notable changes in their O-GlcNAcylation levels (Fig. 4F,G). The elevated DNA binding prowess displayed by these 6 Fork head TFs in a state of heightened O-GlcNAcylation suggests a significant role for O-GlcNAcylation in regulating the activity of this family of TFs (Fig. 4H).
Fig. 4.
O-GlcNAcylation augments the protein-DNA interaction of most TFs/co-factors. (A) Volcano plot of quantitative proteomics data of TFs/co-factors’ O-GlcNAcylation Degree (n = 6 biologically independent experiments, two-sided unpaired Student’s t-test). 233 TFs/co-factors showed changed O-GlcNAcylation degree (fold change ≥ 3, p values ≤ 0.05). (B) Volcano plot of label-free relative quantitative proteomics data of TFs/co-factors’ DNA binding capability (n = 6 biologically independent experiments, two-sided unpaired Student’s t-test). 206 TFs/co-factors showed changed DNA binding capability (fold change ≥ 3, p values ≤ 0.05). (C) Scatter plot illustrating the relationship between the O-GlcNAcylation degree alteration of TFs/co-factors and their DNA-binding capability alteration is presented. The Pearson correlation coefficient (r) is calculated through linear correlation analysis to assess the strength and direction of this relationship. (D) Sankey diagram showed the relationship between the significantly changed O-GlcNAcylation degree of 210 TFs/co-factors and their changed DNA binding capability. (E) These TFs/co-factors regulated by O-GlcNAcylation comprise 73 TFs, belonged to 25 TF families. (F) Volcano plot of quantitative proteomics data of Forkhead family TFs’ O-GlcNAcylation Degree (n = 6 biologically independent experiments, two-sided unpaired Student’s t-test). 6 TFs showed heightened O-GlcNAcylation degree (fold change ≥ 3, p values ≤ 0.05). (G) Volcano plot of label-free relative quantitative proteomics data of Forkhead family TFs’ DNA binding capability (n = 6 biologically independent experiments, two-sided unpaired Student’s t-test). 6 TFs showed enhanced DNA binding capability (fold change ≥ 3, p values ≤ 0.05). (H) Sankey diagram showed the relationship between the significantly changed O-GlcNAcylation degree of 9 Forkhead family TFs and their changed DNA binding capability.
O-GlcNAcylation enhances the interaction between FOXA1 and FOXC1 with catTFRE-80 in a cell-free milieu
To corroborate the precision of proteomic insights, we opted to examine the influence of glycosylation on the documented O-GlcNAcylated TFs FOXA1 and FOXC1 and their affinity with catTFRE-80 utilizing a cell-free milieu. In line with the sample preparation procedures for mass spectrometry analysis, we modulated the O-GlcNAcylation levels of TFs in MCF7 cells treated with OSMI-1 or Thiamet G respectively, and incubated nuclear extracts with catTFRE-80. The results exhibited that, in contrast to FOXA1 and FOXC1 with low O-GlcNAcylation levels, hyper-glycosylation markedly amplified their interaction with catTFRE-80 (Fig. 5A,B). Subsequently, we constructed prokaryotic expression plasmids bearing GST-tagged FOXA1 or FOXC1, and upon successful transformation of E.coli BL21 (DE3), we procured recombinant human GST-FOXA1 and GST-FOXC1 fusion proteins. To attain glycosylated variants of GST-FOXA1 and GST-FOXC1, we co-expressed human OGT in E.coli BL21 (DE3). Following purification via the GST tag, we incubated equal quantities of non-glycosylated and O-GlcNAcylated FOXA1 and FOXC1 with catTFRE-80 in vitro. The results revealed a significantly stronger interaction between O-GlcNAcylated FOXA1 and FOXC1 with DNA (Fig. 5C,D). Additionally, we incubated nuclear extracts from MCF7 cells with catTFRE-80, allowing the endogenic glycosylated and non-glycosylated forms of FOXA1 and FOXC1 in MCF7 cells to compete for binding with catTFRE-80 in vitro. After eluting FOXA1 and FOXC1 bound to catTFRE-80 using a low pH elution buffer, we facilitated an ample binding of O-GlcNAcylated TFs present with surplus sWGA-agarose and collected the flow-through. Subsequent detection of FOXA1 and FOXC1 in the sWGA-agarose and flow-through fractions using specific antibodies revealed that FOXA1 and FOXC1 bound to catTFRE-80 were almost entirely captured by sWGA-agarose, with no detection of these TFs in the flow-through group (Fig. 5E). This confirms that O-GlcNAcylation substantially amplified the interaction between FOXA1, FOXC1, and catTFRE-80. Consistent results were obtained with equal quantities of glycosylated and non-glycosylated GST-FOXA1 and GST-FOXC1 purified via the prokaryotic expression system (Fig. 5F). Together, these results demonstrated that O-GlcNAcylation augments the DNA–protein interplay involving FOXA1 and FOXC1, aligning with insights derived from quantitative proteomics analysis.
Fig. 5.
O-GlcNAcylation enhances the interaction between FOXA1 and FOXC1 with catTFRE-80 in vitro. (A,B) Impact of O-GlcNAcylation levels of Forkhead TFs (FOXA1, FOXC1) on their protein levels within DNA–protein complexes. NPs were extracted from Thiamet G and OSMI-1 treated MCF7 cells. Following in vitro protein-DNA (catTFRE-80) interaction, the levels of O-GlcNAcylation and the protein concentrations of TFs (FOXA1, FOXC1) within the pulldown/IP DNA–protein complexes were analyzed using WB and lectin blotting. (C,D) Recombinant GST-FOXA1 and GST-FOXC1 were co-expressed/or not with OGT in E.coli BL21 (DE3). Following purification and in vitro protein-DNA (catTFRE-80) interaction, the levels of O-GlcNAcylation and the protein amounts of these TFs within the pulldown DNA–protein complexes were assessed using WB. (E,F) Nuclear extracts from MCF7 cells or TFs (FOXA1 and FOXC1) expressed in E.coli were incubated with catTFRE-80, allowing the endogenous glycosylated and non-glycosylated forms of FOXA1 and FOXC1 to compete for binding with catTFRE-80. After eluting the bound FOXA1 and FOXC1 with a low pH elution buffer, binding of O-GlcNAc-modified TFs was promoted using an excess of sWGA-agarose (sWGA groups), and the flow-through was collected (Flow groups). The presence of FOXA1 and FOXC1 in the sWGA-agarose and flow-through fractions was then detected using specific antibodies.
O-GlcNAcylation augments the chromatin binding capacity of FOXA1 in living cells
Having demonstrated in vitro that O-GlcNAcylation enhances the ability of TFs to bind DNA, we further hypothesize that such post-translational modification should exert a comparable influence within living cells. We scrutinized publicly accessible datasets, specifically transposase-accessible chromatin assessed by high-throughput sequencing (ATAC-seq, GSE9937831) and FOXA1 chromatin immunoprecipitation sequencing (ChIP-seq, GSE9665232) derived from 22RV1 human prostate cancer cells, to identify sites of accessible chromatin where FOXA1 binds. Our investigation revealed the presence of five binding sites for FOXA1 within the regulatory and coding sequences of the homeobox A (HOXA) gene cluster-specifically within the promoter region (sites 1, 2 and 5) and the coding region (sites 3 and 4) of pivotal HOXA family genes (HOXA933,34, HOXA1035, and HOXA1336) that are implicated in prostate cancer pathogenesis, as depicted in Fig. 6A. The sequences at these sites exhibit a high degree of similarity to the consensus DNA-binding motif recognized by FOXA1. Following this, we examined if O-GlcNAcylation aids in fostering the interaction between FOXA1 and these specific regions of chromatin. In our previous reports, we identified three O-GlcNAcylation sites on FOXA1 (Thr432, Ser441, and Ser443)18. We further overexpressed HA-tagged recombinant wild-type FOXA1 (HA-FOXA1WT) and a glycosylation-deficient variant of FOXA1 with alanine substitutions at these sites (HA-FOXA13A) in 22RV1 cells. The expression levels of both the HA-FOXA1WT and HA-FOXA13A were comparable and exceeded that of the endogenous FOXA1. Relative to HA-FOXA1WT, the triple mutant HA-FOXA13A exhibited a significant reduction in O-GlcNAcylation, as shown in Fig. 6B. The binding capacity of HA-FOXA1WT and HA-FOXA13A to the five FOXA1 binding sites was examined using ChIP-qPCR (Fig. 6C). The wild-type FOXA1 presented a considerably higher enrichment at all these binding locations relative to the glycosylation-deficient FOXA1. Moreover, we observed an up-regulation in the mRNA levels of HOXA9, HOXA10, and HOXA13 in 22RV1 cells transfected with HA-FOXA1WT compared to those overexpressing HA-FOXA13A or the control 22RV1 cells (Fig. 6D). These data provided evidence that O-GlcNAcylation enhances the accessible chromatin-binding potential of FOXA1, further stimulating the transcription of the HOXA9, HOXA10, and HOXA13 genes in living cells.
Fig. 6.
O-GlcNAcylation augments the chromatin binding capacity of FOXA1 in 22RV1 cells. (A) Integrative Genomics Viewer (IGV) tracks showing FOXA1 ChIP-seq and ATAC-seq signals at the promoter and coding regions of the HOXA family genes cluster (HOXA9, HOXA10, and HOXA13). (B) The O-GlcNAcylation and protein levels of FOXA1 (HA-FOXA1WT and HA-FOXA13A) expressed in 22RV1 cells were monitored by WB. 22RV1 cells transfected with an empty vector plasmid were used as a control. (C) The binding capacity of HA-FOXA1WT and HA-FOXA13A to the five FOXA1 binding sites in 22RV1 cells was examined using ChIP-qPCR. The data are presented as means ± SEM, n = 3 biologically independent experiments (two-sided unpaired Student’s t-test). (D) The mRNA levels of HOXA family genes (HOXA9, HOXA10, and HOXA13) were analyzed in 22RV1 cells expressing HA-FOXA1WT and HA-FOXA13A using qPCR. The data are presented as means ± SEM, n = 3 biologically independent experiments (two-sided unpaired Student’s t-test).
Discussion
Fluctuations in O-GlcNAcylation levels affect the stability of NPs and their interactions with other proteins, which in turn impacts various intracellular biological processes and regulates cell transcription37,38. TFs/co-factors represent one of the most extensive and varied groups of DNA-binding proteins, playing a crucial role in regulating gene expression39. Over 700 TFs/co-factors have been identified with O-GlcNAcylation sites30, O-GlcNAcylation could modify the binding of TFs/co-factors to DNA, thereby influencing gene expression in genome wide, as demonstrated by our research group and others12,40. However, the precise role of O-GlcNAcylation in regulating TFs/co-factors chromatin binding and gene expression regulation function remains unexplored, especially the regulation of DNA binding capability. In this study, we employed an efficient methodology to explore how O-GlcNAcylation affects the direct interactions between NPs, especially TFs/co-factors, and specific DNA sequences. Using catTFREs to capture NPs, we compared DNA-binding capacities between low and high O-GlcNAcylation groups, revealing a marked increase in the DNA-binding ability of numerous nucleoproteins at higher O-GlcNAcylation levels. Notably, O-GlcNAcylation enhanced the binding of selected Forkhead family TFs, such as FOXA1 and FOXC1, to specific DNA sequences.
To analyze the impact of O-GlcNAcylation on protein-DNA binding capabilities at the proteomic scale, we referenced the JASPAR TF binding database in designing DNA sequences used for enriching NPs41. This approach enabled us to incorporate DNA feature-binding sequences from nearly all significant human TF families, resulting in the creation of catTFRE-80. These catTFREs were then employed to enrich NPs that bind to these DNA sequences. To mitigate the adverse effects of non-specific binding, we used a control group with RS under identical conditions. During the quantitative proteomic analysis, we established stringent criteria: only high-confidence NPs (those identified in two-thirds of the replicate experiments) and those with quantitative values exceeding two times that of the RS control group were considered to be specifically enriched by catTFREs. Through this method, we identified over 500 NPs that specifically bind to catTFRE-80, including more than 300 TFs and co-factors. This demonstrates that the approach is effective for enriching NPs with DNA sequence specificity at the proteomic level, and it can be further employed for the quantitative analysis of the DNA-binding capabilities of TFs. Similarly, we employed sWGA-based affinity capture to selectively enrich O-GlcNAcylated proteins from two groups NPs, then evaluate each NPs’ O-GlcNAcylation degree value. Given the constrained efficacy of sWGA in enriching single O-GlcNAc NPs, the O-GlcNAcylation degree quantification for these NPs may be lose. Comparing a quantitative characterization study of O-GlcNAc glycoproteins with our research, we found that sWGA could also enrich single O-GlcNAc-modified NPs, including ALB, BCOR, FAS, and VTA142. Our comparative analysis revealed that the O-GlcNAcylation degree values of 309 NPs were significantly elevated in Thiamet G treated group relative to OSMI-1-treated group (Fig. 3B), demonstrating the quantitative capacity of the sWGA-based strategy. Although mono-O-GlcNAcylated proteins may have been insufficiently enriched, the O-GlcNAcylation degree values derived from sWGA-based enrichment appeared minimally affected. Our ongoing investigations will employ the integration of orthogonal enrichment methodologies to generate a more panoramic landscape of O-GlcNAcylated proteoforms.
Further analysis of the enriched TFs/co-factors found that catTFRE-80 could enrich TFs/co-factors across numerous TF families, covering most human TF families, including some not employed in the catTFRE-80 sequence design. This is likely due to shared DNA binding sequence features across different TF families and variances in DNA sequences that the same TF recognizes at different chromatin locations. Those TFs/co-factors play a critical role in various cellular processes, such as chromatin remodeling, gene transcription and DNA damage response. By quantifying the DNA-binding capacity of these TFs/co-factors under different O-GlcNAc degree, we discovered a significant positive correlation between O-GlcNAcylation levels alteration and DNA binding capability alteration. This finding suggests that O-GlcNAcylation generally enhances the DNA binding of most TFs/co-factors, a potential universal mechanism whereby O-GlcNAcylation regulates the functions and activities of TFs/co-factors. Disruptions in O-GlcNAcylation levels, as seen in diseases like cancer, may regulate the expression levels of downstream target genes at the transcriptomic level through this mechanism, thereby playing a crucial role in disease development.
In previous research, we reported that the breast cancer-associated TF FOXA1 experiences abnormal O-GlcNAcylation at high levels in tumor cells with high metastatic potential18. This elevates the binding of FOXA1 to chromatin, but there is insufficient evidence to confirm that the direct binding of FOXA1 to DNA is regulated by O-GlcNAcylation. In this study, we discovered that not only FOXA1, but multiple TFs in the Forkhead family can enhance their binding to DNA under the influence of O-GlcNAcylation. We verified this finding in quantitative proteomics of the Forkhead family TFs and catTFRE-80, as well as in subsequent FOXA1 ChIP-qPCR experiments. By mutating the glycosylation sites of FOXA1, we further confirmed the important role of FOXA1’s O-GlcNAcylation in its transcriptional regulatory function in prostate cancer cells. These results reveal that O-GlcNAcylation has a significant impact on the transcriptional regulatory functions of many TFs/co-factors, including the Forkhead family. This is achieved by enhancing the binding of these critical TFs/co-factors to DNA, thereby affecting the transcription of numerous target genes. O-GlcNAc modulates the stability of TFs, as well as TF-protein and TF-DNA interactions, thereby influencing transcriptional regulation. Considering that this study does not involve absolute quantification of the glycosylation levels of proteins, our future investigations will integrate mass tag methodologies to obtain a more accurate quantification of proteins’ O-GlcNAcylation levels. While our study offers an effective approach for assessing the impact of O-GlcNAc on protein-DNA interactions, the regulation of TF stability and protein–protein interactions by O-GlcNAc remains largely unexplored. This underscores the need for the development of new high-throughput methodologies to facilitate more comprehensive investigations. Additionally, many proteins undergo multiple PTMs, which can influence each other in a phenomenon known as ‘PTM crosstalk’. A particularly intriguing example of PTM crosstalk occurs between O-GlcNAcylation and phosphorylation, as both modifications can target either identical Ser/Thr residues or spatially adjacent sites within functional domains. The cross-talk or interplay between these two abundant posttranslational modifications is extensive, and may arises both by steric competition for occupancy at the same or proximal sites and by each modification regulating the functional domains’ structure43. Therefore, phosphorylation and O-GlcNAcylation may cooperatively regulate the interaction between TFs/co-factors and DNA, although further experimental validation is required. Our previous research found that the hydrogen-bond network between the O-GlcNAcylated serine region, and the major groove of FOXA1-binding motif DNA sequences would enhance the interaction of FOXA1 CTD with DNA18. In this study, it is observed that many Forkhead family members showed enhanced DNA-binding capability with increased O-GlcNAcylation degree, indicating O-GlcNAcylation may present as major regulation factor to TFs’ DNA-binding capability. We also identifited partial transcriptional complexes associated proteins, such as TBP and TAF family. Statistical analysis revealed a robust positive correlation (r = 0.95; Supplementary Dataset S10) between the O-GlcNAcylation degree alteration of these proteins and their DNA-binding capability alteration, demonstrating the critical role of O-GlcNAcylation in the regulation of transcriptional complexes.
As an in vitro method with naked DNA as the template, the catTFRE approach does have certain limitations. Since naked DNA is employed to assess the potential DNA-binding activities of TFs/co-factors regulated by O-GlcNAcylation, the catTFRE sequences in cells may be shielded within a nucleosome context or by other histone/DNA methylation regulated by O-GlcNAcylation. The DNA binding ability of a TF/co-factor measured by catTFRE may not fully reflect its actual activity at all loci on the chromosome. Alterations to the reporter in diverse contexts, variations in its intracellular expression levels, and increased expression of proteins that may bind to and modify accessibility represent potential confounders. Such factors could introduce systematic errors in quantifying the extent of protein binding to the construct. Future efforts will require the development of multi-omics approaches to comprehensively characterize the modification status of the genome, the modification states of NPs, and the dynamic interplay between them. The regulation of the O-GlcNAcylation level of NPs was achieved by exposing MCF7 cells to OSMI-1 or Thiamet G, such that the protein quantity of the same transcription factor TF/co-factor between the two groups of NPs would not be absolutely equal. Although the input group was used as a control to evaluate the O-GlcNAcylation degree and DNA binding capability of TFs/co-factors, the use of calculated O-GlcNAcylation degree and DNA binding capability might unexpectedly amplify the differences between the two groups. Deregulating the O-GlcNAcylation degree of NPs using OGA, then comparing the O-GlcNAcylation degree and DNA binding capability of NPs and OGA-treated NPs, might further optimize the method. Considering that the folding of recombinantly expressed proteins might be influenced by O-GlcNAcylation, which in turn regulates their biological functions, detecting the crystal structures of glycosylated and non-glycosylated proteins to analyze conformational changes induced by glycosylation may be warranted.
In summary, our study proposes an efficient method for discerning the control of O-GlcNAcylation on DNA–protein interaction, through an in-depth quantitative analysis of O-GlcNAcylation levels and the DNA binding aptitude of proteins. Employing this strategy, an exhaustive evaluation of the influence of O-GlcNAcylation on the DNA binding ability of numerous TFs/co-factors was conducted. We found that O-GlcNAcylation generally intensifies the connection between TFs/co-factors and DNA, thereby broadening the established understanding of transcription-related O-GlcNAcylation on a proteomic scale. These findings offer fresh perspectives and supporting data on how O-GlcNAcylation exercises its regulatory roles within cellular transcription.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
This study is supported by the National Natural Science Foundation of China (32171282, 32471331), the Fundamental Research Funds for the Central Universities (DUT23YG114) and Liaoning Province’s “Xingliao Talent Plan” Youth Top Talents (XLYC2203069).
Author contributions
Y.L. performed experiments, analyzed data, and contributed to figure and manuscript preparation. G.L., F.M. and X.Z performed experiments, contributed to study design, analyzed data, and contributed to figure preparation. K.Y., N.Z., K.Z and H.H. contributed to performed experiments, analyzed data, and contributed to figure preparation. W.L., J.Z., W.W. and Y.R. designed the study, analyzed data, prepared the figures, and wrote the manuscript.
Data availability
The raw mass spectral data in our study is available via iProX with identifier PXD057128 (https://www.proteomexchange.org/).
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Guofang Li, Fanxu Meng and Xiaomin Zhong contributed equally to this work.
Contributor Information
Wei Wang, Email: wangwei_9111@hotmail.com.
Yan Ren, Email: reny@bgi.com.
Yubo Liu, Email: liuyubo@dlut.edu.cn.
References
- 1.Fehl, C. & Hanover, J. A. Tools, tactics and objectives to interrogate cellular roles of O-GlcNAc in disease. Nat. Chem. Biol.18, 8–17. 10.1038/s41589-021-00903-6 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Saha, A., Bello, D. & Fernandez-Tejada, A. Advances in chemical probing of protein O-GlcNAc glycosylation: Structural role and molecular mechanisms. Chem. Soc. Rev.50, 10451–10485. 10.1039/d0cs01275k (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Slawson, C., Copeland, R. J. & Hart, G. W. O-GlcNAc signaling: A metabolic link between diabetes and cancer?. Trends Biochem. Sci.35, 547–555. 10.1016/j.tibs.2010.04.005 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ruan, H. B. et al. Calcium-dependent O-GlcNAc signaling drives liver autophagy in adaptation to starvation. Genes Dev.31, 1655–1665. 10.1101/gad.305441.117 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Pedowitz, N. J., Batt, A. R., Darabedian, N. & Pratt, M. R. MYPT1 O-GlcNAc modification regulates sphingosine-1-phosphate mediated contraction. Nat. Chem. Biol.17, 169–177. 10.1038/s41589-020-0640-8 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lazarus, M. B., Nam, Y., Jiang, J., Sliz, P. & Walker, S. Structure of human O-GlcNAc transferase and its complex with a peptide substrate. Nature469, 564–567. 10.1038/nature09638 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Li, B., Li, H., Lu, L. & Jiang, J. Structures of human O-GlcNAcase and its complexes reveal a new substrate recognition mode. Nat. Struct. Mol. Biol.24, 362–369. 10.1038/nsmb.3390 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Li, J. et al. O-GlcNAcylation of YTHDF2 antagonizes ERK-dependent phosphorylation and inhibits lung carcinoma. Fundam. Res. (2024).
- 9.Wang, H. F. et al. Protein O-GlcNAcylation in cardiovascular diseases. Acta Pharmacol. Sin.44, 8–18. 10.1038/s41401-022-00934-2 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Li, X. et al. OGT controls mammalian cell viability by regulating the proteasome/mTOR/ mitochondrial axis. Proc. Natl. Acad. Sci. USA120, e2218332120. 10.1073/pnas.2218332120 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Chen, Y. et al. O-GlcNAcylation determines the translational regulation and phase separation of YTHDF proteins. Nat. Cell Biol.25, 1676–1690. 10.1038/s41556-023-01258-x (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wang, L. et al. Chromatin-associated OGT promotes the malignant progression of hepatocellular carcinoma by activating ZNF263. Oncogene42, 2329–2346. 10.1038/s41388-023-02751-1 (2023). [DOI] [PubMed] [Google Scholar]
- 13.Lambert, S. A. et al. The human transcription factors. Cell172, 650–665. 10.1016/j.cell.2018.01.029 (2018). [DOI] [PubMed] [Google Scholar]
- 14.Liu, Y. et al. O-GlcNAcylation promotes topoisomerase IIalpha catalytic activity in breast cancer chemoresistance. EMBO Rep.24, e56458. 10.15252/embr.202256458 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Qian, K. et al. Transcriptional regulation of O-GlcNAc homeostasis is disrupted in pancreatic cancer. J. Biol. Chem.293, 13989–14000. 10.1074/jbc.RA118.004709 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ma, J., Hou, C. & Wu, C. Demystifying the O-GlcNAc code: A systems view. Chem. Rev.10.1021/acs.chemrev.1c01006 (2022). [DOI] [PubMed] [Google Scholar]
- 17.Liu, Y. et al. Proteomic profiling and genome-wide mapping of O-GlcNAc chromatin-associated proteins reveal an O-GlcNAc-regulated genotoxic stress response. Nat. Commun.11, 5898. 10.1038/s41467-020-19579-y (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Liu, Y. et al. FOXA1 O-GlcNAcylation–mediated transcriptional switch governs metastasis capacity in breast cancer. Sci. Adv.9, eadg7112. 10.1126/sciadv.adg7112 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Balana, A. T. et al. O-GlcNAcylation of high mobility group box 1 (HMGB1) alters its DNA binding and DNA damage processing activities. J. Am. Chem. Soc.143, 16030–16040 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Ding, C. et al. Proteome-wide profiling of activated transcription factors with a concatenated tandem array of transcription factor response elements. Proc. Natl. Acad. Sci. USA110, 6771–6776. 10.1073/pnas.1217657110 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zhou, Q. et al. A mouse tissue transcription factor atlas. Nat. Commun.8, 15089. 10.1038/ncomms15089 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Bai, L. et al. Proteome-wide profiling of readers for DNA modification. Adv. Sci. (Weinh)8, e2101426. 10.1002/advs.202101426 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Yuzwa, S. A. et al. A potent mechanism-inspired O-GlcNAcase inhibitor that blocks phosphorylation of tau in vivo. Nat. Chem. Biol.4, 483–490. 10.1038/nchembio.96 (2008). [DOI] [PubMed] [Google Scholar]
- 24.Ortiz-Meoz, R. F. et al. A small molecule that inhibits OGT activity in cells. ACS Chem. Biol.10, 1392–1397. 10.1021/acschembio.5b00004 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zhang, S. et al. Identification of the O-GalNAcylation site(s) on FOXA1 catalyzed by ppGalNAc-T2 enzyme in vitro. Biochem. Biophys. Res. Commun.514, 157–165. 10.1016/j.bbrc.2019.04.146 (2019). [DOI] [PubMed] [Google Scholar]
- 26.Hu, H. et al. AnimalTFDB 3.0: A comprehensive resource for annotation and prediction of animal transcription factors. Nucleic Acids Res.47, D33–D38. 10.1093/nar/gky822 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun.10, 1523. 10.1038/s41467-019-09234-6 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Chen, T. et al. iProX in 2021: Connecting proteomics data sharing with big data. Nucleic Acids Res.50, D1522–D1527. 10.1093/nar/gkab1081 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ma, J. et al. iProX: an integrated proteome resource. Nucleic Acids Res.47, D1211–D1217. 10.1093/nar/gky869 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Malard, F., Wulff-Fuentes, E., Berendt, R. R., Didier, G. & Olivier-Van Stichelen, S. Automatization and self-maintenance of the O-GlcNAcome catalog: a smart scientific database. Database (Oxford)10.1093/database/baab039 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Chen, Z. et al. Diverse AR-V7 cistromes in castration-resistant prostate cancer are governed by HoxB13. Proc. Natl. Acad. Sci. USA115, 6810–6815. 10.1073/pnas.1718811115 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Kron, K. J. et al. TMPRSS2-ERG fusion co-opts master transcription factors and activates NOTCH signaling in primary prostate cancer. Nat. Genet.49, 1336–1345. 10.1038/ng.3930 (2017). [DOI] [PubMed] [Google Scholar]
- 33.Malek, R. et al. TWIST1-WDR5-Hottip regulates Hoxa9 chromatin to facilitate prostate cancer metastasis. Cancer Res.77, 3181–3193. 10.1158/0008-5472.CAN-16-2797 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Song, Y. P. et al. Comprehensive landscape of HOXA2, HOXA9, and HOXA10 as potential biomarkers for predicting progression and prognosis in prostate cancer. J. Immunol. Res.2022, 5740971. 10.1155/2022/5740971 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Li, B. et al. HoxA10 induces proliferation in human prostate carcinoma PC-3 cell line. Cell Biochem. Biophys.70, 1363–1368. 10.1007/s12013-014-0065-7 (2014). [DOI] [PubMed] [Google Scholar]
- 36.Luo, Z., Rhie, S. K., Lay, F. D. & Farnham, P. J. A prostate cancer risk element functions as a repressive loop that regulates HOXA13. Cell Rep.21, 1411–1417. 10.1016/j.celrep.2017.10.048 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Jiang, M. et al. O-GlcNAcylation promotes colorectal cancer metastasis via the miR-101-O-GlcNAc/EZH2 regulatory feedback circuit. Oncogene38, 301–316. 10.1038/s41388-018-0435-5 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Gambetta, M. C. & Muller, J. O-GlcNAcylation prevents aggregation of the Polycomb group repressor polyhomeotic. Dev. Cell31, 629–639. 10.1016/j.devcel.2014.10.020 (2014). [DOI] [PubMed] [Google Scholar]
- 39.Bushweller, J. H. Targeting transcription factors in cancer—From undruggable to reality. Nat. Rev. Cancer19, 611–624. 10.1038/s41568-019-0196-7 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Parker, M. P., Peterson, K. R. & Slawson, C. O-GlcNAcylation and O-GlcNAc cycling regulate gene transcription: Emerging roles in cancer. Cancers (Basel)10.3390/cancers13071666 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Rauluseviciute, I. et al. JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding profiles. Nucleic Acids Res.52, D174–D182. 10.1093/nar/gkad1059 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Woo, C. M. et al. Mapping and quantification of over 2000 O-linked glycopeptides in activated human T cells with isotope-targeted glycoproteomics (isotag). Mol. Cell Proteomics17, 764–775. 10.1074/mcp.RA117.000261 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Xu, S., Suttapitugsakul, S., Tong, M. & Wu, R. Systematic analysis of the impact of phosphorylation and O-GlcNAcylation on protein subcellular localization. Cell Rep.42, 112796. 10.1016/j.celrep.2023.112796 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The raw mass spectral data in our study is available via iProX with identifier PXD057128 (https://www.proteomexchange.org/).