Skip to main content
Journal of Industrial Microbiology & Biotechnology logoLink to Journal of Industrial Microbiology & Biotechnology
. 2025 Sep 8;52:kuaf028. doi: 10.1093/jimb/kuaf028

Characterization of S-glycosylated glycocins containing three disulfides

Rachel M Martini 1,#, Chandrashekhar Padhi 2,#, Wilfred A van der Donk 3,4,5,
PMCID: PMC12457901  PMID: 40920454

Abstract

Glycocins are a growing family of ribosomally synthesized and posttranslationally modified peptides (RiPPs) that are O- and/or S-glycosylated. Using a sequence similarity network of putative glycosyltransferases, the thg biosynthetic gene cluster (BGC) was identified in the genome of Thermoanaerobacterium thermosaccharolyticum. Heterologous expression in Escherichia coli showed that the glycosyltransferase (ThgS) encoded in the BGC adds N-acetyl-glucosamine (GlcNAc) to Ser and Cys residues of ThgA. The peptide derived from ThgA, which we name thermoglycocin, was structurally characterized and shown to resemble glycocin F. In addition to two nested disulfide bonds also present in glycocin F, thermoglycocin contains a third disulfide bond creating a C-terminal loop. Unexpectedly, ThgA lacks the common double glycine motif for leader peptide removal by a C39-peptidase. Based on AlphaFold3 modeling, we postulated that cleavage between the leader and core peptide would occur instead at a GK motif, which was experimentally confirmed for an orthologous BGC from Ornithinibacillus bavariensis. Its structurally similar product termed orniglycocin was also produced in E. coli and carries two GlcNAc moieties on two Cys residues. The C39 peptidase domain of the peptidase-containing ATP-binding cassette transporter (PCAT) from this BGC removed the leader peptide after a Gly-Lys motif and the orniglycocin so produced demonstrated antimicrobial activity. This study adds to the small number of characterized glycocins, employs AlphaFold3 to predict the leader peptide cleavage site, and suggests a common naming convention similar to that established for lanthipeptides.

One-Sentence Summary: Thermoglycocin from Thermoanaerobacterium thermosaccharolyticum and orniglycocin from Ornithinibacillus bavariensis were produced heterologously in E. coli, shown to contain three disulfide bonds and two GlcNAcylations, and were released by a unique C39 protease that cleaves at a Gly-Lys sequence.

Keywords: RiPPs, leader peptidase, double Gly motif

Graphical Abstract

Graphical Abstract.

Graphical Abstract

Introduction

Recent investigations of ribosomally synthesized and posttranslationally modified peptides (RiPPs) have yielded many novel posttranslational modifications (Nguyen et al., 2024). RiPPs are synthesized as a precursor peptide that contains two regions: a leader peptide and a core peptide (Arnison et al., 2013,Oman & van der Donk, 2010). The core peptide is acted on by modifying enzymes encoded in the biosynthetic gene cluster (BGC), after which the leader peptide is typically cleaved off by proteolysis to create the mature peptide product (Eslami & van der Donk, 2023, Oman & van der Donk, 2010). Glycocins are a growing family of RiPPs, characterized by glycosylations on Ser, Thr, or Cys residues (Hata et al., 2010, Izquierdo et al., 2009, Kaunietis et al., 2019, Main et al., 2020, Maky et al., 2021, Maky et al., 2015, Nagar & Rao, 2017, Norris & Patchett, 2016, Oman et al., 2011, Ren et al., 2018, Stepper et al., 2011). Compared to O-linked glycosylations, S-linked glycosylations are rare in biology but they have greater stability and resistance to chemical and enzymatic cleavage (De Leon et al., 2017,Maynard et al., 2016). Previous studies have shown that glycosyltransferases from glycocin biosynthetic pathways can be used to glycosylate non-native substrates (Fujinami et al., 2021,Oman et al., 2011). Thus, the prevalence of S-linked glycosylations in glycocins opens up opportunities for the application of their glycosyltransferases to other fields (Sharma et al., 2021,Wang et al., 2014). Given the ubiquitous nature of glycosylations with N-acetyl-glucosamine (GlcNAc) in eukaryotic organisms (Hart et al., 2011), enzymes that generate S-linked GlcNAc modifications are particularly attractive for engineering purposes (Maynard et al., 2016). We therefore set out in this study to identify and characterize additional S-glycosyltransferases that would conjugate GlcNAc to Cys residues in peptides and to identify their products.

We used heterologous expression in Escherichia coli to produce and characterize two O/S-glycosylated peptides. A BGC (termed the thg locus) from the thermophilic bacterium Thermoanaerobacterium thermosaccharolyticum was chosen for investigation because enzymes from thermophiles generally display high stability that is desirable for engineering and use in biocatalytic processes (Chatterjee et al., 2023,Chettri et al., 2021,Gomes et al., 2016,Zhu et al., 2020). Indeed, other enzymes from this organism have been investigated for their potential use in biotechnology because of their thermostable and robust properties (Pei et al., 2012). In addition, a similar BGC from Ornithinibacillus bavariensis J43TS3 was selected for investigation (org locus).

In this work, we co-expressed the precursor peptides ThgA and OrgA with their glycosyltransferases ThgS and OrgS, respectively, in E. coli resulting in predominantly bisglycosylation with GlcNAc. The sites of glycosylation and the disulfide pattern of the products were determined through tandem mass spectrometry as well as proteolytic digest, confirming a disulfide pattern postulated previously (Norris & Patchett, 2016), which demonstrated that the products are members of the glycocin F family. Further, by reconstituting the activity of the C39-peptidase domain of the peptidase-containing ATP-binding cassette transporter (PCAT) in the org BGC, we showed that peptide maturation occurs by proteolytic cleavage at the non-canonical Gly-Lys motif instead of the well-studied double Gly motif, thus expanding the recognition site specificity of C39-peptidases and facilitating correct prediction of the final products. We show that the glycocin from O. bavariensis has antibiotic activity.

Materials and Methods

General Methods

Chemicals and media for cultures were purchased from Thermo Fisher Scientific or Sigma Aldrich unless otherwise stated. Polymerase chain reactions were carried out using a C1000 Bio-Rad thermocycler and were catalyzed using Q5 polymerase (NEB). Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry analyses were carried out using a Bruker UltrafleXtreme instrument (Bruker Daltonics) through the UIUC Mass Spectrometry facility, using 50 mg/mL Super-DHB (Sigma, catalog number 50862-1G-F) dissolved in 80% acetonitrile and 0.1% trifluoroacetic acid (TFA) as matrix. MALDI-TOF mass spectra were obtained by desalting peptides using a C18 ZipTip, elution with 8 μL of 80% aqueous acetonitrile containing 0.1 % TFA, and then mixing 1:1 v/v with matrix before spotting on a MALDI plate.

Identification of the thg BGC

A BLAST search (Boratyn et al., 2013) using the UniProt database (Consortium, 2024) with GccA as a query sequence, an E-value of 5 for the sequence retrieval, a maximum BLAST sequence number of 1,000, and an E-value of 5 for the sequence similarity edge calculation was used to generate a list of proteins related to this GlcNAc glycosyltransferase. The sequence similarity network (SSN) (Atkinson et al., 2009) webtool from the Enzyme Function Initiative-Enzyme Tools (EFI-EST) (Gerlt et al., 2015) was used to generate an SSN from the output of the BLAST search. The SSN was visualized using Cytoscape (Shannon et al., 2003) (Supplementary Fig. S1) with a filter value of 50%. The network was manually inspected for enzymes related to known glycosyltransferases involved in glycocin biosynthesis, especially those that resembled the GlcNAc S-glycosyltransferases AsmA (Main et al., 2020) and GccA (Stepper et al., 2011; Venugopal et al., 2011), to identify potential orthologs. To further narrow potential BGCs to likely glycocin BGCs, the EFI-Genome Neighborhood tool was used to identify glycosyltransferases with a PCAT (PFAM: PF03412) nearby using a neighborhood size of 10 and minimal co-occurrence percentage lower limit set to 20.

Plasmid Assembly

Genes encoding ThgA, ThgS, OrgA, and OrgS, were codon-optimized for E. coli and ordered as double-stranded DNA from Twist Biosciences (see sequences in Table S1). In addition, genes encoding residues 33–157 of ThgT (truncated to aid in solubility), and the N-terminal 156 residues of OrgT containing the C39 peptidase domains from the PCATs of the respective BGCs were ordered codon-optimized. The synthetic genes were inserted into pRSFDuet-1 (thgA and thgS) or pACYCDuet-1 (thgT33-157 and thgS) with HiFi Gibson assembly master mix (NEB). Similarly, orgA and orgS genes were cloned into the MCS1 and MCS2 of pRSFDuet-1 vector, respectively. The gene orgT156 was cloned into a pET28a backbone. See Table S1 for the precise insertion sites into the plasmids. The assembled plasmids were used to transform chemically competent E. coli NEB Turbo cells that were then plated on LB agar with the appropriate antibiotic. Plasmid sequences were verified via Sanger DNA sequencing at the UIUC Core Sequencing Facility or using whole plasmid sequencing (Plasmidsauraus).

Heterologous Expression of Peptides and Proteins and in Vitro Modification

Plasmids were used to transform E. coli Express SHuffle cells (NEB; for ThgA/ThgS/ThgT33-157) or E. coli BL21 (DE3) TUNER (for co-expression of ThgA/ThgS, or OrgA/OrgS), or E. coli BL21(DE3) T1R competent cells (for expression of ThgS and MBP-ThgT33-157). Escherichia coli BL21 (DE3) TUNER cells were also used for the expression of recombinant OrgT156. A single colony was used to inoculate small (5 mL) cultures of LB containing 50 mg/L kanamycin (for pRSFDuet-1 and pET28), chloramphenicol (for pACYCDuet-1), and 100 mg/L spectinomycin (for SHuffle cells). Cultures were grown overnight at 37 °C, while shaking. For expression in E. coli BL21 (DE3) TUNER cells, a modified TB medium (Padhi et al., 2024) was used containing 24 g/L yeast extract, 20 g/L tryptone, and 2% glycerol supplied with 17 mM KH2PO4, and 72 mM K2HPO4 post-sterilization. The small cultures were diluted in 1 L of terrific broth (TB) and grown at 37 °C for TUNER cells (30 °C for SHuffle cells) to OD600 = 0.6. Cultures were incubated on ice for 20 min. Isopropyl ß-D-1-thiogalactopyranoside (IPTG) (GoldBio) was added to 0.5 mM. Cultures were grown overnight at 18 °C (16 °C for SHuffle cells). Cells were harvested via centrifugation at 8,000 xg for 10 min.

For proteins and peptides expressed in SHuffle cells, the harvested pellet was resuspended in lysis buffer (20 mM NaH2PO4 pH 7.5, 500 mM NaCl, 0.5 mM imidazole). Cells were lysed by sonication, and cell debris was removed via centrifugation at 45,000 x g for 1 hr. Supernatant was loaded onto a Ni-NTA immobilized metal affinity chromatography column (Takara Bio). Beads were washed with 20 mM NaH2PO4 pH 7.5, 500 mM NaCl, and 30 mM imidazole. Peptide or protein was eluted with 20 mM NaH2PO4 pH 7.5, 100 mM NaCl, 1 M imidazole. In the case of ThgS, the buffer was exchanged to protein storage buffer (50 mM HEPES pH 8, 300 mM NaCl, 10% glycerol) using an ultracentrifugal filter (Amicon) with a 30-kDa molecular weight cut off and frozen at −80 °C in small aliquots for later use. The buffer containing MBP-tagged ThgT33-157 purified by Ni-affinity chromatography was similarly exchanged to 50 mM Tris pH 8.

In case of peptides and proteins expressed in TUNER cells, the harvested pellet was resuspended in 50 mM Tris-HCl buffer containing 300 mM NaCl, 10% glycerol, and 20 mM imidazole at pH 8.0 (NPI20; 20 representing the imidazole concentration in mM) followed by cell lysis by sonication at an amplitude of 40% for 20 cycles of 5 s on/off with a 6-mm probe. The cell lysate was centrifuged at 12,000 × g to obtain the soluble fraction, which was then incubated with Ni-NTA agarose (MCLAB) for 1 hr at 4 ⁰C. The beads were subsequently washed with 3 × column volume (CV) of NPI40, followed by elution with 3 × CV of NPI750. Eluted peptides of proteins were exchanged into 50 mM Tris-HCl buffer containing 100 mM NaCl and 10% glycerol.

Modified or unmodified ThgA was further purified by preparative reversed-phase HPLC (Macherey-Nagel NUCLEODUR C18, 5 μm, 250×10 mm column) on an Agilent 1260 Infinity II HPLC system. Solvent A contained 0.1% TFA in H2O. Solvent B contained 0.1% TFA in acetonitrile. The gradient increased linearly from 2 to 60% B over 35 min. The yield of modified ThgA co-expressed with ThgS was about 150 μg per liter of culture.

In Vitro Glycosylation of ThgA By ThgS

In vitro glycosylation of ThgA by ThgS was performed as described previously for other glycosyltransferases (Wang & van der Donk, 2011) using uridine-5’-diphosphate-α-D-glucose (UDP-Glc) or uridine-5’-diphosphate-α-D-N-acetylglucosamine (UDP-GlcNAc; Sigma) under reducing conditions (1 mM tris(2-carboxyethyl)phosphine, TCEP). The reaction was incubated overnight at 25 °C. Peptide was desalted using a C18 SPE column and lyophilized. To facilitate the formation of disulfide bonds in ThgA after glycosylation, the glycosylated peptide was incubated overnight with a mixture of oxidized and reduced glutathione as described previously (Wu et al., 2019).

Fully modified peptide (mThgA) was purified by reversed-phase HPLC using an Agilent 1260 Infinity analytical HPLC with a Vydac SelectaPore monomeric C18 column (90 Å, 5 µm, 2.1 mm ×150 mm). All mThgA-containing fractions were collected and lyophilized for storage at −20 °C.

Identification and Characterization of Post-Translational Modifications

The identity of the sugar modification was determined by acid-catalyzed hydrolysis and gas chromatography-mass spectrometry (GC-MS) of the cleaved sugar. Sugar samples were converted into volatile derivatives as described previously (Oman et al., 2011), with the exception of a change in standards to N-acetylgalactosamine, N-acetylglucosamine, and N-acetylmannosamine. Chromatograms were acquired using a GC-MS system (Agilent Inc, CA, USA) consisting of an Agilent 7890 gas chromatograph, an Agilent 5975 MSD and a HP 7683B autosampler. Gas chromatography was performed on a ZB-5MS (60 m × 0.32 mm I.D. and 0.25-μm film thickness) capillary column (Phenomenex, CA, USA). The inlet and MS interface temperatures were 250 °C, and the ion source temperature was adjusted to 230 °C. An aliquot of 1 μL was injected with a split ratio of 10:1. The helium carrier gas was kept at a constant flow rate of 2 mL/min. The temperature program was: 5 min isothermal heating at 70 °C, followed by an oven temperature increase of 5 °C min−1 to 310 °C and a final 10 min at 310 °C. The mass spectrometer was operated in positive electron impact mode (EI) at 69.9 eV ionization energy at m/z 50–800 scan range combined with single ion monitoring (SIM) mode. For the SIM mode, a m/z 319 fragment for derivatized acetyl-hexosamines was analyzed (Mairinger et al., 2020) with 274 as a secondary qualifier fragment to minimize background from an interfering compound. Acquired peaks were evaluated by the Mass Hunter Quantitative Analysis B.08.00 (Agilent Inc., CA, USA) software. The instrument variability was within the standard acceptance limit (5%). GC-MS analysis was performed in the Carver Metabolomics Core (University of Illinois Urbana-Champaign Roy J. Carver Biotechnology Center).

LC-MS analysis was performed on an Agilent 1290 QToF for ESI-HR-MS and MS/MS analysis. For full length precursors, LC separation was conducted at 50 °C on a 5%-60% gradient of acetonitrile-water (+0.1% formic acid) over 8 min at 1 mL/min flow rate on a Phenomenex Aeris 3.6 μm WIDEPORE XB-C18 LC column (part nr. 00D-4482-E0). Mass spectra were collected in positive mode with 170 V fragmentor voltage at 10 spectra/s and 100 ms/spectrum. For digested peptide fragments, LC separation was conducted at 50 °C on a 10%–80% gradient of acetonitrile-water (+0.1% formic acid) over 11 min at 0.6 mL/min flow rate on a Phenomenex Aeris 2.6-μm PEPTIDE XB-C18 LC column (part nr. 00F-4505-E0). Mass spectra were collected in positive mode with 170-V fragmentor voltage at 10 spectra/s and 100 ms/spectrum. Tandem-MS fragmentation was achieved at mass-dependent normalized collision energies using a slope function [formula: (slope)×(m/z)/100 + offset value], where slope = 1 and offset = 20 and 30 for charge 3 and 2, respectively. HR-MS/MS analysis was performed using the Interactive Peptide Spectral Annotator (IPSA) tool (Brademan et al., 2019) and verified manually.

mThgA and mOrgA Structural Characterization

To determine the location of GlcNAc modification, mThgA was labeled with NEM as described below and purified by C18 SPE column. The modified peptide was resuspended in buffer (100 mM Tris pH 8, 2 mM CaCl2) and digested with chymotrypsin (Worthington Biochemical Corporation) for 1 hr at 37 °C. Cleaved peptide was purified by C18 TopTip (Glycen) and then analyzed via liquid chromatography tandem mass spectrometry (LC-MS/MS) on an Agilent 1290 LC-MS QToF using a C18 column (Kinetex 2.6 μm) with a gradient of 2–60% B (for make-up, see above) over 8 min, a flow rate of 0.4 mL/min, and collision-induced dissociation (CID) energy of 30 eV.

The number of disulfide bonds in mThgA was determined by labeling with N-ethylmaleimide (NEM) under reducing conditions. The peptide was incubated at 70 °C for 15 min in buffer containing 100 mM sodium citrate (pH 6), 6 M guanidine HCl, 10 mM EDTA, and 10 mM TCEP. The reaction mixture was cooled to room temperature and NEM was added to 10 mM final concentration and the reaction was left in the dark for 30 min at 37 °C. The reaction mixture was desalted with a C18 ZipTip (Agilent) and analyzed by MALDI-TOF MS.

To determine the location of the disulfide bonds, monoglycosylated ThgA was resuspended in buffer (50 mM Tris pH 8, 0.5 mM CaCl2) and digested by thermolysin (Promega) with a 1:100 peptide to enzyme ratio for 1 hr at 37 °C. Digested fragments were desalted using C18 TopTip (Glycen) and lyophilized. LC-MS/MS was carried out using a C4 column (Jupiter, 5 μm, 300 Å pore size, 50×2 mm) with a gradient of 5–95% B (for make-up, see above) over 4 min, a flow rate of 0.4 mL/min, and CID energy of 30 eV. LysC digests of mThgA were carried out with an enzyme-to-substrate ratio of 1:20–1:100. The reaction was carried out in 20 mM Tris at pH 8 and incubated for 16 hr at room temperature.

For alkylation experiments of OrgS-modified OrgA (mOrgA), the samples were directly incubated with 5 mM TCEP at 50 °C for 30 min followed by incubation with 15 mM of freshly prepared NEM for 30 min in the dark at RT. For tandem-MS analysis, a desalted-alkylated mOrgA sample was dried under vacuum followed by resuspension in 50 mM Tris-HCl (pH 8) containing 100 mM NaCl. Chymotrypsin (Promega) or LysC (NEB) endoproteinase digestions were performed at 1:100 enzyme: substrate ratio for 2 hr in the presence of their respective buffers as per the manufacturer's recommendations, followed by incubation with 1 mM TCEP at 37 °C for 30 min. Samples were then analyzed by LC-MS/MS as described before for peptide fragment analysis.

Characterization of the Protease Maturation At the Non-Canonical GK-motif

N-terminally 6xHis-tagged OrgT156 (OrgT156 encompassing the first 156 residues) was expressed in E. coli BL21 (DE3) TUNER cells and purified as described in the sections above. For activity assays, 50 µM of the substrate (mThgA or mOrgA produced in E. coli) was reacted with 5 µM of the enzyme (OrgT156) at room temperature for 16 hr followed by LC-MS analysis in the presence or absence of 1 mM TCEP. LC separation was conducted at 50 ⁰C on a 5%–60% gradient of acetonitrile-water (+0.1% formic acid) over 8 min at 1 mL/min flow rate on a Phenomenex Aeris 3.6 μm WIDEPORE XB-C18 LC column (part nr. 00D-4482-E0). Mass spectra were collected in positive mode with 170 V fragmentor voltage at 10 spectra/s and 100 ms/spectrum.

Bioactivity Testing of Thermoglycocin

Bisglycosylated ThgA was cleaved with LysC to generate the core peptide, which was purified by HPLC. Fractions containing thermoglycosin as determined by MS were lyophilized and resuspended in sterile water to a final concentration of 10–100 μM. The peptide was screened for bioactivity against Bacillus cereus TZ417 (37 °C; ATCC medium 3: 3 g beef extract, 5 g peptone, 15 g agar per L), E. faecalis 29 212 (37 °C; ATCC medium 3), G. thermodenitirificans NM16-2 (50 °C; ATCC medium 3), S. epidermis (37 °C; BHI media), Bacillus subtilis ATCC 6633 (30 °C; LB), B. subtilis Δspβ (30 °C; LB), Lactococcus lactis (37 °C; TSB media), B. megaterium B-14308 (30 °C; ATCC medium 3), and B. cereus ATCC 14579 (30 °C; ATCC medium 3). Strains were grown overnight at the indicated temperature and in the medium shown. The cultures were diluted to a final OD of ∼0.05 in 20 mL molten agar media and poured over a plate prepared with set agar containing wells formed by placing a sterile 96 well plate over the agar while it set. Thermoglycocin (10 μL of 10–100 μM) was added to the wells, and plates were incubated overnight or until growth was visible in the top agar layer.

Bioactivity Testing of Orniglycocin

The core peptide fragment of OrgT156-digested mOrgA (termed orniglycocin) was purified by HPLC, which resulted in 0.16 mg per liter of culture. The purified product was lyophilized and then dissolved in sterile water to a final concentration of 1 mM. The bioactivity of orniglycocin was tested for antibiotic potential against 10 strains, B. subtilis strain 168, B. cereus Z4222, L. lactis sp. cremoris NZ9000, Bacillus licheniformis NRRL NRS-1264, Staphylococcus aureus C5, Staphylococcus simulans 22, Staphylococcus carnosus TM300, Micrococcus luteus DSM 1790, and Staphylococcus epidermidis ATCC 12228. Overnight cultures were grown at 37 ⁰C in LB media followed by dilution to OD600 = 1. Base plates containing LB agar (1.8% agar w/v) were prepared and then overlayed with LB soft agar (0.5% agar w/v) containing 50 µL of the diluted cultures and allowed to dry. Wells were created using a 1 mL sterile tip and 2 μL of the resuspended orniglycocin was spotted in the wells. The plates were then incubated overnight at 37 ⁰C with subsequent imaging for zones of inhibition.

Results and Discussion

Identification of the Thg BGC

To identify potential new GlcNAc S-glycosyltransferases, we generated an SSN (Atkinson et al., 2009) from the output of a BLAST search with GccA as a query, using the EFI-EST version 2024_04/101 (Oberg et al., 2023). GccA is a glycosyltransferase encoded in the glycocin F gene cluster from Lactobacillus plantarum KW30 (Stepper et al., 2011; Venugopal et al., 2011). Since the leader peptides of all known glycocins are removed by a C39-like protease that is part of a bifunctional PCAT (Håvarstein et al., 1995), the Enzyme Function Initiative genome neighborhood tool (Oberg et al., 2023) was used to identify glycosyltransferases encoded near PCATs. Furthermore, we used comparisons to the sequences of the core peptides of known glycocin precursor peptides to identify systems that were predicted to make glycocins with different structures than previously reported family members. Based on these combined considerations and with the aim of characterizing a thermostable GlcNAc transferase, we chose a BGC from a thermophilic bacterium T. thermosaccharolyticum for further study and named it the thg BGC (for thermoglycocin as the final RiPP product after proteolytic procession).

Using the nomenclature of the first glycocin BGC to be characterized (Dorenbos et al., 2002; Paik et al., 1998) responsible for producing sublancin (Oman et al., 2011), the proteins encoded in the thg cluster were given the names ThgA (precursor peptide), ThgS (glycosyltransferase), and ThgT (bifunctional protease transporter) (Fig. 1a). A pair of thioredoxin-like thiol-disulfide isomerases was annotated ThgC and ThgD. The precursor peptide with seven Cys residues is similar to a homologous peptide previously identified bioinformatically (Norris & Patchett, 2016) in the genomes of both Bacillus lehensis G1 (now Shouchella lehensis G1) (Noor et al., 2014) and Alkalicoccobacillus plakortidis DSM 19153 (Wang et al., 2016b) (Fig. 1b and c). An orthologous BGC was also identified in the genome of O. bavariensis J43TS3 (org cluster, Fig. 1b), which encodes a precursor peptide containing eight Cys residues in the predicted core region. An identical precursor peptide was previously reported in cheese metagenomic data (Norris & Patchett, 2016). The thg gene cluster also appears in the compilation of putative Type I glycocin BGCs reported by Singh & Rao (2021).

Fig. 1.

Fig. 1

(a) Gene architecture of the thg BGC and sequence of the precursor peptide. The end of the leader peptide (residue − 1; leader peptide shown in grey) is putative and based on data in this study. Cys residues are numbered above the sequence. (b) Gene architecture of the org biosynthetic cluster in the genome of O. bavariensis J43TS3. The end of the leader peptide (shown in grey) is putative and based on data in this study. Cys residues are numbered above the sequence. (c) Alignment of the core peptides of ThgA and OrgA with related precursor peptides identified using NCBI BLAST and known glycopeptides, including two (LehA and OrgA previously named LehAvar) that were identified bioinformatically in a previous study (Norris & Patchett, 2016). LehA is encoded in the genome of B. lehensis G1 (AIC94358.1) and in the genome of A. plakortidis (KQL58965.1). LehAvar was found in cheese metagenome (accession ERX644124) (Norris & Patchett, 2016); OrgA is identical to LehAvar. The amino acids that align with glycosylated Ser/Cys residues in glycocin F, thermoglycocin and orniglycocin are highlighted in light blue, as is the Cys that is glycosylated in sublancin.

The ThgA precursor sequence is unique compared to previously experimentally characterized glycocins because of the presence of seven Cys residues in the predicted core peptide (Fig. 1a), whereas all previously experimentally characterized glycocins have five Cys residues (Hata et al., 2010; Izquierdo et al., 2009; Kaunietis et al., 2019; Main et al., 2020; Maky et al., 2021; Maky et al., 2015; Nagar & Rao, 2017; Norris & Patchett, 2016; Oman et al., 2011; Ren et al., 2018). The precursor ThgA is similar to OrgA, which contains eight Cys residues (Fig. 1b and c). The org BGC also encodes a glycosyltransferase (OrgS), a thioredoxin-like protein (OrgC), and a PCAT (OrgT).

Characterization of the Glycocin Produced By the Thg BGC

Co-expression of N-terminally His6-tagged ThgA and ThgS in E. coli SHuffle T7 Express cells or E. coli BL21 (DE3) TUNER cells yielded two peptide products showing mass increases of 203.07 Da and 406.14 Da compared to the predicted mass of the unmodified precursor peptide (Fig. 2a) as determined by liquid chromatography-coupled electrospray ionization (ESI) high-resolution mass spectrometry (LC-ESI-HR-MS). These mass differences verified the addition of two N-acetylhexoses in modified ThgA (mThgA) (Fig. 2a, for tandem-MS see below). Whereas co-expression in SHuffle cells produced mostly monoglycosylated products, expression in TUNER cells produced mostly bisglycosylated products. Acid-catalyzed hydrolysis was used to cleave the sugars from the peptide followed by their derivatization as described previously (Oman et al., 2011). Gas chromatography monitored by mass spectrometry (GC-MS) and comparison with authentic standards derivatized in the same manner identified the sugar as GlcNAc (Supplementary Fig. S3).

Fig. 2.

Fig. 2.

LC-ESI-HR-MS analysis of mThgA and mOrgA in the presence of TCEP. (a) Extracted ion chromatogram (EIC) of unmodified, monoGlcNAcylated, and bisGlcNAcylated ThgA after expression without (top) or with (bottom) ThgS (ThgA-S) and treatment with TCEP. Calculated m/z values are provided; see Table S1 for sequences. Deconvoluted [M + H]+ = 8507.6970 Da (unmodified), 8710.7764 Da (monoglycosylated) and 8913.8558 Da (bisglycosylated). (b) HR-MS spectra of ThgA (top) and mThgA (bottom) showing the observed masses and the mass shifts. (c) EIC of unmodified, monoGlcNAcylated, and bisGlcNAcylated full length OrgA after expression without (top) or with (bottom) OrgS (OrgA-S) and treatment with TCEP. Calculated EIC values are provided. Deconvoluted [M + H]+ = 8435.7297 Da (unmodified), 8638.8091 Da (monoglycosylated), and 8841.8885 Da (bisglycosylated). (d) HR-MS spectra of OrgA (top) and mOrgA (bottom) showing the observed masses and the mass shifts. Expressions were conducted in E. coli BL21 (DE3) TUNER cells. Both peptides showed near-complete conversion to bisglycosylated peptides.

In Vitro Glycosylation By ThgS and Determination of Disulfide Pattern

We next turned to in vitro characterization of ThgS. His6-ThgA and His6-ThgS were expressed separately in E. coli SHuffle T7 Express cells and purified by nickel-affinity chromatography. In vitro reaction of His6-ThgA with His6-ThgS in the presence of UDP-GlcNAc as a sugar donor and TCEP to reduce any disulfides, which was previously shown to be required for glycosylation of sublancin (Oman et al., 2011), resulted in two glycosylations (Fig. 3a). This observation provided further support that fully glycosylated ThgA (mThgA) contains two GlcNAc modifications.

Fig. 3.

Fig. 3.

(a) MALDI-TOF mass spectrum of bisGlcNAcylated ThgA (observed mass: [M + H]+ = 8917.4, calculated average mass: [M + H]+ = 8919.4) after in vitro incubation with ThgS. Note that the three disulfides are reduced in the product (M+6H) because of the presence of TCEP in the enzymatic reaction, but some reoxidation prior to analysis may have occurred. (b) LC-MS/MS spectrum of residues 33–47 of bisGlcNAcylated ThgA, alkylated with NEM, and digested by chymotrypsin (observed mass: [M + 2H]2+ = 861.3147, calculated monoisotopic mass: [M + 2H]2+ = 861.3137, calculated molecular ion = 1720.6118). Fragmentation data are consistent with the + 203 modification being on Cys44. Cys41 and Cys46 were alkylated with NEM. The diagram shows the position of the proteolytic fragment (red) in thermoglycocin. (c) LC-MS/MS of ThgA residues 2–11 and 26–47 (observed mass: [M + 3H]3+ = 1159.4978, calculated monoisotopic mass: [M + 3H]3+ = 1159.4925, calculated molecular ion = 3475.4540) connected by a disulfide bond. The diagram shows the position of the proteolytic fragment (red) in thermoglycocin. (d) Multiple sequence alignment of ThgA with the leader peptide region of precursor peptides of related putative and known glycocins. The predicted cleavage sites are indicated in brown as well as the residues at positions −9 and −14 from the predicted cleavage site. The alternative double Gly motif in OrgA and the residues in positions −7 and −12 with respect to this alternative double Gly site are shown in light blue. Panels b and c were prepared using the Interactive Peptide Annotator Webtool (Brademan et al., 2019). Panel d prepared using MPI Bioinformatics Toolkit (Gabler et al., 2020).

To localize the position of the modified residues, the free Cys residues were alkylated with NEM. The derivatized peptide was digested with chymotrypsin, and the fragments were analyzed by LC-MS/MS. Fragmentation by CID showed GlcNAc modification at Cys44 (Fig. 3b) near the C-terminus with Cys41 and Cys46 alkylated by NEM.

The second glycosylation site was more difficult to determine as the second sugar dissociated in the tandem MS experiments, suggesting that the second GlcNAc was O-linked, which is a more labile linkage (Drummond et al., 2021; Kaunietis et al., 2019; Main et al., 2020; Stepper et al., 2011). The digestion with chymotrypsin allowed us to narrow down the second glycosylation to a peptide spanning Asp19 through Tyr25 (Supplementary Fig. S4). Because of the lability of the GlcNAc during tandem MS analysis, we were not able to conclusively determine whether the site of O-glycosylation was Ser20 or Thr22, but based on sequence homology with the glycocins ASM1 and glycocin F (Fig. 1c) (Main et al., 2020; Stepper et al., 2011), the O-glycosylation is highly likely to occur on Ser20.

Further support for this hypothesis comes from experiments with OrgA (Fig. 1b). This peptide has high homology with ThgA including the six Cys residues that form disulfides and the Cys that is S-glycosylated, but the equivalent position to Ser20 in ThgA is occupied by a Cys residue in OrgA (Fig. 1c). Overexpression of OrgA with its glycosyltransferase OrgS in E. coli BL21 (DE3) TUNER cells resulted in near-complete conversion to a product carrying two GlcNAc molecules (Fig. 2c and 2d). The position of the sugars was determined to be at Cys24 and Cys48 by MS-MS analysis after digestion with endoproteinase LysC and chymotrypsin (Figs S5 and S6). A similar analysis on the NEM-alkylated mOrgA further confirmed that indeed the two Cys that are not involved in disulfides are glycosylated (Fig. S7).

These data present strong but indirect support that ThgA is also bisglycosylated and that Ser20 carries one of the glycosylations. In turn, these findings suggest that ThgS is part of a growing class of glycosyltransferases able to catalyze both O- and S-linked glycosylations (Ahn et al., 2018; Main et al., 2020; Venugopal et al., 2011; Wang et al., 2014). All experimentally characterized glycocins have disulfide bonds that contribute to their remarkable stability and that play an important role in glycocin bioactivity (Bisset et al., 2018; Dorenbos et al., 2002). Reaction of mThgA isolated from E. coli with NEM in the presence of reductant resulted in addition of six NEM molecules (Supplementary Fig. S8) suggesting all six Cys residues that are not glycosylated are involved in disulfide bonding (see also discussion regarding Fig. 4 below that illustrates that mThgA contains three disulfides).

Fig. 4.

Fig. 4.

LC-ESI-HR-MS analysis of OrgT156-digested mOrgA and mThgA. (a) Extracted ion chromatogram (EIC) of mOrgA no-enzyme control (top) and OrgT156-digested mOrgA leader and core peptide fragments (bottom). (b) HR-MS spectra of the leader peptide fragment (top); deconvoluted [M + H]+ = 3392.6917 Da (theoretical), 3392.7001 Da (observed) for leader peptide and the core peptide fragment (bottom) of OrgT156-digested mOrgA in the presence of 1 mM TCEP. Deconvoluted [M + H]+ = 5468.1853 Da (theoretical), 5468.1871 Da (observed) for core peptide with bisglycosylation. (c) The mOrgA core peptide fragment in the absence of TCEP showing a 6 Da lower mass demonstrating the presence of three disulfide bridges formed in the final product. Deconvoluted [M + H]+ = 5462.166 Da (theoretical), 5462.0695 Da (observed) for core peptide with bisglycosylation. (d) EIC of mThgA no-enzyme control (top) and OrgT156-digested mThgA leader and core peptide fragments (bottom). (E) HR-MS spectra of the leader peptide fragment (top); deconvoluted [M + H]+ = 3761.7361 Da (theoretical), 3761.7343 Da (observed) for leader peptide and the core peptide fragment (bottom) of OrgT156-digested mThgA in the presence of 1 mM TCEP. Deconvoluted [M + H]+ = 5171.1376 Da (theoretical), 5171.1421 Da (observed) for core peptide with bisglycosylation. (f) The core peptide fragment generated in the absence of TCEP showing a 6 Da lower mass demonstrating that three disulfide bridges had formed in the final product. Deconvoluted [M + H]+ = 5165.1076 Da (theoretical), 5164.9894 Da (observed) for core peptide with bisglycosylation. The sequence of the full-length substrates used for the assays is shown above panels a and d highlighting the leader peptide in the first line and the core peptide inthe second line.

In E. coli (Fig. 2a) and in vitro (Fig. 3a), a minor product is a monoglycosylated peptide suggesting that the glycosylation is ordered. To interrogate which position is not glycosylated in the monoglycosylated product, the peptide was cleaved with thermolysin in the absence of reductant. Fragments were analyzed using LC-MS/MS. A triply charged ion of 1159.4978 Da was observed that corresponds to the mass of residues 2 to 11 of ThgA connected via a disulfide bond between Cys7 and Cys30 to a GlcNAcylated peptide consisting of residues 26 to 47 (Fig. 3c). Clean fragmentation was observed, and the fragment ions could be traced back to both peptides that were linked by a disulfide. Curiously, the fragment ions as well as the molecular parent ion suggest that the disulfide between Cys41 and Cys46 was not formed in this thermolysin digest peptide. While we do not have a good explanation for why this disulfide was not present, the data clearly show a disulfide between Cys7 and Cys30 and glycosylation of Cys44. Moreover, the thermolysin digest also resulted in an additional observed singly charged ion of 1467.5767 Da corresponding to residues 12–25 with a disulfide bond between Cys14 and Cys23 (Supplementary Fig. S9). These data suggest that Cys44 is glycosylated first in ThgA, with Ser20 being glycosylated second. These findings are also consistent with previous studies of the O/S-glycosyltransferase ThuS (Fujinami et al., 2021) that showed a preference for glycosylating Cys over Ser residues (Wang et al., 2014). This interpretation also explains why only bisglycosylated product was observed for OrgA (Fig. 2c). The full length bisglycosylated mThgA peptide was also analyzed by LC-MS/MS, showing minimal fragmentation between Lys at the −8 position and Tyr32 and no fragmentation between Cys41 and Cys46 (Fig. S10). As no or little fragment ions are expected within a ring structure, these observations together with the disulfides that exist between Cys41 and Cys46 (Fig. 3b) and Cys 14 and 23 (Supplementary Fig. S9) strongly suggest that thermoglycocin has the same nested disulfide bonding pattern of sublancin and glycocin F with an additional disulfide at the C-terminus (Fig. 3c). The protease digest and fragmentation patterns of mOrgA further support this conclusion (Supplementary Fig. S5). This pattern of disulfides was previously predicted for some of the peptides in Fig. 1c (Norris & Patchett, 2016).

Prediction of the Leader Peptide Cleavage Site

Almost all RiPPs are made from a precursor peptide that contains an N-terminal leader peptide that is removed in a late biosynthetic step by a protease (Eslami & van der Donk, 2023; Montalbán-López et al., 2021). The responsible protease is often encoded within the BGC, and indeed the thg and org BGCs encode an ATP-dependent transporter with a C39-type N-terminal protease (ThgT and OrgT, Fig. 1). These bifunctional PCATs are frequently found in BGCs of a wide variety of RiPPs (Eslami & van der Donk, 2023; Håvarstein et al., 1995; Padhi et al., 2024) and their leader peptides are the most common type of leader peptide in RiPP biosynthesis (Montalbán-López et al., 2021). All currently characterized PCATs have been shown to cleave after GG, GA, or GS motifs, typically called the double glycine motif (Bobeica et al., 2019; Dirix et al., 2004a; Dirix et al., 2004b; Ishii et al., 2010). Given the lack of such a motif in ThgA in the predicted leader peptide (Fig. 1a), it was not clear where the leader peptide removal site is in ThgA. We made a multiple sequence alignment of the N-terminal part of ThgA with related peptides retrieved from the NCBI database using BLAST (Boratyn et al., 2013) (Fig. 3d). The structure of glycocin F (Stepper et al., 2011; Venugopal et al., 2011) suggests that its precursor peptide GccF is cleaved at a double glycine motif by the protease/transporter GccB. The double Gly motif of GccF (GG-K) corresponds to the sequence GK-G in ThgA, which was also hypothesized to be the leader peptide removal site in the glycocin Hyp1 (Fig. 3d) (Kaunietis et al., 2019). However, this predicted proteolytic site is unusual for a number of reasons. First, previously studied C39 peptidases were shown to be unable to accept charged residues in the −1 position (Furgerson Ihnken et al., 2008). Second, most PCAT substrates have hydrophobic residues (Leu, Val, Ile, Met) in positions −7 and −12 from the cleavage site, which occupy hydrophobic pockets in the PCAT as shown by X-ray crystallography (Bobeica et al., 2019). An example is the precursor peptide sequence to the glycocin sublancin (Fig. 3d). But for all the precursor peptides other than sublancin shown in Fig. 3d, the residues at position −7 from the putative G(−2)K(−1)-G(+1) cleavage site are hydrophilic (Asp/Glu/Ser/Thr/Asn/Lys) and at position −12 mostly negatively charged (Glu/Asp). We noted that the residues at positions −9 and −14 in these peptides are invariably hydrophobic (Leu, Ile, Val, Fig. 3d), possibly suggesting that the register has shifted in this set of peptides.

To investigate this hypothesis, we used AlphaFold3 (Abramson et al., 2024) modeling of the ThgA leader peptide and the protease domain of ThgT encompassing the N-terminal 157 residues (ThgT1-157, Supplementary Fig. S11). The model showed that residues Ile −9 and Leu −14 of ThgA fit into a hydrophobic groove on ThgT157 and that Lys −1 aligns with the catalytic Cys in the active site of the peptidase (Bobeica et al., 2019; Chen et al., 2001; Ishii et al., 2010). Thus, it seems that a group of PCATs that putatively cleave substrates after a GK sequence are able to do so by setting a different register in which now hydrophobic residues at positions −9 and −14 occupy the same pockets that normally are occupied by residues at positions −7 and −12. The known cleavage site to release glycocin F provides indirect support for such a change in register. The precursor peptide for this glycocin has the canonical double Gly motif, but it also has the hydrophobic residues in positions −9 and −14 and not −7 and −12 (Fig. 3d).

In an attempt to experimentally verify these predictions, the N-terminal C39 protease domain of ThgT was expressed in E. coli as has been done previously for the orthologous enzymes CvaB (Wu & Tai, 2004), ComA (Ishii et al., 2006), LahT (Bobeica et al., 2019), LctT (Furgerson Ihnken et al., 2008), BovT (Wang et al., 2016a), ColT (Wang et al., 2025), McyT (Wang et al., 2023a), and MfuT (Wang et al., 2023b). Unfortunately, a His6-tagged version was obtained in insoluble form under a variety of expression conditions. A maltose binding protein (MBP) fusion was successfully expressed and purified but did not show any proteolytic activity.

Reconstitution of the C39 Protease Domain of OrgT

With the inability to experimentally confirm the predictions regarding the leader peptide cleavage site for thermoglycocin, we turned to the org BGC. OrgA contains both a GG and a GK motif in its putative leader peptide region (Fig. 3d). Our prediction was that OrgT would cleave at the latter since it would position two hydrophobic residues at positions −9 (Ile) and −14 (Leu). Conversely, if cleavage were to take place at the GG sequence, the residues at positions −9 and −14 would be Gly and Thr, respectively, or if the recognition follows the canonical C39 cleavage pattern the residues at positions −7 and −12 from the GG sequence would (coincidentally) again be Gly and Thr, respectively (Fig. 3d). Neither conforms with current knowledge about these enzymes. Based on a predicted structure by AlphaFold3 (Supplementary Fig. S12), the N-terminal 156 residues of OrgT were expressed in E. coli TUNER cells with an N-terminal His6-tag. OrgT156 was purified and reacted with mOrgA that had been modified with two GlcNAc groups by OrgS. Two products were observed by ESI-MS that unambiguously confirm that cleavage takes place after the GK motif (Fig. 4a and b) and that the product obtained in E. coli contains three disulfides (Fig. 4c). OrgT156 was also reacted with mThgA, which also resulted in cleavage at the GK site, albeit with lower efficiency (Fig. 4df). These data thus identify the first group of PCATs that do not cleave at a canonical double Gly site and for which the recognition relies on hydrophobic residues at positions −9 and −14. These data also show that the predicted glycocin from the org BGC has a much longer N-terminus than what would have been predicted based on cleavage at a double Gly site. The discovery of different cleavage sites for PCATs will be important for predicting the junction between leader and core peptides in genome mining studies.

Bioactivity Testing of Thermoglycocin and Orniglycocin

The data obtained in this study suggest a similar topology for the glycocins from T. thermosaccharolyticum and O. bavariensis (Fig. 5a) as the previously characterized compounds sublancin and glycocin F. Orniglycocin was tested for antimicrobial activity using agar diffusion assays against B. subtilis 168, B. cereus Z4222, L. lactis sp. cremoris NZ9000, B. licheniformis NRRL NRS-1264, S. aureus C5, S. simulans 22, S. carnosus TM300, M. luteus DSM 1790, and S. epidermidis ATCC 12228. Orniglycocin showed antibiotic activity against B. cereus Z4222 and B. subtilis 168 (Fig. 5b). Thermoglycocin was also tested but no antimicrobial activity was observed, possibly because the compound selectively targets organisms that co-habitat the environment of T. thermosaccharolyticum, an anaerobic thermophile, just as the glycocin pallidocin from the thermophile Aeribacillus pallidus targets other thermophilic bacteria (Kaunietis et al., 2019). An alternative explanation is that the bisGlcNAcylated compound produced in E. coli and cleaved by OrgT156 is different in structure as the native compound. We consider this explanation less likely given the similarity to orniglycocin as well as other glycocins.

Fig. 5.

Fig. 5.

Structures of (a) thermoglycocin and (b) orniglycocin. (c) Bioactivity assay of orniglycocin. Zones of growth inhibition by orniglycocin that was spotted on an overlay of B. cereus Z4222 and B. subtilis strain 168. Sterile water was used as solvent for orniglycocin as well as for the control spot.

Suggested Nomenclature for Glycocin Biosynthetic Enzymes

Rapid growth in the field of glycocins suggests that a common naming convention for glycocin biosynthetic machinery may be desirable. A standardized nomenclature using the prefix Lan was proposed more than 30 years ago for the largest class of RiPPs, the lanthipeptides (de Vos et al., 1991), which has proven to be very useful in discussing lanthipeptide families, genes, and enzymes (Repka et al., 2017). For glycocins, because some BGCs were annotated before the protein function was known and for other BGCs genes were annotated in alphabetical order of their appearance in the BGC, the function of a certain protein cannot be immediately gleaned from its name. For instance, for three representative glycocins, sublancin, glycocin F, and enterocin F4-9, the protein naming is diverse: substrates are SunA, GccF, and Enf49A, glycosyltransferases are SunS, GccA, and EnfC, thiol-disulfide oxidoreductases are BdbAB, GccCD, and EnfB, and transporters/protease are SunT, GccB, and EnfT. We propose a uniform nomenclature for the biosynthetic enzymes of glycocins that may be identified in future studies that will immediately associate function with the name. The proposed nomenclature is a hybrid of the original naming scheme for sublancin (Oman et al., 2011; Paik et al., 1998) and glycocin F (Ahn et al., 2018; Stepper et al., 2011), the first two characterized glycocins. In the proposed nomenclature, the biosynthetic genes are designated with a generic locus symbol gyc, with a more specific genotypic designation for each glycocin member (e.g. sun for sublancin and gcc for glycocin F). The glycocin precursor peptides would be referred to using the generic term GycA, glycosyltransferases as GycS, peptidases as GycT, and disulfide isomerases as GycC and GycD. We used this scheme here for the naming of the thg and org BGCs.

Conclusions

Thermoglycocin is the first characterized glycocin derived from an anaerobic thermophile. Our genome mining studies also uncovered orniglycocin as another example of an S-linked glycopeptide with antibacterial activity. Both orniglycocin and thermoglycocin contain three disulfide linkages and two GlcNAcylations. They differ in the glycosylation of a loop between two α-helices. The sequence of this eight amino acid loop is near identical, differing only in having a Ser in thermoglycocin and a cysteine in orniglycocin. Interestingly, these two differing residues are GlcNAcylated during their biosynthesis. In the final maturation step, novel C39-peptidases process both modified OrgA and ThgA to furnish orniglycocin and thermoglycocin by removing a leader peptide at a non-canonical Gly-Lys motif. As others have predicted based on genome sequences (Palaniappan et al., 2020; Singh & Rao, 2021), this study shows that the diversity of glycocin structures will continue to grow as more are characterized.

Supplementary Material

kuaf028_Supplemental_File

Acknowledgments

The authors thank the metabolomics facility of the Roy J. Carver Biotechnology Center (CBC) at the University of Illinois at Urbana-Champaign for GC-MS and high resolution mass spectrometry services, and Enleyona Weir (UIUC) for help with the antimicrobial activity assays. HHMI lab heads have previously granted a nonexclusive CC BY 4.0 license to the public and a sublicensable license to HHMI in their research articles. Pursuant to those licenses, the author-accepted manuscript of this article can be made freely available under a CC BY 4.0 license immediately upon publication.

Contributor Information

Rachel M Martini, Department of Biochemistry University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

Chandrashekhar Padhi, Department of Chemistry and Howard Hughes Medical Institute, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

Wilfred A van der Donk, Department of Biochemistry University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Department of Chemistry and Howard Hughes Medical Institute, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

Author Contributions

R.M.M., C.P. and W.A.v.d.D. designed the study. R.M.M. and C.P. performed all experiments. R.M.M., C.P., and W.A.v.d.D. analyzed the data and wrote the manuscript.

Funding

This work was supported by the National Institutes of Health (Grant R01 AI144967 to W.A.v.d.D.). R.M.M. is a recipient of a Chemistry-Biology Interface Training Grant (5T32-GM070421) from the National Institute of General Medical Sciences. W.A.v.d.D. is an Investigator of the Howard Hughes Medical Institute. A Bruker UltrafleXtreme mass spectrometer used was purchased with support from the National Institutes of Health (S10 RR027109).

Conflicts of interest

The authors declare no conflict of interests.

Data Availability

All data are incorporated into the article and its online Supplementary Material. Primary data are deposited at: Martini, Rachel; Padhi, Chandrashekhar; van der Donk, Wilfred (2025), ‘Characterization of antimicrobial S-glycosylated glycocins containing three disulfides’, Mendeley Data, V1, doi: 10.17632/m7zm6kk295.1

References

  1. Abramson  J., Adler  J., Dunger  J., Evans  R., Green  T., Pritzel  A., Ronneberger  O., Willmore  L., Ballard  A. J., Bambrick  J., Bodenstein  S. W., Evans  D. A., Hung  C.-C., O'Neill  M., Reiman  D., Tunyasuvunakool  K., Wu  Z., Žemgulytė  A., Arvaniti  E., Jumper  J. M. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 630(8016), 493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Ahn  S., Stepper  J., Loo  T. S., Bisset  S. W., Patchett  M. L., Norris  G. E. (2018). Expression of Lactobacillus plantarum KW30 gcc genes correlates with the production of glycocin F in late log phase. FEMS Microbiology Letters, 365(23), fny261. [DOI] [PubMed] [Google Scholar]
  3. Arnison  P. G., Bibb  M. J., Bierbaum  G., Bowers  A. A., Bugni  T. S., Bulaj  G., Camarero  J. A., Campopiano  D. J., Challis  G. L., Clardy  J., Cotter  P. D., Craik  D. J., Dawson  M., Dittmann  E., Donadio  S., Dorrestein  P. C., Entian  K. D., Fischbach  M. A., Garavelli  J. S., van der Donk  W. A. (2013). Ribosomally synthesized and post-translationally modified peptide natural products: Overview and recommendations for a universal nomenclature. Natural Product Reports, 30(1), 108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Atkinson  H. J., Morris  J. H., Ferrin  T. E., Babbitt  P. C. (2009). Using sequence similarity networks for visualization of relationships across diverse protein superfamilies. PLoS ONE, 4(2), e4345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bisset  S. W., Yang  S. H., Amso  Z., Harris  P. W. R., Patchett  M. L., Brimble  M. A., Norris  G. E. (2018). Using chemical synthesis to probe structure-activity relationships of the glycoactive bacteriocin glycocin F. ACS Chemical Biology, 13(5), 1270. [DOI] [PubMed] [Google Scholar]
  6. Bobeica  S. C., Dong  S. H., Huo  L., Mazo  N., McLaughlin  M. I., Jimenéz-Osés  G., Nair  S. K., van der Donk  W. A. (2019). Insights into AMS/PCAT transporters from biochemical and structural characterization of a double glycine motif protease. Elife, 8, e42305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Boratyn  G. M., Camacho  C., Cooper  P. S., Coulouris  G., Fong  A., Ma  N., Madden  T. L., Matten  W. T., McGinnis  S. D., Merezhuk  Y., Raytselis  Y., Sayers  E. W., Tao  T., Ye  J., Zaretskaya  I. (2013). BLAST: A more efficient report with usability improvements. Nucleic Acids Research, 41(W1), W29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Brademan  D. R., Riley  N. M., Kwiecien  N. W., Coon  J. J. (2019). Interactive peptide spectral annotator: A versatile web-based tool for proteomic applications. Molecular & Cellular Proteomics, 18(8), S193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Chatterjee  A., Puri  S., Sharma  P. K., Deepa  P. R., Chowdhury  S. (2023). Nature-inspired enzyme engineering and sustainable catalysis: Biochemical clues from the world of plants and extremophiles. Frontiers in Bioengineering and Biotechnology, 11, 1229300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Chen  P., Qi  F. X., Novak  J., Krull  R. E., Caufield  P. W. (2001). Effect of amino acid substitutions in conserved residues in the leader peptide on biosynthesis of the lantibiotic mutacin II. FEMS Microbiology Letters, 195(2), 139. [DOI] [PubMed] [Google Scholar]
  11. Chettri  D., Verma  A. K., Sarkar  L., Verma  A. K. (2021). Role of extremophiles and their extremozymes in biorefinery process of lignocellulose degradation. Extremophiles, 25(3), 203. [DOI] [PubMed] [Google Scholar]
  12. Consortium, T. U. (2024). UniProt: The universal protein knowledgebase in 2025. Nucleic Acids Res., 53, D609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. De Leon  C. A., Levine  P. M., Craven  T. W., Pratt  M. R. (2017). The sulfur-linked analogue of O-GlcNAc (S-GlcNAc) is an enzymatically stable and reasonable structural surrogate for O-GlcNAc at the peptide and protein levels. Biochemistry, 56(27), 3507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. de Vos  W. M., Jung  G., Sahl  H.-G. (1991). Appendix: Definitions and nomenclature of lantibiotics. In: Jung  G., Sahl  H.-G. (Eds.), Nisin and novel lantibiotics. ESCOM, Leiden, (p. 457) [Google Scholar]
  15. Dirix  G., Monsieurs  P., Dombrecht  B., Daniels  R., Marchal  K., Vanderleyden  J., Michiels  J. (2004a). Peptide signal molecules and bacteriocins in gram-negative bacteria: A genome-wide in silico screening for peptides containing a double-glycine leader sequence and their cognate transporters. Peptides, 25(9), 1425. [DOI] [PubMed] [Google Scholar]
  16. Dirix  G., Monsieurs  P., Marchal  K., Vanderleyden  J., Michiels  J. (2004b). Screening genomes of gram-positive bacteria for double-glycine-motif-containing peptides. Microbiology (Reading, England), 150(5), 1121. [DOI] [PubMed] [Google Scholar]
  17. Dorenbos  R., Stein  T., Kabel  J., Bruand  C., Bolhuis  A., Bron  S., Quax  W. J., van Dijl  J. M. (2002). Thiol-disulfide oxidoreductases are essential for the production of the lantibiotic sublancin 168. Journal of Biological Chemistry, 277(19), 16682. [DOI] [PubMed] [Google Scholar]
  18. Drummond  B. J., Loo  T. S., Patchett  M. L., Norris  G. E. (2021). Optimised genetic tools allow the biosynthesis of glycocin F and analogues designed to test the roles of gcc cluster genes in bacteriocin production. Journal of Bacteriology, 203(7), e00529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Eslami  S. M., van der Donk  W. A. (2023). Proteases involved in leader peptide removal during RiPP biosynthesis. ACS Bio & Med Chem Au, 4(1), 20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fujinami  D., Garcia de Gonzalo  C. V., Biswas  S., Hao  Y., Wang  H., Garg  N., Lukk  T., Nair  S. K., van der Donk  W. A. (2021). Structural and mechanistic investigations of protein S-glycosyltransferases. Cell Chemical Biology, 28(12), 1740. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Furgerson Ihnken  L. A., Chatterjee  C., van der Donk  W. A. (2008). In vitro reconstitution and substrate specificity of a lantibiotic protease. Biochemistry, 47(28), 7352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Gabler  F., Nam  S. Z., Till  S., Mirdita  M., Steinegger  M., Soding  J., Lupas  A. N., Alva  V. (2020). Protein sequence analysis using the MPI bioinformatics toolkit. Current Protocols in Bioinformatics, 72(1), e108. [DOI] [PubMed] [Google Scholar]
  23. Gerlt  J. A., Bouvier  J. T., Davidson  D. B., Imker  H. J., Sadkhin  B., Slater  D. R., Whalen  K. L. (2015). Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks. Biochimica et Biophysica Acta (BBA)—Proteins and Proteomics, 1854(8), 1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Gomes  E., de Souza  A. R., Orjuela  G. L., Da Silva  R., de Oliveira  T. B., Rodrigues  A. (2016). Applications and benefits of thermophilic microorganisms and their enzymes for industrial biotechnology. In: Schmoll  M., Dattenböck  C. (eds) Gene Expression Systems in Fungi: Advancements and Applications. Springer International Publishing, Cham, pp 459. [Google Scholar]
  25. Hart  G. W., Slawson  C., Ramirez-Correa  G., Lagerlof  O. (2011). Cross talk between O-GlcNAcylation and phosphorylation: Roles in signaling, transcription, and chronic disease. Annual Review of Biochemistry, 80(1), 825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hata  T., Tanaka  R., Ohmomo  S. (2010). Isolation and characterization of plantaricin ASM1: A new bacteriocin produced by Lactobacillus plantarum A-1. International Journal of Food Microbiology, 137(1), 94. [DOI] [PubMed] [Google Scholar]
  27. Håvarstein  L. S., Diep  D. B., Nes  I. F. (1995). A family of bacteriocin ABC transporters carry out proteolytic processing of their substrates concomitant with export. Molecular Microbiology, 16(2), 229. [DOI] [PubMed] [Google Scholar]
  28. Ishii  S., Yano  T., Ebihara  A., Okamoto  A., Manzoku  M., Hayashi  H. (2010). Crystal structure of the peptidase domain of Streptococcus ComA, a bifunctional ATP-binding cassette transporter involved in the quorum-sensing pathway. Journal of Biological Chemistry, 285(14), 10777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Ishii  S., Yano  T., Hayashi  H. (2006). Expression and characterization of the peptidase domain of Streptococcus pneumoniae ComA, a bifunctional ATP-binding cassette transporter involved in quorum sensing pathway. Journal of Biological Chemistry, 281(8), 4726. [DOI] [PubMed] [Google Scholar]
  30. Izquierdo  E., Wagner  C., Marchioni  E., Aoude-Werner  D., Ennahar  S. (2009). Enterocin 96, a novel class II bacteriocin produced by Enterococcus faecalis WHE 96, isolated from Munster cheese. Applied and Environmental Microbiology, 75(13), 4273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kaunietis  A., Buivydas  A., Citavicius  D. J., Kuipers  O. P. (2019). Heterologous biosynthesis and characterization of a glycocin from a thermophilic bacterium. Nature Communications, 10(1), 1115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Main  P., Hata  T., Loo  T. S., Man  P., Novak  P., Havlicek  V., Norris  G. E., Patchett  M. L. (2020). Bacteriocin ASM1 is an O/S-diglycosylated, plasmid-encoded homologue of glycocin F. FEBS Letters, 594(7), 1196. [DOI] [PubMed] [Google Scholar]
  33. Mairinger  T., Weiner  M., Hann  S., Troyer  C. (2020). Selective and accurate quantification of N-acetylglucosamine in biotechnological cell samples via GC–MS/MS and GC–TOFMS. Analytical Chemistry, 92(7), 4875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Maky  M. A., Ishibashi  N., Nakayama  J., Zendo  T. (2021). Characterization of the biosynthetic gene cluster of enterocin F4-9, a glycosylated bacteriocin. Microorganisms, 9(11), 2276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Maky  M. A., Ishibashi  N., Zendo  T., Perez  R. H., Doud  J. R., Karmi  M., Sonomoto  K. (2015). Enterocin F4-9, a novel O-linked glycosylated bacteriocin. Applied and Environmental Microbiology, 81(14), 4819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Maynard  J. C., Burlingame  A. L., Medzihradszky  K. F. (2016). Cysteine S-linked N-acetylglucosamine (S-GlcNAcylation), A new post-translational modification in mammals. Molecular & Cellular Proteomics, 15(11), 3405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Montalbán-López  M., Scott  T. A., Ramesh  S., Rahman  I. R., van Heel  A. J., Viel  J. H., Bandarian  V., Dittmann  E., Genilloud  O., Goto  Y., Grande Burgos  M. J., Hill  C., Kim  S., Koehnke  J., Latham  J. A., Link  A. J., Martínez  B., Nair  S. K., Nicolet  Y., van der Donk  W. A. (2021). New developments in RiPP discovery, enzymology and engineering. Natural Product Reports, 38(1), 130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Nagar  R., Rao  A. (2017). An iterative glycosyltransferase EntS catalyzes transfer and extension of O- and S-linked monosaccharide in enterocin 96. Glycobiology, 27(8), 766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Nguyen  D. T., Mitchell  D. A., van der Donk  W. A. (2024). Genome mining for new enzyme chemistry. ACS Catalysis, 14(7), 4536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Noor  Y. M., Samsulrizal  N. H., Jema'on  N. A., Low  K. O., Ramli  A. N., Alias  N. I., Damis  S. I., Fuzi  S. F., Isa  M. N., Murad  A. M., Raih  M. F., Bakar  F. D., Najimudin  N., Mahadi  N. M., Illias  R. M. (2014). A comparative genomic analysis of the alkalitolerant soil bacterium Bacillus lehensis G1. Gene, 545(2), 253. [DOI] [PubMed] [Google Scholar]
  41. Norris  G. E., Patchett  M. L. (2016). The glycocins: In a class of their own. Current Opinion in Structural Biology, 40, 112. [DOI] [PubMed] [Google Scholar]
  42. Oberg  N., Zallot  R., Gerlt  J. A. (2023). EFI-EST, EFI-GNT, and EFI-CGFP: Enzyme function initiative (EFI) web resource for genomic enzymology tools. Journal of Molecular Biology, 435(14), 168018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Oman  T. J., Boettcher  J. M., Wang  H., Okalibe  X. N., van der Donk  W. A. (2011). Sublancin is not a lantibiotic but an S-linked glycopeptide. Nature Chemical Biology, 7(2), 78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Oman  T. J., van der Donk  W. A. (2010). Follow the leader: The use of leader peptides to guide natural product biosynthesis. Nature Chemical Biology, 6(1), 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Padhi  C., Field  C. M., Forneris  C. C., Olszewski  D., Fraley  A. E., Sandu  I., Scott  T. A., Farnung  J., Ruscheweyh  H.-J., Narayan Panda  A., Oxenius  A., Greber  U. F., Bode  J. W., Sunagawa  S., Raina  V., Suar  M., Piel  J. (2024). Metagenomic study of lake microbial mats reveals protease-inhibiting antiviral peptides from a core microbiome member. Proceedings of the National Academy of Sciences, 121(49), e2409026121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Paik  S. H., Chakicherla  A., Hansen  J. N. (1998). Identification and characterization of the structural and transporter genes for, and the chemical and biological properties of, sublancin 168, a novel lantibiotic produced by Bacillus subtilis 168. Journal of Biological Chemistry, 273(36), 23134. [DOI] [PubMed] [Google Scholar]
  47. Palaniappan  K., Chen  I. A., Chu  K., Ratner  A., Seshadri  R., Kyrpides  N. C., Ivanova  N. N., Mouncey  N. J. (2020). IMG-ABC v.5.0: An update to the IMG/Atlas of Biosynthetic Gene Clusters Knowledgebase. Nucleic Acids Res., 48, D422. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Pei  J., Pang  Q., Zhao  L., Fan  S., Shi  H. (2012). Thermoanaerobacterium thermosaccharolyticum β-glucosidase: A glucose-tolerant enzyme with high specific activity for cellobiose. Biotechnology for Biofuels, 5(1), 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Ren  H., Biswas  S., Ho  S., van der Donk  W. A., Zhao  H. (2018). Rapid discovery of glycocins through pathway refactoring in Escherichia coli. ACS Chemical Biology, 13(10), 2966. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Repka  L. M., Chekan  J. R., Nair  S. K., van der Donk  W. A. (2017). Mechanistic understanding of lanthipeptide biosynthetic enzymes. Chemical Reviews, 117(8), 5457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Shannon  P., Markiel  A., Ozier  O., Baliga  N. S., Wang  J. T., Ramage  D., Amin  N., Schwikowski  B., Ideker  T. (2003). Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Research, 13(11), 2498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Sharma  Y., Ahlawat  S., Rao  A. (2021). Biochemical characterization of an inverting S/O-HexNAc-transferase and evidence of S-linked glycosylation in actinobacteria. Glycobiology, 32(2), 148. [DOI] [PubMed] [Google Scholar]
  53. Singh  V., Rao  A. (2021). Distribution and diversity of glycocin biosynthesis gene clusters beyond firmicutes. Glycobiology, 31(2), 89. [DOI] [PubMed] [Google Scholar]
  54. Stepper  J., Shastri  S., Loo  T. S., Preston  J. C., Novak  P., Man  P., Moore  C. H., Havlicek  V., Patchett  M. L., Norris  G. E. (2011). Cysteine S-glycosylation, a new post-translational modification found in glycopeptide bacteriocins. FEBS Letters, 585(4), 645. [DOI] [PubMed] [Google Scholar]
  55. Venugopal  H., Edwards  P. J., Schwalbe  M., Claridge  J. K., Libich  D. S., Stepper  J., Loo  T., Patchett  M. L., Norris  G. E., Pascal  S. M. (2011). Structural, dynamic, and chemical characterization of a novel S-glycosylated bacteriocin. Biochemistry, 50(14), 2748. [DOI] [PubMed] [Google Scholar]
  56. Wang  H., Han  Y., Wang  X., Jia  Y., Zhang  Y., Müller  R., Huo  L. (2023a). Genome mining of myxopeptins reveals a class of lanthipeptide-derived linear dehydroamino acid-containing peptides from Myxococcus sp. MCy9171. ACS Chemical Biology, 18(10), 2163. [DOI] [PubMed] [Google Scholar]
  57. Wang  H., Oman  T. J., Zhang  R., Garcia De Gonzalo  C. V., Zhang  Q., van der Donk  W. A. (2014). The glycosyltransferase involved in thurandacin biosynthesis catalyzes both O- and S-glycosylation. Journal of the American Chemical Society, 136(1), 84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Wang  H., van der Donk  W. A. (2011). Substrate selectivity of the sublancin S-glycosyltransferase. Journal of the American Chemical Society, 133(41), 16394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Wang  H., Zhao  X., Li  D., Meng  L., Liu  S., Zhang  Y., Huo  L. (2025). Marine metagenome mining reveals lanthipeptides colwesin A–C, exhibiting novel ring topology and anti-inflammatory activity. ACS Synthetic Biology, 14(4), 1014–1020. [DOI] [PubMed] [Google Scholar]
  60. Wang  J., Ge  X., Zhang  L., Teng  K., Zhong  J. (2016a). One-pot synthesis of class II lanthipeptide bovicin HJ50 via an engineered lanthipeptide synthetase. Scientific Reports, 6(1), 38630. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Wang  J. P., Liu  B., Liu  G. H., Ge  C. B., Xiao  R. F., Zheng  X. F., Shi  H. (2016b). Draft genome sequence of Bacillus plakortidis P203T (DSM 19153), an alkali- and salt-tolerant marine bacterium. Genome announcements. 4, e01690–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Wang  X., Chen  X., Wang  Z. J., Zhuang  M., Zhong  L., Fu  C., Garcia  R., Müller  R., Zhang  Y., Yan  J., Wu  D., Huo  L. (2023b). Discovery and characterization of a myxobacterial lanthipeptide with unique biosynthetic features and anti-inflammatory activity. Journal of the American Chemical Society, 145(30), 16924. [DOI] [PubMed] [Google Scholar]
  63. Wu  C., Biswas  S., Garcia De Gonzalo  C. V., van der Donk  W. A. (2019). Investigations into the mechanism of action of sublancin. ACS Infectious Diseases, 5(3), 454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Wu  K. H., Tai  P. C. (2004). Cys32 and His105 are the critical residues for the calcium-dependent cysteine proteolytic activity of CvaB, an ATP-binding cassette transporter. Journal of Biological Chemistry, 279(2), 901. [DOI] [PubMed] [Google Scholar]
  65. Zhu  D., Adebisi  W. A., Ahmad  F., Sethupathy  S., Danso  B., Sun  J. (2020). Recent development of extremophilic bacteria and their application in biorefinery. Frontiers in Bioengineering and Biotechnology, 8, 483. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

kuaf028_Supplemental_File

Data Availability Statement

All data are incorporated into the article and its online Supplementary Material. Primary data are deposited at: Martini, Rachel; Padhi, Chandrashekhar; van der Donk, Wilfred (2025), ‘Characterization of antimicrobial S-glycosylated glycocins containing three disulfides’, Mendeley Data, V1, doi: 10.17632/m7zm6kk295.1


Articles from Journal of Industrial Microbiology & Biotechnology are provided here courtesy of Oxford University Press

RESOURCES