Highlights
-
•
A novel database for microbial glycosylation in model microbial organisms has been developed.
-
•
This MicroGlycoDB database provides the foundation for a system where microbial glycan-related information can be stored in a standardized and comprehensive manner.
-
•
The database allows automatic visualization of glycosylation pathways is possible, including microbial structures.
-
•
The database is developed based on Semantic Web techniques.
-
•
The data is described in machine-readable and standard format for data integration with a wide range of data.
Keywords: Microbes, Glycosylation, Database, Data integration, Semantic data
Abstract
Glycoconjugates are present on microbial surfaces and play critical roles in modulating interactions with the environment and the host. Extensive research on microbial glycans, including elucidating the structural diversity of the glycan moieties of glycoconjugates and polysaccharides, has been carried out to investigate the function of glycans in modulating the interactions between the host and microbes, to explore their potential applications in the therapeutic targeting of pathogenic species, and in the use as probiotics in gut microbiomes. However, glycan-related information is dispersed across numerous databases and a vast amount of literature, which makes it laborious and time-consuming to identify and gather the relevant information about microbial glycosylation. This challenge can be addressed by a comprehensive database, which could offer insight into the fundamental processes underlying glycosylation. We have developed a MicroGlycoDB database to provide integrated glycan information on important model microorganisms. The data is described using Semantic Web Technologies, which allow microbial glycan data to be represented in a structured format accessible by machines, thus facilitating data sharing and integration with other resources that catalog features such as pathways, diseases, or interactions. This semantic data based on ontologies will contribute to the discovery of new knowledge in the field of microbiology, along with the expansion of information on the glycosylation of other microorganisms.
Introduction
Glycans on the microbial cell surface and glycoenzymes play critical roles in the integrity of the cell [1], permeability of chemical compounds related to drug resistance [2,3], and movement of the cell [4]. Glycosylation in microbes responsible for disease, including Streptococcus spp., Salmonella typhi, Pseudomonas aeruginosa, Acinetobacter baumannii, Neisseria meningitidis ([[5], [6], [7], [8], [9]]), heavily glycosylated viruses [10], and the encapsulated yeast Cryptococcus neoformans [11], plays significant roles in their pathogenesis by aiding in immune evasion. Also, the normal commensal microbiota maintain intestinal health through glycoenzymes, allowing the metabolism of complex glycans [12]. For example, the lipoglycan of Mycobacterium tuberculosis (M. tuberculosis), characterized by latent infection, shows the role of glycans in inhibiting the function of immune cells. This occurs via the interaction of lipoarabinomannan (ManLAM) with pattern-recognition receptors (PRRs) on dendritic cells (DCs), ultimately resulting in immune evasion [13]. Some microbial glycans mimic host glycans to avoid host immune surveillance, while bacterial glycoenzymes can modify the host glycans to enable bacteria to adhere to the host cell or use them as nutrient sources [14]. N-glycosylation of Campylobacter jejuni (C. jejuni) proteins protects them from cleavage by host proteases [15] and its mimicry of host glycans enhances host-pathogen interactions [16]. This can lead to the autoimmune disease Guillain-Barré syndrome [17], which shows the crucial roles of microbial glycans for initial attachment and colonization leading to disease manifestation.
Recent research has also revealed the role of C. jejuni N-glycans in multidrug-resistant efflux pump proteins (CmeABC) that actively remove drugs from bacterial cells. N-glycans directly mediate the activity of the efflux pump by affecting the protein conformation [18], which suggests they could be an ideal target to overcome antibacterial drug resistance.
Bacterial organelles for motility, such as flagella and pili [19,20], are frequently decorated with distinctive monosaccharides. These present a greater diversity of structural compositions than their eukaryotic counterparts, for example including pseudaminic acid (Pse) [21], legionaminic acid (Leg) [22], rhamnose (Rha) [23], 3-deoxy-d-manno‑oct-2-ulosonic acid (Kdo) [24], and N-acetylfucosamine (FucNAc) [25]. This diversity may reflect the adaptation of diverse host organisms to glycans, enabled by short generation time and horizontal genetic exchange. Among these monosaccharides, Pse and Leg are exclusively expressed in pathogenic microbes, including C. jejuni and Helicobacter pylori.
Research into the metabolic pathway of these intriguing glycan structures of C. jejuni has revealed the role of novel glycoenzymes that act in forming glycosidic linkages and modifications of biomolecules. The accumulated information on glycosyltransferases, glycosidases, kinases, monosaccharides, and glycoconjugates has prompted the development of metabolic pathway-based recombinant glycans and fermentable products for research and industrial purposes [26,27]. Genetic studies into the gene clusters that encode glycoenzymes have also advanced understanding of the functional and structural diversity of carbohydrates that exists within a species or genus [28].
However, despite the elucidation of the structures and functions of glycans in multiple microorganisms [29,30], crucial information is dispersed across numerous publications and databases. Thus, the details of even a single glycoenzyme are not readily accessible in a single location, either online (in a database) or in the literature. Efforts to acquire specific glycan-related information therefore require substantial amounts of time and effort. For example, to obtain information about the capsular polysaccharides (CPS) of C. jejuni, an extensive search across multiple databases would be needed: UniProtKB (UniProt Knowledgebase), a global protein resource encompassing almost all species [31], has glycoprotein information; CAZy (Carbohydrate-Active enZymes) provides amino acid sequences and 3D structures of the carbohydrate-related enzymes [32] organized into families of similar enzymes; Pfam provides the protein families classified by shared structural and functional features [33]; CSDB (the Carbohydrate Structure Database) provides detailed information on the structural characteristics of carbohydrates in bacteria and fungi [34]; PDB (Protein Data Bank) is a repository of three-dimensional structures of proteins, gene, and their complexes [35]; GlyTouCan is the international glycan structure repository [36]; ChEBI (Chemical Entities of Biological Interest) organizes compounds of biological interest based on ontologies [37]; and Rhea (https://www.rhea-db.org) provides comprehensive and detailed information about biochemical reactions and their associated enzymes [38].
Sharing or integrating data between experimental databases from different biological disciplines of interest, including glycomics, transcriptomics, proteomics, and genomics, will provide users with a comprehensive view of a research question, which will help researchers gain valuable insights and eventually suggest hypotheses for further study. The most significant challenge in data integration is unifying the data format and the terms (controlled vocabulary) used to describe the same molecules across disparate databases, which originate from variations in naming that are generated by different research groups. For instance, trehalose 6,6′-dimycolate, a pathogenic glycolipid in M. tuberculosis, has the recommended symbol name of TDM, but also has synonyms such as “cord factor” or “trehalose dimycolatea”. To address the problem of inconsistent naming, ontologies have been proposed as a potentially effective way to annotate resources in a formal format. They allow a computer to comprehend the semantics of data represented in natural language, whether the data derive from biomolecules, biological processes, phenotypes, or other complex concepts. Numerous ontologies have been developed to handle the semantic annotation of complex molecules and concepts within fields of interest, and thereby guarantee consistent representation of knowledge. With the semantic model for knowledge representation, another prerequisite for data integration is standardization of file format, which makes data exchange or reuse between different databases easier. Prominent databases in the life sciences such as UniProt, Ensembl, PubChem, and the Microbial Genome Database (MBGD) have shown the successful use of Semantic Web technologies. This is a standard that allows data to be described in a structured, standardized format and makes it easier to be integrated with other resources.
Generally, the investigation of bacterial glycans has focused on model organisms that hold medical significance as either pathogens or commensal bacteria, such as C. jejuni, M. tuberculosis, and Bifidobacterium bifidum. Model organisms are essential for discovering the genetic and structural features of glycans in metabolism or pathogenesis because they facilitate the investigation of fastidious organisms in a detailed and rapid way. Thus, we have applied these Semantic Web standards to six model organisms to create our novel MicroGlycoDB database. We first obtained glycan-related data encompassing a wide range of information (e.g., cellular compartments, enzymes, genes, glycans, metabolic pathways, and glycan synthesis pathways) for representative microbes from experts. Then, we transformed this data into a standardized format through a process called RDFization, in which this data is transformed into a standardized format, allowing for the representation of resources in a structured and machine-readable manner (such that computers can read and process the information), thereby facilitating information sharing and reuse among various life science communities. Since the RDFization process entails semantic annotation of resources by applying ontologies that provide a standardized methodology for sharing and reuse of scientific data, the data can be semantically enriched by connecting them to other RDF data in publicly available databases. Ontologies are pre-defined vocabulary terms that are hierarchically organized to represent various concepts in the life sciences; when the same ontologies are referenced by databases and data resources across the Web, data can be linked to indicate that they refer to the same concept. Therefore, as we have done in this work, by RDFizing microbial glycan data, we expect that the semantically interconnected glycan-related data in the MicroGlycoDB will contribute to our understanding of the roles of microbial glycans at a systems level.
Methods
Data collection for glycan-related information
We obtained glycan-related data for six microbial species: B. bifidum, Bifidobacterium longum, C. jejuni, C. neoformans, Mycobacterium abscessus, and M. tuberculosis. We conducted a thorough review of the literature using PubMed identifiers to gather details on glycan information related to each glycogene. To assign GlyTouCan IDs to the glycans described in textual formats, such as IUPAC or Linear Code, the glycans were transformed into WURCS (Web3 Unique Representation of Carbohydrate Structures) [39] format using the glycan format converter API (Application Progrmming Interface) (https://doc.glycosmos.org/api/glycanformatconverter) developed by the GlyCosmos project. In the case of the glycan format used in the CSDB database, we used the CSDB/SNFG structure editor (http://csdb.glycoscience.ru/snfgedit/snfgedit.html) to obtain the image and CSDB text data of glycan structures, and then converted the glycan linear format to WURCS format. We registered the glycans in WURCS format into the GlyTouCan repository to obtain glycan IDs. When a gene name is present in the UniProt database, the UniProt ID and Rhea ID were obtained. Using the acquired Rhea ID, the ChEBI ID and PubChem ID, of the substrate and product glycans participating in the enzyme reaction were obtained.
RDFization and validation
The MicroGlycoDB database is developed based on Semantic Web Technologies, which consists of Resource Description Framework (RDF) for the data format [40], Web Ontology Language (OWL) for the definition of ontologies [41], and SPARQL (SPARQL Protocol and RDF Query Language) for queries [42]. The RDF data model that we designed for the description of microbial glycosylation-related information was reorganized in a spreadsheet format. RDF statements were represented in turtle format, which is the simplest and most easily understandable format among the major RDF formats. The data was converted into RDF turtle format using RDFLib, a Python library for handling RDF data, based on a previous study for a more specific method to generate RDF triples [43]. We were able to save the RDF data in a compact textual form, in which a long and repeated IRI can be shortened as a prefixed name, which is an arbitrary name created by the researcher. The transformed RDF triples were uploaded into our Virtuoso database server [44], which is a graph database designed to store the RDF data, and the data was utilized to evaluate the accuracy of the semantic data via the SPARQL endpoint (https://ts.glycosmos.org/sparql).
To verify the RDF data, SPARQL queries were generated and organized using SPARQList, the Shape Expressions (ShEx) were coded using PyShEx, version 0.8.1, and then ShEx was carried out by simple Python code. The RDF data was loaded into the ShEx engine and evaluated based on the shape definition to report the object node, object value, and type (https://github.com/sunmyoung/MicroGlycoDB).
User interface
The development of MicroGlycoDB involved the implementation of an efficient retrieval system of the data from the RDF database; this operates as the backend using Python 3.2. The client-side view for the web interface was implemented using HTML, CSS, JavaScript, jQuery, and Ajax. We used Flask, a Python-based web development framework, to create unique web pages for each microbe. These pages make it easier to visualize microbial images, update web pages to reflect new glycan information, and display extracted data from the RDF storage.
Ontologies for the RDF graph
To transform the collected data into RDF format, ontologies preexisting within the biological domain were inspected using Protégé (https://protégé.stanford.edu/), an editor for creating, sharing, and visualizing ontologies, and the OLS (Ontology Lookup Service) online service (https://www.ebi.ac.uk/ols4). After standardized vocabularies for resources were determined, the terms of the ontologies were employed according to their defined specifications. The glycans, glycoconjugates, and enzyme reactions were described using the GlycoRDF [45] or GlycoConjugate Ontology (GlycoCoO) [46] that were developed for the semantic description of glycan structures, their modifications, and their core proteins.
Results
Data collection
We collected datasets for the following six microorganisms that include the names or identification numbers of genes that encode glycoenzymes, revealing the structures or roles of glycosylation based on the accumulated results: for B. bifidum and B. longum information on enzymes that break down glycosidic bonds; for M. tuberculosis and M. abscessus enzymes that form glycosidic bonds to elongate or branch glycan structures; for C. jejuni motility-related proteins and enzymes, including hydrolases, glycosyltransferases, epimerases, kinases, and ligases; and for C. neoformans glycosyltransferases, phosphorylases, and chitin synthases that generate protein glycans, GPI structures, and components of the capsule and cell wall.
Bifidobacterium, one of the most prevalent bacterial genera in the intestinal tract, establishes beneficial relationships with the host by contributing to immune homeostasis in the intestinal epithelium, where their metabolites and ability to utilize host carbohydrates play important roles [[47], [48], [49]]. The information on enzymes that are related to the degradation of HMOs (Human Milk Oligosaccharides), such as sialidase (SiaBb2), β-galactosidases (BbgIII), and ⍺-fucosidases (AfcA, AfcB), was provided with gene names. The essential information for the glycan structures produced by the corresponding enzymes was extracted from the ChEBI, GlyTouCan, and CSDB databases. When the information that was verified by these databases was limited, it was supplemented with information obtained through an extensive search across pertinent literature and databases. For example, for the cj1641 gene of C. jejuni, the relevant enzyme product information was retrieved from the UniProtKB, CAZy, PDB, and Enzyme Commission (EC) databases. The enzyme reaction components, such as the donor and acceptor substrates, were extracted from the Rhea database with their Rhea ID, which was used to retrieve the information of the participant biomolecules taking part in the enzymatic reaction, such as glycans or chemicals; the latter could be represented by a unique identifier provided by the corresponding database such as PubChem or ChEBI. The provided gene names of string datatype were represented by a unique URI (Uniform Resource Identifier), which is essential to standardizing semantic data and allowing a computer to read the meaning and relationship of data resources. The glycan participants were also assigned a GlyTouCan ID, which is the international repository developed to facilitate the referencing of complex glycan structures and thereby avoid the confusion caused by the multiple nomenclatures of glycans, such as IUPAC, Linear Code, or research group-specific naming formats.
The pathogenic bacteria M. tuberculosis (MTb) and M. abscessus, which feature a unique membrane structure, have been studied as important pulmonary pathogens. The MTb cell envelope, which evolved as a formidable defensive barrier, contains multiple distinct glycoconjugate structures such as the mAGP (mycolyl-arabinogalactan-peptidoglycan) complex, phosphatidylinositol mannosides (PIMs), lipomannans (LMs), and lipoarabinomannans (LAMs). These allow it to survive intracellularly in the phagosome, despite a hostile environment filled with a low pH and reactive oxidative-nitrosative stressors [50,51]. M. abscessus has further been observed to transition from a smooth type, characterized by the presence of cell surface-associated glycopeptidolipids (GPL), to a rough type lacking GPL [52]. To represent these glycolipids, which consist of characteristic fatty acids and glycan residues such as l-arabinose and d-mannose, we performed a search on the CSDB database. The unique identifier number, figure format, and CSDB linear text format for the lipoglycan structure were retrieved so that we could describe the glycolipids in MTb and M. abscessus with a focus on glycan structures.
C. neoformans is an opportunistic pathogen that causes fungal meningoencephalitis. The polysaccharide structures of its cell wall, composed of an inner layer of glucans and chitin and an outer layer of glucans and mannoproteins, and its unique capsule play important roles for its integrity and virulence. We obtained information regarding the enzymes involved in the synthesis of these glycans, such as NCBI protein ID, CAZy family, and conserved protein domain family (CDD), from FungiDB [53], a database that provides tools for functional analysis and data mining against a wide range of integrated datasets.
Standardization of microbial glycosylation data
-
•
Graph model for RDFization
As a first step to describe microbial glycosylation data in a standard format, we designed an RDF schema, which provides a way to capture the architecture of data description in a simple manner and to describe complex data concisely (Fig. 1). The RDF graph model was created according to the distinct data for each model microorganism. It is represented by resource nodes (data points) and edges that show the relationships between nodes and their values. Each node represents the relevant resource entities, which include concepts like a biochemical reaction or pathway for the biosynthesis of a glycoconjugate, as well as biomolecules like genes, proteins, species, references, enzymes, and glycans. To serialize the graph model as RDF documents in a text file, we adopted preexisting controlled vocabularies and hierarchical ontologies, which are recommended to improve data interoperability between different knowledge bases and data integration among multi-domain data. We also extensively searched common vocabularies and ontology terms using the OLS and Ontobee (https://ontobee.org/), which is a web service to help identify appropriate ontology terms or vocabularies used for annotation of resources in various domains. Additionally, to identify the definition or usage of the identified ontology, the OWL files of these ontologies were examined using Protégé software [54], which is a tool that enables the inspection of ontologies comprised of complex logic and constraints. For example, for glycoenzyme activity, it is necessary to express the concept of an enzyme exerting its activity on the substrate, resulting in the generation of a product. To describe the glycoenzyme activity, we inspected the GlycoRDF ontology using Protégé, and then introduced adequate vocabularies. To simplify the long URI comprised of a namespace and an identifier, the detailed URI can be abbreviated by specifying it with a pre-defined prefix such as glycan for ‘http://purl.jp/bio/12/glyco/glycan#’. Using this ontology, the substrate and product glycans that take part in an enzyme reaction could be represented using a Reaction class node and a ‘glycan:catalyzed_by’ property, which means that the glycans are the participants of the reaction. Also, to represent their roles in the reaction, such as reactant or product, the properties of ‘glycan:has_product’ and ‘glycan:has_substrate’ were used, respectively. For metadata annotations that provide readability, descriptive vocabulary such as ‘rdfs:label’ or ‘rdfs:comment’ properties were used. When the enzyme activity was provided with a descriptive explanation that was inferred from in silico analysis or binding assays without information about the reaction participants, the Gene Ontology (GO), a structural representation developed through the process of biocuration for the annotation of gene expression and function, was employed to assign and specify semantics to enzyme activity.
Fig. 1.
The RDF schema describes glycan-related data for model organisms.
The GlycoRDF ontology is utilized to depict glycoenzymes that are responsible for catalytic reactions. The protein node is used to connect the enzyme entity to gene resources, and the GO ontology is used to characterize enzyme activity, as long as the appropriate term is available.
The gene information dataset for C. jejuni consisted of proteins involved in O-linked glycosylation of flagellin, N-glycosylation in the periplasm and membranes, and biosynthesis of the lipooligosaccharide (LOS) and capsular polysaccharide (CPS). We used GO identifiers to describe the corresponding enzyme activities that lack information about a specific chemical reaction. For instance, "Kdo transferase activity" is denoted by GO identifier GO:0043842; "flagellin subunit A" corresponds to GO:0005198; "flippase activity" is encoded by GO:0140327; and so forth. The GO ontology is useful for describing the predicted activity of a corresponding protein, and the results of advanced biochemical research will provide information about the reaction participants. In addition to enzyme activity containing glycan information, the gene, protein sequence, localization of cellular anatomy, and taxon information were also semantically connected. This allowed us to expand our scope and retrieve specific glycans or common glycans existing between microorganisms.
-
•
Validation of RDF data
The graph model we developed was serialized into RDF sentences, a process that produces triples in the form of subject-predicate-object in accordance with the defined specification of the controlled vocabulary and ontology terms. We validated the triples using ShEx (Shape Expression), which enables us to resolve interpretation ambiguities raised by the RDF schema diagrams and identify data that does not match the schema [55]. The generated instances of all Classes were evaluated through ShEx validation, such as checking for consistency of the resource type or determining the cardinality of the number of objects that the subject instance can possess under a particular property. The validation process was iterated until all errors or discrepancies in the object values were corrected to be consistent with the ShEx outcomes, and the RDF documents were then modified. The verified RDF files were uploaded to our Virtuoso RDF database, and then, for validation, we tested whether the RDF data describing microbial glycosylation adhered to the RDF schema in the endpoint (https://ts.glycosmos.org/sparql).
User interface
MicroGlycoDB was designed to provide detailed and comprehensive information on microbial glycosylation. Each microbial species has its own dedicated webpage with visual representations, which provides users with glycan structures and a wide range of glycan-related information, including the biosynthetic pathway of the glycoconjugates and their localization in the microbe. The individual page for each microorganism is accessible from the sidebar or main body of the home page (https://microglycodb.glycosmos.org/), where users can easily navigate or download the curated information about microbes, access useful tools for glycan drawing such as GlycoNAVI (https://glyconavi.org/Draw/index.php) or Drawglycan-SNFG (http://www.virtualglycome.org/DrawGlycan/), and links to related resources such as GlyTouCan, GlyCosmos, Rhea, and UniProt. The glycan and gene tabs on the right side of the microbe's summary page show a list of glycan structures and glycogenes encoding enzymes and proteins that are responsible for glycosylation, such as glycan synthesis, modification, or regulation. Each resource in the list has a link to a web page showing the details about the selected glycan or glycogene. The subcategory menu on the Genes and Glycans tab depends on the type of glycoconjugates found, thereby enabling the user to retrieve specific information contained within each category. This information may also be accessed by selecting a particular segment of the membrane structures illustrated in the main figure of the selected microbe. Glycan and glycolipid structures are represented in the SNFG (Symbol Nomenclature for Glycans) format [56], which is supplied by the GlyTouCan repository, and CSDB format, respectively.
As an example of a model organism, information regarding the cellular localization of glycoconjugates in C. jejuni or Mycobacterium is provided with respect to the overall cell structure, so as to represent published reports describing particular glycostructures exist in specific cell components and are associated with their respective pathological roles. Gram-negative bacterial cell architecture, such as CPS that remain attached to the cell surface together withLOS, N-linked glycoproteins in the periplasm and membranes, peptidoglycans in the periplasm, and O-linked glycans of flagellin, are displayed on the graphic image, allowing the user to access specific information by clicking on the substructures. When clicking the components on the bacterial image, the relevant glycosylation process is displayed to the user as a graphic representation along with a table containing the genes, enzyme activities, and a link that will take the user to the detailed page of the glycans (Fig. 2A). For instance, when a user clicks on CPS, the CPS biosynthesis pathway is displayed using SNFG symbols (Fig. 2B). The details of the pathway components, such as glycoenzymes or carbohydrates, are simultaneously displayed on the right side, with information associated with pathogenicity, highlighting a potential target for therapeutic development.
Fig. 2.
Graphic depiction of cellular architecture and glycosylation processes in C. jejuni.
(A) represents the genes and glycan constituents involved in biosynthesis of the lipooligosaccharides (LOS) that are anchored to the outer membrane of C. jejuni. The LOS consist of a diverse array of short oligosaccharides typically modified with sialic acid, which vary both among and within strains and contribute to immune evasion through mimicry of a wide range of human gangliosides. The user can see other structural components by clicking on membrane components such as flagellin or capsular polysaccharides, and peptidoglycan in the periplasm. (B) The SNFG format and names of the monosaccharides that take part in LOS synthesis or are found exclusively in bacteria.
As a second example, lipoproteins, including PIMs, LMs, and LAMs, are anchored to the plasma membrane of M. tuberculosis. These lipoproteins play an important role in maintaining cell membrane integrity and regulating interactions between hosts and pathogens. Data related to the location of glycoconjugates at the interface between MTb and the host cell will provide users with insights that may advance vaccine development in the field of glycoengineering. When a user clicks on a lipoprotein, a pathway diagram representing the enzymatic process is displayed, and clicking on a glycogene or glycan in the pathway diagram brings users to a page where they can inspect specific information on the glycan structure (Fig. 3A). If information about the glycogene involved in the biosynthesis of the glycan structure is known, including the results of in silico analysis, the glycoenzyme information is mapped to the corresponding UniProt and Rhea IDs. Thus, the user can identify the relevant enzymatic reaction as well as integrated information about the sequence and function of the protein. The user can also access enzyme information, including the EC enzyme number, CAZy number, and corresponding enzyme reaction, by using the Rhea ID of the table that is displayed for each glycan detail. In addition, the ChEBI ID from the chemical ontology ChEBI, which provides standardized names for chemicals to reduce nomenclature confusion, appears as an external link to the glycans (Fig. 3B).
Fig. 3.
The information pages contain details about microbial glycans and glycogenes.
(A) is a page providing comprehensive information on glycan structures, external resources, and references, allowing users to recognize various glycan formats that help in referring to or registering glycans. (B) is a gene-related details showing gene name, UniProt ID, description of the enzyme function, and enzyme reaction.
The homepage of MicroGlycoDB offers links to the entry pages of six model microorganisms, as well as a link to download the dump file, tools for searching glycans, and access to external resources. The search page has two options: selecting a species name and a keyword search pertaining to glycan structure. The latter option supports the glycan structure in four formats: CSDB linear, IUPAC condensed format, LINUCS, and WURCS. To retrieve the results using a glycan name in the keyword search, a SPARQL query may be used (Table 1). Through the glycan keyword search results, users can compare the distribution of glycans across the microbes and infer the meaning of shared glycan structures. Clicking the common glycan in the results page transfers the user to the detail page of the glycan (Fig. 4).
Table 1.
Retrieving the species containing a glycan of the keyword search.
SPARQL query |
PREFIX glycan: 〈http://purl.jp/bio/12/glyco/glycan#〉 PREFIX rdfs: 〈http://www.w3.org/2000/01/rdf-schema#〉 PREFIX rdf: 〈http://www.w3.org/1999/02/22-rdf-syntax-ns#〉 PREFIX skos: 〈http://www.w3.org/2004/02/skos/core#〉 SELECT DISTINCT ?g ?glycan_url ?species_name ?glycan_name WHERE { GRAPH ?g { <http://rdf.glycosmos.org/microglycodb>} ?glycan_url rdf:type glycan:Saccharide; rdfs:label ?species_name; glycan:has_glycosequence ?glycosequence . ?glycosequence glycan:in_carbohydrate_format glycan:carbohydrate_format_{{format}}; glycan:has_sequence "{{query}}"^^xsd:string . OPTIONAL { ?glycan_url skos:prefLabel ?glycan_name .} } |
Fig. 4.
Keyword search interface based on SPARQL queries.
The upper box is for the search of relevant glycan information; this allows users to search using a glycan structure as a keyword with options for glycan format, including CSDB Linear, IUPAC, and WURCS. The results are presented in the selection box showing the species harboring the glycan, which is realized by executing SPARQL queries on the database server.
Discussion
The important roles of microbial glycans in their interactions with hosts and the environment have been elucidated through various hypotheses and an extensive range of experimental evidence, including genomics, proteomics, metabolomics, and glycomics. However, the resulting data are fragmented and dispersed in publications and across multiple databases with diverse formats. Thus, the acquisition of relevant data on microbial glycans requires a significant amount of time and effort, which in turn slows down the sharing and expansion of knowledge among researchers. MicroGlycoDB has been developed to provide users with comprehensive information on glycans, specializing in the six important model microorganisms provided by a collaborative research group. Our database allows users to explore essential resources, such as enzymes and glycan structures for their glycosylation pathways, along with graphical representations of microbial structures, which helps users to investigate the functions of microbial glycans and glycan-related information.
Importantly, the glycan-related data of microbes available in MicroGlycoDB is presently restricted when considering pathogenic bacteria and microbial communities present in the human mucosa, including the gastrointestinal tract, oral cavity, skin, nasal passages, and other tissues. For example, the pathogenic N. meningitidis and Neisseria gonorrhoeae have extensive glycosylation on their capsules, LOS, and pili, which facilitate their masking of surface adhesins, promote adhesion and invasion, and enhance bacterial adherence, respectively [57]. P. aeruginosa, a principal pathogen responsible for nosocomial pneumonia, modifies glycosylation on its pili, resulting in enhanced virulence [58]. Also, H. pylori requires pseudaminic acid (Pse) glycosylation on flagella for the proper assembly of the filament and motility and its Lewis antigen mimicry within the O-antigen region of lipopolysaccharides for colonization, exhibiting variability in type among its strains, akin to human ABO blood group antigens [[59], [60], [61]]. Pse5Ac7Am in flagellin is also important to the attachment and entry of C. jejuni into the host intestinal epithelium [62].
Glycosylation in non-bacterial organisms, including fungi and viruses, has frequently been shown to impact pathogenicity. For a eukaryotic example, the biofilm matrix of Candida albicans, one of the most frequent fungal pathogens, mainly consists of α−1,2 branched mannans and α−1,6 mannans. This provides a physical barrier to protect from immune attack such as phagocytosis by neutrophils [63] and contributes to antifungal drug resistance [64]. For a viral example, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of coronavirus disease 2019 (COVID-19), shows significant N- and O-glycosylation on its proteins, including the spike protein, in the envelope which are crucial for host recognition and pathogenesis [65]. The essential role of these glycans indicates that they could be potential targets for novel therapeutics, and a database providing the pathway information for glycan synthesis would therefore be extremely valuable.
In the context of disease, the significant roles of glycans in host interactions have been elucidated in infectious diseases caused by bacteria, viruses, fungi, and protozoa. Glycan-binding proteins on the immune cells of the host, including galectins, C-type lectins, and siglecs, constantly monitor surface glycans of microorganisms to discriminate between self and nonself. This is crucial in determining the fate of health or disease, such as chronic inflammation, autoimmune diseases, and cancer [66]. For instance, C. jejuni is also enveloped in lipooligosaccharides analogous to human gangliosides discovered in nerve tissue. Consequently, antiganglioside antibodies (AGAs) that respond to C. jejuni-associated glycans can give rise to an autoimmune disease known as Guillain-Barré syndrome [67]. Mtb responsible for tuberculosis produces various types of O-mannosylated proteins. The interaction of ManLAM with C-type lectin receptors (CLRs), including dendritic cell-specific intercellular adhesion molecule-3-grabbing nonintegrin (DC-SIGN), mannose receptor (MR), and dectins, has been linked to enhanced host immunity against Mtb infection [68,69]. The human immunodeficiency virus (HIV) is extensively glycosylated with oligomannose-GlcNAc to protect against neutralizing antibodies, similar to many other viruses, and viral Man5-GlcNAc complexes facilitate HIV attachment to host cells [71]. SARS-CoV-2 has further demonstrated the significance of glycans as critical components in the interface of virus-host interactions [70]. The emerging fungus, Candida auris, which exhibits severe multidrug resistance, expresses a cell wall β-glucan that influences the immune response [72]. Increased understanding of the interactions between glycan-binding proteins and glycans is being cataloged in several databases, and we plan to incorporate microbial interaction data into these databases. Overall, this highlights the significance of a comprehensive database that provides information on genes, enzymes, glycans, pathways, associated diseases, and other relevant factors pertaining to the role of glycans in host-microorganism interactions in both health and disease in a standard format.
We describe the resources of MicroGlycoDB using ontology and controlled vocabulary in RDF format, which is appropriate for data description clarity and data integration. Ontologies not only clarify the uncertainty arising from the diverse nomenclature of biological molecules, as seen in the multitude of synonyms, but they also assist the discovery of new knowledge by inferring and representing the semantic relationships between different types of data. We are currently developing a new ontology to clearly define the complex structures of glycoconjugates that consist of many chemical modifications, glycosyl moieties, and backbone molecules in LPS, flagella, pili, and capsules. Thus, MicroGlycoDB will help researchers to discover not only the glycan structures of interest, providing details such as genes, enzymes, and related pathways within the microbial anatomy, but also the connections between glycan structures and their roles, gain insight into complex glycosylation processes, and understand the mechanisms that underlie interactions between microorganisms and hosts.
Some resources or external references for glycan structures or glycogenes in our database are limited. For instance, C. neoformans shows only glycogenes without the information on the glycans. This is due to the lack of mapping data for glycoenzymes expected at the gene level and the glycan structures that mostly remain in publications. As one way to resolve this issue, we are developing an automated pipeline to inspect and verify the information that is uploaded, via a Web tool called MicroGlycoCurator (https://microglycorepo.alpha.glycosmos.org/), an online platform that transforms tabular data into semantic data, and subsequently conveys it to MicroGlycoDB, which is particularly useful for handling large data sets. This system will reduce the effort and errors associated with manual processes while facilitating the integration of individual data within a single microbe into a knowledgebase where all data sets are linked as semantic data by using ontologies. This Web tool will be released as an alpha version in 2025, with public release via GlyCosmos the following year. Just as we have been holding various user workshops and booth exhibits at conferences for GlyCosmos, we will also advertise MicroGlycoCurator and MicroGlycoDB to the community for feedback, which will be incrementally implemented in these respective resources on a periodic basis. The major challenge will be to develop a template format that is compatible with data formats that are currently used by the community. This will require close communications with researchers to develop an extendable and useful template so users can easily upload their information into MicroGlycoCurator and to edit the information online for submission. Another concern is the annotations that users may want to attach to their data and whether it will be straightforward to map them to standardized ontologies. Again, these will be discussed closely with the relevant members of the community. We expect that MicroGlycoDB will encompass additional microbial species and offer more detailed information through our curator system.
Conclusion
We have developed MicroGlycoDB to provide a comprehensive data resource on microbial glycosylation. Although some resources have missing data that needs to be filled in through additional studies, our database emphasizes the important role of integrated knowledge, which assists users to inspect the consolidated information that comes from diverse domains. This database is structured to facilitate and promote data integration, so that we can provide not only specific information about individual glycans but also pertinent information such as glycan structures in various formats and their standardized identifier numbers for easy reference, glycogenes, glycoenzymes, glycosylation pathways, and their graphical representation in cellular architecture. This reduces the effort required for users to search across multiple websites and provides an opportunity to implement an integrated approach to glycosylation systems in microbes. MicroGlycoDB, a comprehensive database on microbial glycosylation, thus lays the foundation for a platform that will provide insights into microbial glycosylation by integrating independent information from publications and databases.
Funding sources
This work was supported by the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency (JST) Grant Number JPMJND2204 and the Soka University International Collaborative Research Grant. Doering lab studies of cryptococcal glycans are supported by National Institutes of Health grant AI135012.
CRediT authorship contribution statement
Sunmyoung Lee: Writing – review & editing, Writing – original draft, Resources. Louis-David Leclercq: Validation, Resources. Yann Guerardel: Writing – review & editing, Resources. Christine M. Szymanski: Writing – review & editing, Resources. Thomas Hurtaux: Validation, Resources. Tamara L. Doering: Writing – review & editing, Funding acquisition, Conceptualization. Takane Katayama: Writing – review & editing, Investigation. Kiyotaka Fujita: Writing – review & editing, Investigation. Kazuhiro Aoki: Writing – review & editing, Investigation. Kiyoko F. Aoki-Kinoshita: Writing – review & editing, Resources, Funding acquisition, Conceptualization.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
We would like to thank Miyuki Kikuchi and Atsuto Uchino for their initial contributions to the development of MicroGlycoDB. We thank Liza Loza for assistance in depicting the cryptococcal cell wall and Cory Wenzel, Michel Gilbert, Ian Schoenhofen and Frank Raushel for assistance with C. jejuni glycan pathway annotations. We would also like to acknowledge Yotsuba Hattori for updating the user interface to its current version.
Data availability
Data will be made available on request.
References
- 1.Huang K.C., Mukhopadhyay R., Wen B., Gitai Z., Wingreen N.S. Cell shape and cell-wall organization in Gram-negative bacteria. Proc. Natl. Acad. Sci. U.S.A. 2008;105:19282–19287. doi: 10.1073/pnas.0805309105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Whitfield C., Williams D.M., Kelly S.D. Lipopolysaccharide O-antigens—bacterial glycans made to measure. J. Biolog. Chem. 2020;295:10593–10609. doi: 10.1074/jbc.REV120.009402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Yakovlieva L., Fülleborn J.A., Walvoort M.T.C. Opportunities and challenges of bacterial glycosylation for the development of novel antibacterial strategies. Front. Microbiol. 2021;12 doi: 10.3389/fmicb.2021.745702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Sanchez H., Hopkins D., Demirdjian S., Gutierrez C., O'Toole G.A., Neelamegham S., Berwin B. Identification of cell-surface glycans that mediate motility-dependent binding and internalization of Pseudomonas aeruginosa by phagocytes. Mol. Immunol. 2021;131:68–77. doi: 10.1016/j.molimm.2020.12.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hiyoshi H., Wangdi T., Lock G., Saechao C., Raffatellu M., Cobb B.A., Bäumler A.J. Mechanisms to evade the phagocyte respiratory burst arose by convergent evolution in typhoidal salmonella serovars. Cell Rep. 2018;22:1787–1797. doi: 10.1016/j.celrep.2018.01.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wangdi T., Lee C.-Y., Spees A.M., Yu C., Kingsbury D.D., Winter S.E., Hastey C.J., Wilson R.P., Heinrich V., Bäumler A.J. The Vi capsular polysaccharide enables salmonella enterica serovar typhi to evade microbe-guided neutrophil chemotaxis. PLoS Pathog. 2014;10 doi: 10.1371/journal.ppat.1004306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Huszczynski S.M., Coumoundouros C., Pham P., Lam J.S., Khursigara C.M. Unique regions of the polysaccharide copolymerase Wzz 2 from pseudomonas aeruginosa are essential for O-specific antigen chain length control. J. Bacteriol. 2019:201. doi: 10.1128/JB.00165-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Fiebig T., Cramer J.T., Bethe A., Baruch P., Curth U., Führing J.I., Buettner F.F.R., Vogel U., Schubert M., Fedorov R., Mühlenhoff M. Structural and mechanistic basis of capsule O-acetylation in Neisseria meningitidis serogroup A. Nat. Commun. 2020;11:4723. doi: 10.1038/s41467-020-18464-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Geisinger E., Huo W., Hernandez-Bird J., Isberg R.R. Acinetobacter baumannii : envelope determinants that control drug resistance, virulence, and surface variability. Annu. Rev. Microbiol. 2019;73:481–506. doi: 10.1146/annurev-micro-020518-115714. [DOI] [PubMed] [Google Scholar]
- 10.Raman R., Tharakaraman K., Sasisekharan V., Sasisekharan R. Glycan–protein interactions in viral pathogenesis. Curr. Opin. Struct. Biol. 2016;40:153–162. doi: 10.1016/j.sbi.2016.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Barreto-Bergter E., Figueiredo R.T. Fungal glycans and the innate immune recognition. Front. Cell. Infect. Microbiol. 2014;4 doi: 10.3389/fcimb.2014.00145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Luis A.S., Hansson G.C. Intestinal mucus and their glycans: a habitat for thriving microbiota. Cell Host Microb. 2023;31:1087–1100. doi: 10.1016/j.chom.2023.05.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gringhuis S.I., Den Dunnen J., Litjens M., Van Der Vlist M., Geijtenbeek T.B.H. Carbohydrate-specific signaling through the DC-SIGN signalosome tailors immunity to Mycobacterium tuberculosis, HIV-1 and Helicobacter pylori. Nat. Immunol. 2009;10:1081–1088. doi: 10.1038/ni.1778. [DOI] [PubMed] [Google Scholar]
- 14.Poole J., Day C.J., Von Itzstein M., Paton J.C., Jennings M.P. Glycointeractions in bacterial pathogenesis. Nat. Rev. Microbiol. 2018;16:440–452. doi: 10.1038/s41579-018-0007-2. [DOI] [PubMed] [Google Scholar]
- 15.Alemka A., Nothaft H., Zheng J., Szymanski C.M. N-glycosylation of campylobacter jejuni surface proteins promotes bacterial fitness. Infect. Immun. 2013;81:1674–1682. doi: 10.1128/IAI.01370-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Houliston R.S., Vinogradov E., Dzieciatkowska M., Li J., St. Michael F., Karwaski M.-F., Brochu D., Jarrell H.C., Parker C.T., Yuki N., Mandrell R.E., Gilbert M. Lipooligosaccharide of Campylobacter jejuni. J. Biolog. Chem. 2011;286:12361–12370. doi: 10.1074/jbc.M110.181750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Day C.J., Semchenko E.A., Korolik V. Glycoconjugates play a key role in campylobacter jejuni infection: interactions between host and pathogen. Front. Cell. Inf. Microbio. 2012;2 doi: 10.3389/fcimb.2012.00009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Abouelhadid S., Raynes J., Bui T., Cuccui J., Wren B.W. Characterization of posttranslationally modified multidrug efflux pumps reveals an unexpected link between glycosylation and antimicrobial resistance. mBio. 2020;11:e02604–e02620. doi: 10.1128/mBio.02604-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Schirm M., Schoenhofen I.C., Logan S.M., Waldron K.C., Thibault P. Identification of unusual bacterial glycosylation by tandem mass spectrometry analyses of intact proteins. Anal. Chem. 2005;77:7774–7782. doi: 10.1021/ac051316y. [DOI] [PubMed] [Google Scholar]
- 20.Castric P., Cassels F.J., Carlson R.W. Structural characterization of the pseudomonas aeruginosa 1244 pilin glycan. J. Biolog. Chem. 2001;276:26479–26485. doi: 10.1074/jbc.M102685200. [DOI] [PubMed] [Google Scholar]
- 21.Schirm M., Soo E.C., Aubry A.J., Austin J., Thibault P., Logan S.M. Structural, genetic and functional characterization of the flagellin glycosylation process in Helicobacter pylori. Mol. Microbiol. 2003;48:1579–1592. doi: 10.1046/j.1365-2958.2003.03527.x. [DOI] [PubMed] [Google Scholar]
- 22.Knirel Y.A., Rietschel E.Th., Marre R., Zähringer U. The structure of the O-specific chain of Legionella pneumophila serogroup 1 lipopolysaccharide. Eur. J. Biochem. 1994;221:239–245. doi: 10.1111/j.1432-1033.1994.tb18734.x. [DOI] [PubMed] [Google Scholar]
- 23.Mistou M.-Y., Sutcliffe I.C., van Sorge N.M. Bacterial glycobiology: rhamnose-containing cell wall polysaccharides in Gram-positive bacteria. FEMS Microbiol. Rev. 2016;40:464–479. doi: 10.1093/femsre/fuw006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Lodowska J., Wolny D., Węglarz L. The sugar 3-deoxy-d-manno-oct-2-ulosonic acid (Kdo) as a characteristic component of bacterial endotoxin – a review of its biosynthesis, function, and placement in the lipopolysaccharide core. Can. J. Microbiol. 2013;59:645–655. doi: 10.1139/cjm-2013-0490. [DOI] [PubMed] [Google Scholar]
- 25.Horzempa J., Held T.K., Cross A.S., Furst D., Qutyan M., Neely A.N., Castric P. Immunization with a Pseudomonas aeruginosa 1244 pilin provides O-antigen-specific protection. Clin. Vaccine Immunol. 2008;15:590–597. doi: 10.1128/CVI.00476-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Singh A., Bajar S., Devi A., Pant D. An overview on the recent developments in fungal cellulase production and their industrial applications. Bioresour. Technol. Rep. 2021;14 [Google Scholar]
- 27.Amaro Bittencourt G., Porto De Souza Vandenberghe L., Valladares-Diestra K., Wedderhoff Herrmann L., Fátima Murawski De Mello A., Sarmiento Vásquez Z., Grace Karp S., Ricardo Soccol C. Soybean hulls as carbohydrate feedstock for medium to high-value biomolecule production in biorefineries: a review. Bioresour. Technol. 2021;339 doi: 10.1016/j.biortech.2021.125594. [DOI] [PubMed] [Google Scholar]
- 28.Lam J.S., Taylor V.L., Islam S.T., Hao Y., Kocíncová D. Genetic and functional diversity of pseudomonas aeruginosa lipopolysaccharide. Front. Microbio. 2011;2 doi: 10.3389/fmicb.2011.00118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Li J., Martin A., Cox A.D., Moxon E.R., Richards J.C., Thibault P. Methods in Enzymology. Elsevier; 2005. Mapping bacterial glycolipid complexity using capillary electrophoresis and electrospray mass spectrometry; pp. 369–397. [DOI] [PubMed] [Google Scholar]
- 30.Mnich M.E., Van Dalen R., Van Sorge N.M. C-type lectin receptors in host defense against bacterial pathogens. Front. Cell. Infect. Microbiol. 2020;10:309. doi: 10.3389/fcimb.2020.00309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Bateman A. UniProt: a worldwide hub of protein knowledge. Nucl. Acid. Res. 2019;47:D506–D515. doi: 10.1093/nar/gky1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Lombard V., Golaconda Ramulu H., Drula E., Coutinho P.M., Henrissat B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucl. Acid. Res. 2014;42:D490–D495. doi: 10.1093/nar/gkt1178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Mistry J., Chuguransky S., Williams L., Qureshi M., Salazar G.A., Sonnhammer E.L.L., Tosatto S.C.E., Paladin L., Raj S., Richardson L.J., Finn R.D., Bateman A. Pfam: the protein families database in 2021. Nucl. Acid. Res. 2021;49:D412–D419. doi: 10.1093/nar/gkaa913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Toukach P.V., Egorova K.S. Carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts. Nucl. Acid. Res. 2016;44:D1229–D1236. doi: 10.1093/nar/gkv840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Burley S.K., Berman H.M., Kleywegt G.J., Markley J.L., Nakamura H., Velankar S. Protein Data Bank (PDB): the single global macromolecular structure archive. Method. Mol. Biol. 2017;1607:627–641. doi: 10.1007/978-1-4939-7000-1_26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Fujita A., Aoki N.P., Shinmachi D., Matsubara M., Tsuchiya S., Shiota M., Ono T., Yamada I., Aoki-Kinoshita K.F. The international glycan repository GlyTouCan version 30. Nucl. Acid. Res. 2021;49:D1529–D1533. doi: 10.1093/nar/gkaa947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.de Matos P., Alcántara R., Dekker A., Ennis M., Hastings J., Haug K., Spiteri I., Turner S., Steinbeck C. Chemical Entities of Biological Interest: an update. Nucl. Acid. Res. 2010;38:D249–D254. doi: 10.1093/nar/gkp886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Bansal P., Morgat A., Axelsen K.B., Muthukrishnan V., Coudert E., Aimo L., Hyka-Nouspikel N., Gasteiger E., Kerhornou A., Neto T.B., Pozzato M., Blatter M.-C., Ignatchenko A., Redaschi N. A. Bridge, Rhea, the reaction knowledgebase in 2022. Nucl. Acid. Res. 2022;50:D693–D700. doi: 10.1093/nar/gkab1016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Tanaka K., Aoki-Kinoshita K.F., Kotera M., Sawaki H., Tsuchiya S., Fujita N., Shikanai T., Kato M., Kawano S., Yamada I., Narimatsu H. WURCS: the Web3 unique representation of carbohydrate structures. J. Chem. Inf. Model. 2014;54:1558–1566. doi: 10.1021/ci400571e. [DOI] [PubMed] [Google Scholar]
- 40.Decker S., Mitra P., Melnik S. Framework for the semantic Web: an RDF tutorial. IEEE Internet Comput. 2000;4:68–73. [Google Scholar]
- 41.McGuinness D.L., Van Harmelen F. OWL web ontology language overview. W3C Recommend. 2004;10:2004. [Google Scholar]
- 42.Seaborne A., Prud'hommeaux E. SPARQL query language for RDF W3C Recommendation. World Wide Web Consort. 2008;15 January. [Google Scholar]
- 43.Lee S., Ono T., Aoki-Kinoshita K. RDFizing the biosynthetic pathway of E.coli O-antigen to enable semantic sharing of microbiology data. BMC Microbiol. 2021;21:325. doi: 10.1186/s12866-021-02384-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.OpenLink Software, Virtuoso Universal Server, (2018).
- 45.Ranzinger R., Aoki-Kinoshita K.F., Campbell M.P., Kawano S., Lütteke T., Okuda S., Shinmachi D., Shikanai T., Sawaki H., Toukach P., Matsubara M., Yamada I., Narimatsu H. GlycoRDF: an ontology to standardize glycomics data in RDF. Bioinformatics. 2015;31:919–925. doi: 10.1093/bioinformatics/btu732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Yamada I., Campbell M.P., Edwards N., Castro L.J., Lisacek F., Mariethoz J., Ono T., Ranzinger R., Shinmachi D., Aoki-Kinoshita K.F. The glycoconjugate ontology (GlycoCoO) for standardizing the annotation of glycoconjugate data and its application. Glycobiology. 2021;31:741–750. doi: 10.1093/glycob/cwab013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Yao S., Zhao Z., Wang W., Liu X. Bifidobacterium Longum: protection against Inflammatory Bowel Disease. J. Immunol. Res. 2021:1–11. doi: 10.1155/2021/8030297. 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Katoh T., Yamada C., Wallace M.D., Yoshida A., Gotoh A., Arai M., Maeshibu T., Kashima T., Hagenbeek A., Ojima M.N., Takada H., Sakanaka M., Shimizu H., Nishiyama K., Ashida H., Hirose J., Suarez-Diez M., Nishiyama M., Kimura I., Stubbs K.A., et al. A bacterial sulfoglycosidase highlights mucin O-glycan breakdown in the gut ecosystem. Nat. Chem. Biol. 2023;19:778–789. doi: 10.1038/s41589-023-01272-y. [DOI] [PubMed] [Google Scholar]
- 49.Sakanaka M., Hansen M.E., Gotoh A., Katoh T., Yoshida K., Odamaki T., Yachi H., Sugiyama Y., Kurihara S., Hirose J., Urashima T., Xiao J., Kitaoka M., Fukiya S., Yokota A., Lo Leggio L., Abou Hachem M., Katayama T. Evolutionary adaptation in fucosyllactose uptake systems supports bifidobacteria-infant symbiosis. Sci. Adv. 2019;5:eaaw7696. doi: 10.1126/sciadv.aaw7696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Maitra A., Munshi T., Healy J., Martin L.T., Vollmer W., Keep N.H., Bhakta S. Cell wall peptidoglycan in Mycobacterium tuberculosis : an Achilles’ heel for the TB-causing pathogen. FEMS Microbiol. Rev. 2019;43:548–575. doi: 10.1093/femsre/fuz016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Catalão M.J., Filipe S.R., Pimentel M. Revisiting anti-tuberculosis therapeutic strategies that target the peptidoglycan structure and synthesis. Front. Microbiol. 2019;10:190. doi: 10.3389/fmicb.2019.00190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kam J.Y., Hortle E., Krogman E., Warner S.E., Wright K., Luo K., Cheng T., Manuneedhi Cholan P., Kikuchi K., Triccas J.A., Britton W.J., Johansen M.D., Kremer L., Oehlers S.H. Rough and smooth variants of Mycobacterium abscessus are differentially controlled by host immunity during chronic infection of adult zebrafish. Nat. Commun. 2022;13:952. doi: 10.1038/s41467-022-28638-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Basenko E., Pulman J., Shanmugasundram A., Harb O., Crouch K., Starns D., Warrenfeltz S., Aurrecoechea C., Stoeckert C., Kissinger J., Roos D., Hertz-Fowler C. FungiDB: an integrated bioinformatic resource for fungi and oomycetes. JoF. 2018;4:39. doi: 10.3390/jof4010039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Tudorache T., Nyulas C., Noy N.F., Musen M.A. WebProtégé: a collaborative ontology editor and knowledge acquisition tool for the Web. Semant. Web. 2013;4:89–99. doi: 10.3233/SW-2012-0057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Thornton K., Solbrig H., Stupp G.S., Labra Gayo J.E., Mietchen D., Prud'hommeaux E., Waagmeester A., et al. In: The Semantic Web. Hitzler P., Fernández M., Janowicz K., Zaveri A., Gray A.J.G., Lopez V., et al., editors. Springer International Publishing; Cham: 2019. Using shape expressions (ShEx) to share RDF data models and to guide curation with rigorous validation; pp. 606–620. [Google Scholar]
- 56.Varki A., Cummings R.D., Aebi M., Packer N.H., Seeberger P.H., Esko J.D., Stanley P., Hart G., Darvill A., Kinoshita T., Prestegard J.J., Schnaar R.L., Freeze H.H., Marth J.D., Bertozzi C.R., Etzler M.E., Frank M., Vliegenthart J.F., Lütteke T., Perez S., et al. Symbol nomenclature for graphical representations of glycans. Glycobiology. 2015;25:1323–1324. doi: 10.1093/glycob/cwv091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Mubaiwa T.D., Semchenko E.A., Hartley-Tassell L.E., Day C.J., Jennings M.P., Seib K.L. The sweet side of the pathogenic Neisseria: the role of glycan interactions in colonisation and disease. Pathog. Dis. 2017:75. doi: 10.1093/femspd/ftx063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Smedley J.G., Jewell E., Roguskie J., Horzempa J., Syboldt A., Stolz D.B., Castric P. Influence of pilin glycosylation on Pseudomonas aeruginosa 1244 Pilus Function. Infect. Immun. 2005;73:7922–7931. doi: 10.1128/IAI.73.12.7922-7931.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Heneghan M.A., McCarthy C.F., Moran A.P. Relationship of blood group determinants on Helicobacter pylori lipopolysaccharide with host lewis phenotype and inflammatory response. Infect. Immun. 2000;68:937–941. doi: 10.1128/iai.68.2.937-941.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Chmiela M. Structural modifications of Helicobacter pylori lipopolysaccharide: an idea for how to live in peace. WJG. 2014;20:9882. doi: 10.3748/wjg.v20.i29.9882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Salah Ud-Din A.I.M., Roujeinikova A. Flagellin glycosylation with pseudaminic acid in Campylobacter and Helicobacter: prospects for development of novel therapeutics. Cell. Mol. Life Sci. 2018;75:1163–1178. doi: 10.1007/s00018-017-2696-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Guerry P., Szymanski C.M. Campylobacter sugars sticking out. Trend. Microbiol. 2008;16:428–435. doi: 10.1016/j.tim.2008.07.002. [DOI] [PubMed] [Google Scholar]
- 63.Sandai D., Tabana Y.M., Ouweini A.E., Ayodeji I.O. Resistance of Candida albicans biofilms to drugs and the host immune system. Jundishap. J. Microbiol. 2016;9 doi: 10.5812/jjm.37385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Kaur J., Nobile C.J. Antifungal drug-resistance mechanisms in Candida biofilms. Curr. Opin. Microbiol. 2023;71 doi: 10.1016/j.mib.2022.102237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Gong Y., Qin S., Dai L., Tian Z. The glycosylation in SARS-CoV-2 and its receptor ACE2. Sig. Transduct. Target Ther. 2021;6:396. doi: 10.1038/s41392-021-00809-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Crouch L.I., Rodrigues C.S., Bakshani C.R., Tavares-Gomes L., Gaifem J., Pinho S.S. The role of glycans in health and disease: regulators of the interaction between gut microbiota and host immune system. Semin. Immunol. 2024;73 doi: 10.1016/j.smim.2024.101891. [DOI] [PubMed] [Google Scholar]
- 67.Yu R.K., Usuki S., Ariga T. Ganglioside molecular mimicry and its pathological roles in Guillain-Barré syndrome and related diseases. Infect. Immun. 2006;74:6517–6527. doi: 10.1128/IAI.00967-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Jia L., Sha S., Yang S., Taj A., Ma Y. Effect of protein O-mannosyltransferase (MSMEG_5447) on M smegmatis and its survival in macrophages. Front. Microbiol. 2021;12 doi: 10.3389/fmicb.2021.657726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Alves I., Fernandes Â., Santos-Pereira B., Azevedo C.M., Pinho S.S. Glycans as a key factor in self and nonself discrimination: impact on the breach of immune tolerance. FEBS Lett. 2022;596:1485–1502. doi: 10.1002/1873-3468.14347. [DOI] [PubMed] [Google Scholar]
- 70.Breiman A., Ruvoën-Clouet N., Deleers M., Beauvais T., Jouand N., Rocher J., Bovin N., Labarrière N., El Kenz H., Le Pendu J. Low levels of natural anti-α-N-acetylgalactosamine (Tn) antibodies are associated with COVID-19. Front. Microbiol. 2021;12 doi: 10.3389/fmicb.2021.641460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Spillings B.L., Day C.J., Garcia-Minambres A., Aggarwal A., Condon N.D., Haselhorst T., Purcell D.F.J., Turville S.G., Stow J.L., Jennings M.P., Mak J. Host glycocalyx captures HIV proximal to the cell surface via oligomannose-GlcNAc glycan-glycan interactions to support viral entry. Cell Rep. 2022;38 doi: 10.1016/j.celrep.2022.110296. [DOI] [PubMed] [Google Scholar]
- 72.Selisana S.M.G., Chen X., Mahfudhoh E., Bowolaksono A., Rozaliyani A., Orihara K., Kajiwara S. Alteration of β-glucan in the emerging fungal pathogen Candida auris leads to immune evasion and increased virulence. Med. Microbiol. Immunol. 2024;213:13. doi: 10.1007/s00430-024-00795-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data will be made available on request.