ABSTRACT
Natural microbial communities, with their vast diversity and complexity, are among the richest sources of untapped novel enzymes. Identifying novel enzymes can be challenging because microbiomes often lack clear, measurable phenotypes, unlike laboratory cultures where enzymatic activity can be linked to genetic elements. These constraints have left much of the functional diversity within microbiomes inaccessible to enzyme discovery efforts. Here, we present a genotype/phenotype association framework directly on microbial communities for enzyme discovery. For this, we developed a ‘bait‐and‐switch’ treatment strategy that generates measurable dual phenotypes directly within intact microbiomes. Using soil microbiomes as a test system, we applied chitin‐rich compost as ‘bait’ to enrich chitin‐degrading organisms, followed by glucose addition to functionally ‘switch’ the community. This treatment produced a distinct phenotypic signature: prevalence of known chitin degradation genes increases during the bait phase, and their transcripts are rapidly downregulated during the switch phase. By performing hypothesis‐free association analysis of protein domains with this dual phenotype, we identified the glycoside hydrolase 18 as the most significantly associated protein domain. Experimental validation confirmed chitinase activity in 63% of tested enzymes, including candidates from unculturable bacteria and those with previously uncharacterized domain architectures. This species‐independent, reference‐free approach to discover novel enzymes has broad applications in microbiome engineering, biopolymer processing and systems biology, offering a generalizable strategy for functional gene discovery in complex microbial systems.
Keywords: enzyme discovery, metagenomics, metatranscriptomics, microbial communities, microbiome, phenotypic association
A proposed workflow for identifying polymer degrading enzymes from intact microbiomes. This strategy uses a ‘bait and switch’ microcosm pulse experiment, genotype–phenotype association analyses and experimental validation of candidates to identify novel chitin degrading enzymes from soil microbial communities.

1. Introduction
Complex polymers, both naturally occurring and artificially synthesized, comprise important substrates in natural systems, providing surfaces for biofilm formation as well as a source of critically important nutrient (Mendes et al. 2013; Amaral‐Zettler et al. 2020; Keller‐Costa et al. 2022; Rillig et al. 2024). Cellulose, starch, lignin and chitin derivatives are a few examples of natural polymers with broad environmental distribution. Chitin is the second most abundant polymer found in nature and is an important component in both carbon and nitrogen biogeochemical cycling. Despite its abundance, this polysaccharide does not accumulate naturally, suggesting its efficient turnover and degradation (Gooday 1990; Beier and Bertilsson 2013). Microbial communities have evolved over time with such polymers common in their natural environments to produce enzymes that can degrade these substrates into oligomers and/or monomers that can be assimilated by the cell and metabolized.
Polymer degrading enzymes have significant application in material sciences, biotechnology and remediation, depending on the polymer of interest (Li et al. 2018; Delacuvellerie et al. 2021). However, accessing enzymatic diversity remains challenging, since conventional culture‐based methods are estimated to recover ca. 1%–2% of total environmental microorganisms under standard laboratory conditions (Hofer 2018; Hahn et al. 2019). Advances in molecular biology, metagenomics, metatranscriptomics and computational approaches have greatly expanded our capacity to identify and characterize enzymes directly from environmental samples, bypassing the need for cultivation (Brumfield et al. 2020). Nevertheless, these strategies remain largely dependent on existing enzyme databases, resulting in discoveries constrained to those enzymes with similarity to the already known, creating a ‘circular discovery loop’.
In contrast, microcosm pulse experiments offer a powerful approach to enzyme discovery by studying microbial communities under controlled, yet ecologically relevant conditions. These systems essentially preserve community structure while allowing targeted perturbation, such as nutrient amendment, to drive compositional and functional changes (Cao et al. 2021). For example, crude oil addition in freshwater sediment microcosm experiments has been shown to stimulate microbial succession, selecting microbes containing hydrocarbon‐degrading enzymes (Howland et al. 2024). Such experiments highlight how dynamic microbial response can potentially be used for novel protein discovery without the need for isolation and cultivation. However, due to the complexity of microbial community composition, determining the direct link between a community‐level shift and the production of specific enzymes remains challenging and relies heavily on similarity to known proteins to prioritize which targets to validate experimentally.
In this study, a workflow was developed to identify novel polymer‐degrading enzymes produced by soil microcosms, by employing a bait‐and‐switch pulse strategy coupled with phenotypic association analysis. As proof of concept, we focused on chitin‐degrading enzymes in salt marsh soils. The experimental design involved ‘baiting’ a microcosm with shell compost, a rich source of chitin, followed by a rapid ‘switch’ to an alternative carbon source. Metagenomic Genome–Phenome Association (MetaGPA) (Yang et al. 2021) was used to link protein domains with phenotypic traits, measured through community succession dynamics and transcriptional responses to bait‐and‐switch perturbation. With this approach, we successfully identified 190 known and putative chitinase genes, of which 27 were expressed for activity screening and 17 (ca. 63%) experimentally validated. Importantly, this framework also enables tracking of global microbial community responses to carbon amendment (e.g., chitin, glucose). These results demonstrate the feasibility of combining a bait‐and‐switch experimental design with phenotype association analyses to accelerate enzyme discovery directly from environmental microbial communities.
2. Experimental Procedure
2.1. Sample Site
Soils were collected from the Great Marsh, Ipswich MA, USA (42.6672° N, 70.8111° W), which spans much of the North Shore of Massachusetts. The sample area consists of an open marsh drained by the Labor‐in‐Vain Creek and experiences flooding during high tide and drainage during low tide. The site is largely dominated by high marsh (greater than 1.3 m) and experiences semidiurnal tides with an average mean range of 2.5–2.9 m (Vallino et al. 2005; Forbrich et al. 2018). The marsh is dominated by the salt meadow cordgrass, Sporobolus pumilus (Forbrich et al. 2018; Farron et al. 2020), and sediments are carbon‐rich compared to other salt marshes in the eastern United States (Forbrich et al. 2018).
Sampling was completed on 2, October 2023, at 7:15 am, during low tide. During the time of sampling, the pH of the soil was about 6, as approximated with VWR pH‐Test strips (BDH35309.606). Approximately 2 kg of soil was collected at 0–10 cm depth, adjacent to S. pumilus growth, which provides a suitable habitat for chitin‐rich organisms such as crustaceans and molluscs (Lonard et al. 2010). Samples were stored in sterile high‐density polypropylene (HDPE) bottles during transport.
2.2. Microcosm Experiments
Collected soil was homogenized and sieved through a 2.36 mm sieve. Stones, sticks, root nodules and other litter were removed. Two microcosms were set up, containing 600 g of homogenized soil, each in sterile glass containers. The first microcosm served as a control. The second microcosm was amended with 10% w/w sterile, ground, commercially available crab and lobster shell compost which served as the chitin amendment (treatment). Shell compost was selected versus purified or synthetically derived chitin to simulate chitin substrates found naturally in the environment. It has been shown that purified and chemically derived chitin is less preferred by microbes for metabolism and degradation (Jacquiod et al. 2013; McClure et al. 2022). Both microcosms were mixed thoroughly with sterile spatulas to ensure homogenization after addition. The experimental design and sampling scheme are illustrated in Figure 1.
FIGURE 1.

Overview of Experimental Design, Sampling Scheme, and Analysis Workflows. (A) Microcosm pulse experiments were designed using the bait and switch model. At each designated timepoint in the sampling scheme, two grams of soil were collected in duplicate and processed for both RNA and DNA sequencing. (B) Three association analyses were completed using the MetaGPA pipeline: DNA reads, RNA reads, and Normalized Transcriptional Changes (RNA:DNA ratios, see Experimental Procedure). This figure was generated with Biorender.
For sample collection, microcosms were mixed with sterile spatulas and destructively sampled at specified time points (0, 1, 4, 8, 24, 48, 49 and 52 h). At each time point, two‐gram aliquots were collected in duplicate, flash frozen in dry ice‐ethanol slurry, and stored at −80°C until further analysis. At 48 h, another round of sampling was completed prior to glucose or water addition. The remaining soil in both the control and chitin amended microcosms was split into two new, sterile containers each containing half of the original soil (~250 g, accounting for soil loss from previous sampling points), thereby providing two microcosms with the control soils and two microcosms with the chitin amended soils. For one of each microcosm group (treatment or control), a sterile 50% w/v glucose solution was added to obtain 10% w/w glucose. For the remaining microcosm groups (one treatment, one control), the same volume of sterile, deionized water was added to control for soil rehydration. All four microcosms were thoroughly mixed with sterile spatulas, and the sampling scheme continued until the completion of the experiment.
2.3. Library Preparation
RNA for metatranscriptomic sequencing and analysis was extracted and prepared as described. RNeasy PowerSoil Total RNA kits (12,866, Qiagen, Hilden, Germany) were used to extract total RNA from 2 g of soil for each replicate according to the manufacturer's instructions, with some modifications. The lysis/bead bashing step was increased to 20 min. During the isopropanol precipitation, samples were incubated at −20°C for 45 min. To remove any residual polyphenolic or humic acids that would interfere with downstream library preparation, the Zymo OneStep PCR Inhibitor Removal Kit (D6030, Zymo Research Corp, Irvine, CA, USA) was used to treat and concentrate purified RNA according to the manufacturer's instructions. RNA concentration and purity ratios were determined using a NanoDrop One (Thermo Scientific, Waltham, MA, USA). Samples were stored at −80°C until library preparation was completed. For each sample, around 800 ng of total RNA was used for library preparation. Ribosomal depletion was completed using the NEBNext rRNA Depletion Kit for Bacteria (E7850, New England Biolabs, Ipswich, MA, USA) according to the manufacturer's instructions. Libraries with 200 bp inserts were prepared with the NEBNext Ultra II Directional RNA Library Prep Kit (E7765, New England Biolabs, Ipswich, MA, USA). NEBNext Multiplex Oligos for Illumina (E7500, New England Biolabs, Ipswich, MA, USA) were used for barcoding and PCR enrichment. Library concentration, size and quality were assessed using a TapeStation 4200 with a D1000 High Sensitivity Screen Tape (5067–5584 Agilent, Lexington, MA, USA).
DNA for paired metagenomic analyses was also extracted and prepared for sequencing. DNeasy PowerSoil Pro Kits (47,016, Qiagen, Hilden, Germany) were used to extract total DNA from 0.25 g of soil per replicate according to the manufacturer's instructions. As with RNA preparation, OneStep PCR Inhibitor Removal Kits (D6030, Zymo Research Corp, Irvine, CA, USA) were used to treat and concentrate DNA samples according to manufacturer instructions. DNA concentration and purity ratios were determined using a NanoDrop One (ThermoScientific, Waltham, MA, USA). Samples were stored at −20°C until further library preparation was completed. For each sample preparation, 50 ng of purified DNA was sheared to generate 200 bp inserts using a Covaris SR Focused Ultrasonicator (Covaris, Woburn, MA, USA). Libraries were generated using the NEBNext Ultra II DNA Library Prep Kit for Illumina (E7645, New England Biolabs, Ipswich, MA, USA), according to manufacturer instructions. NEBNext Multiplex Oligos for Illumina (E7500, New England Biolabs, Ipswich, MA, USA) were used for barcoding and PCR enrichment. Library concentration, size, and quality were assessed using a TapeStation 4200 with a D1000 High Sensitivity Screen Tape (5067–5584 Agilent, Lexington, MA, USA). For both DNA and RNA, libraries were sequenced at the New England Biolabs Sequencing Core, PE 2X100 on the NextSeq or NovaSeq S2/SP targeting 50 and 100 million reads, respectively (Illumina, San Diego, CA, USA).
2.4. Bioinformatic Analysis
Relative changes in community composition in response to treatment were determined via metagenomic k‐mer profiling using Kraken2 under default settings, using the Standard Collection database built for 100‐mers (accessed October 2024) (Lu et al. 2022). Alpha diversity as measured using Shannon's Diversity Index, and Beta Diversity, as measured by Bray–Curtis Dissimilarity, were also calculated for each metagenomic sample using KrakenTools (Lu et al. 2022). Calculations were completed at the genus level using k‐mer profiles and KrakenTools to assess shifts in community composition. To assess the frequency of known chitinase or auxiliary genes involved in chitin degradation in response to treatment, the following analyses were completed. Reads from replicates were combined and downsampled to 30 million reads using seqtk sample (v1.4‐r122). Assembly was performed with metaSPAdes (v3.15.3) (Nurk et al. 2017). Auxiliary genes involved in chitin degradation included those encoding for lytic polysaccharide monooxygenase (LPMO), chitin disaccharide deacetylase (CDD), and metalloprotease. Representative chitinases and auxiliary proteins with experimentally validated activity were retrieved from previous studies (Kim et al. 2007; Vaaje‐Kolstad et al. 2012; Valenzuela et al. 2017; Grifoll‐Romero et al. 2018; Oyeleye and Normi 2018; Zhang et al. 2023; Yao et al. 2023) and corresponding amino acid sequences were obtained from the Protein Data Bank (PDB) database (accession numbers: 1E15, 1WVU, 2CJL, 2Z37, 3CQL, 3HBD, 3G6L, 3IWR, 3 N11, 3ALF, 3AQU, 3OA5, 3SIM, 2Y8V, 4AXN). Homologues of these chitinase and auxiliary genes were identified in the microcosm metagenomic and metatranscriptomic datasets from both chitin‐treated and control samples at each time point using tblastn (BLAST+ v2.16.0, default parameters). Hits with e‐values < 0.001 were combined and merged using the bedtools merge function (v2.27.1). For each dataset, the total number of hits and the corresponding read coverage were recorded.
Phenotype association analyses were completed using the core MetaGPA pipeline as previously described (Yang et al. 2021) on metagenomes (DNA), metatranscriptomes (RNA), and normalized transcriptional changes (RNA:DNA ratios). Analyses of each dataset were completed to determine whether association at the gene level (representing changes in community composition), transcript level (representing changes in gene expression), or normalized transcriptional changes (representing changes in both community composition and gene expression) would provide the strongest signal for domain associations (Figure 1B). The MetaGPA pipeline can be accessed at https://github.com/ge‐a/metaGPA2.
Phenotype association analyses using MetaGPA were completed with the following modifications. For both metagenomic and metatranscriptomic samples, technical duplicates were used to generate assemblies for treatment and control samples. For each time point, reads were assembled using metaSPADES (metagenomes) or directional rnaSPADES (metatransciptomes) under default settings (v.3.15.3) (Nurk et al. 2017; Bushmanova et al. 2019). Assemblies from the treatment and control were combined and dereplicated at 99% identity and 90% alignment length using cd‐hit (v4.8.1) (Fu et al. 2012) to generate a non‐redundant list of contiguous sequences (contigs) or transcripts observed in either the control or treatment, or both. Replicate read files from the control and treatment were individually mapped to the non‐redundant, combined assembly using bwa‐aln (v0.7.15) (Li and Durbin 2009). Read counts for each replicate per experimental group were averaged and used to calculate ratios between the treatment and control to determine if contigs were enriched or depleted in treatment samples. If the ratio of reads mapped between the treatment and control were greater than 3.0, the contig was considered enriched. A cutoff of 3.0 was chosen in accordance with the original metaGPA pipeline. This selection designates roughly 2% of contigs as enriched, consistent with 2%–3.5% of the contigs found in the original metaGPA study (Yang et al. 2021). Each contig was annotated using the Protein Family Databases (PFAM, v35.0), for protein domains (Mistry et al. 2021) using Hidden Markov Modelling (HMM) via hmmer (v3.3.2). To ensure accurate annotations, PFAM assignments were kept for further analysis if the annotation e‐value score was ≤ 1.00 e‐05. Fisher's exact test was used to determine if the number of contigs observed in the treatment versus control was statistically significant. If contigs or transcripts were significantly abundant or depleted in the treatment, they were considered enriched or depleted in the chitin amended soils, respectively.
To assess whether significant associations observed in the classic MetaGPA workflow were a result of pure change in transcript abundance, community succession selecting for chitin degraders or both, additional analyses were completed. First, both metagenomic and metatranscriptomic reads were mapped to assembled transcripts respective of sampling time. Each sample was normalized for sequencing depth, and a ratio of normalized RNA to DNA was generated for each contig. This ratio represents relative expression effort, correcting for differences in community composition across samples and variation in sequencing depth.
Ratios of relative expression effort between treatment and control samples were calculated to determine which genes were upregulated or downregulated in treatment samples compared to control. These normalized transcriptional changes were used to recalculate enrichment scores and determine which domains were differentially associated across different samples.
2.5. Identifying Associated Candidate Domains for Screening
To integrate information from both the bait and switch to identify candidate chitinolytic domains, a scored phenotype association method was used. The scored phenotypes were applied to domain association results using metatranscriptomes from 48 h after chitin addition (bait, Table S1) and 1 h after glucose addition (switch, Table S2). Combined Phenotype Scores (CPS) were assigned based on the following criteria. Only if the ratio during the bait and switch was 3 or above and 0.33 or below respectively, the contig was added to the list of relevant contigs. We used binomial distribution to identify significant enrichment of PFAM domains in the relevant contig list relative to the total contig list resulting in a list of domains associated with desired phenotypes in both the bait and the switch. (Table 1; Table S4). A conceptual overview of the Combined Phenotype Score and selection of candidates for screening is displayed in Figure 2.
TABLE 1.
Top 20 associated PFAM domains from the combined phenotype association.
| PFAM ID | PFAM annotation | PFAM description | Predicted number of genes with significant combined phenotype score | Total number of predicted genes | p |
|---|---|---|---|---|---|
| PF00704.31 | Glyco_hydro_18 | beta‐N‐acetylglucosaminidase, beta‐hexosaminase, lacto‐N‐biosidase | 35 | 344 | 0 |
| PF09547.13 | Stage IV sporulation protein | Stage IV sporulation protein A, ATPase Domains | 59 | 316 | 0 |
| PF03401.17 | TctC | Tripartite tricarboxylate transporter family receptor | 123 | 4626 | 0 |
| PF02728.19 | Cu_amine_oxidN3 | Copper amine oxidase N3 domain | 13 | 38 | 0 |
| PF00378.23 | ECH_1 | Enoyl‐CoA hydratase/isomerase | 127 | 6824 | 0 |
| PF00171.25 | Aldedh | Aldehyde dehrogenase family | 273 | 13,096 | 0 |
| PF02274.20 | ADI | Arginine deiminase | 25 | 407 | 0 |
| PF04957.15 | RMF | Ribosome modulation factor | 21 | 141 | 0 |
| PF02653.19 | BPD_transp_2 | Branched chain amino acid transport system/permease | 119 | 6068 | 0 |
| PF00267.24 | Porin_1 | gram negative porin | 58 | 1635 | 0 |
| PF08298.14 | AAA_PrkA | PrkA AAA in serine kinase | 65 | 2821 | 0 |
| PF03446.18 | NAD_binding_2 | NAD binding of 6‐phosphogluonate dehydrogenase | 103 | 7084 | 0 |
| PF00501.31 | AMP‐binding | AMP‐binding enzyme | 261 | 14,263 | 0 |
| PF04442.17 | CtaG_Cox11 | Cytochrome C oxidase assembly protein | 23 | 284 | 0 |
| PF02894.20 | GFO_IDH_MocA_C | Oxidoreductase family, C‐terminal alpha/beta | 64 | 2528 | 0 |
| PF15984.8 | Collagen_mid | bacterial collagen, midle region | 26 | 93 | 0 |
| PF13609.9 | Porin_4 | gram negative porin | 119 | 5361 | 0 |
| PF00497.23 | SBP_bac_3 | Bacterial extracellular solute‐binding protein | 94 | 5651 | 0 |
| PF01094.31 | ANF_receptor | Receptor family ligand binding region | 161 | 4766 | 0 |
| PF19077.3 | Big_13 | Bacterial Ig‐like domain | 63 | 1811 | 0 |
FIGURE 2.

Conceptual overview of the Combined Phenotype Score for candidate selection. (A) Assembled contigs are assessed for transcriptional response to the bait and switch (RNA) as well as their relative abundance (DNA). These metrics are used to determine the Combined Phenotype Score (CPS, see Experimental Procedure) to classify contigs into two groups. (B) All assembled contigs undergo PFAM domain annotation. Genes annotated with PFAM domain(s) found significantly enriched in the targeted phenotype group are considered candidates for chitinase activity (green domains). (C) Candidate genes (containing the green domain) are selected for enzymatic validation. This figure was generated with Biorender.
Associated domains were further filtered based on annotation to identify candidate chitinase sequences. Results were filtered and domains that were annotated as glycoside hydrolases (GHs) were kept for further analysis. GHs were selected for experimental validation given that domains are represented by over 194 classes and are likely carrying out the core catalytic function of chitinase enzymes (hydrolyzing glycosidic bonds within chitin). Open reading frames from transcripts from the 48‐h assembly that contained domains of interest were additionally used to build sequence similarity networks using the EFI Enzyme Similarity Tool v.2025_03 (Zallot et al. 2019; Oberg et al. 2023). Edges were drawn between nodes with an alignment score threshold of at least 35%, and networks were visualized using Cytoscape v. 3.10.3 (Lopes et al. 2010).
2.6. Expression and Activity of Purified Chitinases
To validate putatively identified chitinases, 27 predicted protein sequences were screened for activity. These sequences were selected based on the following criteria: sequences needed to be complete open reading frames (ORF) and annotated to contain a glycoside hydrolase 18 (GH18) domain. To determine if these identified sequences exhibited novel domain arrangements or structure, AlphaFold2 v.2.3.0 models were built for each predicted protein, and the Protein Databank (PDB) (accessed August 2025) queried with these models using FoldSeek v.10 under default parameters. Additionally, sequences were also aligned to the 444 experimentally characterized protein sequences containing GH18 domains found on the Cazy database (accessed 2 September 2025).
Gene blocks containing candidate chitinase sequences were assembled into the pet28a(+) backbone using the NEBuilder Hifi DNA Assembly Kit (E2621, New England Biolabs, Ipswich, MA, USA). Candidate chitinases were then synthesized in vitro. Approximately 1 μg of plasmid DNA was added to an NEBExpress Cell‐free E. coli Protein Synthesis System reaction (E5360, New England Biolabs, Ipswich, MA, USA), scaled for a total reaction volume of 100 μL. Each reaction was supplemented with PureExpress Disulfide Bond Enhancer (E6820, New England Biolabs, Ipswich, MA, USA) to ensure proper folding of chitinase proteins, which contain disulfide bonds. The IVTT reaction was incubated at 25°C, with vigorous shaking for 24 h. Expressed candidate chitinases were purified via his‐tag affinity using NEBExpress Ni‐NTA Magnetic Beads (S1423, New England Biolabs, Ipswich, MA, USA). Proteins were eluted with elution buffer consisting of 20 mM sodium phosphate, 300 mM NaCl, 300 mM imidazole, pH 7.4. Protein expression was visualized via SDS‐PAGE (Figure S2).
Purified proteins were screened for chitinase activity using the Fluorometric Chitinase Assay Kit (CS1030, Sigma Aldrich, St. Louis, MO, USA) according to the manufacturer's instructions. The assay relies on enzymatic hydrolysis to release 4‐methylumbelliferone (4MU) from labelled substrates to screen for exochitinase (4‐Methylumbelliferyl N,N′‐diacetyl‐β‐D‐chitobioside), endochitinase (4‐Methylumbelliferyl β‐D‐N,N′,N′′‐triacetylchitotriose) and β‐N‐acetylglucosaminidase (4‐Methylumbelliferyl N‐acetyl‐β‐D‐glucosaminide) activity. All three substrates were screened for each protein to fully characterize the chitinolytic activity of the candidate proteins. In reference to the assay positive control chitinase from Trichoderma viride (C6242, Sigma Aldrich, St. Louis, MO, USA), relative levels of chitinolytic activity were determined for each protein sequence and substrate screened. Measurements of 4MU (ng) released below 0 were designated as not detected. The following designations were assigned based on the measured release of 4MU—low: greater than 0 ng and less than or equal to 10 ng, medium: greater than 10 ng and less than or equal to 100 ng, high: greater than 100 ng and less than or equal to 500 ng, and very high: greater than 500 ng. The assay control chitinase exhibited high activity for each substrate tested. Quantification of 4MU release (ng) for each candidate and substrate screened is reported in replicate in Table S3.
3. Results
3.1. Bait and Switch Strategy
Pulse perturbation, namely substrate addition intended to transiently alter community composition, was applied to salt marsh soil microcosms to facilitate identification of novel‐chitin‐degrading enzymes by direct genotype association. To achieve this, we made the hypothesis that the addition of shell compost (predominantly chitin) would select for microorganisms utilizing chitin as a carbon source. Previous studies have shown that chitin serves as an effective selective substrate for chitinolytic microbes (Brzezinska et al. 2013) and that chitin‐selective enrichment strategies have been successfully employed to favour chitinolytic microorganisms (Meunier et al. 2024).
The initial enrichment step, the ‘bait’, was designed to shift the microbial community composition toward enrichment of chitin‐degrading taxa, as well as organisms benefiting indirectly from chitin as a nutrient source. Under the induced selective pressure, an increase in relative abundance of genes encoding chitin‐degrading enzymes would be expected along with an increased likelihood of detecting a larger diversity of enzymes by high‐throughput DNA sequencing. Additionally, transcriptional activation of these genes should occur, resulting in increased relative transcript levels via high‐throughput RNA sequencing following chitin amendment.
Following the bait, a ‘switch’ strategy is applied to the same microcosms. This involves the introduction of a more readily metabolized carbon source, namely glucose, to stimulate downregulation of genes involved in chitin‐degrading pathways. The term ‘switch’ refers to the anticipated shift in transcriptional activity from chitin utilization in response to an alternate carbon substrate. Because transcriptional responses would be expected to occur more rapidly than community composition change, the switch strategy required sampling within hours following glucose amendment to capture any immediate changes in gene expression.
All experiments were performed in biological duplicates with controls that were not exposed to the bait and/or switch. The switch control comprised addition of the equivalent volume of ultrapure water as a substitute for the glucose solution. The overall experimental design is illustrated in Figure 1A. Sampling was at 0, 1, 4, 8, 24 and 48 h after the bait and 1 and 4 (49 and 52 h, total elapsed time) hours after the switch. DNA and RNA were extracted and metagenomic and metatranscriptomic libraries generated for all samples and analysed using Illumina sequencing.
3.2. Chitin Bait Selectively Enriches for Chitin Degrading Taxa
To assess the effect of the bait and switch strategy on microbial community composition, alpha and beta diversity were calculated, employing read k‐mer profiling of all metagenomic samples. Alpha diversity, or species diversity within a sample, was relatively the same with respect to genus for all samples within the first 8 h. At 24 h, alpha diversity decreased in chitin‐treated samples (Figure 3A), suggesting a selective enrichment of a limited subset of bacteria that possess the enzymatic machinery required for chitin degradation.
FIGURE 3.

Soil Microbiome Community Changes during the experimental time course (A) Alpha Diversity, as measured by Shannon's Diversity Index at the Genus level. (B) Beta Diversity, as measured by Bray Curtis Dissimilarity at the Genus level. (C) Relative abundance of different Classes. Diversity and community composition were calculated via k‐mer frequency profiling using kraken2. (D) Number (left) and total coverage (right) of homologues matching known Chitinase genes (from 30042170) at different times of the bait experiment using DNA‐seq data. (E) Same as (D) but based on RNA‐seq data.
Three major groups, based on beta diversity, were determined by Bray–Curtis Dissimilarity analysis (Figure 3B). Groups 2 and 3 include control and treated samples collected within the first 8 h. Group 1 contained chitin‐treated samples from time points of 24 h or later, corresponding to decreased alpha diversity (Figure 3B). At 24 h and later, Gammaproteobacteria and Bacilli increased compared to control samples (Figure 3C). Community composition did not change significantly following the glucose addition, as the short sampling interval (1–4 h post‐addition) was insufficient to capture detectable microbial succession (Figure 3B,C).
3.3. Genes Coding for Chitin Degradation Detected After Chitin Addition
A representative set of 15 experimentally validated chitinases (Oyeleye and Normi 2018) was used to identify known chitinase coding genes via homology search. The number of genes homologous to validated chitinases increased 24 and 48 h after chitin addition (Figure 3D). Also, chitin amended samples exhibited a 10‐fold increase in the number of genes compared to the original sample and a 17‐fold increase relative to untreated controls. The relative abundance of genes, estimated by read mapping, rose 24 h after chitin addition. By 48 h, mapped reads were approximately 30‐fold higher than the initial sample and untreated controls. (Figure 3D). These results indicate that chitin addition during the bait period led to an increased abundance of genes predicted to encode chitin‐degrading enzymes.
Similar patterns were observed with RNA‐seq data, representing relative changes in transcription. Both the number of transcripts homologous to validated chitinases and total number of reads mapped increased 24 and 48 h after chitin addition. Relative transcript abundance of known chitinases decreased 1–4 h after glucose addition. Addition of water did not alter transcript abundance of chitinase coding genes (Figure 3E). These results support our hypothesis that chitin addition upregulates chitinase genes, and glucose addition stimulates rapid downregulation. Taken together, these results indicate both a selection for chitin degrading organisms and an apparent upregulation of transcripts coding for chitinolytic enzymes during the bait and a sharp downregulation of those transcripts during the switch. We also performed the same analysis using known, validated auxiliary enzymes involved in chitin degradation and found similar profiles for both LPMO and CDD. More specifically, we observed a sharp increase in abundance at 24 and 48 h after chitin addition for both RNA and DNA, as well as a decrease in RNA at 1–4 h after glucose addition (Figure S1). These findings support the use of phenotype–genotype association analyses to identify known and new chitinase genes. The phenotype of interest was a combination of both transcriptional activation in response to chitin addition and downregulation upon glucose addition.
3.4. Phenotypic Association
Combined Phenotype Scores (CPS) were used to rank candidate transcripts associated with chitin degradation and identify domains (Table 1, Figure 2), an approach essential to capture the results of the bait and switch simultaneously. Specifically, transcripts showing strong upregulation during the bait phase (48 h) and rapid downregulation during the switch (1 h) had high CPS values (see Experimental Procedure). Transcripts annotated using the Protein Families Database (PFAM) and association analysis (modified from metaGPA) identified significantly enriched protein families among high‐CPS transcripts. Recovered de novo was an unbiased list of 105 associated protein domains (p‐value = 0, Tables 1 and Table S3), with the top hit, a glycoside hydrolase domain from family 18 (PF00704, Glyco_hydro_18, GH18), shown to be widely distributed in hydrolytic enzymes with chitinase or endo‐N‐acetyl‐beta‐D‐glucosaminidase (ENGase) activity as well as chitinase‐like lectins (chi‐lectins/proteins (CLPs) Beier and Bertilsson 2013; Terrapon et al. 2017). Other highly ranked domains included those involved in solute binding and transport, and metabolite biosynthesis and utilization. A complete list of ranked domains is provided in Table S4. For further phenotypic association analyses, associated domains were filtered by annotation for those identified as glycoside hydrolases (GHs), a broad group of enzymes in all domains of life that hydrolyze glycosidic bonds, the expected mechanism of chitin degradation (Terrapon et al. 2017).
3.5. Association Analysis (MetaGPA/TPA) Reveal Enrichment of Chitinase Gene Families
Next, we analysed the association profile of GH18 and other GH domains across time during the bait and switch and observed significant associations 24–48 h after chitin addition (bait). Significantly enriched domains included GH18, GH19, and GH20 using both RNA and DNA data. (Figure 4A,B). This observation suggests both (1) an increase in the abundance of microbes harbouring genes containing these GH domains and (2) transcriptional upregulation of these genes. During the switch phase, 1 h after glucose addition, domains GH18, GH19, GH20 and others were significantly depleted (Figure 4B). Significant association in GH domains using transcription levels normalized to corresponding genomic abundances highlighted ‘true’ transcriptional changes as opposed to changes in community composition. GH18 exhibited the highest enrichment score 48 h after chitin addition. This domain was also significantly depleted one and 4 h after glucose addition (Figure 4C). These additional analyses strengthen the evidence associating GH18 with chitin degradation (Table 1, Figure 4).
FIGURE 4.

GH Association Scores across different analysis strategies. (A) DNA enrichment scores during the bait phase of the experiment (1–48 h post chitin enrichment). (B) RNA association scores including enrichment scores during the bait phase (1–48 h post chitin enrichment) as well as the switch phase (1–4 h after glucose or water addition). Depletion scores for the switch phase are also shown. (C) RNA:DNA Ratio, or normalized transcriptional changes, association scores. Enrichment scores during the bait and switch phase as well as depletion scores during the switch are shown. Domains are considered significantly associated if the association score (−log10(p‐value)) is equal or greater than 5. (D) Relative abundance of identified transcripts assigned to bacterial Class for GH18, 19 and 20. (E) Individual Normalized Transcriptional Changes for each transcript containing a GH18, GH19 or GH20 domain during the switch phase of the experiment. The top panel illustrates normalized transcriptional changes 48 h after chitin addition. The middle and bottom panels illustrate normalized transcriptional changes between 48 h after chitin addition and 1–4 h post spike in (glucose or water).
3.6. Regulation of Putative Chitinases Genes
Next, we identified all the genes that contained GH18, 19 or 20 domains. These genes were identified across several bacterial taxa, many of which are uncharacterized or uncultured (Figure 4D). The majority of putative chitinase genes were upregulated 48 h after chitin addition and downregulated one hour after glucose addition. (Figure 4E; Figure S1). Most candidate sequences contained a GH18 domain (EC 3.2.1.14, n = 128), shown to have endochitinase, exochitinase, and N‐acetylglucosaminidase activity (Terrapon et al. 2017). The GH19 domain, described to have exochitinase activity (Terrapon et al. 2017), was detected in fewer sequences (EC 3.2.1.14, n = 19). Several sequences were found to contain GH20 domains (EC 3.2.1.52, n = 43), which have N‐ acetylglucosaminidase activity, catalysing breakdown of diacetylchitobiose into N‐acetylglucosamine (Terrapon et al. 2017). Most of the sequences were taxonomically assigned to Gammaproteobacteria and Bacilli (Figure 4D), the relative abundance of which increased at 24–48 h in samples treated with chitin (Figure 3C). Consistent with observed changes in regulation of putative chitinases, hypothesized changes in regulation of genes involved in glycolysis were also observed (Figure S4, Table S5). Together, this indicates the addition of glucose likely played a significant role in modulating the regulation of genes containing certain GH domains.
3.7. Experimental Validation of Putative Chitinases
Complete ORFs, containing GH18 domains, were expressed and screened for chitinolytic activity (see Experimental Procedure). While these ORFs differ in sequence similarity and domain composition (Figure S2, Figure 5E), they exhibit the expected pattern of up‐ and downregulation in response to the bait and switch, respectively (Figure 4E). Screened sequences included bacterial Classes Gammaproteobacteria (11 sequences), Bacilli (6 sequences), Unclassified (6 sequences, unable to confidently assign taxonomy), Bacteroidia (2) and Clostridia (2). Of the 27 sequences screened, 17 exhibited chitinolytic activity. Of the ten candidates that did not exhibit chitinolytic activity, eight were not able to be visualized at the expected weight (Figure S3), suggesting poor expression in vitro. One clade, consisting of five proteins with lysM domains (PFAM: PF01476; GH18_27, GH18_9, GH18_22, GH18_4, GH18_26) lacked detectable activity despite detectable protein for three of five constructs (Figure S3). Proteins with greatest relative activity, especially exochitinase and endochitinase functionality, are from Gammaproteobacteria. In addition to containing GH18 domains, several of the proteins also contained chitinase A and carbohydrate binding module family 5/12 domains (ChiA: PF08329 and CBM: PF02839, Figure 4E). Compared to the chitinase assay positive control, chitinases from Gammaproteobacteria had relatively higher exo‐ and endochitinase activity, but not β‐N‐acetylglucosaminidase activity. Relatively low β‐N‐acetylglucosaminidase activity was observed for GH18_19, from Bacilli, that lacked ChiA and CBM domains. Two proteins, from Clostridia (GH18_23, GH18_24), exhibited relatively high and very high exochitinase activity, despite lacking ChiA and CBM (Figure 5). GH18_23 and GH18_24, homologous to Clostridium spp. CotE proteins, are important proteins in spore formation (Permpoonpattana et al. 2013; Whittingham et al. 2020).
FIGURE 5.

Experimental validation of 27 novel chitinase candidates. (A) A phylogenetic tree for the 27 GH18 domain‐containing proteins is shown with relative chitinase activity. The phylogenetic tree was generated using a MAFFT alignment and neighbour joining tree algorithm (1000 iterations). The yellow and purple nodes contain sequences from bacteria from the classes Bacilli and Gammaproteobacteria, respectively. (B) Relative chitinase activity for β‐N‐acetylglucoasmindase (Glu), exochitinase (Exo), and endochitinase (Endo) are depicted on the right. Relative activity was categorized based on the measured release of 4MU (ng): Not Detected < 0; Low < 0, ≤ 10; Medium: > 10, ≤ 100; High: > 100, ≤ 500; Very High: > 500. (C) Enrichment scores are shown for metaGPA analysis completed on DNA or RNA respectively. DNA enrichment scores are shown for the bait phase of the experiment: 0 h and 48 h post chitin addition. RNA enrichment scores are shown for the switch phase of the experiment: 48 h after chitin addition and 1 h after glucose addition. (D) RNA depletion scores are shown for the switch phase of the experiment: 1 h after glucose or water addition as compared to 48 h post chitin addition. Schematic domain structure for each protein sequence is shown to the right of depletion scores. (E) Domains include glycoside hydrolase 18 (GH18), Chitinase A, N terminal (ChiA_N), Chitinase A, C terminal (ChiA_C), LysM domain (LysM), carbohydrate binding module family 5/12 (CBM 5/12), bacterial iG‐like domains (Big_7, Big_9), the REG domain (REG), 1‐cys peroxiredoxin C‐terminal (1‐cysPrx), AhpC/TSA family (AhpC/TSA), and other (all other domains). Domains outlined in black boxes indicate domain arrangement that has not previously been seen before in experimentally validated chitinases.
4. Discussion
Bait and switch pulse experiments were carried out with domain association analyses to establish a framework for polymer degrading enzyme discovery directly from intact microbial communities. The approach included three integrated experimental phases based on the expected behaviour of chitinase genes being more abundant and upregulated when chitin is present and quickly downregulated when the source of carbon changes. First, shifts in community composition were measured by changes in relative DNA abundance, serving as a proxy for microbial succession in response to the addition of the bait. Next, a gene expression baseline within the enriched community was established using metatranscriptomics. Finally, a switch, or deliberate perturbation of the microbial environment, induced rapid downregulation of associated genes. Together, these three phases: community succession, gene expression in response to the bait, and gene downregulation in response to the switch, provide a robust procedure to determine the association of microbial phenotypes with genes that encode targeted functions. To do this, we associated genotypes with the desired phenotype in a hypothesis‐free manner to identify protein domains that exhibit the expected changes in abundance and gene regulation. Thereby, successful identification of a list of putative chitin‐degrading sequences from salt marsh soils from both known and unclassified bacteria was established (Figures 4 and 5). Notably, we observe an enrichment for Gammaproteobacteria, Bacilli, and other taxa, which are known to naturally occur in these systems and associate with chitin substrates (Brumfield et al. 2025). Expression of sequences confirmed activity and validity of the approach to identify target enzymes of interest (Figure 5).
The workflow outlined here allows discovery and characterization of novel enzymes that may have been overlooked by more established approaches. Chitinolytic activity was confirmed for 17 of 27 constructs screened. Of the 17, five sequences had > 98% homology with previously validated chitinases and the other 12 contained glycoside hydrolase 18 domains. Identification of associated proteins from unculturable or poorly characterized microbial taxa can also be observed. For example, six of the 27 GH18‐domain proteins tested for activity lacked formal taxonomic assignment, two of which (GH18_5 and GH18_7) exhibited measurable chitinolytic activity (Figure 5).
Less than 35% of the screened constructs did not show detectable chitinolytic activity, likely due to suboptimal in vitro conditions for select candidates. Eight of the ten inactive candidates were poorly expressed (Figure S3), which may reflect a requirement for highly anaerobic and hypersaline conditions. Additionally, protein candidates containing lysM domains also lacked activity, despite visible protein detection in 3 of the 5 candidates (Figure 5; Figure S3). Previous work has demonstrated expression challenges of proteins containing lysM domains, which are often secreted and attached to cell walls. Due to the nature of lysM domains and tandem lysM repeats, these proteins may experience insolubility and improper folding in vitro, may need specific chaperones to ensure proper folding, or ligand binding to prevent aggregation (Wong et al. 2015; Hu et al. 2021; Tian et al. 2022). Finally, proteins with novel domain arrangement, compared to experimentally validated chitinases, can be identified. Additional carbohydrate‐binding domains flanking the GH18 domain, not previously reported in characterized chitinases, retain chitinolytic activity, as observed in GH18_5 and GH18_1 (Figure 4E). These additional domains may enhance activity via higher affinity for substrate binding (Shoseyov et al. 2006). Chitinolytic activity was also observed for GH18_23 and GH18_24, which have homology to Clostridium spp. CotE proteins. This protein has been previously described as a bifunctional peroxiredoxin‐chitinase (Whittingham et al. 2020), with possible implication in spore binding to mucus during infection (Hong et al. 2017). While recombinant mutants of this protein containing only the GH18 domain have been previously demonstrated to be chitinolytic (Permpoonpattana et al. 2013), to our knowledge, this is the first demonstration of chitinolytic activity of the complete protein sequence containing both the peroxiredoxin and chitinase domains (Figure 5).
5. Conclusions
The results presented have shown that it is possible to implement microbial ecology and phenotype associations to identify functionally important enzymes of interest. Standard metagenomic profiling, including AlphaFold2 structure prediction, relies on homology search and some prior knowledge of expected sequence similarity to known enzymes perpetuating the ‘circular discovery loop’. Here we break this loop by eliciting a phenotype that highlights chitinases' activities regardless of homology with other known chitinases. This relies strictly on data driven association analyses and enables discovery of new enzymes with previously uncharacterized domains or domain architecture.
The three phase approach of the bait and switch is even more critical when functional characteristics of domains driving observed phenotypes are not constrained, in contrast to chitin‐degrading enzymes. Additionally, relying on association at the domain level, rather than gene databases, provides greater resolution and targeting for the discovery of novel enzymes and functionalities. The results of this study emphasize the importance of using paired multi‐omic data as community succession events can be assessed through metagenomic analysis, but changes in gene regulation, especially after the switch, are rapid (1–4 h) and must be evaluated via metatranscriptomics (Figure 4). This approach has the potential to rapidly improve enzyme discovery and characterization timelines. It is possible to screen an increased number of enzymes from uncharacterized taxa, which are believed to make up the majority of microbial diversity, thereby releasing bottlenecks introduced by challenges in cultivation. While chitin degradation was used to develop the workflow, the approach can be used for the discovery of enzymes degrading other polymers, for example, plastics (Piekarska et al. 2023). Given the multifaceted layers for discovery and characterization within this singular, hypothesis‐free pipeline, its expansion and application provide the potential for continued enzyme discovery.
Author Contributions
Rita R. Colwell: investigation, funding acquisition, writing – original draft, methodology, writing – review and editing, project administration, supervision. Kyle D. Brumfield: investigation, writing – review and editing, methodology, writing – original draft, resources. Laurence Ettwiller: conceptualization, investigation, writing – original draft, methodology, validation, visualization, writing – review and editing, project administration, supervision. Jackson A. Buss: investigation, writing – review and editing, methodology, resources. Colleen E. Yancey: conceptualization, investigation, writing – original draft, methodology, visualization, writing – review and editing, formal analysis, validation, project administration.
Funding
Funding was provided by New England Biolabs Inc. The funders did not have any role in study design, data collection, interpretation or decision to submit the work for publication. Further support was provided by the NSF (OCE1839171, CCF1918749 and CBET1751854), National Institute of Environmental Health Sciences, NIH (R01ES030317A), and the National Aeronautics and Space Administration (80NSSC20K0814 and 80NSSC22K1044), awarded to Dr. Rita Colwell.
Ethics Statement
The authors have nothing to report.
Consent
The authors have nothing to report.
Conflicts of Interest
C.E.Y., J.B., and L.E. are employees of New England Biolabs Inc., a manufacturer of restriction enzymes and molecular reagents. R.R.C. and K.D.B. have no conflicting interests to declare.
Supporting information
Figure S1: Changes in relative abundance and transcriptional activity of known auxiliary chitin degrading genes. Auxiliary genes shown include (A) LPMO, (B) CDD and (C) metalloprotease (see Experimental Procedure). For each auxiliary, the gene count (left) and number of reads mapped (right) is shown as a function of the bait for DNA (top) and bait and switch for RNA (bottom).
Figure S2: Similarity networks of putatively Identified Chitinases. (A) Sequence similarity network for amino acid sequence predicted to contain GH18, GH19 and GH20 domains. Each node is coloured by normalized transcriptional changes 1 h after glucose addition (switch). Sequence similarity between proteins is represented by edge thickness.
Figure S3: SDS‐PAGE Gels of expressed chitinase candidates containing GH18 domains. The Protein Disulfide Bond Enhancer (PDBE) is also his‐tagged and therefore co‐purifies with target proteins of interest. The DHFR plasmid, provided in the NEBExpress Cell‐free E coli Protein Synthesis System, was run as a positive control for cell free expression. Visible protein bands of target sequences of interest are highlighted with a red circle.
Figure S4: Transcriptional regulation of domains involved in glycolysis during the Bait and Switch. The association strength of domains involved in glycolysis (red) and chitin degradation (blue) is shown for the critical phases of the switch experiment (1 h after glucose or water addition). Domain association scores are shown for analyses using pure transcriptional changes (RNA only).
Table S1: Enriched PFAM domains with an e‐value of 0, 48 h after chitin addition. Domain association was completed using metatranscriptomes.
Table S2: Depleted PFAM domains with an e‐value of 1e‐10 or lower, 1 h after glucose addition. Domain association was completed using metatranscriptomes.
Table S3: Measured 4MU release (ng) from chitinase activity screens. Each enzyme was screened in technical replicate. An assay (kit control) and negative control (DHFR plasmid expressed in vitro) were also included (see Experimental Procedure). Three subsrates were screened including 4‐Methylumbelliferyl N‐acetyl‐β‐D‐glucosaminide (Glu), 4‐Methylumbelliferyl N,N′‐diacetyl‐β‐D‐chitobioside (Di) and 4‐Methylumbelliferyl β‐D‐N,N′,N′′‐triacetylchitotriose (Tri).
Table S4: Associated PFAM domains according to Combined Phenotype Association (CPA).
Table S5: Enriched PFAM domains with an e‐value of less than 1e‐11, 1 h after glucose addition. Domain association was completed using metatranscriptomes.
Acknowledgements
We are grateful to the following for providing materials, services and support during this project. We thank the NEB Sequencing Core and NEB IT Department for their assistance in sequencing and technical support. We thank Paula Magnelli for critical suggestions and support. We thank Jennifer Ong, Katherine O’Toole, and Rebekah Silva who provided suggestions for protein expression. We are also grateful to Andy Ge for computational and bioinformatic support.
Contributor Information
Colleen E. Yancey, Email: cyancey@neb.com.
Laurence Ettwiller, Email: ettwiller@neb.com.
Data Availability Statement
The datasets generated and analysed (raw reads) during the current study are available in the NCBI Sequencing Read Archive (SRA) repository under BioProject: PRJNA1293438. A reviewer link for the submitted SRA data can be found here. The data are set to be released February 28, 2026, or upon publication, whichever comes first.
References
- Amaral‐Zettler, L. A. , Zettler E. R., and Mincer T. J.. 2020. “Ecology of the Plastisphere.” Nature Reviews. Microbiology 18: 139–151. [DOI] [PubMed] [Google Scholar]
- Beier, S. , and Bertilsson S.. 2013. “Bacterial Chitin Degradation—Mechanisms and Ecophysiological Strategies.” Frontiers in Microbiology 4: 149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brumfield, K. D. , Huq A., Colwell R. R., Olds J. L., and Leddy M. B.. 2020. “Microbial Resolution of Whole Genome Shotgun and 16S Amplicon Metagenomic Sequencing Using Publicly Available NEON Data.” PLoS One 15: e0228899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brumfield, K. D. , Usmani M., Long D. M., et al. 2025. “Climate Change and Vibrio: Environmental Determinants for Predictive Risk Assessment.” Proceedings of the National Academy of Sciences 122: e2420423122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brzezinska, M. S. , Jankiewicz U., and Walczak M.. 2013. “Biodegradation of Chitinous Substances and Chitinase Production by the Soil Actinomycete Streptomyces rimosus .” International Biodeterioration & Biodegradation 84: 104–110. [Google Scholar]
- Bushmanova, E. , Antipov D., Lapidus A., and Prjibelski A. D.. 2019. “rnaSPAdes: A de Novo Transcriptome Assembler and Its Application to RNA‐Seq Data.” GigaScience 8: giz100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cao, Z. , Li P., and Li Z.‐H.. 2021. “A Latest Review on the Application of Microcosm Model in Environmental Research.” Environmental Science and Pollution Research 28: 60438–60447. [DOI] [PubMed] [Google Scholar]
- Delacuvellerie, A. , Benali S., Cyriaque V., et al. 2021. “Microbial Biofilm Composition and Polymer Degradation of Compostable and Non‐Compostable Plastics Immersed in the Marine Environment.” Journal of Hazardous Materials 419: 126526. [DOI] [PubMed] [Google Scholar]
- Farron, S. J. , Hughes Z. J., and FitzGerald D. M.. 2020. “Assessing the Response of the Great Marsh to Sea‐Level Rise: Migration, Submersion or Survival.” Marine Geology 425: 106195. [Google Scholar]
- Forbrich, I. , Giblin A. E., and Hopkinson C. S.. 2018. “Constraining Marsh Carbon Budgets Using Long‐Term C Burial and Contemporary Atmospheric CO2 Fluxes.” Journal of Geophysical Research: Biogeosciences 123: 867–878. [Google Scholar]
- Fu, L. , Niu B., Zhu Z., Wu S., and Li W.. 2012. “CD‐HIT: Accelerated for Clustering the Next‐Generation Sequencing Data.” Bioinformatics 28: 3150–3152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gooday, G. W. 1990. “The Ecology of Chitin Degradation.” In Advances in Microbial Ecology, edited by Marshall K. C., 387–430. Springer US. [Google Scholar]
- Grifoll‐Romero, L. , Pascual S., Aragunde H., Biarnés X., and Planas A.. 2018. “Chitin Deacetylases: Structures, Specificities, and Biotech Applications.” Polymers (Basel) 10: 352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hahn, M. W. , Koll U., and Schmidt J.. 2019. “Isolation and Cultivation of Bacteria.” In The Structure and Function of Aquatic Microbial Communities, edited by Hurst C. J., 313–351. Springer International Publishing. [Google Scholar]
- Hofer, U. 2018. “The Majority Is Uncultured.” Nature Reviews. Microbiology 16: 716–717. [DOI] [PubMed] [Google Scholar]
- Hong, H. A. , Ferreira W. T., Hosseini S., et al. 2017. “The Spore Coat Protein CotE Facilitates Host Colonization by Clostridium difficile .” Journal of Infectious Diseases 216: 1452–1459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Howland, K. E. , Mouradian J. J., Uzarski D. R., Henson M. W., Uzarski D. G., and Learman D. R.. 2024. “Nutrient Amendments Enrich Microbial Hydrocarbon Degradation Metagenomic Potential in Freshwater Coastal Wetland Microcosm Experiments.” Applied and Environmental Microbiology 91: e01972‐24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu, S.‐P. , Li J.‐J., Dhar N., et al. 2021. “Lysin Motif (LysM) Proteins: Interlinking Manipulation of Plant Immunity and Fungi.” International Journal of Molecular Sciences 22: 3114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacquiod, S. , Franqueville L., Cécillon S., Vogel T. M., and Simonet P.. 2013. “Soil Bacterial Community Shifts After Chitin Enrichment: An Integrative Metagenomic Approach.” PLoS One 8: e79699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keller‐Costa, T. , Kozma L., Silva S. G., et al. 2022. “Metagenomics‐Resolved Genomics Provides Novel Insights Into Chitin Turnover, Metabolic Specialization, and Niche Partitioning in the Octocoral Microbiome.” Microbiome 10: 151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim, H.‐S. , Golyshin P. N., and Timmis K. N.. 2007. “Characterization and Role of a Metalloprotease Induced by Chitin in Serratia sp. KCK.” Journal of Industrial Microbiology and Biotechnology 34: 715–721. [DOI] [PubMed] [Google Scholar]
- Li, H. , and Durbin R.. 2009. “Fast and Accurate Short Read Alignment With Burrows‐Wheeler Transform.” Bioinformatics 25: 1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li, Y.‐Y. , Wang B., Ma M.‐G., and Wang B.. 2018. “Review of Recent Development on Preparation, Properties, and Applications of Cellulose‐Based Functional Materials.” International Journal of Polymeric Science 2018: 8973643. [Google Scholar]
- Lonard, R. I. , Judd F. W., and Stalter R.. 2010. “The Biological Flora of Coastal Dunes and Wetlands: Spartina patens (W. Aiton) G.H. Muhlenberg.” Coas 2010: 935–946. [Google Scholar]
- Lopes, C. T. , Franz M., Kazi F., Donaldson S. L., Morris Q., and Bader G. D.. 2010. “Cytoscape Web: An Interactive Web‐Based Network Browser.” Bioinformatics 26: 2347–2348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu, J. , Rincon N., Wood D. E., et al. 2022. “Metagenome Analysis Using the Kraken Software Suite.” Nature Protocols 17: 2815–2839. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McClure, R. , Farris Y., Danczak R., et al. 2022. “Interaction Networks Are Driven by Community‐Responsive Phenotypes in a Chitin‐Degrading Consortium of Soil Microbes.” mSystems 7: e00372‐22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mendes, R. , Garbeva P., and Raaijmakers J. M.. 2013. “The Rhizosphere Microbiome: Significance of Plant Beneficial, Plant Pathogenic, and Human Pathogenic Microorganisms.” FEMS Microbiology Reviews 37: 634–663. [DOI] [PubMed] [Google Scholar]
- Meunier, L. , Costa R., Keller‐Costa T., Cannella D., Dechamps E., and George I. F.. 2024. “Selection of Marine Bacterial Consortia Efficient at Degrading Chitin Leads to the Discovery of New Potential Chitin Degraders.” Microbiology Spectrum 12: e00886‐24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mistry, J. , Chuguransky S., Williams L., et al. 2021. “Pfam: The Protein Families Database in 2021.” Nucleic Acids Research 49: D412–D419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nurk, S. , Meleshko D., Korobeynikov A., and Pevzner P. A.. 2017. “MetaSPAdes: A New Versatile Metagenomic Assembler.” Genome Research 27: 824–834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oberg, N. , Zallot R., and Gerlt J. A.. 2023. “EFI‐EST, EFI‐GNT, and EFI‐CGFP: Enzyme Function Initiative (EFI) Web Resource for Genomic Enzymology Tools.” Journal of Molecular Biology 435: 168018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oyeleye, A. , and Normi Y. M.. 2018. “Chitinase: Diversity, Limitations, and Trends in Engineering for Suitable Applications.” Bioscience Reports 38: BSR2018032300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Permpoonpattana, P. , Phetcharaburanin J., Mikelsone A., et al. 2013. “Functional Characterization of Clostridium difficile Spore Coat Proteins.” Journal of Bacteriology 195: 1492–1503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piekarska, K. , Sikora M., Owczarek M., Jóźwik‐Pruska J., and Wiśniewska‐Wrona M.. 2023. “Chitin and Chitosan as Polymers of the Future—Obtaining, Modification, Life Cycle Assessment and Main Directions of Application.” Polymers 15: 793. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rillig, M. C. , Kim S. W., and Zhu Y.‐G.. 2024. “The Soil Plastisphere.” Nature Reviews. Microbiology 22: 64–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shoseyov, O. , Shani Z., and Levy I.. 2006. “Carbohydrate Binding Modules: Biochemical Properties and Novel Applications.” Microbiology and Molecular Biology Reviews 70: 283–295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Terrapon, N. , Lombard V., Drula E., Coutinho P. M., and Henrissat B.. 2017. “The CAZy Database/the Carbohydrate‐Active Enzyme (CAZy) Database: Principles and Usage Guidelines.” In A Practical Guide to Using Glycomics Databases, edited by Aoki‐Kinoshita K. F., 117–131. Springer Japan. [Google Scholar]
- Tian, H. , Fiorin G. L., Kombrink A., Mesters J. R., and Thomma B. P. H. J.. 2022. “Fungal Dual‐Domain LysM Effectors Undergo Chitin‐Induced Intermolecular, and Not Intramolecular, Dimerization.” Plant Physiology 190: 2033–2044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vaaje‐Kolstad, G. , Bøhle L. A., Gåseidnes S., et al. 2012. “Characterization of the Chitinolytic Machinery of Enterococcus faecalis V583 and High‐Resolution Structure of Its Oxidative CBM33 Enzyme.” Journal of Molecular Biology 416: 239–254. [DOI] [PubMed] [Google Scholar]
- Valenzuela, S. V. , Ferreres G., Margalef G., and Pastor F. I. J.. 2017. “Fast Purification Method of Functional LPMOs From Streptomyces ambofaciens by Affinity Adsorption.” Carbohydrate Research 448: 205–211. [DOI] [PubMed] [Google Scholar]
- Vallino, J. J. , Hopkinson C. S., and Garritt R. H.. 2005. “Estimating Estuarine Gross Production, Community Respiration and Net Ecosystem Production: A Nonlinear Inverse Technique.” Ecological Modelling 187: 281–296. [Google Scholar]
- Whittingham, J. L. , Hanai S., Brannigan J. A., et al. 2020. “Crystal Structures of the GH18 Domain of the Bifunctional Peroxiredoxin‐Chitinase CotE From Clostridium difficile .” Structural Biology and Crystallization Communications 76: 241–249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wong, J. E. M. M. , Midtgaard S. R., Gysel K., et al. 2015. “An Intermolecular Binding Mechanism Involving Multiple LysM Domains Mediates Carbohydrate Recognition by an Endopeptidase.” Acta Crystallographica. Section D, Biological Crystallography 71: 592–605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang, W. , Lin Y.‐C., Johnson W., et al. 2021. “A Genome‐Phenome Association Study in Native Microbiomes Identifies a Mechanism for Cytosine Modification in DNA and RNA.” eLife 10: e70021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yao, R. A. , Reyre J.‐L., Tamburrini K. C., et al. 2023. “The Ustilago maydis AA10 LPMO Is Active on Fungal Cell Wall Chitin.” Applied and Environmental Microbiology 89: e00573‐23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zallot, R. , Oberg N., and Gerlt J. A.. 2019. “The EFI Web Resource for Genomic Enzymology Tools: Leveraging Protein, Genome, and Metagenome Databases to Discover Novel Enzymes and Metabolic Pathways.” Biochemistry 58: 4169–4182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang, Y. , Pan D., Xiao P., et al. 2023. “A Novel Lytic Polysaccharide Monooxygenase From Enrichment Microbiota and Its Application for Shrimp Shell Powder Biodegradation.” Frontiers in Microbiology 14: 1097492. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure S1: Changes in relative abundance and transcriptional activity of known auxiliary chitin degrading genes. Auxiliary genes shown include (A) LPMO, (B) CDD and (C) metalloprotease (see Experimental Procedure). For each auxiliary, the gene count (left) and number of reads mapped (right) is shown as a function of the bait for DNA (top) and bait and switch for RNA (bottom).
Figure S2: Similarity networks of putatively Identified Chitinases. (A) Sequence similarity network for amino acid sequence predicted to contain GH18, GH19 and GH20 domains. Each node is coloured by normalized transcriptional changes 1 h after glucose addition (switch). Sequence similarity between proteins is represented by edge thickness.
Figure S3: SDS‐PAGE Gels of expressed chitinase candidates containing GH18 domains. The Protein Disulfide Bond Enhancer (PDBE) is also his‐tagged and therefore co‐purifies with target proteins of interest. The DHFR plasmid, provided in the NEBExpress Cell‐free E coli Protein Synthesis System, was run as a positive control for cell free expression. Visible protein bands of target sequences of interest are highlighted with a red circle.
Figure S4: Transcriptional regulation of domains involved in glycolysis during the Bait and Switch. The association strength of domains involved in glycolysis (red) and chitin degradation (blue) is shown for the critical phases of the switch experiment (1 h after glucose or water addition). Domain association scores are shown for analyses using pure transcriptional changes (RNA only).
Table S1: Enriched PFAM domains with an e‐value of 0, 48 h after chitin addition. Domain association was completed using metatranscriptomes.
Table S2: Depleted PFAM domains with an e‐value of 1e‐10 or lower, 1 h after glucose addition. Domain association was completed using metatranscriptomes.
Table S3: Measured 4MU release (ng) from chitinase activity screens. Each enzyme was screened in technical replicate. An assay (kit control) and negative control (DHFR plasmid expressed in vitro) were also included (see Experimental Procedure). Three subsrates were screened including 4‐Methylumbelliferyl N‐acetyl‐β‐D‐glucosaminide (Glu), 4‐Methylumbelliferyl N,N′‐diacetyl‐β‐D‐chitobioside (Di) and 4‐Methylumbelliferyl β‐D‐N,N′,N′′‐triacetylchitotriose (Tri).
Table S4: Associated PFAM domains according to Combined Phenotype Association (CPA).
Table S5: Enriched PFAM domains with an e‐value of less than 1e‐11, 1 h after glucose addition. Domain association was completed using metatranscriptomes.
Data Availability Statement
The datasets generated and analysed (raw reads) during the current study are available in the NCBI Sequencing Read Archive (SRA) repository under BioProject: PRJNA1293438. A reviewer link for the submitted SRA data can be found here. The data are set to be released February 28, 2026, or upon publication, whichever comes first.
