Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Dec 21.
Published in final edited form as: Cell Chem Biol. 2023 Oct 27;30(12):1680–1691.e6. doi: 10.1016/j.chembiol.2023.10.001

A metagenomic library cloning strategy that promotes high-level expression of captured genes to enable efficient functional screening

Michelle H Rich 1, Abigail V Sharrock 1,2, Timothy S Mulligan 3, Frazer Matthews 4, Alistair S Brown 1,2, Hannah R Lee-Harwood 1,2, Elsie M Williams 1,5, Janine N Copp 1,6, Rory F Little 1, Jenni JB Francis 1, Claire N Horvat 1,7, Luke J Stevenson 1,2, Jeremy G Owen 1,2, Meera T Saxena 3, Jeff S Mumm 3,4,8,9, David F Ackerley 1,2,10
PMCID: PMC10842177  NIHMSID: NIHMS1939858  PMID: 37898120

Summary

Functional screening of environmental DNA (eDNA) libraries is a potentially powerful approach to discover enzymatic “unknown unknowns”, but is usually heavily biased toward the tiny subset of genes preferentially transcribed and translated by the screening strain. We have overcome this by preparing an eDNA library via partial digest with restriction enzyme FatI (cuts CATG), causing a substantial proportion of ATG start codons to be precisely aligned with strong plasmid-encoded promoter and ribosome-binding sequences. Whereas we were unable to select nitroreductases from standard metagenome libraries, our FatI strategy yielded 21 nitroreductases spanning eight different enzyme families, each conferring resistance to the nitro-antibiotic niclosamide and sensitivity to the nitro-prodrug metronidazole. We showed expression could be improved by co-expressing rare tRNAs and encoded proteins purified directly using an embedded His6-tag. In a transgenic zebrafish model of metronidazole-mediated targeted cell ablation, our lead MhqN-family nitroreductase proved ~5-fold more effective than the canonical nitroreductase NfsB.

Keywords: Bacterial start codons, Environmental DNA screening, FatI restriction cloning, Functional metagenome screening, Metronidazole, Niclosamide, Nitroreductase, Synthetic biology, Targeted cell ablation

Graphical Abstract

graphic file with name nihms-1939858-f0007.jpg

eTOC Blurb

Functional screening of environmental DNA libraries is often heavily biased toward genes that express most effectively in the screening strain. Rich et al. address this with an elegant strategy that precision-clones potential ATG start codons immediately downstream of strong transcriptional and translational regulatory sequences.

Introduction

Bacteria-derived enzymes have innumerable applications in research, medicine and industry, with the industrial enzymes market alone projected to exceed US$10 billion by 2024.1 However, an overwhelming majority of bacteria cannot be cultivated efficiently in the laboratory, meaning that traditional microbiological methods are limited in their scope to screen for new enzymes with desirable activities.13 To address this, culture-independent strategies have been developed to discover enzymes encoded by metagenomic DNA, extracted from promising environments. Frequently these approaches are sequence-based, employing next-generation sequencing and bioinformatics to identify candidate genes.1,2 Although these approaches can traverse a vast amount of sequence space, the need to subsequently synthesise or amplify, clone and characterise each candidate is a major bottleneck that adds substantial time and expense. Moreover, sequence-based approaches are limited by their inherent need for similarity. Not only can highly divergent homologues of known enzymes be difficult to identify,4 microbial metagenome sequencing is revealing ever-increasing numbers of hypothetical proteins of unknown function that can nevertheless be highly effective at catalysing a desired reaction.5

An alternative is functional screening, whereby metagenomic DNA fragments are cloned in a vector, expressed in a host strain such as Escherichia coli, and the desired activity is screened or selected at a phenotypic level.2,3,6 This avoids sequence-level preconceptions, but can introduce other significant biases, most notably that the host cell is often unable to effectively recognise transcriptional and/or translational signals from evolutionarily distant species, resulting in low-to-no expression of most captured genes.3,7 Functional metagenome screening also requires an effective high-throughput screen, or ideally a selection strategy, to recover clones that gain the desired activity.6,8,9

We have a strong interest in discovery and characterisation of bacterial nitroreductases that can efficiently convert prodrugs to a toxic form, by reducing an electron-withdrawing nitro substituent on an aromatic ring to an electron-donating hydroxylamine or amine via concerted two-electron transfer steps.10 Nitro-reduction is a function-based classification that encompasses diverse enzyme families and, given the paucity of nitroaromatic molecules in nature,11 is generally assumed to be a promiscuous activity.12,13 Known nitroreductases include members of the NfsA, NfsB, PnbA quinone oxidoreductase families that share a conserved “nitroreductase” structural fold,14 but also of the AzoR (azoreductase), MsuE (sulphur assimilation) and NemA (old yellow enzyme) families, which share little sequence or structural homology other than all binding FMN or FAD cofactors.15,16 There are undoubtedly many other families of potential nitroreductases that remain to be discovered, and even within the known enzyme families it is difficult to predict a priori whether a given member is likely to be an efficient nitroreductase. Thus, functional metagenome screening is an attractive strategy to discover new nitroreductases and to enrich for enzyme variants that are highly effective for a desired functionality.

Our primary focus here was discovery of nitroreductases that can efficiently convert the nitro-prodrug antibiotic metronidazole to a cytotoxic form. Metronidazole has been used to precisely ablate target cells in transgenic model organisms that express the E. coli nitroreductase gene nfsB from a cell-type-specific promoter, for the purposes of investigating cellular function and/or regeneration.17 This system has received widespread uptake, in particular in zebrafish, but is confounded by the need for metronidazole concentrations near the toxicity threshold (~10 mM) to achieve effective ablation of many cell types.18,19 More efficient metronidazole-converting nitroreductases would therefore enable improved ablation with fewer off-target effects. To select for metagenome-derived nitroreductases we hoped to use niclosamide, an antibacterial that we previously found is not only detoxified by nitro-reduction, but was also able to select for more effective metronidazole-reducing variants of the E. coli nitroreductase NfsA from targeted mutagenesis libraries.20,21 An important caveat is that those previous mutagenesis studies employed high-level expression of each nitroreductase gene variant from a strong tac promoter on a high copy number plasmid. We were therefore uncertain whether niclosamide would prove effective in recovering nitroreductases from a metagenome library, which were likely to be expressed at far lower levels. Ultimately, we found it necessary to implement an alternative strategy that minimises species bias effects and promotes high-level expression of captured genes. This strategy is likely to be broadly applicable for discovery of any catalytic functionality for which an effective screen or selection can be applied.

Results

Niclosamide can select for metronidazole-active nitroreductases

Although niclosamide is usually far more toxic to gram-positive than gram-negative bacteria,22,23 we have previously shown that deletion of the tolC efflux gene and seven endogenous nitroreductase genes renders E. coli 2000-fold more sensitive to niclosamide.20 The majority of this sensitisation effect was due to loss of tolC, but we have nevertheless shown that over-expressed mutants of the E. coli nitroreductase nfsA provide a selectable level of niclosamide resistance in this multi-gene deletion strain (E. coli 7NT).20,21 To test whether the ability to detoxify niclosamide is widely associated with sensitivity to metronidazole, we measured the growth of 18 7NT strains individually over-expressing members of an oxidoreductase gene library (representing six known nitroreductase families) in lysogeny broth amended with 0.8 μM niclosamide or 800 μM metronidazole. Although there was only a moderate inverse correlation (r2 = 0.43; Pearson’s r = −0.65) between levels of growth in each medium, the data were skewed by three AzoR family members that conferred high levels of niclosamide resistance, but little sensitivity to metronidazole (Supplementary Figure S1). Overall, 10 of the 13 strongly niclosamide-resistant strains were overtly growth-inhibited by metronidazole, compared to none of the niclosamide-sensitive strains.

Niclosamide Is inefficient in selecting nitroreductases from standard metagenome libraries

We next conducted pilot tests to assess whether niclosamide could efficiently select for nitroreductase genes from a typical metagenome library, i.e. one generated by purification, fragmentation and cloning of environmental DNA (eDNA) in a standard E. coli expression vector. For this, we used a small and well-characterised soil eDNA library containing ca. 1.3 × 105 unique metagenome inserts with an average size ~4 kb, cloned into pRSETB.24 In an earlier study we used this library to functionally screen for 4’-phosphopantetheinyl transferase genes as markers for natural product biosynthetic gene clusters, and recovered seven unique inserts.25 Bacterial genomes typically encode numerous nitroreductases apiece,14,16 so we considered that recovery of >10 unique nitroreductases would indicate an efficient selection.

As all nitroreductase genes from our oxidoreductase library (Supplementary Figure S1) and in our previous niclosamide resistance screens20 had been strongly over-expressed, we sought to boost transcription of insert DNA from the T7 promoter of pRSETB via IPTG induction (as had been used for this eDNA library by Parachin and Gorwa-Grauslund24). For this, we first lysogenised the 7NT strain with λDE3, which carries a T7 RNA polymerase gene. The resulting E. coli strain (7TL) was transformed with the soil eDNA library, and selection with 0.5 μM niclosamide (the lowest concentration that reliably prevented colony formation by empty plasmid control cells) yielded 21 niclosamide-resistant colonies. While this initially appeared promising, Sanger sequencing revealed that these 21 ‘hits’ represented only three unique inserts, each of which contained a tolC-like gene. Resistance was therefore likely due to restoration of efflux rather than detoxification of niclosamide.

TolC-mediated efflux of niclosamide can be prevented by the chemical inhibitor phenylalanine-argininβ-naphthylamide (PAβN),20 so we added PAβN to the selection medium and re-screened the library. However, we did not recover any eDNA clones at concentrations of niclosamide and PAβN that prevented growth of an empty plasmid control strain while permitting colony formation by 7TL cells that expressed E. coli NfsB from a tac promoter. We therefore concluded that niclosamide did not provide an efficient means to select nitroreductase genes from standard eDNA libraries. It seemed likely that this was because nitroreductase-mediated niclosamide resistance requires higher-level expression of captured genes than a standard eDNA library can routinely provide (whereas trace levels of tolC expression appeared sufficient to confer resistance).

A FatI eDNA cloning strategy for selection of niclosamide and metronidazole active nitroreductases

An ideal solution to boost gene expression would be to ligate eDNA fragments into a plasmid in such a way that the start codon of a captured gene was placed an optimal distance downstream of a strong promoter and ribosome binding site. In considering this problem, we realised that the most common bacterial start codon (ATG) constitutes three quarters of the palindrome recognised by the restriction enzyme FatI (CATG). We envisaged that a partial FatI digest of eDNA would yield an array of fragments with 5’ overhangs that often contain start codons, allowing their associated genes to be ligated into a custom expression vector at a precise location (Figure 1A). We therefore designed a plasmid with a unique and compatible NcoI site (CCATGG) located downstream of an IPTG-inducible tac promoter and strong ribosome binding site (an inducible promoter was chosen to avoid prematurely selecting against gene inserts that impose a fitness burden upon the host cell). Reasoning that it might sometimes be useful to purify target proteins directly from selected bacterial clones, we also embedded an optimally-positioned start codon and N-terminal hexahistidine tag directly upstream of, and in frame with, the ATG of the captured gene (Figure 1B). The final plasmid (pUCXMG; Supplementary Figure S2) was assembled from an artificially-synthesised DNA fragment ligated into a pUCX parental plasmid backbone. Pilot tests demonstrated that a nitroreductase gene (azoR) cloned into the NcoI site of pUCXMG conferred a similar level of E. coli host cell protection against niclosamide to azoR expressed from the parental pUCX plasmid (Supplementary Figure S3).

Figure 1. Strategy for discovery of metronidazole-active nitroreductases from an expression library of FatI-digested metagenomic DNA.

Figure 1.

A. Flowchart of the metagenomic library cloning strategy and functional screening platform used to identify novel nitroreductase enzymes from soil-derived DNA. B. Key features of FatI expression vector pUCXMG. Highlighted are the IPTG-inducible tac promoter; the lacO operator (repressor-binding) region; an XbaI site used in vector assembly; a strong ribosome binding sequence (RBS; derived from the T7 phage major capsid protein RBS); the start codon; the hexahistidine (His6) tag; a thrombin cleavage sequence for His6 tag removal; and the NcoI restriction site for insertion of FatI partially-digested eDNA fragments. Figure drawn using Geneious Prime version 2022.2. The full plasmid map and sequence are available as Supplemental Data S2.

The FatI partial digest strategy we envisaged only permits precision cloning of the subset of genes that possess both an ATG start codon and a cytosine in the −1 position; but soil eDNA is such a vast resource that we did not consider this a major limitation. Nevertheless, we felt it important to consider the distribution of genes that possess these characteristics, for example, to assess the extent to which our method might bias for genes from GC-rich bacteria. For this, we collected 21,675 annotated bacterial genomes from the National Centre for Biotechnology Information (NCBI) Assembly Database, and wrote a Python script to analyse the number of genes using each start codon (ATG, GTG, TTG) and the corresponding nucleotide distribution at the −1 position, within each genome. To exclude plasmid sequences from the analyses, we performed analyses on records within each genome without ‘plasmid’ in the record description; and for genomes containing multiple chromosomes, the results from all chromosomes were combined into a single record (see Supplemental Data S1 and S2 for bioinformatics scripts and compiled genome analyses). This analysis confirmed that ATG is substantially the most abundant start codon, typically being used in 80-95% of genes, except in very high GC genomes where the incidence of GTG increases (Figure 2).

Figure 2: Percentage of genes from sequenced genomes that initiate with ATG, (C)ATG or (G)ATG start codons, relative to genomic GC content.

Figure 2:

The percentage of genes predicted to initiate with ATG (orange), (C)ATG (dark blue), or (G)ATG (light blue) start codons were sourced from 21,675 annotated bacterial genomes, derived from the National Centre for Biotechnology Information Assembly Database on June 7, 2021, and plotted relative to the total GC content of that genome. Data were analysed with Python 3.8.1, using Script 1 (Supplemental Data S1).

Importantly, when we plotted the proportion of ATG and (C)ATG start codons (where (C) denotes a cytosine nucleotide in the −1 position) in each genome relative to its GC content, we noticed a disproportionately high incidence of (C)ATG start codons relative to (G)ATG (Figure 2). Indeed, in a substantial proportion of bacteria containing >60% GC content, over 50% of genes initiate with a (C)ATG start codon. This is helpful from the perspective of capturing coding sequences effectively, but does suggest that DNA from these species will be overrepresented in metagenomic libraries prepared via FatI partial digestion.

To implement our cloning strategy (Figure 1A), we purified DNA from 250 g of locally-collected soil. This yielded 38 μg of purified DNA that was primarily of a size range >10 kb (Supplementary Figure S4). Following partial digestion with FatI (Supplementary Figure S4), we gel-extracted DNA fragments in the 0.6-1.4 kb range, seeking to (i) emphasise single-gene inserts that are more amenable to ligation, one-pass Sanger sequencing and deconvolution of phenotypes; and (ii) capture a wide diversity of bacterial nitroreductases while excluding tolC genes (typically >1.5 kb). Upon ligation of these fragments into the NcoI site of pUCXMG, we generated a plasmid library of 1.38 × 107 clones, with an estimated insert rate of 87.5% i.e. 1.2 × 107 unique variants in total (~1.0 × 107 with an insert >500 bp; Supplementary Figure S5).

In two independent experiments, E. coli 7TL cells transformed with this library were plated to an estimated 10-fold coverage on niclosamide-amended media. In total, 910 resistant colonies were selected and then counter-screened for host-cell sensitivity to 1.5 mM metronidazole (Figure 3). This yielded 178 metronidazole-sensitive ‘hits’ that were sent for Sanger sequencing of the plasmid insert, revealing 21 unique inserts. Each of these contained a gene predicted (by BLAST alignment) to encode a flavin-associated enzyme (Table 1). Sequence similarity network analysis indicated that 18 of these were from a superfamily sharing a conserved ‘nitroreductase fold’ as defined by Akiva et al,14 comprising members of the NfsA, NfsB, MhqN, PnbA, TdsD and more distantly related SagB sub-families. The remaining three predicted flavoenzymes were assigned to the structurally unrelated families AzoR and WrbA (Supplementary Figure S6).

Figure 3: Counter-screening of niclosamide-resistant E. coli 7TL eDNA variants to identify metronidazole sensitive strains.

Figure 3:

910 niclosamide-resistant colonies were recovered from plating of E. coli 7TL cells transformed with the FatI eDNA library on LB agar amended with 0.5 μM niclosamide. Replicate LB cultures were established from each colony and grown for 4 h in either unamended media as a control or else media amended with 0.5 μM niclosamide or 1.5 mM metronidazole. The percentage growth of niclosamide-challenged cultures (A) or percentage growth inhibition of metronidazole-challenged cultures (B) were calculated relative to the unchallenged control. Panels A and B present data from a single set of representative 96-well plates (each of which contained a media-only blank well as well as one empty pUCX (black bar) and three pUCXMG:azoR_Ec controls (grey bars), the latter of which were expected to be niclosamide-resistant but not metronidazole-sensitive as per Supplementary Figure S1). Data were derived from two biological repeats and error bars represent 1 S.D., while the black dashed lines indicate the cut-off that was imposed to define niclosamide resistance (A) or high-level sensitivity to metronidazole (B). The full screening dataset is available in Supplemental Data S3; overall, 78% of niclosamide-challenged cultures achieved at least 50% culture turbidity (OD600) relative to control and 14% of metronidazole-challenged cultures were at least 80% growth-inhibited relative to control.

Table 1:

Evaluation of the 21 pUCXMG:eDNA niclosamide-resistant and metronidazole-sensitive clones.

NTR variant NTR family Metronidazole IC50 (μM) Metronidazole IC50 (μM) +pRARE In frame?a GC content (%) Number of rare codonsb Protein length (AA)
*MhqN1 MhqN 17 ± 7 14 ± 5 Yes 63.9 4 205
*MhqN2 MhqN 23 ± 8 16 ± 5 Yes 64.1 7 205
NfsB1 NfsB 92 ± 27 43 ± 28 Yes 63.3 6 216
NfsB2 NfsB 217 ± 120 56 ± 21 Yes 62.5 7 282
*MhqN3 MhqN 92 ± 27 61 ± 26 Yes 67.0 4 205
**SagB1 SagB 189 ± 72 62 ± 15 Yes 66.5 11 187
*TdsD1 TdsD 130 ± 8 102 ± 33 Yes 67.8 5 201
NfsA1 NfsA 105 ± 26 159 ± 38 Yes 67.2 14 260
NfsB3 NfsB 767 ± 305 233 ± 91 Yes 55.3 9 216
*TdsD2 TdsD 873 ± 308 273 ± 46 No 44.6 10 192
PnbA1 PnbA 618 ± 68 325 Yes 57.7 5 223
*TdsD3 TdsD 1200 ± 460 365 ± 83 No 42.7 9 192
PnbA2 PnbA 1230 ± 260 555 ± 42 No 43.6 10 240
PnbA3 PnbA 1110 ± 240 648 ± 14 Yes 39.7 13 220
**WrbA1c WrbA 2460 ± 500 1250 ± 430 Yes 64.6 2 194
AzoR1c AzoR 2700 ± 980 1620 ± 740 Yes 63.8 5 208
*MhqN4 MhqN >5000 3890 ± 2760 No 45.9 12 233
*MhqN5 MhqN 295 ± 77 NDd Yes 62.9 5 221
NfsB4 NfsB 1300 ± 1100 NDd Yes 70.8 5 216
*TdsD4 TdsD 1870 ± 730 NDd Yes 53.7 8 192
AzoR2c AzoR 2500 ± 1290 NDd Yes 64.6 7 209
Controls
pUCX: NfsA_Ec NfsA 73 ± 10 258 ± 109 N/A 52.6 2 240
pUCX: NfsB_Ec NfsB 415 ± 91 284 ± 22 N/A 51.5 1 217
Empty pUCXMG N/A >5000 >5000 N/A N/A N/A N/A
a

Whether or not the NcoI site of pUCXMG comprised the likely start codon for the identified nitroreductase (NTR) variant.

b

Incidence of six codons (AGG, AGA, AUA, CUA, CCC, GGA) underrepresented in E. coli, and for which matching tRNA genes are present on pRARE.

c

Enzymes that are not part of the structurally-conserved nitroreductase superfamily, as defined by Akiva et al (2017).

d

Insufficient growth in culture to enable IC50 determination.

*

indicates enzymes from families that do not contain any previously-characterised bacterial nitroimidazole reductases;

**

indicates enzymes from families that do not contain any previously-characterised nitroreductases.

A substantial majority (17/21) of the predicted flavoenzyme genes were ligated in-frame at the NcoI/FatI fusion site of their recombinant pUCXMG vector (Table 1). To eliminate possible chromosomal mutations, each unique plasmid was used to transform fresh E. coli cells, and the resulting strains were then subjected to quantitative IC50 assays. Nine of the 21 recovered nitroreductases were found to sensitise E. coli host cells to lower concentrations of metronidazole than E. coli NfsB, the benchmark enzyme for metronidazole-mediated cell ablation. The two most active of these (MhqN1 and MhqN2) sensitised E. coli to 24-fold and 18-fold lower metronidazole concentrations, respectively, than E. coli NfsB (Table 1).

Protein production may be further enhanced by co-expression of rare tRNAs

SDS-PAGE analysis of each strain alongside a control expressing E. coli nfsA from plasmid pUCX revealed that the eDNA-derived nitroreductase levels were rather variable, with no over-expressed band being visible in some cases (Figure 4A). Codon analysis of the recovered sequences revealed that nearly all the recovered genes contained a higher number of rare E. coli codons than the native nfsA or nfsB genes (Table 1), which suggested that sub-optimal codon use might be impairing translation and thereby limiting their perceived activity in this host. To alleviate this issue, we co-transformed these strains with pRARE (a plasmid derived from the ROSETTA strain that supplements E. coli cells with rare tRNAs26) and evaluated its effect on levels of enzyme expression (Figure 4B) and metronidazole IC50 (Table 1). Improvements in each parameter were observed for the majority of variants, with the sensitivity to metronidazole of E. coli cells bearing five different nitroreductases (NfsB2, NfsB3, SagB1, TdsD2, and TdsD3) enhanced by over 3-fold (Table 1). However, strains expressing four nitroreductases did not tolerate pRARE co-expression and grew poorly (AzoR2, MhqN5, NfsB4) or not at all (TdsD4), and pRARE also surprisingly impaired the metronidazole IC50 of the control strain bearing pUCX:nfsA_Ec by over 3-fold (Figure 4B, Table 1). Overall, addition of pRARE was generally beneficial to the expression and activity of recovered nitroreductases, suggesting that steps to mitigate codon bias may add value to screening pipelines.

Figure 4: SDS-PAGE analysis of E. coli 7TL cells expressing captured nitroreductases.

Figure 4:

Enzymes were expressed from pUCXMG, without (top) or with (bottom) co-transformation by pRARE. Protein expression was induced and cultures incubated for 4.5 h, then cell densities normalised and loaded in the same order on each gel (except that there was no TdsD4 sample on the “+pRARE” gel as growth of the corresponding strain could not be achieved in liquid media). Control cultures of cells ± pRARE and expressing NfsA or NfsB from pUCX, or transformed by empty pUCX (V/O; vector only) were treated in identical fashion and analysed on a separate gel (rightmost panels).

An embedded His6-tag allows purification of captured proteins without re-cloning

Based on the metronidazole IC50 data for the pRARE-containing strains (Table 1), we identified seven nitroreductases (MhqN1, MhqN2, NfsB1, NfsB2, MhqN3, SagB1 and TdsD1) that conferred at least a four-fold greater sensitivity to metronidazole than observed for E. coli 7TL cells transformed with pUCX/nfsB. All seven of the corresponding genes were ligated in frame with their start codons positioned within the NcoI/FatI fusion site of pUCXMG, enabling us to test the utility of the embedded His6 tag for protein purification. In all cases, proteins were successfully purified when expressed from the pUCXMG screening plasmid (Figure 5A), avoiding any need to re-clone the corresponding gene inserts into a specialised expression vector prior to protein purification. We noted there was a strong propensity for all proteins other than SagB1 to maintain a dimeric conformation even after boiling in SDS-PAGE loading buffer (Figure 5A). We were also surprised to observe that TdsD1, MhqN1 and SagB1 were exclusively present in the insoluble fraction of lysates derived from the original 7TL screening strain, and it was instead necessary to transfer the corresponding pUCXMG plasmids to the specialised E. coli expression strain BL21 to achieve soluble protein preparations. This additional transfer step could presumably have been avoided by conducting our metagenome screening in a BL21-derived host strain.

Figure 5: In vitro analysis of recombinant nitroreductases.

Figure 5:

A. SDS-PAGE analysis. Each nitroreductase was purified as a His6-tagged protein by standard Ni/NTA chromatography post-expression from each respective pUCXMG cloning plasmid. Five micrograms of purified protein were loaded per lane. B, C. Detection of niclosamide and metronidazole reduction products by HPLC. To confirm nitroreductase activity, chromatographic separation was performed on samples containing purified nitroreductase, NADPH and either (B) niclosamide or (C) metronidazole, following incubation for 1 hour at room temperature. Column eluants were monitored at either 320 nm (for niclosamide-nitroreductase reactions) or 262 nm (for metronidazole-nitroreductase reactions). Pictured are the HPLC traces for MhqN2; traces for the remaining nitroreductases are presented in Supplementary Fig. S7. Experimental reactions contained both MhqN2 and either niclosamide (NCS) or metronidazole (MTZ), as indicated by the plus signs. Control reactions lacked either MhqN2 or substrate as indicated by minus signs. Standards of unreacted substrate were also analysed (bottom trace in each panel). D. Detection of metronidazole reduction products by mass spectrometry. LCMS retention times of the inferred products and their predicted chemical formulae are provided alongside the observed and predicted m/z values of species detected in eluants from the +MTZ +MhqN2 experimental reaction.

Protein purification also enabled us to confirm that each enzyme was capable of directly modifying both nitroaromatic substrates. When enzymes were individually incubated with NADPH and either niclosamide or metronidazole, HPLC analyses revealed new species that had different retention times to the unmodified substrate, and that were absent when the enzyme was omitted from the reaction (Figure 5B,C, Supplementary Figure S7). As we lacked standards to identify these species, we further assessed the reduction products of metronidazole by one of the leading nitroreductase candidates, MhqN2, using mass spectrometry (Figure 5D). This showed that MhqN2 was generating species with masses consistent with nitroso, hydroxylamine and amine reduction products of metronidazole, together with a presumed methoxy-hydroxylamine species that likely resulted from a spontaneous reaction with the methanol that was added as a stop solution.

The metagenome-derived nitroreductase MhqN2 outperforms the canonical nitroreductase E. coli NfsB for targeted cell ablation in zebrafish

Having confirmed nitroreductase activity in vitro, we next sought to assess the relative abilities of our top seven enzymes to sensitise transgenically-targeted zebrafish cells, defined by a neuronal promoter, to metronidazole. Nitroreductase-metronidazole mediated cellular ablation employs nitroreductase-expressing transgenes and cell-specific promoter elements to restrict the expression of a metronidazole-active nitroreductase to only the subset of cells active for the chosen promoter. Effective cellular ablation at concentrations of metronidazole that are not toxic to healthy tissues requires that the nitroreductase be expressed efficiently and retain a high-level of prodrug-converting activity in the target cell type.

We have previously observed that certain bacterial nitroreductases express poorly, or not at all, in eukaryotic models, which we attribute to the potential for nitroreductase substrate promiscuity to disrupt primary metabolic pathways.27 To determine whether any of our top seven eDNA-derived nitroreductases could be used for targeted cell ablation in zebrafish, we attempted to create transgenic zebrafish lines co-expressing each enzyme together with a YFP reporter under control of the same neuronal promoter. For this, transgenic UAS reporter/effector lines, Tg(5xUAS:tagYFP-2A-nitroreductase,he:tagBFP2)jh552 fish were generated as previously described.27 Each UAS line was crossed to a previously established Gal4 enhancer trap driver line, Et(2xNRSE-Mmu.fos:KALTA4)gmc617Et,28 to restrict nitroreductase and YFP co-expression to the same set of targeted neurons. The transgenic lines that were recovered for TdsD1, NfsB1 or NfsB2 did not express YFP at detectable levels and were not further investigated. However, we successfully generated distinct transgenic zebrafish lines co-expressing the YFP reporter and MhqN1, MhqN2, MhqN3 or SagB1.

To assay the abilities of these nitroreductase variants to induce cell ablation, larvae from each strain were subjected to a titration of metronidazole concentrations (0, 1, 5 or 10 mM) at 5 days post-fertilisation (5 dpf). After 48 h of exposure, residual levels of YFP expression in 7 dpf larvae were quantified using a TECAN fluorescence microplate reader as previously described.27 Our most-active variant in E. coli, MhqN1, did not appear to be functional in zebrafish. However, partial ablation was apparent for the lines expressing MhqN3 and SagB1, and near complete ablation for the line expressing MhqN2 at all concentrations tested (p<0.0001 relative to control) (Figure 6). The MhqN2 line was then subjected to a further titration of metronidazole concentrations (0.1, 0.2 and 0.5 mM; Figure 6E) that enabled calculation of an absolute EC50 of 430 μM. This was ~5-fold more effective than a previously generated control line co-expressing the benchmark nitroreductase E. coli NfsB and mCherry, Tg(UAS:NTR-mCherry)c26429 in the same neuronal target cells (i.e., crossed to the same Gal4 driver, gmc617Et,28 which yielded an absolute EC50 of 2.3 mM metronidazole (Figure 6D).

Figure 6: Cell ablation efficacy in transgenic zebrafish for neuronal cells expressing lead nitroreductase candidates.

Figure 6:

A-E) Transgenic zebrafish larvae co-expressing the indicated nitroreductase and either yellow fluorescent protein (A-C,E) or mCherry (D) in cells of the central nervous system were exposed to a range of metronidazole concentrations to assess relative cell ablation efficacy. In initial tests, the MhqN2 line (E) showed >50% ablation at 1 mM metronidazole and was exposed to lower concentrations to enable measurement of an absolute EC50. Bonferroni-corrected p’-values relative to the control condition (0 mM metronidazole) are indicated by asterisks: *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001 (NS = not significant). F) Micrographs of MhqN2 expressing zebrafish larvae after 48 hours of exposure to control media (above) or media containing 10 mM metronidazole (below).

Discussion

We describe here a broadly-applicable strategy to generate small-insert eDNA libraries that are greatly enriched for genes with their start codons placed an optimal distance downstream of a strong E. coli promoter and ribosome binding sequence. This enables efficient selection or screening for weak phenotypes that require high levels of gene expression to manifest, as per the niclosamide and metronidazole converting nitroreductases exemplified here. Although a small proportion of our recovered nitroreductase genes initiated from internal start codons rather than at the NcoI-FatI ligation point, suggesting they might have been recoverable from standard eDNA libraries, over 80% of selected genes were ligated in-frame at the NcoI-FatI fusion point, consistent with the majority having required the boosted expression our cloning strategy provides. We anticipate that this boosted expression will provide substantial benefit to enzyme discovery campaigns that employ ultra-high throughput fluorescence activated cell or droplet sorting technologies, as these impose an extreme requirement for strong signals from very small reaction volumes. However, by minimising the incidence of non-expressing inserts our approach will also benefit screens that only have low to moderate throughput and hence require a high ‘hit’ frequency31 (e.g., discovery of substrate-converting enzymes using thin-layer chromatography32). We also showed that a N-terminal His6-tag could be embedded in the vector to streamline purification and biochemical evaluation of recovered enzymes. While it is possible that some desirable enzyme variants may not tolerate a purification tag in this position, a pragmatic consideration is that screening with a tag in place will select for enzymes that are more likely to be amenable to biochemical characterisation.

Our strategy was exemplified using eDNA from soil, which can represent many thousands of bacterial species per gram,33,34 but we anticipate it will be readily applicable to other sources, e.g. to interrogate the human gut microbiome to detect drug-modifying enzymes, or to identify enzymes with bioremediation potential from polluted environments. We believe it will also hold great value in discovering individual enzymatic tools for synthetic biology, while similarly-designed libraries that employ larger insert sizes may also prove useful for capturing entire operons, e.g. for discovery of natural product gene clusters by screening for characteristic ‘beacon’ genes.35 However, while our approach may offer substantial advantages in activating the expression of operons that might otherwise be silent,36,37 it will not preferentially clone complete operons over partial ones, so it must be considered that small operons are far more likely to be recovered intact than large ones.

In analysing the occurrence of (C)ATG start codons in genome-sequenced bacteria, we made a surprising observation that cytosine is over-represented at the −1 position relative to ATG start codons. The heightened frequency of cytosines in this position cannot be attributed solely to the GC content of the host organism, as (C)ATG start codons appear nearly twice as frequently as (G)ATG codons. We consider it plausible that the higher incidence of palindromic CATG sequences could reflect secondary structures that may form around the translational start point, with possible regulatory roles (e.g., it was recently shown that reducing mRNA secondary structure around the start codon substantially increased expression of the fluorescent reporter mNeonGreen in both Saccharomyces cerevisiae and E. coli38). Irrespective, while this is a beneficial phenomenon for our FatI cloning strategy in terms of increased likelihood of capturing start codons, it does reflect that our strategy is likely to be biased toward capture of genes from GC-rich bacteria. The lower GC content of E. coli (50.8%39) relative to the majority of recovered nitroreductases likely contributed to the incidence of rare codons and poor expression of some of these enzymes. We showed that expression of some nitroreductases was improved by co-transformation of the host with pRARE; but in some other cases this actually diminished nitroreductase activity. Thus, for groups seeking to maximise gene recovery, there may be value in conducting parallel screens of a host strain transformed by the eDNA library alone, alongside another host that has been co-transformed with pRARE. It is possible that addition of other genes that facilitate heterologous expression, e.g. increase chaperone production, might also improve the recovery of genes from distant phyla.

The great strength of conducting functional metagenomic screens or selections is that one is not limited to only the ‘known unknowns’, i.e. homologues of proteins already known to possess the activity of interest. The power to recover novel biocatalysts was on display here. As far as we are aware, there have been no previous reports of nitroreductase activity for bacterial enzymes from the unrelated SagB (azole biosynthesis) and WrbA (quinone oxidoreductase) families; and while there has been one report apiece of nitroreductase activity from TdsD40 and MhqN41 family members (which share a conserved fold with the better-known NfsA, NfsB and PnbA nitroreductases14), no activity had previously been described with nitroimidazole substrates. Despite this, our two most active metronidazole reductases in an E. coli host (MhqN1, MhqN2) were both from the MhqN enzyme family. The value in recovering a broad range of metronidazole reductases was evident when testing in a transgenic zebrafish model of cellular ablation. This is an environment where we have previously observed only a subset of otherwise-promising nitroreductases to function,27 and likewise, our top-performing variant in E. coli (MhqN1) appeared non-functional in this background. In contrast, MhqN2 appeared ~5-fold more effective than the canonical nitroreductase, E. coli NfsB, which was previously found to be insufficiently active for ablation of certain cell types, e.g., dopaminergic neurons,42 cone photoreceptors,43 and macrophages.27 Importantly, because MhqN2 enables effective ablation at doses of metronidazole that are not toxic to zebrafish (<1mM27), it opens additional opportunities for chronic ablation paradigms, i.e. continuous metronidazole exposure for inducible modelling of long-term degenerative diseases.

Not only does accessing a far greater breadth of diversity increase the chances of uncovering an enzyme that is substantially better than any native enzymes previously known (as was the case here), it also provides a broad range of starting points for directed evolution to further improve the desired activity. This breadth will be beneficial to avoid local maxima that have potential to stall directed evolution campaigns when the initial levels of diversity are low.44,45 Moreover, functional metagenomics and directed evolution both usually require efficient high-throughput screens or selections, and the same basic pipeline can often be applied to further enhance activity by evolving the top enzymes recovered from eDNA library screening. One key difference is that directed evolution usually seeks to discriminate between closely related variants, whereas metagenome screening can uncover entirely unrelated enzymes, and hence is more likely to encounter substantial discrepancies in relative expression levels. The strategies we describe here to boost expression of genes captured in eDNA libraries should mitigate this impact and thereby facilitate combined discovery and evolution campaigns.

Limitations of the study

The most overt limitation of our approach is that captured genes must not only initiate with an ATG start codon (typically around 90% of bacterial start codons) but also possess a cytosine in the −1 position. This requirement likely biases the method toward capture of genes from bacteria with higher GC genomes, as indicated (Figure 2). This may lead to a higher proportion of selected genes being difficult to amplify by PCR, or difficult to express at high levels in an E. coli host cell. Co-expression of rare tRNAs can potentially alleviate the latter issue, but is not a complete solution (e.g., Figure 4). It is likely that screening in alternative hosts would lead to recovery of a different set of genes from the same starting material, but this remains to be tested. The other significant bias is for small DNA inserts over larger ones, as a natural consequence of a restriction enzyme based cloning method. Thus, large biocatalyst ‘unknowns’ are less likely to be discovered than smaller ones.

STAR Methods

RESOURCE AVAILABILITY

Lead contact:

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Professor David Ackerley, david.ackerley@vuw.ac.nz.

Materials availability:

Plasmids pUCX and pUCXMG have been deposited with Addgene (Cat# 60681 and Cat#204409, respectively). There are restrictions to the availability of our New Zealand soil eDNA library due unresolved Treaty of Waitangi claim Wai262 regarding Māori kaitiakitanga (guardianship) of New Zealand’s taonga resources, but for data preservation, aliquots of this library have been archived offsite at the Ferrier Research Institute in Wellington, New Zealand (https://wgtn.ac.nz/ferrier; deposition ID DFA-1). The Tg(14xUAS-E1B:NTR1.0-mCherry)c264 and Danio rerio wild type AB fish lines have been deposited with ZIRC (https://zebrafish.org/fish/lineAll.php, IDs ZL12341 and ZL1, respectively) and all remaining fish lines generated in this study are available from the Mumm lab upon request (contact jmumm3@jhmi.edu).

Data and code availability:
  • Start codon analysis data is provided as Supplemental Data S2 as detailed further in the Code statement below. Growth inhibition data for all recovered ‘hits’ (cultures derived from colonies picked from niclosamide selection plates) challenged with 0.5 μM niclosamide or 1.5 mM metronidazole are provided as Supplemental Data S3. All recovered nitroreductase gene and protein sequences are provided in Supplemental Data S4.

  • All original code developed for bioinformatic analyses are available at the following GitHub link: https://github.com/michhrich/metagenomic-library-rich-et-al-2023. A version of record has been archived with Zenodo (doi: 10.5281/zenodo.8381319).

  • Any additional information required to reanalyse the data reported in this paper is available from the lead contact upon request.

EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS

All bacterial screening and growth assays were performed using E. coli 7NT or its λDE3 lysogenised derivative 7TL as described in this study. 7NT was derived from the standard laboratory strain W3110 by scarless in-frame deletion of seven candidate nitroreductase genes (nfsA, nfsB, azoR, nemA, yieF, ycaK and mdaB) and the efflux pump gene tolC.46 As detailed below, the FatI library was initially prepared in E. coli DH10B (Invitrogen) and proteins were purified from either E. coli BL21 (Novagen) or 7TL. All zebrafish lines were created in Danio rerio wild type AB fish (ZIRC).

METHOD DETAILS

Media, chemicals and plasmids

All chemicals were sourced from Duchefa Biochemie unless otherwise stated. Bacterial cultures were grown and assessed in Lysogeny Broth (LB) amended with antibiotics as appropriate for plasmid maintenance (100 μg.mL−1 ampicillin for pUCX or pRSETB, 20 μg.mL−1 gentamycin for pUCXMG, and/or 30 μg.mL−1 chloramphenicol for pRARE). Plasmid pUCX (Addgene Cat#60681) was generated in house as previously described.47 Briefly, a 2 kb fragment spanning the lacI gene, tac promoter, lac operator, RBS and rrnB terminator sequence was PCR-amplified from plasmid pMMB67EH (ATCC accession # 37622) and ligated into pUC19 using the restriction enzymes AgeI and SpeI. Plasmid pRSETB (Invitrogen Cat#V35120) bearing a soil eDNA library was kindly provided by Nadia Parachin and Marie Gorwa-Grauslund.24 This library was generated using DNA extracted from garden soil that had been digested with restriction enzymes BamHI and MboI and gel-purified to recover fragments between 2.0 and 6.0 kb in size, prior to ligation into BamHI-treated pRSETB. Plasmid pRARE,26 which expresses tRNAs that recognise the rare E. coli codons AGG, AGA, AUA, CUA, CCC and GGA, was purified from ROSETTA(DE3) competent cells (Novagen Cat#70954). Plasmid pUCXMG (Addgene Cat#204409) was created during this study as detailed below.

Bioinformatics

For this work, 21,675 complete assembled bacterial genomes were downloaded from NCBI (https://www.ncbi.nlm.nih.gov/genome/browse#!/prokaryotes/) in June 2021. A Python script (https://github.com/michhrich/metagenomic-library-rich-et-al-2023/blob/main/supplementary-script-s1) was then used to extract and compile elements from individual chromosomes within each genome (by excluding records with ‘plasmid in the record description, including the position −1 to 3 sequences at the start of each predicted open reading frame for each annotated CDS. For genomes containing multiple chromosomes, the results were combined into a single record using a second Python script (https://github.com/michhrich/metagenomic-library-rich-et-al-2023/blob/main/supplementary-script-s2). The scripts are additionally available as Supplemental Data S1, and their combined output was compiled into an Excel worksheet and provided as Supplemental Data S2.

E. coli growth Inhibition assays and IC50 analysis

Day cultures were established by adding 150 μl of overnight culture to 3 ml fresh LB amended with ampicillin and 50 μM IPTG in a 15 ml tube, for each strain to be assessed. Day cultures were incubated to induce protein expression for 2 h at 30 °C with shaking at 200 rpm. For growth inhibition assays, 40 μl aliquots of culture were added to individual wells of a 384 well plate containing 40 μl LB, either unamended as an unchallenged control, or amended with metronidazole or niclosamide at twice the desired final concentration. Culture turbidity (OD600) was read initially (T0), and again following 4 h incubation at 30 °C, 200 rpm (T4). Percentage growth for challenged strains was then calculated by subtracting the T0 value from the T4 for each well, converting any negative values to zero, then dividing the data for challenged wells by the corresponding data for the unchallenged control. For IC50 assays, a range of growth inhibition data were calculated in an equivalent fashion, from replicate cultures across a two-fold dilution series of 800 μM to 24 nM metronidazole, or 50 μM to 1.5 nM niclosamide. Final IC50 values were calculated from three biological replicates each comprising two technical replicates using Graphpad Prism software.

Screening of pRSETB soil eDNA library

Initial screening of the pRSETB soil eDNA library created by Parachin and Gorwa-Grauslund24 was performed to >3-fold coverage in E. coli 7TL cells on LB agar amended with ampicillin, 0.5 μM niclosamide and 50 μM IPTG, and yielded three different eDNA inserts containing tolC-like genes. The library was subsequently rescreened on LB agar amended with ampicillin, 50 μM IPTG, niclosamide and the TolC inhibitor phenylalanine-arginine beta naphthylamide (PAβN),48 with or without addition of 1 mM MgSO4 to mitigate the membrane permeabilising effects of PAβN.49 Two paired concentrations of PAβN and niclosamide were used, 100 μM PAβN and 0.1 μM niclosamide, or 50 μM PAβN and 0.2 μM niclosamide (each empirically found to prevent the growth of E. coli 7TL cells expressing tolC-like genes recovered in the initial screen, but permit the growth of 7TL transformed by pUCX bearing E. coli nfsB).

Design and assembly of plasmid pUCXMG

The pUCXMG vector was generated from the pUCX vector backbone with a replacement of the ampicillin resistance marker by a gentamycin resistance cassette and introduction of a modified multiple-cloning site, containing a His6 tag and a downstream NcoI restriction site. For replacement of the antibiotic resistance gene, the pUCX vector was amplified with forward primer CTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAAC and reverse primer ACTCTTCCTTTTTCAATATTATTGAAGC and assembled with a synthetic gentamycin cassette ordered from Twist Bioscience containing 5’ and 3’ 20-bp pUCX homology arms, using NEBuilder® HiFi DNA Assembly (New England Biolabs). A synthetic cloning site comprising an NcoI recognition sequence flanked by XbaI and HindiII restriction sites was ordered from Twist Bioscience and used to replace the XbaI-HindIII region of the pUCX multiple cloning site by restriction cloning. The complete sequence of the final pUCXMG plasmid is available in Supplementary Figure S2.

Generation of a high-expression soil eDNA library using FatI partial digestion

Metagenomic DNA was extracted from soil collected from a private residence in Holloway Road, Wellington, New Zealand as per the protocol of Stevenson et al.50 The eDNA was further purified to remove humic inhibitors by electrophoresis through an agarose gel (1% w/v low-gelling-temperature agarose (Sigma Type VII) in 1× TAE buffer) for 1 h at 150 V, 4 °C. The agarose gel was post-stained with SYBR Safe DNA stain (Thermo Fisher), and the high-molecular-weight DNA was sliced from the gel and digested with β-Agarase I (New England Biolabs) for 1 h at 42 °C. The eDNA was then purified from the digested solution by precipitating with 60% isopropanol + 300 mM sodium acetate pH 5.2 in microcentrifuge tubes. Tubes were centrifuged at 17,000 g, supernatants discarded, and pellets washed with −20 °C 70% EtOFI (v/v). After this, supernatants were discarded and pellets air dried for 5 min, then resuspended in 10 mM Tris-HCl pH 8.0 and DNA concentrations determined using a nanodrop spectrophotometer.

For library generation, eDNA was partially digested by adding 1.1 U FatI/μg eDNA and incubating at 55 °C until test reactions revealed a substantial ‘smear’ in the 0.5 to 5 kb range when visualised on a 1% agarose gel. The digested eDNA was electrophoresed on a low melting temperature agarose gel with a sacrificial sample in the lane next to the markers being stained for visualisation and the ca. 0.6-1.4 kb range marked. The neighbouring (unstained) lanes were then aligned against the marks and the equivalent regions excised, then DNA fragments extracted and purified as described above. The extracted eDNA fragments were then ligated with pUCXMG vector that had been linearised by NcoI digestion in a 2:1 ratio with overnight at 4 °C. The ligated DNA was co-precipitated with yeast tRNA (1 μl of 1 μg/μl tRNA per 5 μl ligation mixture) using isopropanol/sodium acetate followed by a 70% ethanol wash and resuspension in 10 mM Tris-HCl pH 8.0 as above. The resulting FatI library ligation was used to transform electrocompetent E. coli DH10B cells that were then plated onto LB agar amended with gentamycin; a serial dilution of small aliquots on 90 mm plates to estimate library size, and the remainder on a 150 mm plate. Cells were collected from the latter by adding 2 ml LB broth, scraping, and transferring the liquid to a centrifuge tube. Centrifugation was performed for 1 h at 2,400 g, after which the supernatant was discarded and the pellet resuspended in fresh LB broth to form a thick slurry. Aliquots from the slurry were miniprepped to provide a DNA level library and the remainder mixed 1:1 with 80% glycerol (v/v) and snap frozen at −80 °C as a renewable stock. Insert rates were estimated by colony PCR using 56 colonies randomly selected from the serial dilution plates used to estimate library size, with the primers pUCX_for (GACATCATAACGGTTCTG) and pUCX_rev (GTTTCACTTCTGAGTTCG) that flank the NcoI cloning site of pUCXMG.

Selection and evaluation of nitroreductases from the FatI eDNA library

E. coli 7TL cells transformed with the FatI eDNA library were plated on LB agar amended with gentamycin, 0.5 μM niclosamide, and either 5 or 50 μM IPTG. Any resulting colonies were individually picked into fresh LB amended with gentamycin in 96 well microplates and the resulting cultures subjected to niclosamide and metronidazole growth inhibition assays as described above. Niclosamide-resistant and metronidazole-sensitive clones were miniprepped and Sanger sequenced by Macrogen (South Korea) in both orientations using primers pUCX_for and pUCX_rev (details above). Sequenced inserts were analysed against the NCBI non-redundant protein sequence database using BLASTx, in each case revealing a predicted protein sequence annotated as a nitroreductase or NAD(P)H-dependent oxidoreductase. These were assigned to a nitroreductase sub-family14 by BLAST search against the Structure-Function Linkage Database (http://sfld.rbvi.ucsf.edu/archive/django/index.html, now archived51) or else annotated as members of the non-homologous AzoR or WrbA enzyme families using the NCBI Conserved Domain tool.52 To generate sequence similarity networks for the 21 nitroreductases recovered from the soil library, the EFI- Enzyme Similarity Tool53 was used. An all-by-all BLAST was performed on each sequence with an E-Value of 5. Edges were drawn between each node if the BLAST pairwise similarity score was at least 8 (Supplementary Figure S6A) or at least 20 (Supplementary Figure S6B). Each network contained 21 nodes, with each node representing a unique nitroreductase. Sequence similarity networks were visualised with Cytoscape54 using the yFiles organic layout algorithm55. All nitroreductase gene and protein sequences are available in Supplemental Data S4 and have been deposited with Genbank (Accession IDs: azoR1 OR525613, azoR2 OR525614, mhqN1 OR525615, mhqN2 OR525616, mhqN3 OR525617, mhqN4 OR525618, mhqN5 OR525619, nfsA1 OR525620, nfsB1 OR525621, nfsB2 OR525622, nfsB3 OR525623, nfsB4 OR525624, pnbA1 OR525625, pnbA2 OR525626, pnbA3 OR525627, sagB1 OR525628, tdsD1 OR525629, tdsD2 OR525630, tdsD3 OR525631, tdsD4 OR525632, wrbA1 OR525633).

Protein purification and SDS-PAGE analysis

His6-tagged proteins were purified using Ni/NTA columns (Novagen), following expression in the E. coli 7TL screening strain (or E. coli BL21 for TdsD1, MhqN1 or SagB1). Inocula from overnight cultures were incubated in 50 ml of fresh LB containing gentamycin at 37 °C with shaking at 200 rpm until a turbidity of OD600 of 0.5 was achieved. Cultures were then chilled on ice for 15 min, IPTG added to a final concentration of 0.5 mM, and then incubated at 18 °C for 16 h. Following centrifugation, pellets were resuspended in 20 ml HisBind buffer (Novagen) and cells lysed by French pressing, with supernatants from a further centrifugation step being applied to the Ni/NTA columns. Post purification, purity was assessed by SDS-PAGE using 12.5% acrylamide gels with 5 μg protein loaded per lane and bands visualised by staining with Coomassie Brilliant Blue.

For SDS-PAGE analysis of cells expressing nitroreductase genes, cultures of each strain were established as above then incubated for 4.5 h at 30 °C post-addition of 50 μM IPTG, after which cells were spun down, resuspended in 50 μl LB and normalised to an OD600 of 5. A 20 μl volume of each cell resuspension was boiled in SDS-PAGE loading buffer and loaded per lane and bands were visualised by staining with Coomassie Brilliant Blue.

Detection of niclosamide reduction products in vitro

Reactions were established in 100 μl volumes comprising 10 mM Tris-HCl pH 7.5, 50 μM niclosamide and 250 μM NADPH, then initiated by addition of 5 μM purified protein and incubated at room temperature for 1 h. Reactions were terminated by the addition of one volume of ice-cold 100% methanol and stored at −80 °C for at least one hour. Samples were centrifuged for 10 min at 12,000 g at 4 °C and the supernatant was then collected for HPLC analysis. 10 μl of each sample was injected onto a Poroshell-120 EC-C18 2.1 × 100 mm, 2.7 μm (Agilent) and analysed by reverse phase-HPLC employing an Agilent 1260 Infinity II series system. The mobile phase used for HPLC analysis was milli-Q H2O + 0.1% formic acid as aqueous and acetonitrile + 0.1% formic acid as organic. The HPLC run parameters consisted of 1 min at 10% organic phase followed by a linear increase to 100% organic over 17 mins at a flow rate of 0.4 ml.min−1. Eluants were monitored at 320 nm.

Detection of metronidazole reduction products in vitro

Reactions were established in 100 μl volumes comprising 50 mM sodium phosphate buffer pH 7.0, 1 mM metronidazole and 100 μM NADPH together with 5 mM glucose and 0.55 μM Bacillus subtilis glucose dehydrogenase (to regenerate NADPH), then initiated by addition of 5 μM purified protein and incubated at room temperature for 1 h. Reactions were terminated by the addition of one volume of ice-cold 100% methanol and stored at −80 °C for at least one hour. Samples were centrifuged for 10 min at 12,000 g at 4 °C and the supernatant was then collected for HPLC analysis. 10 μl of each sample was injected onto an Ascentis C8 3 μm 150 × 4.6mm column and analysed by reverse phase-HPLC employing an Agilent 1260 Infinity II series system. The mobile phase used for HPLC analysis 45 mM formate buffer pH 6.5 as aqueous and 80% acetonitrile as organic. The HPLC run parameters consisted of 4 min at 5% organic followed by a linear increase to 50% organic from 4 to 19 min at a flow rate of 1.5 ml.min−1. Eluants were monitored at 262 nm.

For enzyme MhqN2, eluants were further analysed by LCMS-QTOF analysis to identify reaction products. Mass spectrometry analysis was conducted on an Agilent 6530 Accurate Mass Q-TOF LCMS equipped with a 1260 Infinity binary pump. 1 μl of each sample was injected into an Ascentis C8 3 μm 150 × 4.6mm column. The aqueous phase consisted of 0.05% ammonium formate and the organic phase was acetonitrile + 0.05% formic acid. The HPLC run parameters consisted of 3 min at 5% organic, followed by a linear increase to 100% organic from 3 to 11 min at a flow rate of 1.5 ml.min−1. Eluants were monitored in the positive mode and MassHunter Qualitative Analysis B.08.00 software was used for MS data analysis.

Evaluation of lead nitroreductases In transgenic zebrafish

A subset of nitroreductases shown to effectively convert metronidazole in bacteria were used to create novel zebrafish transgenic lines. UAS-based reporter/effector transgenes for co-expressing nitroreductase variants and the yellow fluorescent protein tagYFP were assembled and corresponding transgenic lines created as previously described.27 Briefly, transgene constructs comprising fluorescent reporter and nitroreductase genes under control of a UAS cis-acting promoter and separated by a self-cleaving P2A sequence were artificially synthesized (Twist Biosciences) with unique XhoI and XmaI restriction sequences added at either end. Constructs were digested with XhoI and XmaI then ligated into pTOL2 plasmid (Addgene Cat#73483) that had been digested with compatible SalI and XmaI enzymes. The resulting constructs were co-injected with Tol2 transposase into fertilized zebrafish eggs. Injected embryos were raised to sexual maturity and screened for germline transmission of fluorescence. In total, the following Gal4 driver and UAS:nitroreductase lines were used (Tg(14xUAS-E1B:NTR1.0-mCherry)c26456; Et(2xNRSE-Mmu.fos:KALTA4)gmc61728) or were generated anew in this study (Tg(5xUAS:tagYFP-P2A-MhqN1)jh542; Tg(5xUAS:tagYFP-P2A-MhqN2)jh540; Tg(5xUAS:tagYFP-P2A-MhqN3)jh545;Tg(5xUAS:tagYFP-P2A-SagB1)jh543. All UAS lines were crossed to the same previously established Gal4-based driver line, Et(2xNRSE-Mmu.fos:KALTA4)gmc61728, in order to test cell ablation efficacy in the same population of neurons targeted by the gmc617 line. Relative YFP expression levels were quantified following exposure to metronidazole at the indicated concentrations using an established fluorescence plate reader assay57. Briefly, zebrafish larvae were anesthetized by addition of 350 ppm clove oil for 15 min and an Infinite M1000 plate reader (Tecan) with iControl software (version 2.0) was used to quantify fluorescence levels in individual fish. Z-dimension focus settings were defined by averaging the maximal z-dimension scan values of five non-ablated controls. Nine regions per well were scanned to account for random orientation of fish. Evaluations of all nitroreductase strains involved quantifying fluorescence before and after metronidazole exposure to allow normalization per each individual fish (i.e., with relative fluorescence expressed as the post-metronidazole fluorescence reading divided by the pre-metronidazole fluorescence reading). Evaluation of E. coli NfsB efficacy involved a post-metronidazole reading only, as above.

QUANTIFICATION AND STATISTICAL ANALYSIS

All data was processed and plotted using GraphPad Prism. Absolute EC50 values – i.e., the concentration predicted to elicit 50% cell ablation – were calculated from dose-response data using an online EC50 calculator (https://www.aatbio.com/tools/ec50-calculator) and solving for y = 0.5. Multiple comparison corrected p-values were used for statistical comparisons. Micrographs demonstrating metronidazole-induced cell ablation efficacy in anesthetized zebrafish larvae were collected on a MVX10 Olympus fluorescence stereoscope with an Olympus DP72 camera (MhqN2), or an MV1000 Olympus confocal microscope, as previously described58 (E. coli NfsB). Briefly, confocal z-stacks (Olympus .oib files) of fish expressing YFP or mCherry transgenes were collected and fluorescence signals were calculated for each fluorescence channel across the entire image volume using identical processing parameters (background subtraction, intensity threshold). Cell surfaces were 3D rendered and total fluorescence per channel was calculated using local background-based volumetric quantification.

Supplementary Material

1
2

Supplemental Data S1: Python Scripts, related to Figure 2

3

Supplemental Data S2: Start Codon Analysis, related to Figure 2

4

Supplemental Data S3: Collated Growth Inhibition Data, related to Figure 3

5

Supplemental Data S4: Nitroreductase Sequences, related to Table 1

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
Bacterial and virus strains
Escherichia coli 7NT Copp et al.46 Ackerley lab ID: 7NT
E. coli 7TL This manuscript Ackerley lab ID: 7TL
E. coli BL21(DE3) Novagen Cat# 69450
Biological samples
NZ FatI soil DNA library in pUCXMG This manuscript Archived at the Ferrier Research Institute (https://wgtn.ac.nz/ferrier; deposition ID DFA-1)
Swedish BamHI-MboI soil DNA library in pRSETB Parachin and Gorwa-Grauslund24 N/A
Chemicals, peptides, and recombinant proteins
Restriction enzyme FatI New England Biolabs Cat# R0650
Restriction enzyme NcoI New England Biolabs Cat# R0193
Niclosamide Sigma-Aldrich Cat# N3510
Metronidazole Duchefa Biochemie Cat# M0131
Critical commercial assays
Deposited data
azoR1 gene sequence This manuscript GENBANK ID OR525613
azoR2 gene sequence This manuscript GENBANK ID OR525614
mhqN1 gene sequence This manuscript GENBANK ID OR525615
mhqN2 gene sequence This manuscript GENBANK ID OR525616
mhqN3 gene sequence This manuscript GENBANK ID OR525617
mhqN4 gene sequence This manuscript GENBANK ID OR525618
mhqN5 gene sequence This manuscript GENBANK ID OR525619
nfsA1 gene sequence This manuscript GENBANK ID OR525620
nfsB1 gene sequence This manuscript GENBANK ID OR525621
nfsB2 gene sequence This manuscript GENBANK ID OR525622
nfsB3 gene sequence This manuscript GENBANK ID OR525623
nfsB4 gene sequence This manuscript GENBANK ID OR525624
pnbA1 gene sequence This manuscript GENBANK ID OR525625
pnbA2 gene sequence This manuscript GENBANK ID OR525626
pnbA3 gene sequence This manuscript GENBANK ID OR525627
sagB1 gene sequence This manuscript GENBANK ID OR525628
tdsD1 gene sequence This manuscript GENBANK ID OR525629
tdsD2 gene sequence This manuscript GENBANK ID OR525630
tdsD3 gene sequence This manuscript GENBANK ID OR525631
tdsD4 gene sequence This manuscript GENBANK ID OR525632
wrbA1 gene sequence This manuscript GENBANK ID OR525633
Experimental models: Cell lines
Tg(14xUAS-E1B:NTR1.0-mCherry)c264 ZIRC (https://zebrafish.org/fish/lineAll.php) and Pishareth et al.56 ZIRC ID: ZL12341
Et(2xNRSE-Mmu.fos:KALTA4)gmc617 Xie et al.28, available from Mumm lab on request (contact jmumm3@jhmi.edu). Mumm lab ID: gmc617
Tg(5xUAS:tagYFP-P2A-MhqN1)jh542 This manuscript Mumm lab ID: jh542
Tg(5xUAS:tagYFP-P2A-MhqN2)jh540 This manuscript Mumm lab ID: jh540
Tg(5xUAS:tagYFP-P2A-SagB1)jh543 This manuscript Mumm lab ID: jh543
Tg(5xUAS:tagYFP-P2A-MhqN3)jh545 This manuscript Mumm lab ID: jh545
Experimental models: Organisms/strains
Danio rerio wild type ZIRC (https://zebrafish.org/fish/lineAll.php) ZIRC ID: ZL1
Oligonucleotides
CTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAAC This manuscript pUCXassembly_for
ACTCTTCCTTTTTCAATATTATTGAAGC This manuscript pUCXassembly_rev
GACATCATAACGGTTCTG This manuscript pUCX_for
GTTTCACTTCTGAGTTCG This manuscript pUCX_rev
Recombinant DNA
Plasmid pUCXMG Addgene Cat# 204409
Plasmid pUCX Addgene Cat# 60681
Plasmid pTOL2 Addgene Cat# 73483
Plasmid pRARE Novagen Cat# 70954
Plasmid pRSETB Invitrogen Cat# V35120
Cloned nitroreductases in pUCXMG This manuscript Genbank IDs OR525613-OR525633, as per Deposited Data
Software and algorithms
Python script to extract and compile −1 to 3 sequence elements from each predicted open reading frame for each annotated CDS from individual chromosomes within each genome downloaded from NCBI Github https://github.com/michhrich/metagenomic-library-rich-et-al-2023/blob/main/supplementary-script-s1
Python script to combine results from script above into a single record for genomes containing multiple chromosomes Github https://github.com/michhrich/metagenomic-library-rich-et-al-2023/blob/main/supplementary-script-s2
Archived version of record of python scripts Zenodo doi:10.5281/zenodo.8381319
Other

Significance.

Modern DNA sequencing technologies are probing deeper than ever before into the ‘microbial dark matter’ within complex environments such as soil. However, biochemical characterisation of the diversity of proteins encoded by this sequence is lagging far behind. Functional screening of environmental DNA is an attractive strategy to discover new enzymatic activities without requiring preconceptions of the types of enzymes likely to be catalysing the desired chemistry, which will perforce be heavily biased toward previously-characterised protein families. We describe here an environmental DNA cloning strategy that ensures potential start codons are placed an optimal distance downstream of a strong host-appropriate promoter and ribosome binding sequence, and show that it greatly enriches for captured genes that express efficiently in the new host cell. This overcomes an important and long-established roadblock to effective functional screening. In particular, it provides access to weak promiscuous activities that require high-level gene expression to confer a detectable phenotype, and yet might hold particular value for biotechnology. We exemplify that here by recovering 21 enzymes from eight different families that are each active with non-biological molecules, able to detoxify the antibiotic niclosamide and activate the prodrug metronidazole. This collection included enzymes from two families that had not previously been implicated in bacterial nitro-reduction. Our best-performing enzyme in a transgenic zebrafish model of targeted cellular ablation was effective at ~5-fold lower concentrations of metronidazole than the previous benchmark enzyme E. coli NfsB, illustrating the power of an unbiased screening approach to recover desirable activities.

Highlights.

  • CATG-targeting restriction enzyme FotI cleaves at potential start codons

  • Environmental genes captured via FotI cloning can be strongly expressed in E. coli

  • Niclosamide-metronidazole used as positive-negative selection for nitroreductases

  • Top enzyme superior to the canonical nitroreductase NfsB for targeted cell ablation

Acknowledgements

This work was supported by the Royal Society of New Zealand Marsden Fund (contract VUW1902; D.F.A., J.G.O.), the Health Research Council of New Zealand (contract 18-532; D.F.A.) and the US National Institutes of Health (R01OD020376 and RF1MH126731 awards to J.S.M. and D.F.A., and a P30 core grant to the Wilmer Eye Institute, P30EY001765). M.H.R. was supported by a PhD Scholarship from the Cancer Research Society of New Zealand, A.V.R. was supported by a Research for Life Postdoctoral Fellowship, A.S.B. was partially supported by a Health Research Council New and Emerging Researcher grant (contract 23-484), and H.R.L.-H. was supported by a Te Herenga Waka – Victoria University of Wellington DVC(Māori) PhD Scholarship.

Inclusion and Diversity

One or more of the authors of this paper self-identifies as an underrepresented ethnic minority in their field of research or within their geographical location. One or more of the authors of this paper self-identifies as a member of the LGBTQIA+ community. One or more of the authors of this paper received support from a program designed to increase minority representation in their field of research.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Declaration of Interests

The authors declare no competing interests.

References

  • 1.Berini F, Casciello C, Marcone GL, Marinelli F. 2017. Metagenomics: novel enzymes from non-culturable microbes. FEMS Microbiol Lett. 364: fnx211. DOI: 10.1093/femsle/fnx211. [DOI] [PubMed] [Google Scholar]
  • 2.Schmeisser C, Steele H, Streit WR. 2007. Metagenomics, biotechnology with non-culturable microbes. Appl Microbiol Biotechnol 75: 955–62. DOI: 10.1007/s00253-007-0945-5. [DOI] [PubMed] [Google Scholar]
  • 3.Uchiyama T, Miyazaki K. 2009. Functional metagenomics for enzyme discovery: challenges to efficient screening. Curr Opin Biotechnol 20: 616–22. DOI: 10.1016/j.copbio.2009.09.010. [DOI] [PubMed] [Google Scholar]
  • 4.Bernard G, Pathmanathan JS, Lannes R, Lopez P, Bapteste E. 2018. Microbial Dark Matter Investigations: How Microbial Studies Transform Biological Knowledge and Empirically Sketch a Logic of Scientific Discovery. Genome Biol Evol 10: 707–15. DOI: 10.1093/gbe/evy031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Vanni C, Schechter MS, Acinas SG, Barberán A, Buttigieg PL, Casamayor EO, Delmont TO, Duarte CM, Eren AM, Finn RD, Kottmann R, Mitchell A, Sánchez P, Siren K, Steinegger M, Gloeckner FO, Fernàndez-Guerra A. 2022. Unifying the known and unknown microbial coding sequence space. Elife 11: e67667. DOI: 10.7554/eLife.67667. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Ngara TR, Zhang H. 2018. Recent advances in function-based metagenomic screening. Genomics Proteomics Bioinformatics 16: 405–15. DOI: 10.1016/j.gpb.2018.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Han Y, Kinfu BM, Blombach F, Cackett G, Zhang H, Pérez-García P, Krohn I, Salomon J, Besirlioglu V, Mirzaeigarakani T, Schwaneberg U, Chow J, Werner F, Streit WR. 2022. A novel metagenome-derived viral RNA polymerase and its application in a cell-free expression system for metagenome screening. Sci Rep 12: 17882. DOI: 10.1038/s41598-022-22383-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bunzel HA, Garrabou X, Pott M, Hilvert D. 2018. Speeding up enzyme discovery and engineering with ultrahigh-throughput methods. Curr Opin Struct Biol. 48: 149–56. DOI: 10.1016/j.sbi.2017.12.010. [DOI] [PubMed] [Google Scholar]
  • 9.Markel U, Essani KD, Besirlioglu V, Schiffels J, Streit WR, Schwaneberg U. 2020. Advances in ultrahigh-throughput screening for directed enzyme evolution. Chem Soc Rev 49: 233–62. DOI: 10.1039/c8cs00981c. [DOI] [PubMed] [Google Scholar]
  • 10.Williams EM, Little RF, Mowday AM, Rich MH, Chan-Hyams JV, Copp JN, Smaill JB, Patterson AV, Ackerley DF. 2015. Nitroreductase gene-directed enzyme prodrug therapy: insights and advances toward clinical utility. Biochem J. 471: 131–53. DOI: 10.1042/BJ20150650. [DOI] [PubMed] [Google Scholar]
  • 11.Parry R, Nishino S, Spain J. 2011. Naturally-occurring nitro compounds. Nat Prod Rep 28: 152–67. DOI: 10.1039/c0np00024h. [DOI] [PubMed] [Google Scholar]
  • 12.Hall KR, Robins KJ, Williams EM, Rich MH, Calcott MJ, Copp JN, Little RF, Schwörer R, Evans GB, Patrick WM, Ackerley DF. 2020. Intracellular complexities of acquiring a new enzymatic function revealed by mass-randomisation of active-site residues. Elife 9: e59081. DOI: 10.7554/eLife.59081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Roldán MD, Pérez-Reinado E, Castillo F, Moreno-Vivián C. 2008. Reduction of polynitroaromatic compounds: the bacterial nitroreductases. FEMS Microbiol Rev 32: 474–500. DOI: 10.1111/j.1574-6976.2008.00107.x. [DOI] [PubMed] [Google Scholar]
  • 14.Akiva E, Copp JN, Tokuriki N, Babbitt PC. 2017. Evolutionary and molecular foundations of multiple contemporary functions of the nitroreductase superfamily. Proc Natl Acad Sci U S A. 114: E9549–58. DOI: 10.1073/pnas.1706849114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Green LK, Storey MA, Williams EM, Patterson AV, Smaill JB, Copp JN, Ackerley DF. 2013. The Flavin Reductase MsuE Is a Novel Nitroreductase that Can Efficiently Activate Two Promising Next- Generation Prodrugs for Gene-Directed Enzyme Prodrug Therapy. Cancers (Basel) 5: 985–97. DOI: 10.3390/cancers5030985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Prosser GA, Copp JN, Mowday AM, Guise CP, Syddall SP, Williams EM, Horvat CN, Swe PM, Ashoorzadeh A, Denny WA, Smaill JB, Patterson AV, Ackerley DF. 2013. Creation and screening of a multi-family bacterial oxidoreductase library to discover novel nitroreductases that efficiently activate the bioreductive prodrugs CB1954 and PR-104A. Biochem Pharmacol 85: 1091–103. DOI: 10.1016/j.bcp.2013.01.029. [DOI] [PubMed] [Google Scholar]
  • 17.Curado S, Anderson RM, Jungblut B, Mumm J, Schroeter E, Stainier DY. 2007. Conditional targeted cell ablation in zebrafish: a new tool for regeneration studies. Dev Dyn 236: 1025–35. DOI: 10.1002/dvdy.21100. [DOI] [PubMed] [Google Scholar]
  • 18.Mathias JR, Zhang Z, Saxena MT, Mumm JS. 2014. Enhanced cell-specific ablation in zebrafish using a triple mutant of Escherichia coli nitroreductase. Zebrafish 11: 85–97. DOI: 10.1089/zeb.2013.0937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.White DT, Mumm JS. 2013. The nitroreductase system of inducible targeted ablation facilitates cell-specific regenerative studies in zebrafish. Methods 62: 232–40. DOI: 10.1016/j.ymeth.2013.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Copp JN, Pletzer D, Brown AS, Van der Heijden J, Miton CM, Edgar RJ, Rich MH, Little RF, Williams EM, Hancock REW, Tokuriki N, Ackerley DF. 2020. Mechanistic Understanding Enables the Rational Design of Salicylanilide Combination Therapies for Gram-Negative Infections. mBio 11: e02068–20. DOI: 10.1128/mBio.02068-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sharrock AV, McManaway SP, Rich MH, Mumm JS, Hermans IF, Tercel M, Pruijn FB, Ackerley DF. 2021. Engineering the Escherichia coli nitroreductase NfsA to create a flexible enzyme-prodrug activation system. Front Pharmacol 12: 701456. DOI: 10.3389/fphar.2021.701456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Peyclit L, Baron SA, Hadjadj L, Rolain JM. 2022. In Vitro Screening of a 1280 FDA-Approved Drugs Library against Multidrug-Resistant and Extensively Drug-Resistant Bacteria. Antibiotics (Basel) 11: 291. DOI: 10.3390/antibiotics11030291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rajamuthiah R, Fuchs BB, Conery AL, Kim W, Jayamani E, Kwon B, Ausubel FM, Mylonakis E. 2015. Repurposing salicylanilide anthelmintic drugs to combat drug resistant Staphylococcus aureus. PLoS One 10: e0124595. DOI: 10.1371/journal.pone.0124595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Parachin NS, Gorwa-Grauslund MF. 2011. Isolation of xylose isomerases by sequence- and function-based screening from a soil metagenomic library. Biotechnol Biofuels 4: 9. DOI: 10.1186/1754-6834-4-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Owen JG, Robins KJ, Parachin NS, Ackerley DF. 2012. A functional screen for recovery of 4’-phosphopantetheinyl transferase and associated natural product biosynthesis genes from metagenome libraries. Environ Microbiol 14: 1198–209. DOI: 10.1111/j.1462-2920.2012.02699.x. [DOI] [PubMed] [Google Scholar]
  • 26.Kirienko NV, Lepikhov KA, Zheleznaya LA, Matvienko NI. 2004. Significance of codon usage and irregularities of rare codon distribution in genes for expression of BspLU11III methyltransferases. Biochemistry (Mosc) 69: 527–35. DOI: 10.1023/b:biry.0000029851.96180.92. [DOI] [PubMed] [Google Scholar]
  • 27.Sharrock AV, Mulligan TS, Hall KR, Williams EM, White DT, Zhang L, Emmerich K, Matthews F, Nimmagadda S, Washington S, Le KD, Meir-Levi D, Cox OL, Saxena MT, Calof AL, Lopez-Burks ME, Lander AD, Ding D, Ji H, Ackerley DF, Mumm JS. 2022. NTR 2.0: a rationally engineered prodrug-converting enzyme with substantially enhanced efficacy for targeted cell ablation. Nat Methods 19:205–215. DOI: 10.1038/s41592-021-01364-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Xie X, Mathias JR, Smith MA, Walker SL, Teng Y, Distel M, Köster RW, Sirotkin HI, Saxena MT, Mumm JS. 2012. Silencer-delimited transgenesis: NRSE/RE1 sequences promote neural-specific transgene expression in a NRSF/REST-dependent manner. BMC Biol 10: 93. DOI: 10.1186/1741-7007-10-93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Davison JM, Akitake CM, Goll MG, Rhee JM, Gosse N, Baier H, Halpern ME, Leach SD, Parsons MJ. 2007. Transactivation from Gal4-VP16 transgenic insertions for tissue-specific cell labeling and ablation in zebrafish. Dev Biol. 304: 811–24. DOI: 10.1016/j.ydbio.2007.01.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Sheludko YV, Fessner WD. 2020. Winning the numbers game in enzyme evolution - fast screening methods for improved biotechnology proteins. Curr Opin Struct Biol 63: 123–33. DOI: 10.1016/j.sbi.2020.05.003. [DOI] [PubMed] [Google Scholar]
  • 31.Ferrer M, Martínez-Martínez M, Bargiela R, Streit WR, Golyshina OV, Golyshin PN. 2016. Estimating the success of enzyme bioprospecting through metagenomics: current status and future trends. Microb Biotechnol. 9: 22–34. DOI: 10.1111/1751-7915.12309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Rabausch U, Juergensen J, Ilmberger N, Böhnke S, Fischer S, Schubach B, Schulte M, Streit WR. 2013. Functional screening of metagenome and genome libraries for detection of novel flavonoid-modifying enzymes. Appl Environ Microbiol 79: 4551–63. DOI: 10.1128/AEM.01077-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Crits-Christoph A, Diamond S, Butterfield CN, Thomas BC, Banfield JF. 2018. Novel soil bacteria possess diverse genes for secondary metabolite biosynthesis. Nature. 2018 558: 440–4. DOI: 10.1038/s41586-018-0207-y. [DOI] [PubMed] [Google Scholar]
  • 34.Roesch LF, Fulthorpe RR, Riva A, Casella G, Hadwin AK, Kent AD, Daroub SH, Camargo FA, Farmerie WG, Triplett EW. 2007. Pyrosequencing enumerates and contrasts soil microbial diversity. ISME J 1: 283–90. DOI: 10.1038/ismej.2007.53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Baltz RH. 2017. Molecular beacons to identify gifted microbes for genome mining. J Antibiot (Tokyo) 70: 639–46. DOI: 10.1038/ja.2017.1. [DOI] [PubMed] [Google Scholar]
  • 36.Mao D, Okada BK, Wu Y, Xu F, Seyedsayamdost MR. 2018. Recent advances in activating silent biosynthetic gene clusters in bacteria. Curr Opin Microbiol 45: 156–63. DOI: 10.1016/j.mib.2018.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Rutledge PJ, Challis GL. Discovery of microbial natural products by activation of silent biosynthetic gene clusters. 2015. Nat Rev Microbiol 13: 509–23. DOI: 10.1038/nrmicro3496. [DOI] [PubMed] [Google Scholar]
  • 38.Hector RE, Mertens JA, Nichols NN. 2021. Increased expression of the fluorescent reporter protein ymNeonGreen in Saccharomyces cerevisiae by reducing RNA secondary structure near the start codon. Biotechnol Rep (Amst) 33: e00697. DOI: 10.1016/j.btre.2021.e00697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Blattner FR, Plunkett G 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, Gregor J, Davis NW, Kirkpatrick HA, Goeden MA, Rose DJ, Mau B, Shao Y. 1997. The complete genome sequence of Escherichia coli K-12. Science 277: 1453–62. DOI: 10.1126/science.277.5331.1453. [DOI] [PubMed] [Google Scholar]
  • 40.Takahashi S, Furuya T, Ishii Y, Kino K, Kirimura K. 2009. Characterization of a flavin reductase from a thermophilic dibenzothiophene-desulfurizing bacterium, Bacillus subtilis WU-S2B. J Biosci Bioeng 107: 38–41. DOI: 10.1016/j.jbiosc.2008.09.008. [DOI] [PubMed] [Google Scholar]
  • 41.Takeda K, Iizuka M, Watanabe T, Nakagawa J, Kawasaki S, Niimura Y. 2007. Synechocystis DrgA protein functioning as nitroreductase and ferric reductase is capable of catalyzing the Fenton reaction. FEBS J 274: 1318–27. DOI: 10.1111/j.1742-4658.2007.05680.x. [DOI] [PubMed] [Google Scholar]
  • 42.Godoy R, Noble S, Yoon K, Anisman H, Ekker M. 2015. Chemogenetic ablation of dopaminergic neurons leads to transient locomotor impairments in zebrafish larvae. J Neurochem. 135: 249–60. DOI: 10.1111/jnc.13214. [DOI] [PubMed] [Google Scholar]
  • 43.Fraser B, DuVal MG, Wang H, Allison WT. 2013. Regeneration of cone photoreceptors when cell ablation is primarily restricted to a particular cone subtype. PLoS One 8: e55410. DOI: 10.1371/journal.pone.0055410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Gupta RD, Tawfik DS. 2008. Directed enzyme evolution via small and effective neutral drift libraries. Nat Methods 5: 939–42. DOI: 10.1038/nmeth.1262. [DOI] [PubMed] [Google Scholar]
  • 45.Packer MS, Liu DR. 2015. Methods for the directed evolution of proteins. Nat Rev Genet 16: 379–94. DOI: 10.1038/nrg3927. [DOI] [PubMed] [Google Scholar]
  • 46.Copp JN, Williams EM, Rich MH, Patterson AV, Smaill JB, Ackerley DF. 2014. Toward a high-throughput screening platform for directed evolution of enzymes that activate genotoxic prodrugs. Protein Eng Des Sel 27: 399–403. DOI: 10.1093/protein/gzu025. [DOI] [PubMed] [Google Scholar]
  • 47.Prosser GA, Copp JN, Syddall SP, Williams EM, Smaill JB, Wilson WR, Patterson AV, Ackerley DF. 2010. Discovery and evaluation of Escherichia coli nitroreductases that activate the anti-cancer prodrug CB1954. Biochem Pharmacol 79: 678–87. DOI: 10.1016/j.bcp.2009.10.008. [DOI] [PubMed] [Google Scholar]
  • 48.Lomovskaya O, Warren MS, Lee A, Galazzo J, Fronko R, Lee M, Blais J, Cho D, Chamberland S, Renau T, Leger R, Hecker S, Watkins W, Hoshino K, Ishida H, Lee VJ. 2001. Identification and characterization of inhibitors of multidrug resistance efflux pumps in Pseudomonas aeruginosa: novel agents for combination therapy. Antimicrob Agents Chemother 45: 105–16. DOI: 10.1128/AAC.45.1.105-116.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Lamers RP, Cavallari JF, Burrows LL. 2013. The efflux inhibitor phenylalanine-arginine beta-naphthylamide (PAβN) permeabilizes the outer membrane of gram-negative bacteria. PLoS One 8: e60666. DOI: 10.1371/journal.pone.0060666. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Stevenson LJ, Ackerley DF, Owen JG. 2022. Preparation of soil metagenome libraries and screening for gene-specific amplicons. Methods Mol Biol 2397: 3–17. DOI: 10.1007/978-1-0716-1826-4_1. [DOI] [PubMed] [Google Scholar]
  • 51.Akiva E, Brown S, Almonacid DE, Barber AE 2nd, Custer AF, Hicks MA, Huang CC, Lauck F, Mashiyama ST, Meng EC, Mischel D, Morris JH, Ojha S, Schnoes AM, Stryke D, Yunes JM, Ferrin TE, Holliday GL, Babbitt PC. 2014. The Structure-Function Linkage Database. Nucleic Acids Res 42(Database issue): D521–30. DOI: 10.1093/nar/gkt1130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Lu S, Wang J, Chitsaz F, Derbyshire MK, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Marchler GH, Song JS, Thanki N, Yamashita RA, Yang M, Zhang D, Zheng C, Lanczycki G, Marchler-Bauer A. 2020. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res 48(D1): D265–D268. DOI: 10.1093/nar/gkz991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Oberg N, Zallot R, Gerlt JA. 2023. EFI-EST, EFI-GNT, and EFI-CGFP: Enzyme Function Initiative (EFI) Web Resource for Genomic Enzymology Tools. J Mol Biol 168018. DOI: 10.1016/j.jmb.2023.168018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. 2003. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13: 2498–504. DOI: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Atkinson HJ, Morris JH, Ferrin TE, Babbitt PC. 2009. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies. PLoS One 4: e4345. DOI: 10.1371/journal.pone.0004345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Pisharath H, Rhee JM, Swanson MA, Leach SD, Parsons MJ. 2007. Targeted ablation of beta cells in the embryonic zebrafish pancreas using E. coli nitroreductase. Mech Dev. 124: 218–29. DOI: 10.1016/j.mod.2006.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Walker SL, Ariga J, Mathias JR, Coothankandaswamy V, Xie X, Distel M, Köster RW, Parsons MJ, Bhalla KN, Saxena MT, Mumm JS. 2012. Automated reporter quantification in vivo: high-throughput screening method for reporter-based assays in zebrafish. PLoS One 7: e29916. DOI: 10.1371/journal.pone.0029916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Ariga J, Walker SL, Mumm JS. 2010. Multicolor time-lapse imaging of transgenic zebrafish: visualizing retinal stem cells activated by targeted neuronal cell ablation. J Vis Exp. (43): 2093. DOI: 10.3791/2093. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

Supplemental Data S1: Python Scripts, related to Figure 2

3

Supplemental Data S2: Start Codon Analysis, related to Figure 2

4

Supplemental Data S3: Collated Growth Inhibition Data, related to Figure 3

5

Supplemental Data S4: Nitroreductase Sequences, related to Table 1

Data Availability Statement

  • Start codon analysis data is provided as Supplemental Data S2 as detailed further in the Code statement below. Growth inhibition data for all recovered ‘hits’ (cultures derived from colonies picked from niclosamide selection plates) challenged with 0.5 μM niclosamide or 1.5 mM metronidazole are provided as Supplemental Data S3. All recovered nitroreductase gene and protein sequences are provided in Supplemental Data S4.

  • All original code developed for bioinformatic analyses are available at the following GitHub link: https://github.com/michhrich/metagenomic-library-rich-et-al-2023. A version of record has been archived with Zenodo (doi: 10.5281/zenodo.8381319).

  • Any additional information required to reanalyse the data reported in this paper is available from the lead contact upon request.

RESOURCES