Abstract
A rapid protocol was developed for constructing plasmid libraries from small quantities of genomic/metagenomic DNA. The technique utilizes linker amplification with topoisomerase cloning and allows for inducible transcription in Escherichia coli. As proof of principle, several anti-Bacillus lysins were cloned from bacteriophage genomes and an aerolysin was cloned from a metagenomic sample.
The field of metagenomics offers unique perspectives on unculturable microorganisms and their biosynthetic products (4, 13, 17, 19). Overall, metagenomic research can be subdivided into two categories. In bioinformatic studies, environmental DNA is sequenced to gain insight into a sample's diversity and phylogeny (18, 20). By contrast, in functional metagenomics, environmental genes are expressed within a host and clones are screened for phenotype acquisition (9, 12). A potential limiting factor in either case is the initial amount of DNA, for instance, if the purity/quantity of the sample is low or if the analysis involves a subset of an environmental population. An example of the latter is shown in viral metagenomics, in which phage particles are isolated prior to DNA extraction (5, 6). Random amplification has proven important in these situations, with a prominent method being the linker-amplified shotgun library (LASL) approach (2, 3). Here, DNA (∼1 μg or less) is fragmented, short linkers are attached, and PCR is conducted with primers targeting the linkers.
We report here a modified LASL approach combining linker amplification with topoisomerase cloning (15). The technique is particularly well suited for functional screening, as it rapidly generates expression libraries with gene-sized inserts. These libraries are referred to as expressible LASLs (E-LASLs), and their utility was demonstrated with genomic DNA from four Bacillus bacteriophages isolated from bat guano and metagenomic DNA from the gut contents of an earthworm (Eisenia hortensis). For all samples, 100 ng of DNA was fragmented with Tsp509I (consensus sequence AATT; 0.01 or 0.1 U enzyme; 50-μl reaction mixture volume in NEB buffer 2; 1 min of digestion at 65°C). Following phenol-chloroform extraction and ethanol precipitation, the DNA was ligated to 40 ng of linker sequence, with a complementary 5′ overhang (AATTCGGCTCGAG, where the overhang is underlined). The ligation mixture was used as the template for Taq-based PCR (1 μl template per 50-μl reaction mixture volume; linker-targeted primer [CCATGACTCGAGCCGAATT]; PCR conditions of 95°C for 1 min; 95°C for 30 s, 55°C for 30 s, and 72°C for 5 min [40 times], and 72°C for 10 min).
The resultant E-LASL length distributions ranged from 500 bp to >4 kb (Fig. 1). For each E-LASL, 1 volume of raw PCR product was added to 3 volumes of distilled water, 1 volume of 1× salt solution, and 1 volume of topoisomerase-conjugated pBAD plasmid (pBAD TOPO TA expression kit; Invitrogen). Following a 1-min room temperature incubation, the mixtures were transformed into competent Escherichia coli TOP10 (Invitrogen) and plated onto LB-ampicillin. The mean insert sizes in the resultant clones were determined through PCR of randomly selected colonies: 2.27 ± 0.74 kb (n = 97) for the phage genomic libraries (digested with 0.01 U Tsp509I) and 1.99 ± 0.61 kb (n = 65) for the metagenomic library (digested with 0.1 U Tsp509I). For all E-LASLs, about 34% of colonies were determined by electrophoresis and sequencing to contain a circularized plasmid without any insert. Although they were an unavoidable by-product of the kits, these clones fortuitously did not proliferate when replicated onto arabinose-containing medium. In terms of colony yield, the number of insert-containing clones that could be generated per topoisomerase reaction varied from library to library. For the libraries used here, the average number of clones per 6-μl reaction mixture was 1,187 (range, 596 to 2,713).
FIG. 1.
(a) E-LASL amplification products. Lanes 2 to 5 depict E-LASLs constructed from four Bacillus phage genomic samples (100 ng DNA per sample; 1-min digestion; 0.01 U Tsp509I; 2-μl reaction product per well). The E-LASL depicted in lane 6 was constructed from metagenomic DNA extracted from earthworm gut contents (100 ng DNA; 1-min digestion; 0.1 U Tsp509I; 2-μl reaction product per well). (b) Digested DNA prior to linker amplification. Lane 2 contains undigested phage genomic DNA. One microgram was digested under the same conditions as used during E-LASL construction: 0.1 U Tsp509I/100 ng DNA/50-μl reaction volume (lane 3) and 0.01 U Tsp509I/100 ng DNA/50-μl reaction volume (lane 4). The undigested DNA, however, is not amplified during PCR and is noncontributory to the final libraries. We should note that the amplified libraries could contain ligated chimeras of two or more digested fragments (the likely origin of the longest E-LASL components shown in panel 1a). Such chimeras are of little concern, however, given the functional nature of the screens and the fact that the majority of unamplified DNA in panel 1b was gene sized or greater in length.
Clones were replicated onto LB-agar with 0.2% arabinose and examined for phenotype acquisition. For the genomic libraries, clones were screened for phage lysins (hydrolases that digest bacterial cell walls during phage infection [reviewed in references 1 and 10]). Chloroform-permeabilized E. coli isolates were overlaid with Bacillus anthracis ΔSterne (20 μl log-phase culture per 7 ml molten soft agar) and monitored for clones around which bacilli failed to proliferate. Lysins were identified for three-fourths of the phage libraries screened (designated BG1, BG2, and BG3). The proportion of positive hits varied among libraries: BG1, 8 hits/640 clones screened; BG2, 1 hit/540 clones screened; and BG3, 1 hit/2,713 clones screened. For the fourth phage genomic library, 1,222 clones were screened without any observed hits.
Based on BLAST homology, the BG2 and BG3 lysins have N-acetylmuramoyl-l-alanine amidase activity (GenBank accession numbers EU258892 and EU2588913), while the BG1 lysin has N-acetylmuramidase activity (accession number EU258891). Overall, alanine amidases are far more common than N-acetylmuramidase among Bacillus phages/prophages. In fact, the only known muramidase homologue of the BG1 lysin (which we termed PlyBeta) among Bacillus phages is the PlyB lysin from the BcpI phage (14). These two enzymes share 78% nucleotide sequence identity and 81% amino acid sequence identity, including all putative catalytic residues (see Fig. S1 in the supplemental material). The activity of PlyBeta was confirmed qualitatively against 14 Bacillus species/strains through soft agar overlay experiments. Like PlyB, it is variably active against different bacilli. Clearing zones were noted for the following seven strains (with corresponding clearing zone radii in parentheses): B. anthracis ΔSterne (4 mm), B. anthracis 222 (4 mm), Bacillus cereus 03BB87 (7 mm), B. cereus E33L ZK (4 mm), B. cereus 4429/73 FRI-16 (7 mm), Bacillus thuringiensis HD73 (4 mm), and Bacillus subtilis SL4 (8 mm). Clearing zones were not observed for B. cereus ATCC 10987, B. cereus ATCC 14579, B. cereus 13100, B. thuringiensis HD866, B. thuringiensis Al-Hakam, Bacillus mycoides 6462, and Bacillus megaterium, an undefined strain.
Prior to functional screening of the worm gut metagenomic E-LASL, a brief sequence-based analysis was conducted to verify the bacterial origin of the DNA (see Table S1 in the supplemental material). These data suggested a gut environment to which Proteobacteria (in particular, members of the Aeromonas genus) made a prominent contribution. The metagenomic library was subjected to soft agar overlay screens in which permeabilized clones were each overlaid with soft agar containing 15% sheep's blood. This screen was employed due to the origins of the metagenomic DNA; bacterial species carry various hemolysins that are readily detectable in this manner. Indeed, among 5,005 clones screened, a zone of β-hemolysis was noted around one colony (Fig. 2a). Blast analysis showed the hemolytic clone to carry a new member of the aerolysin gene family, a class of pore-forming exotoxins secreted by aeromonads (11). Compared to known aerolysins, the protein (accession number EU258894) demonstrated an average Clustal nucleotide alignment score of 71.1 (range, 68 to 78) and an average amino acid score of 73.2 (range, 68 to 79) (16). Phylogenetic analysis, moreover, showed it to occupy a relatively unique evolutionary position relative to those of other aerolysins (Fig. 2b). In order to confirm its activity quantitatively, the new aerolysin was expressed and purified (see Fig. S2 in the supplemental material) and its ability to lyse various erythrocytes was tested with an in vitro hemolysis assay (Fig. 2c) (7). Under the assay conditions (108 red blood cells/ml in phosphate-buffered saline; 1 h of incubation at 37°C), the erythrocyte sensitivity could be ranked as follows: rabbit erythrocytes were the most sensitive (complete hemolysis at 5 nM), sheep erythrocytes were the least sensitive (complete hemolysis at 200 nM), and human and chicken erythrocytes demonstrated intermediate sensitivity.
FIG. 2.
(a) Positive hemolysin clone. During a screen of worm gut contents, a single hemolytic colony was observed from 5,005 clones screened (indicated by the arrow and amplified in the inset). This clone contained a metagenomic DNA insert encoding a novel aerolysin. (b) Phylogenetic analysis. Nucleotide alignment was conducted with the metagenomic aerolysin and all other nonredundant aerolysins in the NCBI database: Aeromonas hydrophila I (accession number M16495), A. hydrophila II (X65044), A. hydrophila III (X65045), A. hydrophila IV (AY611033), A. hydrophila V (AF41110466), A. hydrophila VI (M84709), A. hydrophila VII (DQ40826), A. hydrophila VIII (AY378303), Aeromonas sobria I (X65046), A. sobria II (AY157998), A. sobria III (Y00559), Aeromonas salmonicida I (X65048), Aeromonas trota I (AF064068), and Aeromonas caviae (AAC44637). Multiple sequence alignment was conducted with ClustalX, version 1.81 (16), and phylogenetic analysis was performed with PHYLIP, version 3.67 (8). DNA distance and parsimony methods were employed with 1,000 rounds of bootstrapping and all other parameters at default values. The figure presents the distance-based tree, with bootstrap values listed at the nodes; the topology of the parsimony-based tree was largely congruent. (c) Aerolysin-induced hemolysis of red blood cells. Erythrocytes from several species showed susceptibility to the metagenomic aerolysin in a quantitative hemolysis assay. Rabbit erythrocytes were the most sensitive, sheep erythrocytes were the least sensitive, and human and chicken erythrocytes demonstrated intermediate sensitivity. There were six hemolysis measurements for rabbits, sheep, and chickens, and there were three hemolysis measurements for humans. Several hemolysis measurements extend slightly above the 100% value. This is an artifact of the experimental standards; in these cases, aerolysin-induced hemolysis exceeded hypotonic hemolysis (defined as 100%). Rabbit and sheep cells were purchased defibrinated, while chicken cells were purchased in Alsever's solution (Cleveland Scientific); human cells were collected from a healthy donor and placed into a heparinized vacuum tube immediately prior to use (following necessary consent and protocol guidelines). Error bars indicate standard deviations.
Although used here to identify phage lysins and a hemolysin, the E-LASL approach can be adapted to fit diverse screening needs. By modifying the soft agar overlay, for instance, one could target other classes of biocatalysts and active proteins. And while we employed Tsp509I to fragment DNA, other 4-bp consensus enzymes could be substituted to avoid biasing against GC-rich samples. Mechanical fragmentation could also be employed, for instance, if restriction enzyme inhibition due to DNA modification is a concern. The E-LASL approach could even be adapted to other expression systems (i.e., different hosts and/or vectors) if commercial topoisomerase cloning was omitted in favor of traditional ligation strategies. Finally, it is important to note the limitations of the technique. Like all functional screens, certain proteins may fail to express or prove toxic. This result could explain, for instance, why the genomic screen of one phage failed to identify a lysin-encoding clone. Moreover, E-LASLs should not be expected to identify secondary metabolites (i.e., antibiotic small molecules) encoded within large biosynthetic gene clusters.
In summary, the E-LASL method presented here provides a straightforward means of functional genomic/metagenomic screening. It can be utilized to mine small quantities of DNA for protein compounds with pharmacological, industrial, or pathogenic potential.
Supplementary Material
Acknowledgments
This research was funded by NIH/NIAID grants AI057472 and AI056510 to V.A.F. J.E.S. acknowledges the Pharmaceutical Research and Manufacturers of America Foundation and the NIH MSTP program (Weill Cornell/Rockefeller/Sloan-Kettering grant GM 07739).
We thank Forest Rohwer of San Diego State University for his helpful input.
Footnotes
Published ahead of print on 14 December 2007.
Supplemental material for this article may be found at http://aem.asm.org/.
REFERENCES
- 1.Borysowski, J., B. Weber-Dabrowska, and A. Gorski. 2005. Bacteriophage endolysins as a novel class of antibacterial agents. Exp. Biol. Med. 231:366-377. [DOI] [PubMed] [Google Scholar]
- 2.Breitbart, M., I. Hewson, B. Felts, J. M. Mahaffy, J. Nulton, P. Salamon, and F. Rohwer. 2003. Metagenomic analysis of an uncultured viral community from human feces. J. Bacteriol. 185:6220-6223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Breitbart, M., P. Salamon, B. Anderson, J. M. Mahaffy, A. M. Segall, D. Mead, F. Azam, and F. Rowher. 2002. Genomic analysis of uncultured marine viral communities. Proc. Natl. Acad. Sci. USA 99:14250-14255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Daniel, R. 2005. The metagenomics of soil. Nat. Rev. Microbiol. 3:470-478. [DOI] [PubMed] [Google Scholar]
- 5.Delwart, E. L. 2007. Viral metagenomics. Rev. Med. Virol. 17:115-131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Edwards, R. A., and F. Rohwer. 2005. Viral metagenomics. Nat. Rev. Microbiol. 3:504-510. [DOI] [PubMed] [Google Scholar]
- 7.Eschbach, E., J. P. Scharsack, U. John, and L. K. Medlin. 2001. Improved erythrocyte lysis assay in microtitre plates for sensitive detection and efficient measurement of haemolytic compounds from ichthyotoxic algae. J. Appl. Toxicol. 21:513-519. [DOI] [PubMed] [Google Scholar]
- 8.Felsenstein, J. 1989. PHYLIP—phylogeny inference package (version 3.2). Cladistics 5:164-166. [Google Scholar]
- 9.Ferrer, M., F. Martinez-Abarca, and P. N. Golyshin. 2005. Mining genomes and ‘metagenomes’ for novel catalysts. Curr. Opin. Biotechnol. 16:588-593. [DOI] [PubMed] [Google Scholar]
- 10.Fishetti, V. A. 2005. Bacteriophage lytic enzymes: novel anti-infectives. Trends Microbiol. 13:491-496. [DOI] [PubMed] [Google Scholar]
- 11.Fivas, M., L. Abrami, Y. Tsitrin, and F. G. van der Goot. 2001. Aerolysin from Aeromonas hydrophila and related toxins. Curr. Top. Microbiol. Immunol. 257:35-52. [DOI] [PubMed] [Google Scholar]
- 12.Gillespie, D. E., S. F. Brady, A. D. Bettermann, N. P. Cianciotto, M. R. Liles, M. R. Rondon, J. Clardy, R. M. Goodman, and J. Handelsman. 2002. Isolation of antibiotics turbomycin A and B from a metagenomic library of soil DNA. Appl. Environ. Microbiol. 68:4301-4306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Green, B. D., and M. Keller. 2006. Capturing the uncultivated majority. Curr. Opin. Biotechnol. 17:236-240. [DOI] [PubMed] [Google Scholar]
- 14.Porter, C. J., R. Schuch, A. J. Pelzek, A. M. Buckle, S. McGowan, M. C. Wilce, J. Rossjohn, R. Russell, D. Nelson, V. A. Fischetti, and J. C. Whisstock. 2007. The 1.6 Å crystal structure of the catalytic domain of PlyB, a bacteriophage lysin active against Bacillus anthracis. J. Mol. Biol. 366:540-550. [DOI] [PubMed] [Google Scholar]
- 15.Shuman, S. 1994. Novel approach to molecular cloning and polynucleotide synthesis using vaccinia DNA topoisomerase. J. Biol. Chem. 269:32678-32684. [PubMed] [Google Scholar]
- 16.Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 24:4876-4882. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Tringe, S. G., and E. M. Rubin. 2005. Metagenomics: DNA sequencing of environmental samples. Nat. Rev. Genet. 6:805-814. [DOI] [PubMed] [Google Scholar]
- 18.Venter, J. C., K. Remington, J. F. Heidelberg, A. L. Halpern, D. Rusch, J. A. Eisen, D. Wu, I. Paulsen, K. E. Nelson, W. Nelson, D. E. Fouts, S. Levy, A. H. Knap, M. W. Lomas, K. Nealson, O. White, J. Peterson, J. Hoffman, R. Parsons, H. Baden-Tillson, C. Pfannkoch, Y. H. Rogers, and H. O. Smith. 2004. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304:66-74. [DOI] [PubMed] [Google Scholar]
- 19.Voget, S., H. Steele, and W. R. Streit. 2005. Metagenomes—an unlimited resource for novel genes, biocatalysts, and metabolites. Minerva Biotechnol. 17:47-53. [Google Scholar]
- 20.Yooseph, S., G. Sutton, D. B. Rusch, A. L. Halpern, S. J. Williamson, K. Remington, J. A. Eisen, K. B. Heidelberg, G. Manning, W. Li, L. Jaroszewski, P. Cieplak, C. S. Miller, H. Li, S. T. Mashiyama, M. P. Joachimiak, C. van Belle, J. M. Chandonia, D. A. Soergel, Y. Zhai, K. Natarajan, S. Lee, B. J. Raphael, V. Bafna, R. Friedman, S. E. Brenner, A. Godzik, D. Eisenberg, J. E. Dixon, S. S. Taylor, R. L. Strausberg, M. Frazier, and J. C. Venter. 2007. The Sorcerer II global ocean sampling expedition: expanding the universe of protein families. PLoS Biol. 5:e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


