Skip to main content
Microbiology Resource Announcements logoLink to Microbiology Resource Announcements
. 2024 Aug 30;13(10):e00463-24. doi: 10.1128/mra.00463-24

Nine Cluster E mycobacteriophages isolated from soil

Joseph M Gaballa 1, Amanda Freise 2, Krisanavane Reddi 2, Jordan Moberg Parker 2,3,
Editor: Catherine Putonti4
PMCID: PMC11465865  PMID: 39212351

ABSTRACT

Mycobacteriophages FireRed, MISSy, MPhalcon, Murica, Sassay, Terminus, Willez, YassJohnny, and Youngblood were isolated from soil using Mycobacterium smegmatis as a host. Genome sequencing and annotation revealed that they belong to Actinobacteriophage Cluster E. Here, we describe the features of their genomes and discuss similarities within these Cluster E phages.

KEYWORDS: mycobacteriophage, bacteriophages

ANNOUNCEMENT

The discovery of mycobacteriophages is typically conducted using Mycobacterium smegmatis, a non-pathogenic strain that serves as a model host for pathogenic mycobacteria. Over 150,000 genes have been identified, a majority of which have unknown function, in the collection of over 2,000 mycobacteriophages, which have been sequenced (1, 2). This incredible degree of genetic diversity warrants mycobacteriophages to be organized into clusters and subclusters based on genomic similarity (3). In this study, we introduce the genomes of nine Cluster E mycobacteriophages isolated from soil by undergraduates in the Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science (SEA-PHAGES) program (4).

Soil samples were collected from sites around the greater Los Angeles, CA, area, and phages were isolated using direct or enrichment isolation (Table 1) on M. smegmatis MC2155 at 35°C as described by the SEA-PHAGES Discovery Guide (5). DNA was isolated using the Wizard Promega DNA Clean-Up Kit (#A7280). Pooled libraries were prepared with a NEB Ultra II DNA Library Prep kit (NEB #E7103) for Illumina sequencing or a Roche GS FLX Titanium emPCR Lib-A Kit for 454 GS FLX sequencing (Table 1). Sequence reads were assembled into single-phage contigs using Newbler v2.9 (454 Life Sciences) with default settings and assessed for completeness and genomic termini using Consed v29 as previously described (6, 7). Location and coding potential of putative genes were predicted using DNA Master [J. G. Lawrence lab (http://cobamide2.bio.pitt.edu)], which integrates both Glimmer and GeneMark to detect potential open-reading frames (8, 9). Location calls were curated using Phamerator and Starterator (10). ARAGORN and tRNAscan-SE were used to detect the presence of tRNA genes (11, 12). Functional calls were predicted using the PhagesDB and NCBI databases, including the Conserved Domain Database, HHPred, and TMHMM (1317). Gene Content Similarity (GCS) was calculated using PhagesDB (https://phagesdb.org/genecontent/) (18). Unless otherwise stated, no modifications were made to kit instructions, and the current version of each software package at the time of isolation was used with default parameters.

TABLE 1.

Isolation, sequencing, and genomic features of the Cluster E phages

Mycobacteriophage Isolation method Collection year Sample location (Lat, Lon) Plaque morphology and diameter Genome length (bp) No. of genes GC content (%) 3′ Overhang Sequence Sequencing method No. ofreads Avg. spot length(bp) (SD) Sequencing coverage Sequence read archive accession no. GenBank accession no.
FireRed Enriched 2013 34.08 N 118.40 W Turbid (4 mm) 76,217 150 63.0 CGCTTGTCA Roche 454 GS FLX 8,421 512 (51.2) 48 x SRX23607957 MF919506
MISSy Enriched 2014 34.056 N 118.442 W Bullseye (3.5–6.0 mm) 75,808 147 63.1 CGCTTGTCA Roche 454 GS FLX 11,565 514 (53.5) 63 x SRX23607958 MF919524
MPhalcon Direct 2017 33.9922 N 118.4705 W Clear/halo (2–4 mm) 75,605 148 63.1 CGCTTGTCA Illumina MiSeq 150-base single-end reads 1,090,304 146 (15.0) 2,029 x SRX23702566 MH020247
Murica Enriched 2013 34.07 N 118.451 W Clear/bullseye(3–5 mm) 77,053 149 63.0 CGCTTGTCA Roche 454 GS FLX 14,503 506 (61.1) 81 x SRX23607959 MF919525
Sassay Enriched 2014 34.0703 N 118.453 W Turbid with halos (4 mm) 73,495 141 63.0 CGCTTGTCA Roche 454 GS FLX 22,872 506 (58.8) 126 x SRX23607960 MF919529
Terminus Enriched 2014 34.0675 N 118.45444 W Clear (2–4 mm) 76,169 149 63.1 CGCTTGTCA Illumina HiSeq 150-base paired reads 17,911,466 98 (9.2), 98 (9.3) 10,582 x SRX23607963 MF919535
Willez Enriched 2011 34.021 N 118.395 W Turbid (5 mm) 74,576 144 62.9 CGCTTGTCA Roche 454 GS FLX 24,148 512 (50.1) 157 x SRX23607961 MF919540
YassJohnny Enriched 2015 34.0656 N 118.4540 W Turbid/Bullseye(2 mm) 73,697 141 62.9 CGCTTGTCA Illumina MiSeq 150-base single-end reads 516,974 127 (22.0) 913 x SRX23702568 MF919541
Youngblood Enriched 2014 34.075 N 118.451 W Turbid/bullseye(2–5 mm) 75,896 150 62.9 CGCTTGTCA Roche 454 GS FLX 6,987 511 (55.5) 38 x SRX23607962 MG099953

Local whole-genome BLASTn against the Phagesdb.org database (https://phagesdb.org/blast/) indicated 98%–99% identity of all nine phages with previously characterized Cluster E phages. Complete genome lengths ranged between 73,495 and 77,053 base pairs with each genome containing a 3′ sticky overhang (CGCTTGTCA). GC content ranged from 62.9% to 63.1%, and the average of 63.0% was consistent with the average of published Cluster E phages (63.0%). Total predicted gene counts ranged between 141 and 150 (Table 1), with pairwise GCS ranging between 86.1% and 95.5% (Fig. 1). Most of the genes with no annotated functions are located on the right side of the genome, while the left side of the genome is highly conserved and contains most of the structural proteins. Each phage genome contains two tRNAs and a lysis cassette comprising lysin A, lysin B, and holin genes. Integrase and an immunity repressor are found downstream of the lysis cassettes in all phages. The presence of a lysis cassette, integrase, and an immunity repressor suggests that these phages can undergo lytic or lysogenic life cycles, and this was supported by the presence of both clear and turbid or bullseye plaque morphologies.

Fig 1.

Image displays heat map showing various numbers ranging from 86.1 to 95.5 represented by a color from yellow (lower values) to dark green (higher values). The chart compares values in pairs with each cell intersecting between two labeled rows and columns.

Gene Content Similarity (GCS) of the Cluster E phages. GCS, which is a calculation of the average number of shared genes between two phages, ranged from 86.1% to 95.5% (18). The GCS calculated for each phage pair is presented as a heat map with the highest GCS scores represented in dark green and the lowest GCS scores represented in yellow.

ACKNOWLEDGMENTS

Mycobacteriophages were isolated by undergraduates in the Research Immersion in Virology course-based undergraduate research experience in the Microbiology, Immunology, and Molecular Genetics Department at UCLA. This project was funded by the Life Sciences Division at UCLA, with additional support for sequencing from the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science (SEA-PHAGES) program. We thank Rebecca A. Garlena and Daniel A. Russell at the Pittsburgh Bacteriophage Institute for phage sequencing and genome assembly; and Debbie Jacobs-Sera, Welkin Pope, Graham Hatfull, and the SEA-PHAGES community for programmatic support.

Contributor Information

Jordan Moberg Parker, Email: Jordan.P.Parker@kp.org.

Catherine Putonti, Loyola University Chicago, Chicago, Illinois, USA.

DATA AVAILABILITY

The Whole Genome Sequencing reads and Complete Genome sequences have been deposited in the NCBI Sequence Read Archive (SRA) and GenBank, respectively (Table 1). The versions described in this paper are the first versions.

REFERENCES

  • 1. Hatfull GF. 2018. Mycobacteriophages. Microbiol Spectr 6. doi: 10.1128/microbiolspec.GPP3-0026-2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Hatfull GF. 2022. Mycobacteriophages: from petri dish to patient. PLoS Pathog 18:e1010602. doi: 10.1371/journal.ppat.1010602 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Hatfull GF, Jacobs-Sera D, Lawrence JG, Pope WH, Russell DA, Ko C-C, Weber RJ, Patel MC, Germane KL, Edgar RH, et al. 2010. Comparative genomic analysis of 60 mycobacteriophage genomes: genome clustering, gene acquisition, and gene size. J Mol Biol 397:119–143. doi: 10.1016/j.jmb.2010.01.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Jordan TC, Burnett SH, Carson S, Caruso SM, Clase K, DeJong RJ, Dennehy JJ, Denver DR, Dunbar D, Elgin SCR, et al. 2014. A broadly implementable research course in phage discovery and genomics for first-year undergraduate students. MBio 5:e01051-13. doi: 10.1128/mBio.01051-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Poxleitner M, Pope WH, Jacobs-Sera D, Sivanathan V, Hatfull G. 2018. Phage discovery guide. Howard Hughes Medical Institute. Howard Hughes Medical Institute, Chevy Chase, MD. [Google Scholar]
  • 6. Russell DA. 2018. Sequencing, assembling, and finishing complete bacteriophage genomes, p 109–125. In Clokie MRJ, Kropinski AM, Lavigne R (ed), Bacteriophages: methods and protocols. Vol. 3. New York, NY, Springer New York. [DOI] [PubMed] [Google Scholar]
  • 7. Gordon D, Abajian C, Green P. 1998. Consed: a graphical tool for sequence finishing. Genome Res 8:195–202. doi: 10.1101/gr.8.3.195 [DOI] [PubMed] [Google Scholar]
  • 8. Delcher AL, Harmon D, Kasif S, White O, Salzberg SL. 1999. Improved microbial gene identification with GLIMMER. Nucleic Acids Res 27:4636–4641. doi: 10.1093/nar/27.23.4636 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Lukashin AV, Borodovsky M. 1998. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26:1107–1115. doi: 10.1093/nar/26.4.1107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Cresawn SG, Bogel M, Day N, Jacobs-Sera D, Hendrix RW, Hatfull GF. 2011. Phamerator: a bioinformatic tool for comparative bacteriophage genomics. BMC Bioinformatics 12:395. doi: 10.1186/1471-2105-12-395 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Laslett D, Canback B. 2004. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 32:11–16. doi: 10.1093/nar/gkh152 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Chan PP, Lowe TM. 2019. tRNAscan-SE: searching for tRNA genes in genomic sequences. Methods Mol Biol 1962:1–14. doi: 10.1007/978-1-4939-9173-0_1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, Geer RC, He J, Gwadz M, Hurwitz DI, Lanczycki CJ, Lu F, Marchler GH, Song JS, Thanki N, Wang Z, Yamashita RA, Zhang D, Zheng C, Bryant SH. 2015. CDD: NCBI’s conserved domain database. Nucleic Acids Res 43:D222–6. doi: 10.1093/nar/gku1221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Söding J, Biegert A, Lupas AN. 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33:W244–8. doi: 10.1093/nar/gki408 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Krogh A, Larsson B, von Heijne G, Sonnhammer EL. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580. doi: 10.1006/jmbi.2000.4315 [DOI] [PubMed] [Google Scholar]
  • 16. Russell DA, Hatfull GF. 2017. PhagesDB: the actinobacteriophage database. Bioinformatics 33:784–786. doi: 10.1093/bioinformatics/btw711 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403–410. doi: 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]
  • 18. Mavrich TN, Hatfull GF. 2017. Bacteriophage evolution differs by host, lifestyle and genome. Nat Microbiol 2:17112. doi: 10.1038/nmicrobiol.2017.112 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The Whole Genome Sequencing reads and Complete Genome sequences have been deposited in the NCBI Sequence Read Archive (SRA) and GenBank, respectively (Table 1). The versions described in this paper are the first versions.


Articles from Microbiology Resource Announcements are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES