Nine Cluster E mycobacteriophages isolated from soil

Joseph M Gaballa; Amanda Freise; Krisanavane Reddi; Jordan Moberg Parker

doi:10.1128/mra.00463-24

. 2024 Aug 30;13(10):e00463-24. doi: 10.1128/mra.00463-24

Nine Cluster E mycobacteriophages isolated from soil

Joseph M Gaballa ¹, Amanda Freise ², Krisanavane Reddi ², Jordan Moberg Parker ^2,^3,^✉

Editor: Catherine Putonti⁴

PMCID: PMC11465865 PMID: 39212351

ABSTRACT

Mycobacteriophages FireRed, MISSy, MPhalcon, Murica, Sassay, Terminus, Willez, YassJohnny, and Youngblood were isolated from soil using Mycobacterium smegmatis as a host. Genome sequencing and annotation revealed that they belong to Actinobacteriophage Cluster E. Here, we describe the features of their genomes and discuss similarities within these Cluster E phages.

KEYWORDS: mycobacteriophage, bacteriophages

ANNOUNCEMENT

The discovery of mycobacteriophages is typically conducted using Mycobacterium smegmatis, a non-pathogenic strain that serves as a model host for pathogenic mycobacteria. Over 150,000 genes have been identified, a majority of which have unknown function, in the collection of over 2,000 mycobacteriophages, which have been sequenced (1, 2). This incredible degree of genetic diversity warrants mycobacteriophages to be organized into clusters and subclusters based on genomic similarity (3). In this study, we introduce the genomes of nine Cluster E mycobacteriophages isolated from soil by undergraduates in the Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science (SEA-PHAGES) program (4).

Soil samples were collected from sites around the greater Los Angeles, CA, area, and phages were isolated using direct or enrichment isolation (Table 1) on M. smegmatis MC²155 at 35°C as described by the SEA-PHAGES Discovery Guide (5). DNA was isolated using the Wizard Promega DNA Clean-Up Kit (#A7280). Pooled libraries were prepared with a NEB Ultra II DNA Library Prep kit (NEB #E7103) for Illumina sequencing or a Roche GS FLX Titanium emPCR Lib-A Kit for 454 GS FLX sequencing (Table 1). Sequence reads were assembled into single-phage contigs using Newbler v2.9 (454 Life Sciences) with default settings and assessed for completeness and genomic termini using Consed v29 as previously described (6, 7). Location and coding potential of putative genes were predicted using DNA Master [J. G. Lawrence lab (http://cobamide2.bio.pitt.edu)], which integrates both Glimmer and GeneMark to detect potential open-reading frames (8, 9). Location calls were curated using Phamerator and Starterator (10). ARAGORN and tRNAscan-SE were used to detect the presence of tRNA genes (11, 12). Functional calls were predicted using the PhagesDB and NCBI databases, including the Conserved Domain Database, HHPred, and TMHMM (13 –17). Gene Content Similarity (GCS) was calculated using PhagesDB (https://phagesdb.org/genecontent/) (18). Unless otherwise stated, no modifications were made to kit instructions, and the current version of each software package at the time of isolation was used with default parameters.

TABLE 1.

Isolation, sequencing, and genomic features of the Cluster E phages

Mycobacteriophage	Isolation method	Collection year	Sample location (Lat, Lon)	Plaque morphology and diameter	Genome length (bp)	No. of genes	GC content (%)	3′ Overhang Sequence	Sequencing method	No. ofreads	Avg. spot length(bp) (SD)	Sequencing coverage	Sequence read archive accession no.	GenBank accession no.
FireRed	Enriched	2013	34.08 N 118.40 W	Turbid (4 mm)	76,217	150	63.0	CGCTTGTCA	Roche 454 GS FLX	8,421	512 (51.2)	48 x	SRX23607957	MF919506
MISSy	Enriched	2014	34.056 N 118.442 W	Bullseye (3.5–6.0 mm)	75,808	147	63.1	CGCTTGTCA	Roche 454 GS FLX	11,565	514 (53.5)	63 x	SRX23607958	MF919524
MPhalcon	Direct	2017	33.9922 N 118.4705 W	Clear/halo (2–4 mm)	75,605	148	63.1	CGCTTGTCA	Illumina MiSeq 150-base single-end reads	1,090,304	146 (15.0)	2,029 x	SRX23702566	MH020247
Murica	Enriched	2013	34.07 N 118.451 W	Clear/bullseye(3–5 mm)	77,053	149	63.0	CGCTTGTCA	Roche 454 GS FLX	14,503	506 (61.1)	81 x	SRX23607959	MF919525
Sassay	Enriched	2014	34.0703 N 118.453 W	Turbid with halos (4 mm)	73,495	141	63.0	CGCTTGTCA	Roche 454 GS FLX	22,872	506 (58.8)	126 x	SRX23607960	MF919529
Terminus	Enriched	2014	34.0675 N 118.45444 W	Clear (2–4 mm)	76,169	149	63.1	CGCTTGTCA	Illumina HiSeq 150-base paired reads	17,911,466	98 (9.2), 98 (9.3)	10,582 x	SRX23607963	MF919535
Willez	Enriched	2011	34.021 N 118.395 W	Turbid (5 mm)	74,576	144	62.9	CGCTTGTCA	Roche 454 GS FLX	24,148	512 (50.1)	157 x	SRX23607961	MF919540
YassJohnny	Enriched	2015	34.0656 N 118.4540 W	Turbid/Bullseye(2 mm)	73,697	141	62.9	CGCTTGTCA	Illumina MiSeq 150-base single-end reads	516,974	127 (22.0)	913 x	SRX23702568	MF919541
Youngblood	Enriched	2014	34.075 N 118.451 W	Turbid/bullseye(2–5 mm)	75,896	150	62.9	CGCTTGTCA	Roche 454 GS FLX	6,987	511 (55.5)	38 x	SRX23607962	MG099953

Open in a new tab

Local whole-genome BLASTn against the Phagesdb.org database (https://phagesdb.org/blast/) indicated 98%–99% identity of all nine phages with previously characterized Cluster E phages. Complete genome lengths ranged between 73,495 and 77,053 base pairs with each genome containing a 3′ sticky overhang (CGCTTGTCA). GC content ranged from 62.9% to 63.1%, and the average of 63.0% was consistent with the average of published Cluster E phages (63.0%). Total predicted gene counts ranged between 141 and 150 (Table 1), with pairwise GCS ranging between 86.1% and 95.5% (Fig. 1). Most of the genes with no annotated functions are located on the right side of the genome, while the left side of the genome is highly conserved and contains most of the structural proteins. Each phage genome contains two tRNAs and a lysis cassette comprising lysin A, lysin B, and holin genes. Integrase and an immunity repressor are found downstream of the lysis cassettes in all phages. The presence of a lysis cassette, integrase, and an immunity repressor suggests that these phages can undergo lytic or lysogenic life cycles, and this was supported by the presence of both clear and turbid or bullseye plaque morphologies.

Image displays heat map showing various numbers ranging from 86.1 to 95.5 represented by a color from yellow (lower values) to dark green (higher values). The chart compares values in pairs with each cell intersecting between two labeled rows and columns. — Gene Content Similarity (GCS) of the Cluster E phages. GCS, which is a calculation of the average number of shared genes between two phages, ranged from 86.1% to 95.5% (18). The GCS calculated for each phage pair is presented as a heat map with the highest GCS scores represented in dark green and the lowest GCS scores represented in yellow.

ACKNOWLEDGMENTS

Mycobacteriophages were isolated by undergraduates in the Research Immersion in Virology course-based undergraduate research experience in the Microbiology, Immunology, and Molecular Genetics Department at UCLA. This project was funded by the Life Sciences Division at UCLA, with additional support for sequencing from the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science (SEA-PHAGES) program. We thank Rebecca A. Garlena and Daniel A. Russell at the Pittsburgh Bacteriophage Institute for phage sequencing and genome assembly; and Debbie Jacobs-Sera, Welkin Pope, Graham Hatfull, and the SEA-PHAGES community for programmatic support.

Contributor Information

Jordan Moberg Parker, Email: Jordan.P.Parker@kp.org.

Catherine Putonti, Loyola University Chicago, Chicago, Illinois, USA.

DATA AVAILABILITY

The Whole Genome Sequencing reads and Complete Genome sequences have been deposited in the NCBI Sequence Read Archive (SRA) and GenBank, respectively (Table 1). The versions described in this paper are the first versions.

REFERENCES

1. Hatfull GF. 2018. Mycobacteriophages. Microbiol Spectr 6. doi: 10.1128/microbiolspec.GPP3-0026-2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Hatfull GF. 2022. Mycobacteriophages: from petri dish to patient. PLoS Pathog 18:e1010602. doi: 10.1371/journal.ppat.1010602 [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Hatfull GF, Jacobs-Sera D, Lawrence JG, Pope WH, Russell DA, Ko C-C, Weber RJ, Patel MC, Germane KL, Edgar RH, et al. 2010. Comparative genomic analysis of 60 mycobacteriophage genomes: genome clustering, gene acquisition, and gene size. J Mol Biol 397:119–143. doi: 10.1016/j.jmb.2010.01.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Jordan TC, Burnett SH, Carson S, Caruso SM, Clase K, DeJong RJ, Dennehy JJ, Denver DR, Dunbar D, Elgin SCR, et al. 2014. A broadly implementable research course in phage discovery and genomics for first-year undergraduate students. MBio 5:e01051-13. doi: 10.1128/mBio.01051-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Poxleitner M, Pope WH, Jacobs-Sera D, Sivanathan V, Hatfull G. 2018. Phage discovery guide. Howard Hughes Medical Institute. Howard Hughes Medical Institute, Chevy Chase, MD. [Google Scholar]
6. Russell DA. 2018. Sequencing, assembling, and finishing complete bacteriophage genomes, p 109–125. In Clokie MRJ, Kropinski AM, Lavigne R (ed), Bacteriophages: methods and protocols. Vol. 3. New York, NY, Springer New York. [DOI] [PubMed] [Google Scholar]
7. Gordon D, Abajian C, Green P. 1998. Consed: a graphical tool for sequence finishing. Genome Res 8:195–202. doi: 10.1101/gr.8.3.195 [DOI] [PubMed] [Google Scholar]
8. Delcher AL, Harmon D, Kasif S, White O, Salzberg SL. 1999. Improved microbial gene identification with GLIMMER. Nucleic Acids Res 27:4636–4641. doi: 10.1093/nar/27.23.4636 [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Lukashin AV, Borodovsky M. 1998. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26:1107–1115. doi: 10.1093/nar/26.4.1107 [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Cresawn SG, Bogel M, Day N, Jacobs-Sera D, Hendrix RW, Hatfull GF. 2011. Phamerator: a bioinformatic tool for comparative bacteriophage genomics. BMC Bioinformatics 12:395. doi: 10.1186/1471-2105-12-395 [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Laslett D, Canback B. 2004. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 32:11–16. doi: 10.1093/nar/gkh152 [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Chan PP, Lowe TM. 2019. tRNAscan-SE: searching for tRNA genes in genomic sequences. Methods Mol Biol 1962:1–14. doi: 10.1007/978-1-4939-9173-0_1 [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, Geer RC, He J, Gwadz M, Hurwitz DI, Lanczycki CJ, Lu F, Marchler GH, Song JS, Thanki N, Wang Z, Yamashita RA, Zhang D, Zheng C, Bryant SH. 2015. CDD: NCBI’s conserved domain database. Nucleic Acids Res 43:D222–6. doi: 10.1093/nar/gku1221 [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Söding J, Biegert A, Lupas AN. 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33:W244–8. doi: 10.1093/nar/gki408 [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Krogh A, Larsson B, von Heijne G, Sonnhammer EL. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580. doi: 10.1006/jmbi.2000.4315 [DOI] [PubMed] [Google Scholar]
16. Russell DA, Hatfull GF. 2017. PhagesDB: the actinobacteriophage database. Bioinformatics 33:784–786. doi: 10.1093/bioinformatics/btw711 [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403–410. doi: 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]
18. Mavrich TN, Hatfull GF. 2017. Bacteriophage evolution differs by host, lifestyle and genome. Nat Microbiol 2:17112. doi: 10.1038/nmicrobiol.2017.112 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[B1] 1. Hatfull GF. 2018. Mycobacteriophages. Microbiol Spectr 6. doi: 10.1128/microbiolspec.GPP3-0026-2018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] 2. Hatfull GF. 2022. Mycobacteriophages: from petri dish to patient. PLoS Pathog 18:e1010602. doi: 10.1371/journal.ppat.1010602 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] 3. Hatfull GF, Jacobs-Sera D, Lawrence JG, Pope WH, Russell DA, Ko C-C, Weber RJ, Patel MC, Germane KL, Edgar RH, et al. 2010. Comparative genomic analysis of 60 mycobacteriophage genomes: genome clustering, gene acquisition, and gene size. J Mol Biol 397:119–143. doi: 10.1016/j.jmb.2010.01.011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] 4. Jordan TC, Burnett SH, Carson S, Caruso SM, Clase K, DeJong RJ, Dennehy JJ, Denver DR, Dunbar D, Elgin SCR, et al. 2014. A broadly implementable research course in phage discovery and genomics for first-year undergraduate students. MBio 5:e01051-13. doi: 10.1128/mBio.01051-13 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5. Poxleitner M, Pope WH, Jacobs-Sera D, Sivanathan V, Hatfull G. 2018. Phage discovery guide. Howard Hughes Medical Institute. Howard Hughes Medical Institute, Chevy Chase, MD. [Google Scholar]

[B6] 6. Russell DA. 2018. Sequencing, assembling, and finishing complete bacteriophage genomes, p 109–125. In Clokie MRJ, Kropinski AM, Lavigne R (ed), Bacteriophages: methods and protocols. Vol. 3. New York, NY, Springer New York. [DOI] [PubMed] [Google Scholar]

[B7] 7. Gordon D, Abajian C, Green P. 1998. Consed: a graphical tool for sequence finishing. Genome Res 8:195–202. doi: 10.1101/gr.8.3.195 [DOI] [PubMed] [Google Scholar]

[B8] 8. Delcher AL, Harmon D, Kasif S, White O, Salzberg SL. 1999. Improved microbial gene identification with GLIMMER. Nucleic Acids Res 27:4636–4641. doi: 10.1093/nar/27.23.4636 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9. Lukashin AV, Borodovsky M. 1998. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26:1107–1115. doi: 10.1093/nar/26.4.1107 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] 10. Cresawn SG, Bogel M, Day N, Jacobs-Sera D, Hendrix RW, Hatfull GF. 2011. Phamerator: a bioinformatic tool for comparative bacteriophage genomics. BMC Bioinformatics 12:395. doi: 10.1186/1471-2105-12-395 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11] 11. Laslett D, Canback B. 2004. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 32:11–16. doi: 10.1093/nar/gkh152 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] 12. Chan PP, Lowe TM. 2019. tRNAscan-SE: searching for tRNA genes in genomic sequences. Methods Mol Biol 1962:1–14. doi: 10.1007/978-1-4939-9173-0_1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B13] 13. Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, Geer RC, He J, Gwadz M, Hurwitz DI, Lanczycki CJ, Lu F, Marchler GH, Song JS, Thanki N, Wang Z, Yamashita RA, Zhang D, Zheng C, Bryant SH. 2015. CDD: NCBI’s conserved domain database. Nucleic Acids Res 43:D222–6. doi: 10.1093/nar/gku1221 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B14] 14. Söding J, Biegert A, Lupas AN. 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33:W244–8. doi: 10.1093/nar/gki408 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B15] 15. Krogh A, Larsson B, von Heijne G, Sonnhammer EL. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580. doi: 10.1006/jmbi.2000.4315 [DOI] [PubMed] [Google Scholar]

[B16] 16. Russell DA, Hatfull GF. 2017. PhagesDB: the actinobacteriophage database. Bioinformatics 33:784–786. doi: 10.1093/bioinformatics/btw711 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B17] 17. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403–410. doi: 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]

[B18] 18. Mavrich TN, Hatfull GF. 2017. Bacteriophage evolution differs by host, lifestyle and genome. Nat Microbiol 2:17112. doi: 10.1038/nmicrobiol.2017.112 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Nine Cluster E mycobacteriophages isolated from soil

Joseph M Gaballa

Amanda Freise

Krisanavane Reddi

Jordan Moberg Parker

Roles

ABSTRACT

ANNOUNCEMENT

TABLE 1.

Fig 1.

ACKNOWLEDGMENTS

Contributor Information

DATA AVAILABILITY

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Nine Cluster E mycobacteriophages isolated from soil

Joseph M Gaballa

Amanda Freise

Krisanavane Reddi

Jordan Moberg Parker

Roles

ABSTRACT

ANNOUNCEMENT

TABLE 1.

Fig 1.

ACKNOWLEDGMENTS

Contributor Information

DATA AVAILABILITY

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases