ABSTRACT
We report genome sequences of six mycobacteriophages. Each virus was isolated from a soil sample and belongs to the siphovirus morphology. Genomes are 41,901–60,613 bp in length, contain between 62 and 103 protein-coding genes, with up to 40% of those genes having a predicted function.
KEYWORDS: bacteriophage, Mycobacterium, mycobacteriophage
ANNOUNCEMENT
The bacterial phylum Actinobacteria comprises a diverse group of gram-positive bacteria that are found in a variety of ecosystems (1). Some Actinobacteria cause illness in humans (1), and viruses, or bacteriophages, that infect this phylum have now been used in phage therapy treatments for at least 20 patients with Mycobacterium infections (2). We are interested in understanding the genomic diversity of phages infecting Actinobacteria so that phage therapy can be improved.
For more than 15 years, the SEA PHAGES program has engaged undergraduate students in Actinobacteriophage discovery and bioinformatics with the goal of improving STEM education while increasing knowledge of phage evolution and sequence diversity. Here, we report the genome sequences of six mycobacteriophages isolated from surface-level soil samples collected in Richmond, Williamsburg, and Aldie, Virginia (Table 1). Each bacteriophage was discovered through enrichment with the host bacteria Mycobacterium smegmatis mc2155 (NRRL 24020) (3). Enrichment samples were grown overnight at 37°C using 7H9 media. Phages were purified through two to six rounds of plaque picking, serial dilution, infection into cultures, and plating with top agar (3). Each phage produced small, round plaques of varying clarity. Negative staining using 1% uranyl acetate for transmission electron microscopy revealed siphovirus-like morphology.
TABLE 1.
Phage and genome sequence characteristics
| Phage | Aubs | ShaboiShabazz | BABullseye | TomBrady | CheetoDust | Ageofdapage |
|---|---|---|---|---|---|---|
| Sample location (city, state) | Aldie, VA | Richmond, VA | Williamsburg, VA | Richmond, VA | Richmond, VA | Richmond, VA |
| Sample location GPS coordinates | 38.92169 N, 77.55269 W | 37.5464 N, 77.4536 W | 37.234482 N, 76.67864 W | 37.548472 N, 77.45375 W | 37.545141 N, 77.454741 W | 37.545273 N, 77.454804 W |
| Plaque morphology | Clear | Clear | Halo with clear center | Cloudy | Cloudy | Clear |
| Plaque size | 1 mm | 1 mm | 3.5–4 mm | 1 mm | 1 mm | 3 mm |
| Capsid size (nm) | 76 (n = 2) | 55 (n = 2) | 60 (n = 1) | 57 (n = 2) | 50 (n = 1) | 50 (n = 2) |
| Tail length (nm) | 205 (n = 2) | 175 (n = 2) | 136 (n = 1) | 193 (n = 2) | 205 (n = 1) | 255 (n = 2) |
| SRA accession number | SRR27188678 | SRX22868882 | SRX22868879 | SRX22868883 | SRX22868880 | SRR2718860 |
| No. of reads | 487,877 | 642,024 | 924,556 | 944,512 | 66,355 | 541705 |
| Approximate read coverage | 1,169 | 2,158 | 2,589 | 3,187 | 77 | 1,284 |
| GenBank accession number | OP297542 | ON637759 | OP434447 | ON970618 | OR521075 | OR521076 |
| Cluster | F1 | G1 | A6 | G1 | K1 | K1 |
| Genome length (bp) | 58,937 | 41,901 | 50,450 | 41,902 | 59,302 | 60,613 |
| # ORFs (# with predicted function) | 103 (40) | 62 (25) | 97 (35) | 63 (25) | 96 (39) | 100 (38) |
| # Orphams | 0 | 0 | 1 | 0 | 0 | 2 |
| # tRNAs | 0 | 0 | 3 | 0 | 1 | 1 |
| % GC | 61.4 | 66.6 | 61.5 | 66.6 | 66.5 | 66.5 |
| Length of 3′-sticky overhang | 10 base | 11 base | 10 base | 11 base | 11 base | 11 base |
DNA was isolated from phage high-titer lysate using the Promega Wizard DNA Cleanup Kit. Genomic DNA was sequencing using an Illumina MiSeq sequencer (v3 reagents) after preparing individual libraries using the NEBNext Ultra II FS Kit. The number and approximate coverage of single-end 150 bp reads are listed in Table 1. Raw reads were assembled into a single contig for each genome using Newbler v2.9 with default parameters. Genomes were visually checked for completeness using Consed v20 (4).
The resulting genomes were 41,901–60,613 bp in length, with a GC content of 61.4%–66.6% (Table 1). Assemblies yielded single contigs with ends containing 10–11 base 3′-sticky overhangs that were evident by many reads ending at the same position. Genomes were assigned to Actinobacteriophage clusters A6, F1, G1, and K1 according to established guidelines (5, 6).
Phage genes were predicted by Genemark v2.5 (7) and Glimmer v3.02 (8) using PECAAN (9) and DNA Master v5.23.6 (http://cobamide2.bio.pitt.edu). tRNA genes were predicted using Aragorn v1.2.38 (10) and tRNAscanSE v2.0 (11). Proteins containing membrane-spanning domains were identified using TMHMM (12) and TOPCONS (13). Potential protein functions were predicted using HHPred [using the PDB_mmCIF70, SCOPe70, Pfam-A, and NCBI_Concerved_Domains databases (14)] and Blastp [NCBI non-redundant database (15)]. After auto-annotation, students manually curated predicted positions and function for each gene (16).
Genomes contained between 62 and 103 predicted genes, with 36%–40% of those genes having a predicted function (Table 1). tRNA genes were identified in BABullseye (3), CheetoDust (1), and Ageofdapage (1). Three genes out of a combined 521 total predicted protein-coding genes in these genomes were determined to be orphams or novel genes with no sequence homologs within the Actinobacteriophage database (6, 17). All six phages are predicted to be temperate phages based on genome content and cluster designation.
ACKNOWLEDGMENTS
We thank Daniel Russell and Rebecca Garlena for sequencing and assembling the genome, and the HHMI SEA-PHAGES program and VCU Life Sciences for support.
Contributor Information
Allison A. Johnson, Email: aajohnson@vcu.edu.
Kenneth M. Stedman, Portland State University, Portland, USA
DATA AVAILABILITY
The annotated genome sequences for each phage are available at GenBank: Ageofdapage (OR521076), Aubs (OP297542), BABullseye (OP434447), CheetoDust (OR521075), ShaboiShabazz (ON637759), and TomBrady (ON970618). Sequencing reads are available at the Sequence Read Archive (SRA) through accession numbers SRX22868876, SRX22868878, SRX22868879, SRX22868880, SRX22868882, and SRX22868883.
REFERENCES
- 1. Ul-Hassan A, Wellington EM. 2009. Actinobacteria, p 25–44. In Schaechter M (ed), Encyclopedia of Microbiology, Third Edition. Academic Press. [Google Scholar]
- 2. Dedrick RM, Smith BE, Cristinziano M, Freeman KG, Jacobs-Sera D, Belessis Y, Whitney Brown A, Cohen KA, Davidson RM, van Duin D, et al. 2023. Phage therapy of Mycobacterium infections: compassionate use of phages in 20 patients with drug-resistant mycobacterial disease. Clin Infect Dis 76:103–112. doi: 10.1093/cid/ciac453 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Poxleitner M, Pope W, Jacobs-Sera D, Sivanathan V, Hatfull G. n.d. HHMI SEA-PHAGES phage discovery guide. Published Online 2018. Available from: https://seaphagesphagediscoveryguide.helpdocsonline.com/home
- 4. Russell DA. 2018. Sequencing, assembling, and finishing complete bacteriophage genomes. Methods Mol Biol Clifton NJ 1681:109–125. doi: 10.1007/978-1-4939-7343-9 [DOI] [PubMed] [Google Scholar]
- 5. Hatfull GF. 2020. Actinobacteriophages: genomics, dynamics, and applications. Annu Rev Virol 7:37–61. doi: 10.1146/annurev-virology-122019-070009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Russell DA, Hatfull GF, Wren J. 2017. PhagesDB: the actinobacteriophage database. Bioinforma Oxf Engl 33:784–786. doi: 10.1093/bioinformatics/btw711 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Besemer J, Borodovsky M. 2005. GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res 33:W451–W454. doi: 10.1093/nar/gki487 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Salzberg SL, Delcher AL, Kasif S, White O. 1998. Microbial gene identification using interpolated Markov models. Nucleic Acids Res 26:544–548. doi: 10.1093/nar/26.2.544 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Rinehart CA, Gaffney B, Wood JD, Smith J. 2016. PECAAN, a phage evidence collection and annotation network. Published online 2016. Available from: https://discover.kbrinsgd.org/login
- 10. Laslett D, Canback B. 2004. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 32:11–16. doi: 10.1093/nar/gkh152 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Lowe TM, Chan PP. 2016. tRNAscan-SE on-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res 44:W54–W57. doi: 10.1093/nar/gkw413 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Krogh A, Larsson B, von Heijne G, Sonnhammer EL. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580. doi: 10.1006/jmbi.2000.4315 [DOI] [PubMed] [Google Scholar]
- 13. Tsirigos KD, Peters C, Shu N, Käll L, Elofsson A. 2015. The TOPCONS web server for consensus prediction of membrane protein topology and signal peptides. Nucleic Acids Res 43:W401–W407. doi: 10.1093/nar/gkv485 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Söding J, Biegert A, Lupas AN. 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33:W244–W248. doi: 10.1093/nar/gki408 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402. doi: 10.1093/nar/25.17.3389 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Pope WH, Jacobs-Sera D.. Annotation of bacteriophage genome sequences using DNA master: an overview. Methods Mol Biol Clifton NJ. 2018;1681:217-229. doi: 10.1007/978-1-4939-7343-9_16 [DOI] [PubMed] [Google Scholar]
- 17. Cresawn SG, Bogel M, Day N, Jacobs-Sera D, Hendrix RW, Hatfull GF. 2011. Phamerator: a bioinformatic tool for comparative bacteriophage genomics. BMC Bioinformatics 12:395. doi: 10.1186/1471-2105-12-395 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The annotated genome sequences for each phage are available at GenBank: Ageofdapage (OR521076), Aubs (OP297542), BABullseye (OP434447), CheetoDust (OR521075), ShaboiShabazz (ON637759), and TomBrady (ON970618). Sequencing reads are available at the Sequence Read Archive (SRA) through accession numbers SRX22868876, SRX22868878, SRX22868879, SRX22868880, SRX22868882, and SRX22868883.
