ABSTRACT
Here, we report the genome sequences of five bacteriophages isolated from hospital wastewater, including two new species and two candidates for therapeutic application. No virulence, temperate marker and antibiotic resistance genes were found in the genomes of Escherichia phage vB_VIPECOOM03 and Klebsiella phage vB_VIPKPNUMC01, making them suitable candidate for therapy.
KEYWORDS: whole-genome sequencing, bacteriophages, wastewater, philippines, phage therapy
ANNOUNCEMENT
Antimicrobial resistance is a global health concern. In Southeast Asia, infections with multidrug resistant (MDR) bacteria range from 4% to 68% (1), with bacteriophages recently explored to treat infection. Understanding phage diversity and genomic features is essential to develop an effective phage “cocktail” for therapeutic application. Here, we present the genome sequences of five bacteriophages isolated from hospital wastewater.
Sewage samples (1L) collected from Ospital ng Maynila Medical Center and De La Salle University Medical Center wastewater facilities were filtered using a 0.22-µm syringe filter and co-cultured (20 mL wastewater + 20 mL Luria-Bertani broth) with respective bacterial host (Table 1) for 24 h (agitation: 160 rpm). Phage isolation, purification, propagation, and titer determination were performed using plaque assay (2) and stored in SM buffer for further characterization.
TABLE 1.
Genome characteristics and accession numbers of the five bacteriophage genomes a
Phage name | Escherichia phage vB_VIPECOOM03 | Enterobacter phage vB_VIPECLOM01 | Enterobacter phage vB_VIPECLUMC02 | Klebsiella phage vB_VIPKPNUMC01 | Pseudomonas phage vB_VIPPAEUMC01 |
---|---|---|---|---|---|
Propagation host | Escherichia coli ATCC 25922 |
Enterobacter cloacae ATCC 13047 |
Enterobacter cloacae ATCC 13047 |
Klebsiella pneumoniae ATCC 13883 |
Pseudomonas aeruginosa ATCC 15442 |
No. of raw reads | 3,132,575 | 2,984,666 | 2,141,041 | 2,004,323 | 8,875,898 |
No. of clean reads | 2,702,757 | 2,798,181 | 1,891,580 | 1,812,001 | 8,031,297 |
Genome size (bp) | 168,519 | 171,903 | 172,129 | 167,797 | 92,158 |
GC content (%) | 35.49 | 39.83 | 39.80 | 39.55 | 49.35 |
Mean coverage | 6,309 | 5,448 | 4,003 | 4,137 | 9,978 |
No. of CDS | 269 | 288 | 287 | 279 | 173 |
No. of genes | 279 | 306 | 305 | 295 | 188 |
No. of genes with predicted function | 149 (53%) | 148 (48%) | 148 (48%) | 147 (50%) | 66 (35%) |
No. of tRNAs | 10 | 18 | 18 | 16 | 14 |
Head-neck tail organization | Myoviridae of neck type 2 |
Myoviridae of neck type 2 |
Myoviridae of neck type 2 |
Myoviridae of neck type 2 |
Myoviridae of neck type 1 (cluster 7) |
Lifestyle prediction | Lytic | Lytic | Lytic | Lytic | Lytic |
Temperate marker genes | – | Integrase (44,848–45,846) |
Integrase (44,978–45,976) |
– | cro gene (44,740–44,973) |
Presence of antibiotic resistance genes | – | – | – | – | – |
Presence of virulence genes | – | – | – | – | – |
BioProject accession no. | PRJNA943845 | PRJNA943928 | PRJNA943930 | PRJNA943931 | PRJNA943933 |
SRA accession no. | SRR23866065 | SRR23866658 | SRR23866659 | SRR23865899 | SRR23865898 |
GenBank accession no. | OQ721911 | OQ721912 | OQ721913 | OQ721914 | OQ721915 |
“–” indicates the absence of genes in phage genomes.
Genomic DNA was extracted using phage DNA isolation kit (Norgen Biotek Corp., Thorold, ON, Canada). DNA libraries were prepared using Illumina DNA prep kit, and whole-genome sequencing was performed at DOST-ITDI virology laboratory using Illumina Miseq platform (2 × 250 bp PE), generating ~3,827,700 reads per sample. Read quality was assessed using FastQC v0.11.9 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), trimmed with Trimmomatic v0.39 (3), and de novo assembled using Spades v3.14 (parameters: --careful --only-assembler -k 21,33,55,77,99,127) (4) followed by genome reorientation to reflect the rIIA/terL genes of the closest relative. To calculate genome coverage, reads were mapped back to the assembly using Bowtie2 v2.5.0 (5), indexed, and sorted with SAMtools v1.16.1 (6). Finally, genomes were polished using Pilon v1.24 (7), while quality and completeness were assessed using Quast v5.0.2 (8) and checkV v1.0.1 (9). Gene calling and annotations were conducted using Prokka v1.14.6 (10) utilizing PHROGs (11) database. Putative tRNAs and virion structural proteins were predicted using ARAGORN v1.2.41 (12) and STEP3 (13). Default parameters were used for all software unless otherwise specified.
Genome size ranges from 92,158 to 172,129 bp with GC content of 35.49%–49.35% and mean coverage of 4,003× to 9,978× (Table 1). CheckV (9) identified the genomes to be “complete” with direct terminal repeats and of high quality based on MIUViG criteria (14). In silico (15, 16) analysis predicted a lytic lifestyle and myoviridae-like morphology on all isolates; however, integrase and cro (repressor) genes were present in Enterobacter and Pseudomonas phage genomes suggesting access to temperate lifestyle. No virulence, toxin, or antibiotic resistance genes were detected using PhageLeads (17).
For taxonomic classification, closely related genomes were obtained from NCBI database, and intergenomic similarities were computed using VIRIDIC v1.1 (Fig. 1) (18). Following the International Committee on Taxonomy of Viruses guidelines for demarcation of virus taxonomic ranks (19, 20), we identified Escherichia and Pseudomonas phages as novel species belonging to genus Tequatrovirus (94.1%) and Pakpunavirus (93.5%) and will be classified as “Tequatrovirus vipecoom” and “Pakpunavirus vippaeumc.” The Enterobacter and Klebsiella phages shared >95% genomic similarity to known phage genomes (Fig. 1).
Fig 1.
Comparative genome analysis of the assembled and reference genomes using Virus Intergenomic Distance Calculator. The five closest relative was determined using BLASTn search in NCBI. Darker color indicates high intergenomic similarities between genomes, with percent similarities (%) of the closest relative highlighted in bold. Genome sequence similarity of ≥95% is the same species. The number at the lower left part is alignment indicators. A high fraction (orange to white) of the genome is aligned in closely related phages and is expected to have similar genome length (black to white). The three coliphages (vB_VIPECOOM01, vB_VIPECOOM02, and vB_VIPECOOM03) isolated from the same location are identical species, and only Escherichia phage vB_VIPECOOM03 is reported here. The phage isolates and their accession numbers are in blue font.
Thus far, we identified two candidate phages for phage therapy which may further be explored to develop endolysin-derived antimicrobial agents. More research on phage biology and genomic characterization is still needed to develop a broad host-range therapeutic cocktail against MDR pathogens.
ACKNOWLEDGMENTS
The authors thank Dr. Andrew Millard for guidance in preparing the taxonomic proposal to the International Committee on Taxonomy of Viruses (ICTV), the Computing and Archiving Research Environment (COARE) of the Department of Science and Technology-Advanced Science and Technology Institute (DOST–ASTI) for the computing resources, and the management of the Ospital ng Maynila Medical Center (OMMC) and the De La Salle University Medical Center (DLSUMC) for allowing us to collect sewage samples on their facility.
This study was funded by the Department of Science and Technology-Philippine Council for Health Research and Development (DOST–PCHRD) Grant number LFP-EBD-2021-02.
Contributor Information
Michael Angelou L. Nada, Email: mikeangelounada@gmail.com.
Ursela G. Bigol, Email: ugbigol@itdi.dost.gov.ph.
Catherine Putonti, Loyola University Chicago, Chicago, Illinois, USA .
DATA AVAILABILITY
The raw sequences (SRA) and genome assemblies were deposited in DDBJ/ENA/GenBank under the accession numbers listed in Table 1.
REFERENCES
- 1. Carascal MB, Dela Cruz-Papa DM, Remenyi R, Cruz MCB, Destura RV. 2022. Phage revolution against multidrug-resistant clinical pathogens in Southeast Asia. Front Microbiol 13:820572. doi: 10.3389/fmicb.2022.820572 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Clokie MR, Kropinski A. 2009. Methods and protocols, volume 1: Isolation, characterization, and interactions, p 69–81. In Methods in molecular biology. Humana press. [Google Scholar]
- 3. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. doi: 10.1038/nmeth.1923 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup . 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. doi: 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM, Wang J. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9:e112963. doi: 10.1371/journal.pone.0112963 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072–1075. doi: 10.1093/bioinformatics/btt086 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Nayfach S, Camargo AP, Schulz F, Eloe-Fadrosh E, Roux S, Kyrpides NC. 2021. CheckV assesses the quality and completeness of metagenome-assembled viral genomes. Nat Biotechnol 39:578–585. doi: 10.1038/s41587-020-00774-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068–2069. doi: 10.1093/bioinformatics/btu153 [DOI] [PubMed] [Google Scholar]
- 11. Terzian P, Olo Ndela E, Galiez C, Lossouarn J, Pérez Bucio RE, Mom R, Toussaint A, Petit M-A, Enault F. 2021. PHROG: families of prokaryotic virus proteins clustered using remote homology. NAR Genom Bioinform 3:lqab067. doi: 10.1093/nargab/lqab067 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Laslett D, Canback B. 2004. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 32:11–16. doi: 10.1093/nar/gkh152 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Thung TY, White ME, Dai W, Wilksch JJ, Bamert RS, Rocker A, Stubenrauch CJ, Williams D, Huang C, Schittelhelm R, Barr JJ, Jameson E, McGowan S, Zhang Y, Wang J, Dunstan RA, Lithgow T. 2021. Component parts of bacteriophage virions accurately defined by a machine-learning approach built on evolutionary features. mSystems 6:e0024221. doi: 10.1128/mSystems.00242-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Roux S, Adriaenssens EM, Dutilh BE, Koonin EV, Kropinski AM, Krupovic M, Kuhn JH, Lavigne R, Brister JR, Varsani A, Amid C, Aziz RK, Bordenstein SR, Bork P, Breitbart M, Cochrane GR, Daly RA, Desnues C, Duhaime MB, Emerson JB, Enault F, Fuhrman JA, Hingamp P, Hugenholtz P, Hurwitz BL, Ivanova NN, Labonté JM, Lee K-B, Malmstrom RR, Martinez-Garcia M, Mizrachi IK, Ogata H, Páez-Espino D, Petit M-A, Putonti C, Rattei T, Reyes A, Rodriguez-Valera F, Rosario K, Schriml L, Schulz F, Steward GF, Sullivan MB, Sunagawa S, Suttle CA, Temperton B, Tringe SG, Thurber RV, Webster NS, Whiteson KL, Wilhelm SW, Wommack KE, Woyke T, Wrighton KC, Yilmaz P, Yoshida T, Young MJ, Yutin N, Allen LZ, Kyrpides NC, Eloe-Fadrosh EA. 2019. Minimum information about an uncultivated virus genome (MIUViG). Nat Biotechnol 37:29–37. doi: 10.1038/nbt.4306 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. McNair K, Bailey BA, Edwards RA. 2012. PHACTS, a computational approach to classifying the lifestyle of phages. Bioinformatics 28:614–618. doi: 10.1093/bioinformatics/bts014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Lopes A, Tavares P, Petit M-A, Guérois R, Zinn-Justin S. 2014. Automated classification of tailed bacteriophages according to their neck organization. BMC Genomics 15:1027. doi: 10.1186/1471-2164-15-1027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Yukgehnaish K, Rajandas H, Parimannan S, Manickam R, Marimuthu K, Petersen B, Clokie MRJ, Millard A, Sicheritz-Pontén T. 2022. Phageleads: rapid assessment of phage therapeutic suitability using an ensemble machine learning approach. Viruses 14:342. doi: 10.3390/v14020342 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Moraru C, Varsani A, Kropinski AM. 2020. VIRIDIC-A novel tool to calculate the intergenomic similarities of prokaryote-infecting viruses. Viruses 12:1268. doi: 10.3390/v12111268 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Turner D, Kropinski AM, Adriaenssens EM. 2021. A roadmap for genome-based phage taxonomy. Viruses 13:506. doi: 10.3390/v13030506 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Turner D, Shkoporov AN, Lood C, Millard AD, Dutilh BE, Alfenas-Zerbini P, van Zyl LJ, Aziz RK, Oksanen HM, Poranen MM, Kropinski AM, Barylski J, Brister JR, Chanisvili N, Edwards RA, Enault F, Gillis A, Knezevic P, Krupovic M, Kurtböke I, Kushkina A, Lavigne R, Lehman S, Lobocka M, Moraru C, Moreno Switt A, Morozova V, Nakavuma J, Reyes Muñoz A, Rūmnieks J, Sarkar BL, Sullivan MB, Uchiyama J, Wittmann J, Yigang T, Adriaenssens EM. 2023. Abolishment of morphology-based Taxa and change to binomial species names: 2022 taxonomy update of the ICTV bacterial viruses subcommittee. Arch Virol 168:74. doi: 10.1007/s00705-022-05694-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The raw sequences (SRA) and genome assemblies were deposited in DDBJ/ENA/GenBank under the accession numbers listed in Table 1.