Summary
Phages exert profound evolutionary pressure on bacteria by interacting with receptors on the cell surface to initiate infection. While the majority of phages use chromosomally-encoded cell surface structures as receptors, plasmid-dependent phages exploit plasmid-encoded conjugation proteins, making their host range dependent on horizontal transfer of the plasmid. Despite their unique biology and biotechnological significance, only a small number of plasmid-dependent phages have been characterized. Here we systematically search for new plasmid-dependent phages targeting IncP and IncF plasmids using a targeted discovery platform, and find that they are common and abundant in wastewater, and largely unexplored in terms of their genetic diversity. Plasmid-dependent phages are enriched in non-canonical types of phages, and all but one of the 64 phages we isolated were non-tailed, and members of the lipid-containing tectiviruses, ssDNA filamentous phages or ssRNA phages. We show that plasmid-dependent tectiviruses exhibit profound differences in their host range which is associated with variation in the phage holin protein. Despite their relatively high abundance in wastewater, plasmid-dependent tectiviruses are missed by metaviromic analyses, underscoring the continued importance of culture-based phage discovery. Finally, we identify a tailed phage dependent on the IncF plasmid, and find related structural genes in phages that use the orthogonal type 4 pilus as a receptor, highlighting the promiscuous use of these distinct contractile structures by multiple groups of phages. Taken together, these results indicate plasmid-dependent phages play an under-appreciated evolutionary role in constraining horizontal gene transfer via conjugative plasmids.
Viral infections pose a constant threat to the majority of life on Earth1,2. Viruses recognize their hosts by interacting with structures (receptors) on the cell surface3. For viruses that infect bacteria (phages), these receptors are usually encoded on the chromosome, and are part of core cellular processes including transporter proteins or structurally integral lipopolysaccharides4. However, certain mobile genetic elements, such as conjugative plasmids, also contribute to the cell surface landscape by building secretory structures (e.g. type 4 secretion systems) which enable transfer into neighboring bacterial cells5,6. Plasmid-dependent phages (PDPs) have evolved to use these plasmid-encoded structures as receptors, and can only infect plasmid-containing bacteria7. Conjugative plasmids can often transmit between distantly related bacterial cells, creating new phage-susceptible hosts by horizontal transfer of receptors8.
Almost all previously identified PDPs belong to unusual ‘non-tailed’ groups of phages, some of which have more in common with eukaryotic viruses than the ‘tailed’ phages that make up the majority of bacterial virus collections9,10. This includes the dsDNA alphatectiviruses, and members of the ssDNA inoviruses and ssRNA fiersviruses. The handful of known PDPs have had profound impacts on molecular biology, enabling phage display technology11 (F plasmid-dependent phage M13), and in vivo RNA imaging12 (F plasmid-dependent phage MS2). PDPs have also aided in our understanding of the origin of viruses: tectiviruses are thought to represent ancient ancestors of adenoviruses13.
Predation by PDPs exerts strong selection on bacteria to lose conjugative plasmids or to mutate/repress the conjugation machinery including the pilus14–17. As antibiotic resistance genes are frequently carried and spread by conjugative plasmids18–21, selection against plasmid carriage functionally selects against antibiotic resistance in many instances. The extent to which this is a significant evolutionary pressure on antibiotic resistance depends on how frequent these phages are in nature.
Despite the remarkable properties of these phages and their intriguing association with conjugative plasmids, only a handful of PDPs exist in culture. In the 1970s-80s at least 39 different PDPs were reported targeting 17 different plasmid types (classified by “incompatibility” groups)7. However, most of these reports predated the era of genome sequencing, and to our knowledge, most of the reported PDPs have been lost to science. Here, we use a targeted discovery approach to show that PDPs are easily discoverable in the environment, and associated with unappreciated genetic and phenotypic diversity.
Co-culture enables direct discovery of plasmid-dependent phages
Plasmid-dependent phages have historically been identified and quantified by screening phage collections that were isolated on bacteria containing conjugative plasmids on isogenic plasmid-free bacteria, to look for plasmid-specific phenotypes22. As PDPs use the proteins expressed by conjugative plasmids as receptors, their host range mirrors plasmid host range, and typically crosses bacterial genera7. Exploiting this property, some studies have used multi-species enrichment methods to increase the likelihood of finding plasmid-dependent phages that can infect multiple different plasmid-containing hosts23,24. Alternatively, enrichment of PDPs by the de-enrichment of species-specific phages, so called “somatic” phages, has been described25. However, these enrichment methods do not allow the direct quantification of PDPs relative to somatic phages in a sample, and suffer from other drawbacks such as increased likelihood of repeated isolation of the same phage, and bias against PDPs that may use both species-specific and plasmid-encoded receptors. In order to isolate and directly assess the abundance and diversity of PDPs in the environment, we set out to develop a targeted isolation approach. The challenge of targeted isolation is discriminating PDPs, in a direct, non-labor-intensive way, from somatic phages, which may be more or less abundant than PDPs depending on the environmental sample, the plasmid in question and the host species26.
To differentiate PDPs, we co-cultured Salmonella enterica and Pseudomonas putida, a pair of taxonomically distinct bacteria with no known shared phages, that grew well in co-culture. We also selected a known PDP, the Alphatectivirus PRD1, which depends on IncP group conjugative plasmids such as RP4 and pKJK5, and can infect S. enterica and P. putida provided they contain an IncP plasmid. We made a modification to the traditional phage plaque assay, by co-culturing these strains with differential fluorescent tags, together in the same soft-agar lawn. After applying dilutions of phages, the plaquing phenotype of the PDP PRD1, which efficiently killed both fluorescently labeled strains in the lawn (resulting in no fluorescent signal) was immediately discernible from species-specific phage 9NA (infecting S. enterica) and SVOΦ44 (infecting P. putida) (Figure 1a). This observation formed the basis of the targeted phage discovery method we termed “Phage discovery by coculture” (Phage DisCo) (Figure 1b).
Figure 1 |. A method for systematic discovery of plasmid-dependent phages by fluorescence assisted co-culture (Phage DisCo).
a, Comparison between monocultured lawns and a co-cultured lawn. All images show merged GFP and mScarlet fluorescence channels (GFP shown in blue for visualization purposes). In monocultured lawns with exclusively S. enterica RP4 (red) or P. putida RP4 (blue), only plasmid-dependent phage PRD1 and the appropriate species-specific phages (S. enterica phage 9NA or P. putida phage SVOϕ44) generate plaques. In the co-culture lawn (magenta, showing the overlap of both bacterial hosts), the species-specific phages form plaques on one of the species while plasmid-dependent phage PRD1 forms plaques on both species. b, Schematic of the Phage DisCo method and screening strategy. Environmental samples were collected from around Boston, USA, and processed into a co-culture lawn with two plasmid-carrying bacterial hosts labeled with different fluorescent markers. After incubation, the plates were imaged in both fluorescence channels. The merged image was then used to distinguish species-specific phages (forming red or blue plaques) from plasmid-dependent phages (forming dark plaques). c, Imaging of co-cultured lawn with white light or fluorescent light channels, with approximately equimolar concentrations of phages shown in (a) to simulate a screening plate from an environmental sample containing plasmid-dependent and species-specific phages. Individual plaques are clearly discernible as 9NA (blue plaques), SVOϕ44 (red plaques), and PRD1 (dark plaques). d, Sankey diagram summarizing the properties of the 64 novel plasmid-dependent phages isolated in this study, broken down by plasmid-dependency, phage genus and genome type. 62 phages were obtained from the broad-host range phage DisCo screen (light gray horizontal blocks), and an additional 2 phages were isolated from a narrow-host range screen (dark gray horizontal blocks).
To directly isolate PDPs dependent on the IncP plasmids using Phage DisCo, environmental samples putatively containing PDPs can be mixed together with fluorescently labeled S. enterica and P. putida strains containing the IncP conjugative plasmid RP4 (Figure 1b). After growth of the bacterial lawn, phages are immediately identifiable by the fluorescence phenotype of their plaques: P. putida phages appear as red plaques where only S. enterica RP4 (red) is able to grow, S. enterica phages present as blue plaques where only P. putida RP4 (blue) is able to grow, and PDPs make colorless plaques where both bacteria in the lawn are killed (Figure 1b). As a proof of principle, we mixed equimolar amounts of the test phages, 9NA, SVOΦ44 and PRD1, to simulate an environmental sample containing both species-specific phages and PDPs (Figure 1c). After incubation and growth of the bacteria in the lawn, the plate was photographed using a custom fluorescence imaging setup (Methods). Once the two fluorescent image channels were digitally merged, all three phages were easy to identify by fluorescence phenotype, and importantly, the PRD1 plaques could be easily discerned from the plaques made by the two species-specific phages.
Having established the efficacy of the phage DisCo method using phages we had in culture, we set out to look for new PDPs in samples collected from compost, farm waste and wastewater in the Greater Boston area (Massachusetts, USA) (Table S1e). We chose to focus on phages depending on conjugative plasmids of the IncP and IncF incompatibility groups. IncP and IncF plasmids are also associated with extensive antibiotic resistance gene cargo and are frequently isolated from environmental27 or clinical28,29 samples, respectively. The archetypal IncF plasmid, the F plasmid originally isolated from E. coli K12, has a narrower host range than IncP plasmids, so we changed the DisCo hosts strains to E. coli and S. enterica. As S. enterica strains natively encode an IncF plasmid, we used a derivative that had been cured of all plasmids and prophages to mitigate any interference from these elements. Initially we collected 50 novel unique phages dependent on the IncP plasmid, and 12 dependent on the IncF plasmid. In order to identify any narrow-host range phages that may have been missed in our IncP-search using S. enterica and P. putida, we capitalized on the low abundance of P. putida phage in wastewater to perform a traditional single-host phage screen (Methods) that captured an additional 2 narrow host range phages dependent on the IncP plasmid. Therefore in total we collected 64 novel plasmid-dependent phages in this study (Figure 1d). All phages were further characterized by genome sequencing and we adopted a naming system wherein each phage was given a unique color identifier with a prefix consistent with previously isolated plasmid-dependent phages (Methods).
IncP plasmid-dependent tectiviruses from a limited geographic area fully encompass the previously known global diversity
Genomic analysis revealed that 51 of the 52 IncP-dependent phages in our collection belonged to the Alphatectivirus genus in the Tectiviridae family, and are related to Enterobacteriophage PRD1. Surprisingly, despite our sampling being limited to a small geographical area and short time frame, the phages we isolated represented significantly more genomic diversity than the six previously known plasmid-dependent tectiviruses that were isolated across multiple continents, suggesting these phages are greatly under sampled. We estimate our collection expands the genus Alphatectivirus from two species (represented by type isolates PRD1 and PR4) to 12, as determined by pairwise nucleotide identity of all Alphatectiviruses, including the six previously known Alphatectiviruses and our 51 new isolates (Figure S1a) (species cut-off <95% nucleotide identity, according to guidelines published by the International Committee on Taxonomy of Viruses (ICTV)30.
Additionally, by querying genome databases we identified one published tectivirus genome, Burkholderia phage BCE1, closely related to PRD1 by whole genome phylogeny (Figure 2a). As Burkholderia sp. are known hosts of IncP-type conjugative plasmids31 we expect that the Burkholderia cenocepacia host used to isolate BCE1 carried such a plasmid (highlighting the serendipitous nature by which PDPs are often found) and we include BCE1 in our known plasmid-dependent tectivirus phylogeny. Novel conjugative plasmids have recently been detected in Burkholderia contaminans isolates, after their existence was implicated by the isolation of alphatectiviruses on these strains32, suggesting Burkholderia species may be common hosts for these plasmids and phages.
Figure 2 |. Targeted discovery of plasmid-dependent phages reveals unappreciated diversity of alphatectiviruses.
a, Maximum likelihood tree of all known alphatectiviruses. (Generated with the whole genome, 14888 sites) Branch tips in red represent the novel phages isolated in this study. All other colors (highlighted in the enlarged section of the tree) represent all previously known representatives of this phage group. b, Map showing the site and isolation year of phages shown in (a). This collection includes and vastly expands the previously known diversity, despite sampling being more geographically and temporally constrained c, Nucleotide diversity across our collection of alphatectivirus genomes (n=51). The genome map is colored to better display the nucleotide diversity value inside the gene body. Red coloration in the gene arrow symbols indicates high nucleotide diversity and blue indicates low nucleotide diversity, values correspond to the histogram above.
While the new plasmid-dependent tectiviruses we report greatly expand the known diversity of this group of phages, we found that all 51 phages in our collection had perfectly conserved gene synteny (Figure S1b). Just like the 6 previously known alphatectiviruses, they have no accessory genome and contain homologs of all 31 predicted coding genes of the PRD1 reference genome, suggesting strong constraints on genomic expansion in this group of phages. However, the isolates in our collection contain a large number of single nucleotide polymorphisms (SNPs) distributed across the entire genome ~15 Kb genome (Figure 2c), and isolates ranged from 82.5% to 99% average pairwise nucleotide identity. Certain regions of the genome are highly associated with polymorphism across our collection, such as the center and C-terminus of the DNA polymerase gene, I (P1). Two small genes towards the end of the phage genome, XXXVII (P37) and XIX, (P19) are especially associated with nucleotide polymorphisms across our genome collection. Interestingly, XXXVII (also called gp v, P37) is the outer-membrane unit of a two-component spanin system thought to be responsible for fusion of the inner and outer membrane in the final stages of cell lysis33.
Newly isolated ssRNA and ssDNA phages targeting IncF and IncP plasmids
In total we isolated eight novel ssRNA phages (Figure 1d), seven targeting the IncF plasmid and one targeting IncP. All eight ssRNA phages were related to phages in the Fiersviridae family of ssRNA phages and had mostly syntenic genome architectures. Analysis of the sequence of the RNA-dependent RNA polymerase or replicase, rep, protein homologs of the eight phages in context of other reference phage sequences showed that three of the IncF-dependent phages belonged to the Emesvirus genus and were closely related to phage MS2. The other four IncF-dependent ssRNA phages were related to Qbeta, in the Qubevirus genus. Finally, the IncP-dependent ssRNA phage, PRRlime, was closely related to phage PRR1, suggesting it is the same species and the second isolate of the Perrunavirus genus. Interestingly, although PRRlime was the only IncP-dependent ssRNA phage we isolated, by amino acid identity of the replicase protein it is more closely related to the MS2 phage group, than the MS2 and Qbeta groups are to each other (Figure 3a). None of the new ssRNA phages in our collection exhibited <80% replicase protein amino acid identity to the closest reference isolate, and therefore do not meet proposed cutoff criteria for new ssRNA phage species34.
Figure 3 |. Phage DisCo uncovers new diversity even in the best-characterized (IncF) plasmid-dependent phage system.
a, b, Phylogenetic trees showing newly isolated (red brand tips) and known (gray branch tips) F dependent phages from the Fiersviridae family (ssRNA) and Inovirus (ssDNA) genera. Fiersviridae tree is based on the RNA-dependent RNA polymerase (635 amino acid sites), Inovirus tree is based on whole genome alignment (6425 nucleotide sites) c, Genome map of the novel IncF plasmid-dependent phage, FtMidnight, highlighting structural genes. Genes highlighted in green are predicted to be involved in head formation, the head-tail attachment and tail tube region is highlighted in blue, and distal tail region in pink. d, Transmission electron micrograph of FtMidnight, confirming it has siphovirus morphology (long non-contractile tail).
Finally, we isolated five new ssDNA phages targeting the IncF plasmid. All five phages were found to be related to the filamentous phage M13, within the Inovirus genus (Figure 3b). One of the novel inoviruses, FfLavender, was significantly different from others in our analysis and shared 88% identity to phage M13 at the nucleic acid level across the whole genome. In line with current taxonomic guidelines, we propose that FfLavender is progenitor of the novel species Inovirus lavender. In general, we observed less relative diversity in the IncF dependent phages than the IncP-dependent phages described above.
A novel tailed phage targeting the F plasmid is related to phages targeting an orthogonal contractile pilus
The final IncF plasmid-dependent phage we isolated, which we named FtMidnight, was found to have a 40,995 bp dsDNA genome containing putative tail genes (Figure 3c). This finding distinguishes FtMidnight from any known IncF plasmid-dependent phage, which all belong to non-tailed ssRNA or ssDNA phage groups. Transmission electron microscopy confirmed that FtMidnight is a tailed phage resembling the morphological class of flexible tailed siphoviruses (Figure 3d). To confirm the interaction of FtMidnight with the conjugation machinery of the F plasmid specifically, we sampled phage resistant micro-colonies from within FtMidnight plaques to obtain two resistant mutants that still encoded the plasmid-borne antibiotic resistance marker. The FtMidnight-resistant mutants were collaterally resistant to F-plasmid dependent phages MS2 and Qbeta (Figure S2c), and sequencing revealed the two mutants had independent SNPs causing a frameshift and a premature truncation of conjugation proteins TraA and TraF respectively (Figure S2b). Both these proteins are essential components of the conjugative pilus, suggesting that ablation of the conjugative pilus renders cells resistant to FtMidnight, and that the phage interacts directly with the pilus.
As there are no phage tail proteins known to interact with plasmid-encoded conjugation machinery, identification of the FtMidnight receptor binding protein might have long term applications in the development of alternative antimicrobial therapies35. To this end, we used structure-guided homology search to infer the probable structure and function of the FtMidnight tail proteins. Based on similarity to the distantly related marine roseophage vB_DshS-R4C36, we identified a cluster of 5 proteins that are predicted to compose the distal, receptor-interacting, end of the FtMidnight tail, gp18–22 (Figure 3c, S3a). By searching for homologs of these genes, we detected a number of siphoviral phage genomes that possess related distal tail regions to FtMidnight (Figure S3b). Intriguingly, many of these FtMidnight-related phages, which were isolated on hosts including Pseudomonas, Xylella and Stenotrophomonas, have been documented to use the type 4 pilus (also called T4P or type IV pili) as their receptors (Figure S3b, Table S2), an orthogonal contractile pilus that is thought to be unrelated to the conjugal pilus (a type 4 secretion system)37. Phylogenetic analysis of the five putative distal tail proteins of FtMidnight implicate gp18 as most likely to be involved in receptor recognition, as it is much more divergent from homologs in T4P-associated phages than the other 4 proteins, and we speculate it is specifically adapted to facilitate adsorption to the F plasmid pilus (Figure S3c).
Remarkably, both the ssRNA Fiersviridae and the ssDNA filamentous Inoviridae families include phages that use either the conjugative pilus or chromosomally-encoded T4P as receptors38,39. Therefore, the FtMidnight-like group of phages is the third example of a phage group that can use either contractile pilus structure fairly indiscriminately. This suggests that from a phage entry perspective, there is a high degree of functional overlap between the conjugative pilus and the T4P, which we hypothesize to be their biophysical, contractile nature.
Plasmid-dependent tectiviruses show substantial phenotypic differences despite perfectly syntenic genomes with no accessory genes
Plasmid-dependent phage host range is dependent on plasmid host range, and therefore phages dependent on broad-host range plasmids are required to replicate in diverse host cells. Indeed, tectiviruses (alphatectiviruses) dependent on the broad-host range IncP plasmid exhibit a remarkably wide host range40, surpassing the host breadth of any other described group of phages. This ability comes in stark contrast with their small genome size, perfect gene synteny, and lack of accessory genome. While the broad host-range phenotype of PRD1-like phages has been long appreciated41, the six-previously isolated PRD1-like phages have been assumed to be mostly phenotypically redundant, perhaps due to the high level of conservation between their genomes. However, subtle differences in efficiency of plating of some alphatectiviruses on different host strains was previously reported42. To explore the extent to which this constrained genomic diversity leads to phenotypic variation in our larger collection of PDPs, we constructed a set of 13 hosts of diverse Gammaproteobacteria, carrying the IncP conjugative plasmid pKJK5 (indicated by P). We initially observed that PDPs exhibited substantial differences in plating efficiency across hosts (Figure 4a). For example, while PRD1 is able to plaque efficiently in all but one of the hosts, PRDcerulean can only efficiently form plaques on Pseudomonas hosts, representing a decrease in plaquing efficiency of at least four orders of magnitude in most other hosts. In contrast, PRDchartreuse and PRDjuniper decrease their plaquing efficiency by a similar magnitude in P. putidaP when compared against P. fluorescensP. Notably, these isolates share >95% nucleotide identity to PRD1 and have no variation in gene content (Figure S1).
Figure 4 |. Plasmid-dependent tectiviruses have profoundly different host range preferences.
a, Plaque assays of 10-fold dilutions of five novel plasmid-dependent tectiviruses on diverse Gammaproteobacterial hosts containing the IncP conjugative plasmid pKJK5 (indicated by P). The five phages have large differences in plaquing efficiency on different host bacteria, despite being closely related by whole genome phylogeny (Figure 2a). b, Top shows examples of growth curve data for phages PRD1 and PRDcerulean on three host bacteria containing the pKJK5 plasmid. Bottom shows the same data, represented as liquid assay score. c, High throughput estimation of host range preferences for all the novel plasmid-dependent tectiviruses in our dataset by liquid growth curve analysis. Maximum likelihood trees at the left and bottom indicate the inferred phylogenetic relationships between phages (by whole genome phylogeny) and host bacteria (by 16S phylogeny). Grayed out rows are displayed for the 6 published alphatectiviruses that we were unable to collect host preference data for. Black diamonds on the base of the heatmap highlight phages with host range preferences that are referenced in the text.
We quantified host preference differences of all 51 phages on all 13 bacterial species using a high throughput liquid growth assay43. For each phage-host pair we calculated a liquid assay score (see Methods), which represents the growth inhibition incurred by a fixed phage concentration, normalized as a percentage relative to the host growth in a phage-free control (Figure 4b). We found that, consistent with earlier plaque assays (Figure 4a), the growth inhibition phenotype was highly variable across phage isolates (Figure 4c). We identified more examples of phages such as PRDmint and PRDcanary that displayed a host-specialist behavior, akin to that of PRDcerulean, while others, like PRDobsidian and PRDamber appeared to robustly inhibit the growth of a wide range of hosts (host generalism). Surprisingly, when looking at the data broadly, we found that neither the phage nor the host phylogenetic relationships were strong predictors of host-preference. To rule out that these host-preference differences are caused by sequence-specific anti-phage systems, we characterized the CRISPR-Cas and restriction modification (RM) systems encoded in the host’s genomes. We found that only 4 of the hosts encoded a complete Cas operon, and that none of the spacers in the CRISPR arrays matched any of the phages in our collection (Figure S4, Table S3). We also found that all but two of the bacterial hosts harbor at least one RM system. Although interactions between plasmid-dependent tectiviruses and RM systems might play a role in host range, it cannot fully explain the differences we observe. Instead, we speculate that these host preference patterns reflect adaptation of the plasmid-dependent tectiviruses to the physiologies of different host cells. The composition of natural polymicrobial communities containing IncP plasmids likely require PDPs to rapidly adapt to infect particular assortments of taxonomically distant hosts.
Holin protein variation contributes to host range differences in plasmid-dependent tectiviruses
To explore the genetic basis of the host-range preferences we focused on PRDcerulean, which was the only tectivirus isolated in our narrow-host IncP plasmid screen. PRDcerulean displays the most restricted host range of our collection, and only replicates efficiently on P. putidaP. On S. entericaP (and most IncP plasmid containing hosts in our screen) PRDcerulean does not make plaques even at high titers (Figure 4a). We reasoned that differences in host range between these phages are unlikely to be due to adsorption failure, as the receptor is encoded by an identical conjugative plasmid present in each host strain, and if there were pilus elaboration problems in any of the hosts, this should affect all phages in our collection rather than some phage individually. In line with this assumption, there was no difference in adsorption efficiency between PRD1 and PRDcerulean on S. entericaP (Figure S5a), indicating that the replication defect in S. entericaP is receptor independent, and probably occurs once the phage chromosome has reached the interior of the cell. To understand where the replication defect occurs, we conducted high multiplicity of infection (MOI) experiments of PRDcerulean on S. entericaP in order to isolate a spontaneous escape mutant that could form plaques. We identified a single rare mutant of PRDcerulean, cer1, which was able to make plaques on S. entericaP. Sequencing revealed cer1 contained a number of mutations relative to wildtype PRDcerulean, notably in the holin gene, which encodes the protein (P35) responsible for triggering the destruction of the bacterial cell wall during cell lysis44. To confirm this observation, we generated a chimeric phage, cer6, by recombining the P35-P36 region of PRDcerulean with the respective sequence from PRD1. Strikingly, exchange of these two proteins restored the plating efficiency of PRDcerulean on S. entericaP, although we note that the cer6 recombinant phage formed smaller plaques than PRD1 (Figure 5a).
Figure 5 |. Differences in the holin protein explain reduced host range of PRDcerulean.
a, Schematic representation of holin (P35) mutants, and their effect in plaque formation. Gray arrows represent the WT PRD1 holin gene, blue arrows represent the WT PRDcerulean holin gene. Amino acid changes are represented by purple, green and red marks on the holin gene body. To the right, the corresponding plaque assays of 10-fold dilutions on S. entericaP and P. putidaP are shown. Complete replacement of the holin gene, as well as specific mutations in the TMD2&3 restore the plating efficiency of PRDcerulean on S. entericaP b, Diagram of the predicted membrane topology of the holin protein (P35cer). It is predicted to have a short N-terminal periplasmic segment, 3 transmembrane domains (TMDs), and a longer disordered C-terminal region that extends into the cytoplasm. P= periplasm, IM= inner membrane and C = cytoplasm. Relevant mutations are colored as in panel a.
Analysis of the holin protein indicated that it is predicted to form 3 transmembrane domains (TMDs), with a short N-terminal periplasmic segment, and a longer disordered C-terminal region that extends into the cytoplasm (Figure 5b), a topology characteristic of class I holins. The holin protein of PRDcerulean has a distinct 5 amino acid motif at the C-terminal end, shared only with one other phage in our collection, PRDfuschia (Figure S5d). Despite this similarity at the C-terminus, the TMD1 and TMD2 of PRDcerulean differ significantly from those of PRDfuschia, which is able to replicate efficiently on S. entericaP. As the TMDs have been shown to be especially important for holin function45, we hypothesized that variations in these regions could be associated with PRDcerulean’s reduced host range, and we attempted to individually replace each of the regions corresponding to the TMDs of the PRDcerulean holin with the respective sequences of PRDfuschia, by recombination. Recombination with TMD1 did not yield any chimeric phages that could plaque on S. entericaP, but recombinants of TMD2 yielded phages that plaqued on S. entericaP almost as efficiently as cer6 (Figure 5a, Figure S3b). Additionally, to recapitulate the original variant we had seen in the spontaneous mutant, we replaced the TMD3 from PRDcerulean with that of cer1.
Sequencing of the resulting recombinant phages revealed that rather than recombining the entire region corresponding to the TMDs, the phages had recombined specific SNPs from within the donor TMD sequences, allowing us to pinpoint individual variants in the PRDcerulean holin that expand host range to S. entericaP. Cer9 had two amino acid changes within TMD2 and cer10 had a single amino acid change inside TMD3. Furthermore, the cer10 recombinant, which was associated with poor efficiency of plating and plaque size on S. entericaP, proved to be unstable in culture, and larger plaques spontaneously appeared after 1 round of replication. Re-sequencing of the larger plaque mutant, cer11, revealed it had acquired a frameshift mutation in the C-terminal end of the protein, which reverts the C-terminal motif close to the longer motif found in most variants in our collection, e.g. PRDaquamarine (Figure S5e). This indicates that the mutation in the C-terminal cytoplasmic end of the protein might have an epistatic interaction with the TMD mutations.
Mapping of these mutations onto the holin membrane topology prediction showed that they are clearly located within membrane-embedded portions of the holin protein, and further AlphaFold modeling of the holin secondary structure indicated that the TMD2&3 mutations may be spatially proximal in the native protein structure (Figure S5c). Overall, these results indicate that one of the largest hurdles for plasmid-dependent tectiviruses to achieve infection of diverse bacterial hosts may be adapting the phage lysis components to various host cell physiology, e.g. inner membrane composition. We speculate that the PRDcerulean lysis proteins may be specifically adapted to work in Pseudomonad hosts, and though the phage appears to have a narrow-host range in our limited screen, we may simply not be testing the natural hosts of PRDcerulean. Notably, none of the holin mutations affected replication on P. putidaP (Figure 5a, Figure S5b). The finding that altered host range is accessible within a small number of point mutations suggests there is immense functional flexibility encoded within the proteins of plasmid-dependent tectiviruses, and while there appears to be a strict constraint on genome size, these viruses may acquire accessory function through protein variation rather than gene gain, in contrast to tailed phages.
Metagenomic approaches fail to recover plasmid-dependent tectiviruses
Given the small number of plasmid-dependent tectiviruses known prior to this study (6, excluding BCE1) we were surprised by how readily-discoverable these phages were in our samples (though we note that low numbers of characterized representative phages does not necessarily reflect low environmental abundance). To quantify their absolute abundance, we used Phage DisCo to estimate the concentration of IncP plasmid-dependent phages in fresh influent from two wastewater sites in Massachusetts, USA, relative to species-specific phages of E. coli, S. enterica, and P. putida (Figure 6a). Phages dependent on the IncP plasmid RP4 were present in wastewater at approximately 1000 phages per mL, the same order of magnitude as species-specific phages of E. coli at ~4000 phages per mL. Species-specific phages of S. enterica and P. putida were less abundant than IncP-plasmid-dependent phages, present at ~100 phages per mL and ~5 phages per mL respectively. While this absolute quantification is limited by the use of a single strain to represent all species-specific phages, plasmid-dependent phage quantification may be similarly limited by use of a single plasmid to represent all IncP plasmids. Nevertheless, wastewater is considered one of the best samples in which to find E. coli and S. enterica phages, and therefore regardless of the accuracy of this relative abundance metric, the data show that these phages are common, at least in built environments (human-made environments). The extent to which this abundance is a characteristic of phages dependent on IncP type plasmids as opposed to PDPs in general remains to be seen, although these estimates are in line with reports of phage abundance for multiple different plasmid types in wastewater from Denmark and Sweden26.
Figure 6 |. Alphatectiviruses are underrepresented in metagenomic assembled viromes.

a, Abundance of plasmid-dependent phages in wastewater influent. Plasmid-dependent phages targeting the IncP plasmid RP4 are orders of magnitude more abundant than P. putida- and S. enterica- specific phages, in two independent wastewater influent samples. b, Histogram and scatter plot of reads classified as being of alphatectiviral origin, against total number of reads in each metagenomic sample analyzed. Colors indicate different BioProjects from the SRA, full metadata can be found in Supplementary Table S2.
Metagenomic-based viral discovery techniques have been extremely successful in expanding known viral diversity46–48. Although some studies have identified tectiviruses in metagenomic datasets49 and metagenomic-assembled genomes50, alphatectiviruses have yet to be found in metagenomic analyses, at odds with the relatively high abundance of the plasmid-dependent alphatectiviruses in wastewater (Figure 6a). With the increasing availability of metagenomic datasets, we decided to reexamine the presence of this group of phages in assembled metagnome collections. We queried the JGI IMG/VR database of uncultivated viral genomes and retrieved genomes with a match to the Pfam model PF09018, which corresponds to the PRD1 coat protein, which is conserved across all known tectiviruses. This search retrieved a set of diverse genomes in which, using refined models built from our alphatectivirus collection, we identified homology to diagnostic tectivirus proteins14, such as DNA polymerase (P1), Packaging ATPase (P9), and delivery genes (P18, P32) in addition to the coat protein (P3) used for the retrieval of these sequences (Figure S6b). However, none of the uncultivated viral genomes appear to belong to any of the pre-existing groups of isolated tectiviruses (Figure S6c) suggesting there is large unexplored diversity in the Tectiviridae family.
We tested if we could recover alphatectivirus sequences through metagenomic sequencing of our own wastewater samples, where we knew these phages were present at high abundance (around 1000 PFU/mL) (Figure 4a). We processed our samples by filtration, and further concentrated the viral fraction by 100-fold (Methods), before performing DNA extraction and bulk sequencing. We classified our metagenomics dataset with Kraken2 (See Methods) and found that a very small proportion of the reads (>0.001%) could be assigned to the alphatectivirus taxonomic group, which would not be sufficient for assembly (Figure 6b). This implied that, despite there being no assembled alphatectiviruses in public databases, they may still be identifiable in raw reads.
We then looked at additional published wastewater metagenomic sequencing datasets, and processed samples from diverse projects, representing different sequencing depths, locations, and sample processing methods, comprising a total of 290 samples and more than 5 billion reads total (Table S2). Over 75% of the samples contained 5 or fewer reads assigned to alphatectiviruses (Figure 6b). However, we found some alphatectivirus reads, primarily from the larger datasets, which directly mapped to the PRD1 reference genome (Figure S6a). The recovered reads appeared to be bona fide alphatectivirus sequences, as shown by the high mapping quality to the reference, a conservative approach that would fail to identify isolates with higher variation. Taken together, no single dataset we analyzed contains enough reads to assemble a complete alphatectivirus genome. We hypothesize that a combination of a low relative abundance, small genome size, and highly polymorphic population might be responsible for the absence of alphatectiviruses in metagenomic assembled genome collections. Overall, this finding points to a discordance between culture-based and metagenomic-based virus surveillance.
Discussion
Our finding that phages exploiting conjugative plasmid-encoded receptors are common and abundant in the urban environment suggests that PDPs act as an important and underappreciated constraint on the spread of conjugative plasmids in nature. Though studies have shown that conjugative plasmids can rapidly evolve resistance to PDPs17,51, these studies also suggest that resistance comes with a tradeoff in conjugation efficiency, such that phage-resistant plasmids cannot easily spread to new hosts. This suggests that with further study and discovery, PDPs could be exploited to manipulate the dynamics of conjugative plasmid mobility, and thus the spread of antibiotic resistance genes in high-risk environments. PDPs may be particularly applicable to controlling epidemics of plasmid acquired resistance, for example the current epidemic of carbapenem-resistant Enterobacteriaceae mobilized by IncX3 conjugative plasmids52–54.
A challenge in the potential translational application of plasmid-dependent phages may be ensuring that phage host range is sufficiently broad to avoid the formation of plasmid reservoirs in bacterial hosts that cannot be infected by phages. Our finding that plasmid-dependent tectiviruses have highly variable host range preferences reinforces the significance of this hurdle. However, our investigation into the basis of host range of these phages showed that this phenotype is, to some extent, genetically encoded in the lysis machinery of the phages. Further study is necessary to better understand the genetic basis of host range in plasmid-dependent tectiviruses and PDPs more broadly, but the expansion of host range of PRDcerulean via gene exchange may be an exciting step towards predicting and engineering the host range of these phages. From a virus evolution perspective, this finding illustrates the great functional flexibility contained within PDP lysis proteins, which we speculate may be necessary for rapid adaptation to new host cells, as their associated conjugative plasmids transmit across communities of diverse bacteria.
Our discovery of FtMidnight, along with the significant expansion of other known conjugative plasmid-dependent phage families, highlights the power of Phage DisCo to uncover new phage diversity. A more comprehensive understanding of the diversity of PDPs may shed light on outstanding questions as to the evolution of plasmid-dependency in phages. Indeed, our discovery that the F-pilus dependent phage FtMidnight is related to type 4 pilus targeting phages suggests that there may be some functional, if not evolutionary, relationship between these purportedly unrelated structures. It remains to be seen how this finding translates to other groups of plasmid-dependent phages. For example, only the Alphatectivirus genus within the broader Tectiviridae family are known to depend on plasmid-encoded receptors, and the receptors of other genera, such as the betatectiviruses that infect Bacillus species, are thought to be components of the cell wall55, although we note there is very little similarity between the spike proteins of PRD1 and the most characterized betatectivirus Bam3556. Likewise, a recent study identified a new pair of short tailed phages that are dependent on conjugative plasmids belonging to the IncN group57 but the receptor dependency of phages with related tail proteins is unknown. Further study of such viruses that are evolutionarily adjacent to plasmid-dependent groups may reveal parallel evolutionary routes to plasmid-dependency. Additionally, further characterization of the diversity of phage receptor-binding proteins that interact with plasmid-encoded pili could eventually facilitate the engineering of plasmid-targeting phenotypes into genetically engineered phages or phage-derived particles, which may offer long-term promise as alternative antimicrobial therapies35.
The relatively high abundance of IncP PDPs in wastewater as measured by culture-based methods contrasts with their absence from metagenomic datasets, indicating a blind spot in bulk-sequencing based approaches to detect certain groups of viruses. The biochemical properties of some viruses have been suggested to play a role in their depletion from metagenomic datasets, such as DNA genomes with covalently bound proteins58. Though we cannot rule out a similar phenomenon is responsible for the lack of plasmid-dependent tectiviruses in some metagenomic samples, our metagenomic extractions were protease treated yet had comparable abundance of plasmid-dependent tectiviruses relative to public datasets. We speculate that other factors might play a role, including the small genome size of PDPs relative to other viruses, low relative abundance compared to other viruses, and high within sample sequence diversity interfering with consensus-assembly based methods. Consistent with our observations, high strain heterogeneity has previously been shown to hinder metagenomic assembly of abundant marine viruses59, and benchmarking studies with simulated metagenomic data has found this this to be an intrinsic limitation of both viromic and metagenomic sequencing studies60,61. These discrepancies point to the continued need for systematic culture-based viral discovery and method innovation.
Though we chose to focus this initial study on conjugative plasmids that are already known to be targeted by PDPs, we anticipate that the Phage DisCo method will be generally applicable to identifying phages dependent on other conjugative plasmid systems, as well as translatable to further specialized phage discovery screens. The diversity and abundance of the PDPs we detected in the urban environment leads us to hypothesize that the interplay between phages and conjugative plasmids, both selfish genetic elements, may be driving the diversification of the conjugation systems mediating horizontal gene transfer in bacteria. This work represents a major first step in the large-scale exploration of this functional group of phages, and much remains to be discovered about their ecology and biology, including how they interact with the plethora of defense systems present in bacteria62.
Methods
Strains and growth conditions
Details of all bacterial strains, plasmids, phages and primers used and constructed in this study are available in Supplementary Table S1a–d. Unless stated otherwise, bacteria were grown at 37 °C or 30 °C in autoclaved LBLennox broth (LB: 10 g/L Bacto Tryptone, 5 g/L Bacto Yeast Extract, 5 g/L NaCl) with aeration (shaking 200 rpm) or on LB agar plates, solidified with 2% Bacto Agar at 37°C or 30 °C. Salt-free LBO media contained 10 g/L Bacto Tryptone, 5 g/L Bacto Yeast Extract. When required antibiotics were added at the following concentrations: 50 μg/mL kanamycin monosulfate (Km), 100 μg/mL ampicillin sodium (Ap), 20 μg/mL tetracycline hydrochloride (Tc), 30 μg/mL trimethoprim (Tm), 20 μg/mL chloramphenicol (Cm) and 20 μg/mL gentamicin sulfate (Gm).
Phage replication
Replication host strains for all phages used in this study are detailed in Supplementary Table S1c. High titer phage stocks were produced by adding ~105 Plaque Forming Units (PFU) to exponential phase cultures at approximately OD600 0.1, and infected cultures were incubated for at least 3 hours at 37 °C (with aeration). Phage lysates were centrifuged (10,000 × g, 1 min) and supernatants were filter-sterilized with 0.22 μm, syringe filters. Phage lysates were serial-diluted (decimal dilutions) with SM buffer and PFU enumeration was performed by double-layer overlay plaque assay63, as follows. Bacterial lawns were prepared with stationary phase cultures of the host strains, diluted 40 times with warm top agar (0.5 % agar in LB, 55 °C). The seeded top agar was poured on LB 2% agar bottom layer: 3 mL for 8.6 cm diameter petri dishes or 5 mL for 8.6 cm × 12 cm rectangular petri dishes. When required, antibiotics were added to the top agar at concentrations specified above.
Plasmid construction
The F plasmid from strain SVO150 was modified via recombineering to encode a gfp locus and kanamycin resistance locus (aph) for selection (FΔfinO::aph-Plac-gfp) to aid in conjugation and rapid identification of plasmid+ colonies. Briefly, SVO150 was electroporated with the pSIM5tet recombineering plasmid (Supplementary Table S1b), and the native IS3-interrupted finO locus was replaced with the aph-Plac-gfp cassette from pKJK5 using primers NQO2_9 and NQO2_12 as described (Koskiniemi et al., 2011). The replaced region was amplified with primers NQO2_5 and NQO2_6 and sent for Sanger sequencing to confirm the correct replacement.
Strain construction
For differential identification of plaques in coculture and transconjugant selection, constitutive sgfp2* or mScarlet-I loci along with a chloramphenicol resistance locus were added to E. coli, S. enterica and P. putida strains (Supplementary Table S1a). Tn7 transposons from pMRE-Tn7-145 and pMRE-Tn7-152 were introduced into the atttn7 site via conjugation from an auxotrophic E. coli donor strain as previously described64.
The RP4 plasmid was introduced into chromosomally tagged S. enterica and P. putida via conjugation using the BL103 donor strain. Overnight liquid cultures of donor and recipient strains were mixed at a 1:10 (donor:recipient) ratio and concentrated into a volume of 20 μl by centrifugation. The cell slurry was transferred to the top of a 12 mm, 0.45 μm nitrocellulose membrane on the surface of an LB agar plate for 4 hours at temperature optimal for the recipient strain (see Supplementary Table S1a) to permit conjugation. Transconjugants were selected by plating on LB supplemented with chloramphenicol and kanamycin. For FΔfinO::aph-Plac-gfp, a plasmid and prophage-cured S. enterica strain (SNW555, D23580 ΔΦ ΔpSLT-BT ΔpBT1 ΔpBT2 ΔpBT365) was used to mitigate any interference from the IncF Salmonella virulence plasmid (pSLT) and native prophages. The FΔfinO::aph-Plac-gfp plasmid was introduced into SNW555 and NQO62 via conjugation, exactly as described above.
For IncP-PDP host range experiments, the pKJK5 plasmid was transconjugated into Pseudomonas putida KT2440, Pectobacterium atrosepticum SCRI1043, Shewanella oneidensis MR1, Serratia marcescens ATCC 1388, Enterobacter cloacae ATCC 13047, Pseudomonas fluorescens Pf0-1, Klebsiella pneumoniae PCI 602, Citrobacter werkmanii IC19Y, Citrobacter freundii ATCC 8090, Edwardsiella tarda ATCC 15947, Proteus mirabilis BB2000 Δugd and Salmonella enterica serovar Typhimurium LT2 via the cross streak method. The pKJK5 plasmid contains gfp under the control of the Plac promoter, which results in derepressed fluorescence in non-E. coli (lac negative) hosts66. Additionally, the pKJK5 donor strain, NQO38, constitutively expresses mCherry, permitting easy identification of transconjugants without need for dual selection. Briefly, an overnight liquid culture of the donor strain NQO38 was applied vertically in a single streak down the center of an LB agar plate. Subsequently, an overnight liquid culture of a recipient strain was streaked horizontally across the plate, crossing over the donor streak. After incubation at the recipient optimal temperature, transconjugant colonies were purified on the basis of green fluorescence signal.
Optimization of PDP detection by fluorescence-enabled co-culture
To validate the use of fluorescence-enabled co-culture to detect PDPs, a S. enterica-specific phage (9NA), a P. putida-specific phage (SVOΦ44) and an IncP plasmid-dependent phage (PRD1) were mixed at equal concentration (approximately 103 PFU/mL). 100 μL each of overnight liquid cultures of S. enterica LT2 attTn7::Tn7-mScarlet-I + RP4 (NQO89) and P. putida attTn7::Tn7-SGFP2* + RP4 (NQO80) was added to 3 mL molten LB top agar, along with 10 μL of the phage mixture, and poured onto an LB agar plate. Plates were incubated overnight at 30 °C and then imaged in brightfield, red fluorescence channel, and green fluorescence channel using a custom imaging platform.
The custom imaging setup has a Canon EOS R camera with a Canon 100 mm lens with LEDs paired with excitation and emission filters (Green: 490–515 nm LED with 494 nm EX and 540/50 nm EM filters; Red: 567 nm LED with 562 nm EX and 641/75 nm EM filters). Excitation filters are held in a Starlight express emission filter wheel. The camera, LEDs, and filter wheel are all controlled with custom software. Exposure times were 0.25 [green] and 0.5 s [red], with camera set to ISO-200 and f/3.5 as experimentally determined to maximize dynamic range. Imagining parameters were selected such that when green and red fluorescence channel images were merged, all three phages could be easily identified by fluorescent plaque phenotype: 9NA phages were visible as green plaques (only P. putida attTn7::Tn7-SGFP2* + RP4 grows in these areas), SVOΦ44 plaques were visible as red plaques (only S. enterica LT2 attTn7::Tn7-mScarlet-I + RP4 grows in these areas) and PRD1 plaques had no fluorescent signal (neither species grew in these areas). The red and green channels were separated from their raw images, their exposure linearly rescaled, and remapped to the red and blue channels respectively (to enhance visual color contrast). All image manipulations were done with scikit-image v0.17.267.
Collection and processing of environmental samples
For phage isolation, wastewater primary influent from a total of 4 sites in Massachusetts were collected, along with soil, animal waste, and compost from farms, community gardens and parks close to Boston, USA. Sample collection details can be found in Table S1e (Environmental Samples). All samples were resuspended (if predominantly solid matter) in up to 25 mL of sterile water and incubated at 4 °C for 12 hours with frequent vortexing to encourage suspension and homogenization of viral particles. The resuspended samples were centrifuged at 4,000 × g for 30 minutes to pellet large biomass, and the clarified supernatant was filter sterilized using a 0.22 μm vacuum driven filtration unit to remove bacteria. Filtered samples were stored at 4 °C. For metaviromic sequencing and phage enumeration in wastewater influent, two 100 mL samples were collected in September 2022 from two separate intake sources of wastewater at a treatment plant in Boston, MA. Samples were processed by filtration as described above, except that processing was initiated immediately upon sample collection to avoid any sample degradation.
Isolation of novel environmental PDPs by fluorescence enabled coculture
For high throughput discovery of plasmid-dependent phages targeting the IncP plasmid pilus, co-culture lawns of S. enterica LT2 attTn7::Tn7-mScarlet-I + RP4 (NQO 89) and P. putida attTn7::Tn7-SGFP2* + RP4 (NQO80) were prepared as described earlier, except that 100 μl of filtered environmental samples putative novel phages were added instead of the reference phages. In cases where phage load in samples was too high, and subsequent lawn did not grow uniformly due to widespread lysis, the amount of filtered sample added to the lawns was diluted 10-fold until single plaques were obtained. Putative PDP plaques (exhibiting no fluorescence) were sampled using sterile filter tips, diluted and re-plated for single plaques at least twice to ensure purity. For the IncF plasmid targeting phages, the procedure was the same, except that strains SVO348 (E. coli MG1655 attTn7::mScatlet-I-gmR + FΔfinO::aph-gfp) and NQO87 (S. enterica D23580 ΔΦ ΔpSLT-BT ΔpBT1 ΔpBT2 ΔpBT3+ FΔfinO::aph-gfp) were used in the lawns. The plasmid and prophage cured strain of S. enterica was used for the IncF-dependent phage screen to mitigate interference from the native Salmonella virulence plasmid (which belongs to incompatibility group F68) and prophages.
Once putative novel PDPs had been purified from environmental samples, 5 μl drops of 10-fold dilutions were plated on lawns of isogenic plasmid free host strains (BL131, SVO126, SVO50 or SNW555) to confirm plasmid-dependency. We note that false positives (i.e plasmid independent phages that infected both species in the coculture) were occasionally obtained during the IncF PDP isolation, due to the phylogenetic proximity between E. coli and S. enterica, suggesting that use of more distinct host strains (if possible for the plasmid of interest) maximizes assay efficiency.
Phage DNA and RNA extraction and sequencing
Pure phage stocks that had undergone at least 2 rounds of purification from single plaques and had titers of at least 109 PFU/mL were used for nucleic acid extraction. The Invitrogen Purelink viral RNA/DNA mini kit was used to extract genetic material from all phages according to manufacturer instructions. High absorbance ratios (260/280) 2.0–2.2 were considered indicative of RNA phage genomes. To remove host material contamination, putative RNA samples were incubated with DNase I (NEB) for 1 hour at 37 °C and inactivated afterwards with EDTA at a final concentration of 5 mM. RNA was reverse transcribed using SuperScript™ IV VILO™ (Invitrogen™) for first strand synthesis, per the manufacturer’s instructions. Second strand synthesis was performed by incubating the cDNA with DNA Ligase, DNA Polymerase I, and RNase H in NEBNext® Second Strand Synthesis Reaction Buffer (NEB) at 16 °C for three hours. cDNA was then used in downstream library preparation. Additionally, as all known non-RNA IncF plasmid-dependent phages have ssDNA genomes which are incompatible with tagmentation-based library preparation, any putative DNA sample from IncF plasmid-dependent phages was subjected to second strand synthesis as described above. Illumina sequencing libraries of the DNA and cDNA samples were prepared as previously described69. Sequencing was carried out on the Illumina Novaseq or iSeq ro produce 150 bp paired end reads. To improve the assembly quality of the RNA phage genomes, we conducted a second round of sequencing of the same RNA samples using the NGS provider SeqCoast Genomics. RNA samples were prepared for whole genome sequencing using an Illumina Stranded Total RNA Prep Ligation with Ribo-Zero Plus Microbiome and unique dual indexes. Sequencing was performed on the Illumina NextSeq2000 platform using a 300 cycle flow cell kit to produce 150 bp paired reads. The genetic composition (dsDNA vs ssDNA) for phage FtMidnight was inferred via fluorescence signal using the Quant-IT dsDNA kit (Invitrogen).
For metaviromic DNA extraction, 45 mL of freshly filtered influent from each of the two extraction sites was concentrated 100 × into 500 μl using 100 kDa molecular weight cut off centrifugal filter units (Amicon). Nucleic acids were extracted from 200 μl of concentrated filtrate, and sent to SeqCenter for library preparation and Illumina sequencing. Sample libraries were prepared using the Illumina DNA Prep kit and IDT 10 bp UDI indices, and sequenced on an Illumina NextSeq 2000, producing 2×151 bp reads.
Phage genome assembly and annotation
Sequencing reads were adapter trimmed (NexteraPE adapters) and quality filtered with Trimmomatic v.0.3970. For samples with very high read depth, filtered reads were subsampled with rasusa v.0.5.071 to an approximate 200x coverage to facilitate assembly. The reads were then assembled with Unicycler v.0.4.872 or rnaviralSPAdes v.3.15.573. The annotations from curated PRD1, MS2, Qbeta and M13 reference genomes were transferred to the resulting assemblies with RATT v.1.0.374 and manually curated for completion. Phage isolates with redundant genomes were removed from the analysis and all phages included in this study represent unique isolates. Reads are deposited in the NCBI Sequence Read Archive (SRA) (accessions pending) All accession numbers for previously published genomes and those generated in this study are listed in Supplementary Table S1c.
Nucleotide diversity
To calculate nucleotide diversity among the alphatectiviruses, all the assembled isolates were aligned to the PRD1 reference genome with minimap2 v2.2475. Resulting alignments were processed with bcftools v1.976 and samtools v1.677 to then calculate nucleotide diversity with vcftools v0.1.1678 with a sliding window of size 100 bp. Results were plotted with seaborn v0.12.279 and matplotlib80. Novel species classifications for alphatectiviruses were proposed where average pairwise nucleotide diversity was less than 95%30.
Phage enumeration in wastewater by plaque assay
Two freshly filtered wastewater influent samples were processed as previously described (See Collection and processing of environmental samples) and the concentration of phages in volumes of 10, 100 and 500 μm were enumerated by single host plaque assay on strains SVO50, BL131, and SVO126 and by fluorescence enabled co-culture plaque assay on NQO89 and NQO80. All phage enumeration was performed with 3 biological replicates. Titers per mL were calculated and plotted for both sites.
Determination of phage host range
Host range of the IncP-PDPs was assessed by traditional efficiency of plating (EoP) assay or by killing in liquid culture by OD660 measurement, based on a previously described method43. All the phages were challenged against the following bacteria containing the pKJK5 plasmid: Pseudomonas putida KT2440, Pectobacterium atrosepticum SCRI1043, Shewanella oneidensis MR1, Serratia marcescens ATCC 1388, Enterobacter cloacae ATCC 13047, Pseudomonas fluorescens Pf0-1, Klebsiella pneumoniae PCI 602, Citrobacter werkmanii IC19Y, Citrobacter freundii ATCC 8090, Edwardsiella tarda ATCC 15947, Proteus mirabilis BB2000 Δugd and Salmonella enterica serovar Typhimurium LT2. These hosts were chosen as they all showed some degree of susceptibility to IncP dependent phages when transconjugated with the pKJK5 plasmid, indicating proper elaboration of the IncP pilus.
For the high throughput determination of host range, phages were normalized to a titer of 107 PFU/mL as measured in strain NQO36, with the exception of PRDchartreuse, PRDcanary, PRDjuniper, and PRDmamacita, which were normalized to the same titer in NQO37, due to their inability to replicate to high titers in NQO36. Growth curve experiments were set up in 96-well plates with each well containing 180 μL of bacterial culture at OD600 of ~0.1 and 20 μL of phage stock when appropriate, for a final concentration of 106 PFU/mL. They were grown in a plate reader (Tecan Sunrise™) for 10 hours with shaking, at the optimal temperature for the strain (see Supplementary Table S1a), measuring the optical density at 660nm, every 5 minutes. Each 96-well plate had a phage-free control, cell free control, and the strain-phage condition in triplicate. To calculate the liquid assay score of each host-phage pair we followed the method described previously43. Briefly, we calculate the area under the growth curve for each host-phage pair, as well as for its corresponding phage-free control grown in the same plate. The mean area under the curve value is then normalized as a percentage of the mean area under the curve in the phage-free control. Growth curves are plotted with shading representing the standard error. Liquid assays scores are plotted as a heatmap, and are vertically sorted according to the previously computed alphatectivirus tree and horizontally sorted according to a 16S tree of the bacterial hosts (See Supplementary Table S1a).
Adsorption assay
50 μl of exponentially growing SVO126 and NQO37 cells at a density of approximately 108 CFU/ml were mixed in a 96-well plate with 50 μl of PRD1 or PRDcerulean at a density of 106 PFU/ml to achieve a multiplicity of infection (M.O.I.) of ~0.01. Adsorption was done in triplicate for each strain-phage combination, and cell-free media controls were used in place of cells to quantify the maximum unadsorbed concentration of phage. After 10 min adsorption time at 37°C, the 96-well plate containing the cell-phage mixtures was mounted on top of a sterile 96-well MultiScreenHTS GV Filter Plate with sterile 0.22 μm membrane (Millipore) and centrifuged at 4000 × g to remove cells and adsorbed phages. Unadsorbed phage were quantified by serial dilution of the filtrate and plaque assay as described in Phage Replication above. Unadsorbed phage were represented as a percent of the maximum unadsorbed phage derived from cell-free media controls.
Phage recombination
To replace the holin gene of PRDcerulean, S. enterica-RP4 with recombineering plasmid pSIM5tet (SVO296) was used. Briefly, bacteria were grown to exponential phase in LB at 30 °C, with selective antibiotics for both plasmids, as specified in Table S1b. The culture was then infected with high titer PRDcerulean lysate to a final concentration of ~107 p.f.u/mL for 15 min. The culture was then induced for recombination for 15 min at 42 °C. Electro-competent cells were then prepared by cooling the cells for 10 minutes followed by three washes with cold sterile Mili-Q water at a 1:1 volume, and concentrated 50 times in cold sterile Mili-Q water. Competent cells were then mixed with ~100 ng of DNA in 1mm gap cuvettes and electroporated (1.8 kV, 25 μF, 200 Ω). Electroporated bacteria were mixed with 100 μL of fresh overnight liquid culture of S. enterica RP4 (NQO89) and the mixture was plated in a double-layer overlay plaque assay as described above. Only successful recombinant phages formed plaques on the bacterial lawn, and those where isolated and purified for further analysis. The holin gene DNA substrates for recombination were obtained with primers NQO3_13 - NQO3_20.
Holin structure prediction
The topology of the PRDcerulean holin protein (P35cer) was predicted with the CCTOP v1.1.0 web server81 and drawn with Protter v.1.082. The structure was predicted with ColabFold v1.5.383, and rendered with PyMol v.2.5.684. Model parameters are specified in the code repository. The holin multiple sequence alignment was generated with clustalo v1.2.485 and visualized with UGENE v.38.186.
FtMidnight-resistant mutants
To isolate mutants of the F-plasmid that were spontaneously resistant to FtMidnight, a dilution series of phage was plated on SVO348, with kanamycin in the top agar in order to select against phage resistance via plasmid loss (the F plasmid derivate FΔfinO::aph-Plac-gfp contains a kanamycin resistance marker), as already described in Phage Replication above. Plates were incubated for 24 hours at 37 °C, and then a further 24 hours at room temperature, after which phage resistant micro-colonies were visible within the zones of phage lysis. Two independent colonies were picked and restreaked onto LB kanamycin agar. Restreaking was repeated once more to ensure purity. To understand resistance phenotype, the mutants were screened by plaque assay for susceptibility to FtMidnight, MS2 and Qbeta. To find the causative mutations, genomic DNA was extracted using the Quick-DNA Miniprep Plus Kit (Zymo) according to manufacturer instruction, and sequenced to >30x genome coverage with Nanopore technology using v14 library preparation chemistry and an R10.4.1 flow cell by the provider Plasmidsaurus’ bacterial genome sequencing service.
FtMidnight structural annotation and homology search
The genome of FtMignight was originally annotated with prokka v1.14.687 using the PHROGs database88. Annotations were manually curated to identify specific structural components by template-based homology search against the PDB_mmCIF70 database with HHpred through the MPI Bioinformatics Toolkit89. The structure model for FtMidnight was based on these structural hits, which can be found in Table S2. A list of phage genomes containing homologs of the gp18-gp22 proteins was collected by searching with the tblastn90 web server against the nucleotide collection. A selection of phages with a conserved distal tail region were visualized with clinker91, and the receptor was annotated if found in the literature. Accessions and references can be found in Table S2.
Annotation of CRISPR-Cas and RM in bacterial genomes
CRISPR-Cas systems and spacers were annotated with CRISPRCasTyper v1.8.092, and RMs were annotated with DefenseFinder v1.0.993. All the spacers were then searched with blastn v2.15.094 against the complete alphatectivirus genomes, but no hits were recovered from this search. All the Accessions to bacterial genomes can be found in Table S1a, and results of this search are included in Table S3.
Search and comparison of tectiviruses in metagenomic assembled genomes
To collect metagenomic assembled genomes of tectiviruses, a search was performed in the JGI IMG/VR95 for uncultivated viral genomes (UViGs) matching Pfam model PF0901896, which corresponds to the tectivirus capsid protein. The recovered assemblies were annotated with prokka v1.14.697 using the PHROGs database88. To refine these annotations, our large collection of alphatectiviruses was used to build protein alignments for each protein in the PRD1 genome, using clustalo v1.2.4.85 and manually curating them for quality. These alignments were then used to build hmm profile models with HMMER v3.3.198, to search them against the collected tectivirus MAGs. A representative selection of annotated MAGs was selected and visualized with clinker v0.0.2891 and colored to show homology. Shaded connectors represent proteins with >0.3 sequence identity, while annotations with the same color represent significant (p < 0.01) homologs according to the HMMER search.
Search for alphatectiviruses in metagenomic reads
Kraken2 v2.1.299 was used to search for the presence of alphatectiviruses reads in metagenomic datasets. A custom database was built by adding our new alphatectivirus assemblies to the default RefSeq viral reference library. With this database, a collection of reads from wastewater sequencing projects was searched. The SRA BioProject accession numbers of this collection can be found in Supplementary Table S2. The individual reads from each sequencing run that were classified as belonging to alphatectiviruses according to Kraken2 were extracted and mapped to the PRD1 reference genome with minimap2 v2.22. The resulting mapped reads were processed with samtools v1.6 and visualized with IGV v2.11.4100.
Phylogenetic trees
For the Alphatectivirus tree, all previously published genomes and those collected in this study were aligned with clustalo v1.2.4. The resulting multiple sequence alignment was manually curated to ensure quality of the alignment. The tree was then built with iqtree v2.2.0.3101, and visualized with iTOL v6.7102.
For the Fiersviridae and Inovirus trees, the protein sequence of the RNA-dependent RNA polymerase (replicase) or the whole nucleotide content, respectively, were aligned. For the FtMidnight distal-tail protein trees, one alignment per protein was generated. All alignments were performed with clustalo v1.2.4, the trees were then generated with phyml v3.2.0103 and visualized with FigTree v.1.4.4104.
For the tectivirus ATPase tree, the amino acid sequences for protein P9 (ATPase) from all known tectiviruses were aligned with clustalo v1.2.4. This alignment was used to create an hmm profile model with HMMER, which was then used to search the amino acid sequences extracted from the annotated MAGs (see Search for tectiviruses in metagenomic assembled genomes). Significant hits were extracted and aligned to the model with HMMER. We also included in this alignment the previously metagenomic-assembled tectiviruses listed in Yutin et al.50 and a selection of characterized representatives of the 5 tectivirus genera. A tree of the resulting ATPase alignment was built with phyml v3.2.0, and visualized with iTOL v6.7.
Specific alignment and tree building parameters can be found in the code repository. All accession numbers of sequences used to build these trees are listed in Supplementary Table S2.
Electron microscopy
Carbon grids were glow discharged using a EMS100x Glow Discharge Unit for 30 seconds at 25mA. High titer phage stocks were diluted 1:10 in water and 5 μL was adsorbed to the glow discharged carbon grid for 1 minute. Excess sample was blotted with filter paper and the grids were washed once with water before staining with 1% uranyl acetate for 20 seconds. Excess stain was blotted with filter paper and the grids were air dried prior to examination with a Tecnai G2 Spirit BioTWIN Transmission Electron Microscope at the Harvard Medical School Electron Microscopy Facility.
Supplementary Material
Figure S1. Pairwise nucleotide identity and gene synteny of alphatectiviruses
Figure S2. Evidence that FtMidnight uses the F pilus as a receptor
Figure S3. The distal tail of FtMidnight is related to phages that use the orthogonal type 4 pilus as a receptor
Figure S4. Presence of sequence-specific phage defense systems in the genomes of host bacteria used in this study
Figure S5. Mutations in the holin protein confer expanded host range in PRDcerulean
Figure S6. Alphatectiviruses are not recovered from metagenomic sequencing datasets
Table S1. Spreadsheet listing all bacterial strains, plasmids, phages, and primers and environmental samples used in this study.
Table S2. Spreadsheet listing all phage genomes used for comparative genomics analyses (tectiviruses, fiersviruses, inoviruses, FtMidnight-relatives), and SRA datasets for metagenomic analyses.
Table S3. Spreadsheet listing CRISPR-Cas and RM hits in bacterial hosts.
Acknowledgements
We are grateful for the gifts of bacterial strains, plasmids, phages or wastewater from the labs of Uli Klümper, Catherine Putonti, George O’Toole, Karine Gibbs, Jay Hinton, Pamela Silver and Ameet Pinto. We thank the other instructors and students of the HMS Phages 2022 summer course: Thomas Bernhardt, Amelia McKitterick, Kate Hummels, Thomas Bartlett, Nawonh Charles, Melanie Justice, Tosin Bademosi and Ahadu Molla, which was partially supported by the HHMI Science Education Alliance. NQO thanks the Marine Biological Laboratory at Woods Hole and all instructors from the 2019 Microbial Diversity course. Electron microscopy imaging and consultation were performed in the HMS Electron Microscopy Facility. Custom instrumentation was built with assistance from the Research Instrumentation core at Harvard Medical School. Computational work used the O2 cluster supported by the Research Computing Group at Harvard Medical School. This work was supported by the NIGMS of the National Institutes of Health (R35GM133700), the David and Lucile Packard Foundation, the Pew Charitable Trusts, Alfred P. Sloan Foundation and NSF grant IOS-2331228. NQO acknowledges support from Consejo Nacional de Ciencia y Tecnología (CONACYT, México). MGM, EAR, RP and JSP acknowledge support from the Systems, Synthetic and Quantitative Biology PhD program training award (T32GM135014). ACF was supported in part by the NSF-Simons Center for Mathematical and Statistical Analysis of Biology at Harvard (award number #1764269), and the Harvard Quantitative Biology Initiative.
Data and code availability
Raw sequencing reads have been deposited in the NCBI BioProject database under accession number PRJNA954020. Accession numbers for novel phage genomes generated in this study can be found in Supplementary Table S1c. All code and raw data used in figures are available on a github repository: https://github.com/baymlab/2023_QuinonesOlvera-Owen
References
- 1.Greene S. E. & Reid A. Viruses Throughout Life & Time: Friends, Foes, Change Agents: A Report on an American Academy of Microbiology Colloquium San Francisco // July 2013. (American Society for Microbiology, 2013). [PubMed] [Google Scholar]
- 2.Lefkowitz E. J. et al. Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV). Nucleic Acids Res 46, D708–D717 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dimitrov D. S. Virus entry: molecular mechanisms and biomedical applications. Nat Rev Microbiol 2, 109–122 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Bertozzi Silva J., Storms Z. & Sauvageau D. Host receptors for bacteriophage adsorption. FEMS Microbiology Letters 363, fnw002 (2016). [DOI] [PubMed] [Google Scholar]
- 5.Waksman G. From conjugation to T4S systems in Gram-negative bacteria: a mechanistic biology perspective. EMBO reports 20, e47012 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Goessweiner-Mohr N., Arends K., Keller W. & Grohmann E. Conjugation in Gram-Positive Bacteria. Microbiology Spectrum 2, 2.4.19 (2014). [DOI] [PubMed] [Google Scholar]
- 7.Frost L. S. Conjugative Pili and Pilus-Specific Phages. in Bacterial Conjugation (ed. Clewell D. B.) 189–221 (Springer US, 1993). doi: 10.1007/978-1-4757-9357-4_7. [DOI] [Google Scholar]
- 8.Bottery M. J. Ecological dynamics of plasmid transfer and persistence in microbial communities. Curr Opin Microbiol 68, None (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Mäntynen S., Sundberg L.-R., Oksanen H. M. & Poranen M. M. Half a Century of Research on Membrane-Containing Bacteriophages: Bringing New Concepts to Modern Virology. Viruses 11, 76 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wolf Y. I. et al. Origins and Evolution of the Global RNA Virome. mBio 9, e02329–18 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Barderas R. & Benito-Peña E. The 2018 Nobel Prize in Chemistry: phage display of peptides and antibodies. Anal Bioanal Chem 411, 2475–2479 (2019). [DOI] [PubMed] [Google Scholar]
- 12.George L., Indig F. E., Abdelmohsen K. & Gorospe M. Intracellular RNA-tracking methods. Open Biology 8, 180104 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Koonin E. V., Krupovic M. & Yutin N. Evolution of double-stranded DNA viruses of eukaryotes: from bacteriophages to transposons to giant viruses. Ann. N.Y. Acad. Sci. 1341, 10–24 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Jalasvuori M., Friman V.-P., Nieminen A., Bamford J. K. H. & Buckling A. Bacteriophage selection against a plasmid-encoded sex apparatus leads to the loss of antibiotic-resistance plasmids. Biology Letters 7, 902–905 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Colom J. et al. Sex pilus specific bacteriophage to drive bacterial population towards antibiotic sensitivity. Scientific Reports 9, 12616 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Penttinen R., Given C. & Jalasvuori M. Indirect Selection against Antibiotic Resistance via Specialized Plasmid-Dependent Bacteriophages. Microorganisms 9, 280 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ojala V., Laitalainen J. & Jalasvuori M. Fight evolution with evolution: plasmid-dependent phages with a wide host range prevent the spread of antibiotic resistance. Evol Appl 6, 925–932 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.DelaFuente J. et al. Within-patient evolution of plasmid-mediated antimicrobial resistance. 2022.05.31.493991 Preprint at 10.1101/2022.05.31.493991 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Anderson R. M. The pandemic of antibiotic resistance. Nature Medicine 5, 147–149 (1999). [DOI] [PubMed] [Google Scholar]
- 20.Getino M. & de la Cruz F. Natural and Artificial Strategies To Control the Conjugative Transmission of Plasmids. Microbiology Spectrum 6, 6.1.03 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Conlan S. et al. Plasmid Dynamics in KPC-Positive Klebsiella pneumoniae during Long-Term Patient Colonization. mBio 7, e00742–16 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Vinjé J., Oudejans S. J. G., Stewart J. R., Sobsey M. D. & Long S. C. Molecular Detection and Genotyping of Male-Specific Coliphages by Reverse Transcription-PCR and Reverse Line Blot Hybridization. Applied and Environmental Microbiology 70, 5996 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Meynell G. G. & Lawn A. M. Filamentous phages specific for the I sex factor. Nature 217, 1184–1186 (1968). [DOI] [PubMed] [Google Scholar]
- 24.Bradley D. E., Coetzee J. N., Bothma T. & Hedges R. W. Phage X: A Plasmid-dependent, Broad Host Range, Filamentous Bacterial Virus. Microbiology 126, 389–396 (1981). [DOI] [PubMed] [Google Scholar]
- 25.Nuttall D., Maker D. & Colleran E. A method for the direct isolation of IncH plasmid-dependent bacteriophages. Letters in Applied Microbiology 5, 37–40 (1987). [Google Scholar]
- 26.He Z., Parra B., Nesme J., Smets B. F. & Dechesne A. Quantification and fate of plasmid-specific bacteriophages in wastewater: Beyond the F-coliphages. Water Research 227, 119320 (2022). [DOI] [PubMed] [Google Scholar]
- 27.Popowska M. & Krawczyk-Balska A. Broad-host-range IncP-1 plasmids and their resistance potential. Frontiers in Microbiology 4, (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Turton J. F. et al. Wide distribution of Escherichia coli carrying IncF plasmids containing blaNDM-5 and rmtB resistance genes from hospitalized patients in England. Journal of Medical Microbiology 71, 001569 (2022). [DOI] [PubMed] [Google Scholar]
- 29.Woodford N. et al. Complete Nucleotide Sequences of Plasmids pEK204, pEK499, and pEK516, Encoding CTX-M Enzymes in Three Major Escherichia coli Lineages from the United Kingdom, All Belonging to the International O25:H4-ST131 Clone. Antimicrobial Agents and Chemotherapy 53, 4472–4482 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Adriaenssens E. & Brister J. R. How to Name and Classify Your Phage: An Informal Guide. Viruses 9, 70 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Brooks L. E., Kaze M. & Sistrom M. Where the plasmids roam: large-scale sequence analysis reveals plasmids with large host ranges. Microbial Genomics 5, e000244 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Stanton C. R., Petrovski S. & Batinovic S. Isolation of a PRD1-like Phage Uncovers the Carriage of Three Putative Conjugative Plasmids in Clinical Burkholderia Contaminans. Preprint at 10.20944/preprints202304.0153.v1 (2023). [DOI] [PubMed] [Google Scholar]
- 33.Krupovič M., Cvirkaitė-Krupovič V. & Bamford D. H. Identification and functional analysis of the Rz/Rz1-like accessory lysis genes in the membrane-containing bacteriophage PRD1. Molecular Microbiology 68, 492–503 (2008). [DOI] [PubMed] [Google Scholar]
- 34.Callanan J. et al. Expansion of known ssRNA phage genomes: From tens to over a thousand. Science Advances 6, eaay5981 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.MacNair C. R., Rutherford S. T. & Tan M.-W. Alternative therapeutic strategies to treat antibiotic-resistant pathogens. Nat Rev Microbiol 1–14 (2023) doi: 10.1038/s41579-023-00993-0. [DOI] [PubMed] [Google Scholar]
- 36.Huang Y. et al. Structure and proposed DNA delivery mechanism of a marine roseophage. Nat Commun 14, 3609 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hospenthal M. K., Costa T. R. D. & Waksman G. A comprehensive guide to pilus biogenesis in Gram-negative bacteria. Nat Rev Microbiol 15, 365–379 (2017). [DOI] [PubMed] [Google Scholar]
- 38.Hay I. D. & Lithgow T. Filamentous phages: masters of a microbial sharing economy. EMBO Reports 20, e47427 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Tittes C., Schwarzer S. & Quax T. E. F. Viral Hijack of Filamentous Surface Structures in Archaea and Bacteria. Viruses 13, 164 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Olsen R. H., Siak J.-S. & Gray R. H. Characteristics of PRD1, a Plasmid-Dependent Broad Host Range DNA Bacteriophage. Journal of Virology 14, 689–699 (1974). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bamford D. H., Caldentey J. & Bamford J. K. H. Bacteriophage Prd1: A Broad Host Range Dsdna Tectivirus With an Internal Membrane. in Advances in Virus Research (eds. Maramorosch K., Murphy F. A. & Shatkin A. J.) vol. 45 281–319 (Academic Press, 1995). [DOI] [PubMed] [Google Scholar]
- 42.Bamford D. H., Rouhiainen L., Takkinen K. & Soderlund H. Comparison of the Lipid-containing Bacteriophages PRD1, PR3, PR4, PR5 and L17. Journal of General Virology 57, 365–373 (1981). [DOI] [PubMed] [Google Scholar]
- 43.Xie Y., Wahab L. & Gill J. J. Development and Validation of a Microtiter Plate-Based Assay for Determination of Bacteriophage Host Range and Virulence. Viruses 10, 189 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Wang I. N., Smith D. L. & Young R. Holins: the protein clocks of bacteriophage infections. Annu Rev Microbiol 54, 799–825 (2000). [DOI] [PubMed] [Google Scholar]
- 45.Bläsi U., Fraisl P., Chang C.-Y., Zhang N. & Young R. The C-Terminal Sequence of the λ Holin Constitutes a Cytoplasmic Regulatory Domain. J Bacteriol 181, 2922–2929 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Nayfach S. et al. A genomic catalog of Earth’s microbiomes. Nature Biotechnology 1–11 (2020) doi: 10.1038/s41587-020-0718-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Roux S. et al. Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes. Nat Microbiol 4, 1895–1906 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Edgar R. C. et al. Petabase-scale sequence alignment catalyses viral discovery. Nature 602, 142–147 (2022). [DOI] [PubMed] [Google Scholar]
- 49.Strange J. E. S., Leekitcharoenphon P., Møller F. D. & Aarestrup F. M. Metagenomics analysis of bacteriophages and antimicrobial resistance from global urban sewage. Sci Rep 11, 1600 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Yutin N., Bäckström D., Ettema T. J. G., Krupovic M. & Koonin E. V. Vast diversity of prokaryotic virus genomes encoding double jelly-roll major capsid proteins uncovered by genomic and metagenomic sequence analysis. Virol J 15, 67 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Grahn A. M., Haase J., Lanka E. & Bamford D. H. Assembly of a functional phage PRD1 receptor depends on 11 genes of the IncP plasmid mating pair formation complex. Journal of bacteriology 179, 4733–4740 (1997). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Guo X. et al. Global prevalence, characteristics, and future prospects of IncX3 plasmids: A review. Frontiers in Microbiology 13, (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Mouftah S. F. et al. Epidemic IncX3 plasmids spreading carbapenemase genes in the United Arab Emirates and worldwide. Infect Drug Resist 12, 1729–1742 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Liakopoulos A. et al. Genomic and functional characterisation of IncX3 plasmids encoding blaSHV-12 in Escherichia coli from human and animal origin. Sci Rep 8, 7674 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Gaidelytė A., Cvirkaitė-Krupovic V., Daugelavicius R., Bamford J. K. H. & Bamford D. H. The Entry Mechanism of Membrane-Containing Phage Bam35 Infecting Bacillus thuringiensis. Journal of Bacteriology 188, 5925 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Laurinmäki P. A., Huiskonen J. T., Bamford D. H. & Butcher S. J. Membrane Proteins Modulate the Bilayer Curvature in the Bacterial Virus Bam35. Structure 13, 1819–1828 (2005). [DOI] [PubMed] [Google Scholar]
- 57.Parra B. et al. Isolation and characterization of novel plasmid-dependent phages infecting bacteria carrying diverse conjugative plasmids. Microbiology Spectrum 0, e02537–23 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Kauffman K. M. et al. A major lineage of non-tailed dsDNA viruses as unrecognized killers of marine bacteria. Nature 554, 118–122 (2018). [DOI] [PubMed] [Google Scholar]
- 59.Martinez-Hernandez F. et al. Single-virus genomics reveals hidden cosmopolitan and abundant viruses. Nat Commun 8, 15892 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Sczyrba A. et al. Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software. 099127 Preprint at 10.1101/099127 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Roux S., Emerson J. B., Eloe-Fadrosh E. A. & Sullivan M. B. Benchmarking viromics: an in silico evaluation of metagenome-enabled estimates of viral community composition and diversity. PeerJ 5, e3817 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Bernheim A. & Sorek R. The pan-immune system of bacteria: antiviral defence as a community resource. Nat Rev Microbiol 18, 113–119 (2020). [DOI] [PubMed] [Google Scholar]
- 63.Kropinski A. M., Mazzocco A., Waddell T. E., Lingohr E. & Johnson R. P. Enumeration of bacteriophages by double agar overlay plaque assay. Methods in molecular biology (Clifton, N.J.) (2009) doi: 10.1007/978-1-60327-164-6_7. [DOI] [PubMed] [Google Scholar]
- 64.Schlechter R. O. et al. Chromatic Bacteria – A Broad Host-Range Plasmid and Chromosomal Insertion Toolbox for Fluorescent Protein Expression in Bacteria. Frontiers in Microbiology 9, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Owen S. V. et al. Prophages encode phage-defense systems with cognate self-immunity. Cell Host & Microbe (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Klümper U. et al. Broad host range plasmids can invade an unexpectedly diverse fraction of a soil bacterial community. The ISME Journal 9, 934–945 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.van der Walt S. et al. scikit-image: image processing in Python. PeerJ 2, e453 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Villa L., García-Fernández A., Fortini D. & Carattoli A. Replicon sequence typing of IncF plasmids carrying virulence and resistance determinants. Journal of Antimicrobial Chemotherapy 65, 2518–2529 (2010). [DOI] [PubMed] [Google Scholar]
- 69.Baym M. et al. Inexpensive Multiplexed Library Preparation for Megabase-Sized Genomes. PLOS ONE 10, e0128036 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Bolger A. M., Lohse M. & Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Hall M. Rasusa: Randomly subsample sequencing reads to a specified coverage. JOSS 7, 3941 (2022). [Google Scholar]
- 72.Wick R. R., Judd L. M., Gorrie C. L. & Holt K. E. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13, e1005595 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Meleshko D., Hajirasouliha I. & Korobeynikov A. coronaSPAdes: from biosynthetic gene clusters to RNA viral assemblies. Bioinformatics 38, 1–8 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Otto T. D., Dillon G. P., Degrave W. S. & Berriman M. RATT: Rapid Annotation Transfer Tool. Nucleic Acids Research 39, e57–e57 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Danecek P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Li H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Danecek P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Waskom M. seaborn: statistical data visualization. JOSS 6, 3021 (2021). [Google Scholar]
- 80.Hunter J. D. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 9, 90–95 (2007). [Google Scholar]
- 81.CCTOP: a Consensus Constrained TOPology prediction web server | Nucleic Acids Research | Oxford Academic. https://academic.oup.com/nar/article/43/W1/W408/2467915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Protter: interactive protein feature visualization and integration with experimental proteomic data | Bioinformatics | Oxford Academic. https://academic.oup.com/bioinformatics/article/30/6/884/285666. [DOI] [PubMed] [Google Scholar]
- 83.Mirdita M. et al. ColabFold: making protein folding accessible to all. Nat Methods 19, 679–682 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Schrödinger L. & DeLano W. The PyMOL Molecular Graphics System.
- 85.Sievers F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7, 539 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Okonechnikov K., Golosova O., Fursov M., & the UGENE team. Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics 28, 1166–1167 (2012). [DOI] [PubMed] [Google Scholar]
- 87.Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014). [DOI] [PubMed] [Google Scholar]
- 88.Terzian P. et al. PHROG: families of prokaryotic virus proteins clustered using remote homology. NAR Genomics and Bioinformatics 3, lqab067 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Zimmermann L. et al. A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core. Journal of Molecular Biology 430, 2237–2243 (2018). [DOI] [PubMed] [Google Scholar]
- 90.Sayers E. W. et al. Database resources of the national center for biotechnology information. Nucleic Acids Res 50, D20–D26 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Gilchrist C. L. M. & Chooi Y.-H. clinker & clustermap.js: automatic generation of gene cluster comparison figures. Bioinformatics 37, 2473–2475 (2021). [DOI] [PubMed] [Google Scholar]
- 92.Russel J., Pinilla-Redondo R., Mayo-Muñoz D., Shah S. A. & Sørensen S. J. CRISPRCasTyper: Automated Identification, Annotation, and Classification of CRISPR-Cas Loci. The CRISPR Journal 3, 462–469 (2020). [DOI] [PubMed] [Google Scholar]
- 93.Tesson F. et al. Systematic and quantitative view of the antiviral arsenal of prokaryotes. Nat Commun 13, 2561 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Altschul S. F., Gish W., Miller W., Myers E. W. & Lipman D. J. Basic local alignment search tool. J Mol Biol 215, 403–410 (1990). [DOI] [PubMed] [Google Scholar]
- 95.Camargo A. P. et al. IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata. Nucleic Acids Research 51, D733–D743 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.El-Gebali S. et al. The Pfam protein families database in 2019. Nucleic Acids Research 47, D427–D432 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014). [DOI] [PubMed] [Google Scholar]
- 98.HMMER. http://hmmer.org/.
- 99.Wood D. E., Lu J. & Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration | Briefings in Bioinformatics | Oxford Academic. https://academic.oup.com/bib/article/14/2/178/208453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Minh B. Q. et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Molecular Biology and Evolution 37, 1530–1534 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Letunic I. & Bork P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Research 49, W293–W296 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Guindon S. et al. New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0. Systematic Biology 59, 307–321 (2010). [DOI] [PubMed] [Google Scholar]
- 104.FigTree. http://tree.bio.ed.ac.uk/software/figtree/.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figure S1. Pairwise nucleotide identity and gene synteny of alphatectiviruses
Figure S2. Evidence that FtMidnight uses the F pilus as a receptor
Figure S3. The distal tail of FtMidnight is related to phages that use the orthogonal type 4 pilus as a receptor
Figure S4. Presence of sequence-specific phage defense systems in the genomes of host bacteria used in this study
Figure S5. Mutations in the holin protein confer expanded host range in PRDcerulean
Figure S6. Alphatectiviruses are not recovered from metagenomic sequencing datasets
Table S1. Spreadsheet listing all bacterial strains, plasmids, phages, and primers and environmental samples used in this study.
Table S2. Spreadsheet listing all phage genomes used for comparative genomics analyses (tectiviruses, fiersviruses, inoviruses, FtMidnight-relatives), and SRA datasets for metagenomic analyses.
Table S3. Spreadsheet listing CRISPR-Cas and RM hits in bacterial hosts.
Data Availability Statement
Raw sequencing reads have been deposited in the NCBI BioProject database under accession number PRJNA954020. Accession numbers for novel phage genomes generated in this study can be found in Supplementary Table S1c. All code and raw data used in figures are available on a github repository: https://github.com/baymlab/2023_QuinonesOlvera-Owen





