Abstract
We identified 9371 tailed phage prophages of 20 known types in reported complete genome sequences of 3298 bacteria in the Salmonella genus. These include 4758 P2 type and 744 P22 type prophages. The latter prophage types were found in the genome sequences of 127 and 24 bacterial host genera, increasing the known host ranges of phages in these groups by 114 and 20 genera, respectively. These prophage nucleotide sequences displayed much more diversity than was previously known from the 48 P2 and 24 P22 type authentic phages whose genomes have been sequenced. More detailed analysis of these prophage sequences indicated that major capsid protein (MCP) gene exchange between tailed phage clusters or types is extremely rare and that P22 prophage-encoded tailspikes correspond perfectly with their hosts’ surface polysaccharide structure; thus, MCP and tailspike sequences accurately predict tailed phage type (and thus lifestyle) and host cell surface polysaccharide structure, respectively.
Keywords: bacteriophage, tailed phage, prophage, phage P2, phage P22, major capsid protein, tailspike, Caudovirales, Salmonella
Introduction
Bacterial viruses are extremely abundant, and they have major impacts on the ecology of planet Earth (Bergh et al., 1989; Wommack and Colwell, 2000; Hendrix, 2002; Wilhelm et al., 2002; Hambly and Suttle, 2005; Suttle, 2007; Brussaard et al., 2008; Hurwitz and Sullivan, 2013). Yet, our understanding of the nature and extent of bacteriophage diversity remains very incomplete. In addition to the isolation and characterization of many individual bacteriophages (see for example reviews by Hatfull et al. (2010) and Grose and Casjens (2014)), considerable effort has gone into viral metagenomic analyses of various ecosystems such as the oceans (e.g., Breitbart et al., 2002; Paul and Sullivan, 2005; Roux et al., 2013; Hurwitz et al., 2014; Brum et al., 2015; Sanchez et al., 2015). The latter studies demonstrate the huge diversity and abundance of tailed phages but remain largely unconnected to the specific phages whose life cycles have been studied in the laboratory. Our approach towards attaining an understanding of viral diversity has been to reduce the size of problem by examining the tailed phages that infect particular bacterial taxa. We previously followed the strategy of Hatfull et al. (2010) to classify authentic Enterobacteriaceae tailed phages into “clusters” that have nucleic acid similarity across >50% of their genomes (Grose and Casjens, 2014). We use the term “authentic” to denote phages that have been studied in the laboratory as functional viruses and the term “supercluster” to include different clusters whose members do not have enough nucleic acid similarity to merit inclusion in the same cluster but which have overall synteny with homologous (or sometimes functionally analogous) proteins encoded at parallel genome positions. In our analysis, clusters of related phages are operationally defined by their genome sequences without regard to their bacterial hosts. The previously defined Enterobacteriaceae authentic phage clusters very rarely include phages that are known to infect hosts from other bacterial taxonomic families; we have found only two exceptions. Stenotrophomonas (a member of the Xanthomonadaceae bacterial family) phage IME15 is remarkable in its inclusion within the Enterobacteriaceae T7-like phage cluster (lytic cluster 5 in Grose and Casjens, 2014), and recently several phages that belong to the EMCL-117-like cluster (lytic cluster 27) have been reported to be broad spectrum, infecting both E. coli in the Enterobacteriaceae family and Pseudomonas aeruginosa in the Pseudomonadaceae family (Malki et al., 2015).
The above comparative analysis of 337 tailed phages that infect Enterobacteriaceae hosts and whose genomes have been completely sequenced largely unambiguously parsed these phages into 56 different “clusters”, 24 of which are temperate and 32 of which are lytic (the later are also known as “virulent phages”) (Grose and Casjens, 2014). The rate of phage genome sequencing has continued to accelerate, and in the one and a half years since that study was completed 257 additional Enterobacteriaceae tailed phage genome sequences have been determined as of March 1, 2016, so the total number of sequenced genomes rose from 337 to 594. All but five of these newly characterized phages fall into one of the previously defined 56 clusters. The five exceptional newly sequenced phages, IME-EC2 (Escherichia; Accession No. KF591601; Hua et al., 2014), Ss1 (Cronobacter; KM058087; Endersen et al., 2014), CVT22 (Citrobacter; KP774835; Tikhe et al., 2015), SEN34 (Salmonella, KT630649), and GF-2 (Edwardsiella; AP014629; Yasuike et al., 2015) are founding phages of three new lytic and two new temperate clusters (our unpublished observations). The small number of new clusters generated by the many recently sequenced genomes supports the notion that, at least for the best-studied genera of this bacterial family, Escherichia and Salmonella, most of the common tailed phage types that infect them have been discovered.
In addition to the sequenced phage genomes, a large number of bacterial genome projects have sequenced many thousands of Enterobacteriaceae genomes. We and others have noted that many prophage sequences are present in these bacterial genome sequences (e.g., Perna et al., 2002; Canchaya et al., 2003; Casjens, 2003). In fact there are many times more prophage nucleotide sequences available for the genomes of most temperate phage clusters than there are bona fide functional phage genome sequences. In this report we examine the prophages present in the available Enterobacteriaceae bacterial genome sequences, with a focus on two very well-studied temperate tailed phage groups, the P2-like and P22-like phages, and on one of the best studied host genera, Salmonella, to determine how they affect our view of the diversity and evolution of tailed phage types.
RESULTS AND DISCUSSION
Prophage discovery and the temperate lifestyle
Analysis of Enterobacteriaceae phages reveals that switches between temperate and lytic lifestyles are rare
In our previous study we found that tailed bacteriophage major capsid protein (MCP, the protein building block for the icosahedral head shell of the virion) amino acid sequence relationships appeared to be good predictors of tailed phage type or cluster membership (Grose and Casjens, 2014). This indicates that MCP encoding genes are not being rapidly exchanged between clusters. Furthermore, since none of the Enterobacteriaceae tailed phage clusters contain both lytic and temperate phages, we tentatively concluded that switches between temperate and lytic lifestyles by Enterobacteriaceae tailed phages are infrequent, if they happen at all (Grose and Casjens, 2014). This correlation between MCP type and cluster membership suggests that MCP sequence relationships should predict phage lifestyle. We examine this idea in more detail here.
We searched the extant NCBI bacterial genome sequence database for similarities to selected MCPs (one from each subcluster) from each of the 61 Enterobacteriaceae tailed phage clusters (including the five recently established clusters, above) as well as the few exceptional MCPs (see Grose and Casjens, 2014). Table 1 shows that at the time of this search (March 1, 2016) 24 of the 26 Enterobacteriaceae temperate tailed phage clusters have matches with ≥97% amino acid sequence identity in Enterobacteriaceae bacterial genomes. The remaining two clusters are typified by the Erwinia temperate phages ∅Et88 and PEp14 (temperate singleton clusters 17 and 24, respectively, of Grose and Casjens, 2014). Phage ∅Et88 MCP has ~75% identity matches in two Enterobacteriaceae genera, and the putative PEp14 MCP (locus_tag PEp14_00010) has no convincing bacterial genome matches. We manually examined several hundred of these MCP matches in more detail and all were clearly present in (usually unannotated) prophages. The best authentic temperate phage MCP matches in bacterial genome sequences almost always have the highest similarity to prophage sequences in the host genus of the authentic phage whose MCP was used as the database probe (see Supplementary Table S1 for hosts of cluster prototype phages). This supports the notion that closely related phages usually infect related bacterial hosts, which in turn suggests that prophages have generally entered their host cells through the normal infection route that includes adsorption to species- or genus-specific receptors.
Table 1.
Phage cluster1 | Prototype phages | Four most closely related Enterobacteriaceae genera with >50% MCP matches4 |
---|---|---|
Lytic 1–7 | T1, T4, Vi01, T5, T7, SP6, KP34 | None |
Lytic 8 | LIMEzero | Enterobacter-53%, Cronobacter-52%, Escherichia-51% |
Lytic 9 – 25 | ∅KT, GAP227, N4, 9NA, Chi, ∅Eco32, Felix-O1, SETP3 * K1-dep(1) 2), SO-1, ECO1230-10, Gj1, PY100, ∅92, rV5, SPN3US, RaK2, ∅R1–37, | None |
Lytic 26 | E1 | Klebsiella-54%, Enterobacter-54%, Escherichia-53%, Salmonella-52% |
Lytic 27 – 32 | EMCL-117, KF-1, MSW-3, Ea35–70, ∅EaH1, 9g | None |
Lytic 33 | IME-EC2 | Pectobacterium-73%, Escherichia-72%, Salmonella-72%, Leclercia-71% |
Lytic 34 – 35 | Ss1, CVT22 | None |
Temperate 1 | Lambda | Escherichia-100%, Shigella-100%, Salmonella-99%, Enterobacter-90% |
Temperate 2 & 32 | ∅80 & N15 | Escherichia-100%, Enterobacter-96%, Citrobacter-93%, Klebsiella-92% |
Temperate 3 | PY542 | Yersinia-80%, Klebsiella-78%, Enterobacter-77%, Cirobacter-76% |
Temperate 4 | HK97 | Enterobacter-99%, Escherichia-98%, Citrobacter-96%, Pantoea-93% |
Temperate 4 | mEp2352 | Enterobacter-99%, Escherichia-98%, Cronobacter-98%, Klebsiella-91% |
Temperate 5 | ES18 | Salmonella-100%, Cronobacter-94%, Escherichia-92%, Enterobacter-92% |
Temperate 6 | Gifsy-2 | Salmonella-100%, Edwardsiella-73%, Hafnia-73%, Escherichia-70% |
Temperate 7 | BP-4795 | Escherichia-100%, Shigalla-97%, Salmonella-77%, Enterobacter-78% |
Temperate 8 | SfV2 | Escherichia-100%, Salmonella-99%, Enterobacter-94%, Citrobacter-91% |
Temperate 8 | SfI | Shigella-100%, Escherichia-100%, Salmonella-100%, Edwardsiella-97% |
Temperate 9 | P22 | Salmonella-100%, Tautumella-77%, Serratia-76%, Enterobacter-76% |
Temperate 9 | Sf62 | Escherichia-99%, Salmonella-99%, Serratia-92%, Shimwellia-89% |
Temperate 10 | APSE-13 | Hamiltonella-99%, Arsenophonus-91%, Providencia-85%, Sodalis-82% |
Temperate 11 | 933W | Escherichia-99%, Salmonella-85%, Pectobacterium-75%, Yersinia-75% |
Temperate 12 | HK639 | Enterobacter-97%, Cronobacter-93%, Escherichia-94%, Leclercia-92% |
Temperate 13 | ∅ES15 | Cronobacter-100%, Enterobacter-73%, Morganella-72%; Salmonella-71% |
Temperate 14 | HS23 | Sodalis-100%, Leminorella-85%, Erwinia-84% |
Temperate 15 | ENT47670 | Cronobacter-98%, Escherichia-78%; Enterobacter-78%, Klebsiella-77% |
Temperate 16 | ZF40 | Brenneria-97%, Pectobacterium-96%, Kosakonia-84%, Serratia-84% |
Temperate 17 | ∅Et88 | Enterobacter-76%, Hafnia-75% |
Temperate 18 | ε15 | Salmonella-100%, Klebsiella-99%, Enteroobacter-99%, Citrobacter-98% |
Temperate 19 | P1 | Escherichia-100%, Salmonella-100%, Klebsiella-99% |
Temperate 20 | P2 | Escherichia-100%, Salmonella-100%, Shigella-99%, Klebsiella-99% |
Temperate 21 | ESSI-2 | Cronobacter-100%, Yokenella-84%, Escherichia-83%, Salmonella-82% |
Temperate 22 | Mu | Escherichia-100%, Salmonella-99%, Citrobacter-73%, Enterobacter-71% |
Temperate 23 | SSU5 | Salmonella-100%, Enterobacter-100%, Yersinia-99%, Escherichia-99% |
Temperate 24 | PEp14 | None |
Temperate 25 | GF-2 | Edwardsiella-99%, Enterobacter-84%, Dickeya-84%, Escherichia-81% |
Temperate 26 | SEN34 | Salmonella-99%, Escherichia-97%, Citrobacter-97%, Enterobacter-97%% |
See Grose and Casjens (2014) for cluster definitions.
N15 and ∅80 MCPs are similar in spite of different overall cluster relationship; phage K1-dep(1), PY54, mEp235, Sf6 and SfV MCPs are exceptionally different from those of the other phages in their clusters.
Virions produced but not yet shown to be infective.
The few database hits to entries with only “WP_” accession numbers that did not allow access to MCP gene context were ignored. Percent identity values are from BLASTp (Altschul et al., 1997) searches and so cover most of the target proteins but may be slightly higher than direct comparison of whole protein sequences.
Table 1 also shows that in striking contrast to the temperate phages, only three of the 35 Enterobacteriaceae lytic phage clusters have MCP matches in Enterobacteriaceae bacterial chromosomes that have >50% identity. One of these, homologues of moderate similarity (~50% identity) to LIMEzero-like phage (lytic cluster 8; within the T7 supercluster) MCPs are encoded by a few Enterobacter, Cronobacter and Escherichia genomes is not yet understood. These bacterial genome regions also usually encode a putative integrase but few other phage-like proteins. The other two Enterobacteriaceae lytic phages that have bacterial genome matches with better than 50% identity, E1 and IME-EC2 (singleton lytic clusters 26 and 33), appear to be examples of past horizontal transfer of virion assembly genes between temperate and lytic phages. Salmonella lytic phage E1 (Niu et al., 2014) has a number of phage ES18-like (temperate cluster 5) virion assembly genes and an early region that is weakly related to lytic T1-like phages (lytic cluster 1); the proteins encoded by these two regions are mostly in the 30–40% identity range to ES18 and T1, respectively. Lytic Escherichia phage IME-EC2 (Hua et al., 2014) has virion assembly genes that are related to temperate P22-like phages (temperate cluster 9) with protein identities to those phages mostly in the 30–70% range (closest MCP match is about 70% identity to the CUS-3-like subgroup of the P22-like phages) (see Figure S1). However, its early region lacks any recognizable lysogeny genes and has DNA metabolism genes that are in part relatives of those of Xanthomonas phage DIBBI. The latter phage is also a distant relative of the T1-like phages that is likely lytic by its lack of recognizable lysogeny genes and lack of >50% MCP homologues in bacterial genomes. Both phage E1 and IME-CE2 have the same host species as their temperate relatives. Thus, E1 and IME-EC2 are apparently examples of rare, relatively “recent” hybridization between lytic and temperate phages to form novel lytic phage types (IME-EC2 is discussed further below).
A simple sequence test that distinguishes temperate from lytic lifestyle
Our analysis (above) found that 25 of the 26 Enterobacteriaceae-infecting temperate phage cluster MCPs have >75% identity with proteins encoded by prophages in extant Enterobacteriaceae genome sequences (above). These authentic phage MCPs match proteins encoded by cognate prophages much better than any proteins encoded by prophages in other bacterial families (Tables 1 and S1). We note that this analysis includes the N15-like phages (temperate cluster 3) whose prophages are linear plasmids (Ravin and Shulga, 1970; Ravin et al., 2000) and the P1- and SSU5-like phages (temperate clusters 19 and 23, respectively) whose prophages are circular plasmids (Yarmolinsky and Sternberg, 1988; Falgenhauer et al., 2014). These temperate groups do not encode an integrase, the gene usually thought of as indicative of a temperate phage lifestyle. As expected, homologues of these latter MCPs are not found as integral parts of bacterial chromosome sequences but are present as plasmids or as phage-sized sequence contigs (or smaller if a draft genome has small contigs). In addition, none of the 35 lytic clusters have Enterobacteriaceae bacterial genome MCP homologues with >75% identity. Therefore, the presence or absence of closely related MCPs encoded by the genomes of the authentic phage’s host or closely related bacteria indicates that the phage is almost certainly temperate or lytic, respectively. This very simple test was as accurate or better at determining lifestyle than was reported for the computer program PHACTS, which uses a training set of phages with known lifestyles (McNair et al., 2012). In addition, it was as good at finding prophages (that have not degraded or deleted their MCP gene) as Prophinder and Phage_Finder, which detect phage-gene enriched regions and genomic context to identify prophages (Fouts, 2006; Lima-Mendez et al., 2008). Two qualifications should be kept in mind concerning this simple operational prediction method. A sufficient number of host bacterial genome sequences must be available (this is perhaps the reason for the one failure we find in our data set, the absence of Enterobacteriaceae genome-encoded homologues of temperate phage PEp14 MCP and only a low ~75% match for ∅Et88). We also recognize that the 75% identity cutoff will likely not hold up in all cases since rare “hybrid” phages like E1 and IME-EC2 may be found in the future which have had less time to diverge since the hybridization event. If a >90% cutoff is used accuracy is much more assured.
Curiously, the strong lack of Enterobacteriaceae lytic phage MCP gene homologues in Enterobacteriaceae bacterial genome sequences does not hold up as well when all bacteria are included in such a comparison. Six of the lytic cluster MCP types have bacterial matches with ≥50% identity to non-Enterobacteriaceae prophages (blue text in Table S1). For example, prophages in Acinetobacter and Mesorhizobium genomes encode putative MCP proteins with 64% and 75% identity to Enterobacteriaceae lytic phages SETP3 (lytic cluster 16) and N4 (lytic cluster 11), respectively. These similarities indicate a closer MCP relationship between the above Enterobacteriaceae lytic phages and temperate phages in other bacterial phyla than with temperate phages that infect the Enterobacteriaceae. Nonetheless, it is clear that authentic temperate phage MCPs are much more closely related to the MCPs of related prophages in their host species’ genomes than are lytic phage MCPs, often bearing homologs of >97% identity (as is the case for 24 of our 26 temperate clusters as described above).
Prophage discovery and abundance
The number of prophages and prophage fragments present in bacterial genome sequences in the public database is much larger than the number of authentic, fully functional phages whose genomes have been sequenced. For example there are currently eleven authentic phages with sequenced genomes that have MCPs that are ≥98% identical to the well-studied Salmonella dsDNA tailed phage P22 (all of which infect Salmonella enterica), but the public database contains (on July 30, 2015) 712 prophages in Salmonella genome sequences that encode MCPs ≥98% identical to P22 MCP. In addition, the best matches outside Salmonella are in the following Enterobacteriaceae genera: Pluralibacter (91%), Kluvyera (83%), Tatumella (77%), Serratia (76%), Enterobacter (76%) and Escherichia (73%) (these include only matches that lie in contigs that show the integration of the prophage into bacterial DNA and so eliminate the possibility of the match being due to contamination of a genome project by Salmonella-generated P22-like phage particle DNA). Thus, temperate phages with MCPs that are very closely related to that of P22 appear to be restricted exclusively to Salmonella hosts, but slightly more distantly related MCPs are found in a number of related genera. Clearly prophages represent a huge untapped reservoir of phage sequence and diversity information. Although a small number of prophage sequences have been analyzed and described in the literature, the vast majority of extant prophage sequences are not annotated as such, and they have not been studied experimentally in any way. Without experimental work, it is of course not known whether any given prophage identified in a nucleotide sequence is actually a functional virus genome, even if it appears to be largely intact by bio-informatic analysis. However, we have argued that most genes in defective prophages remain functional, and so we believe that that the prophage relationships discussed here contribute to a valid assessment of the diversity of fully functional phages, even if some of the individual prophages analyzed might be defective due to natural deletion and mutation processes (Lawrence et al., 2001; Casjens, 2003).
Since authentic phages that infect most bacterial species have not been isolated and characterized, in addition to adding to our knowledge of the diversity that is present within known temperate phage types, the many prophage sequences present in the genome sequences of such species should be very informative regarding the range of hosts that a temperate phage type can infect. The huge number of prophages in bacterial genome sequences makes it impractical to analyze them all in detail, so in this report we focus our attention on the P2-like phage supercluster (which contains the P2-like and ESSI-2-like temperate clusters 20 and 21, respectively) and the P22-like phage group (P22-like temperate cluster 9 and APSE-1-like cluster 10, both of which are within the lambda supercluster) as examples of two quite different types of temperate phages (Grose and Casjens, 2014). These two phage groups were chosen for the following reasons: first, they are very well-characterized, and evolutionary and diversity studies benefit greatly from this accumulated knowledge; second, our preliminary analysis showed that they have contrasting wide and relatively narrow host ranges; third, they form well-defined, self-contained groups; and fourth, they infect the Enterobacteriaceae, one of the bacterial families for which overall phage diversity is best understood (Grose and Casjens, 2014).
As discussed above, to simplify prophage discovery we utilized our previous conclusion that MCP sequences are nearly always indicative of phage cluster membership (retests of this conclusion below confirm its validity). We used BLASTp (Altschul et al., 1997) searches of the public sequence database at the NCBI web site (http://blast.ncbi.nlm.nih.gov/Blast.cgi) with MCP probes from the phage P2- and P22-like phage clusters to identify related prophages. We note that the previously defined protein families called pfam05125 and pfam11651 (Finn et al., 2016) define a sets of proteins with sequence similarities to phage P2 and P22 coat proteins, respectively, but these pfams were not used here and bear no direct relevance to the current study.
Although rare horizontal exchange events between phages in different clusters could potentially obfuscate a small number of specific cases (see phage E1 and IME-EC2 discussions above), such events are so rare that they do not abrogate any of the general conclusions drawn in this report. A few related MCP gene-containing prophages could be missed by this strategy since diagnostic prophage genes can on rare occasion fall between contigs in draft genome sequences or be missed if a genome sequence’s predicted translation annotation is not correct. Since draft bacterial genome sequences were included in this analysis and draft sequence contigs can be smaller than the length of prophages, we made no attempt to determine the “completeness” of prophage sequences. We did, however, examine MCP matches in putative prophages from any surprising source, such a distantly related bacterial phylum, and they were not used unless the presence of host-phage sequence junction showed that the contig is not a phage contaminant in the genome project (only a very small number of such contigs were found). These shortcomings have proven to be very minor, and we believe that the analyses discussed below show a realistic picture of the abundance and diversity of the P2- and P22-like prophages in bacterial genome sequences.
Phage P2-like prophage diversity
Defining the P2-like supercluster
E. coli phage P2 is the best-studied member of a group of phages that we term the P2 supercluster. We previously reported that the thirteen Enterobacteriaceae-infecting P2 supercluster phages whose genomes had been completely sequenced at that time naturally parsed unambiguously into two clusters that contained twelve (now eighteen) P2-like phages and one ESSI-2-like phage (Grose and Casjens, 2014). These nineteen authentic phages each infect one of the following species: Escherichia coli, Salmonella enterica, Yersinia pestis, Erwinia amylovora or Cronobacter sakazakii (listed in Table S2A). These two phage clusters are self-contained with well-defined boundaries, and they both contain a divergently transcribed P2-like Q-P-O-N-M-L head gene cluster that is unique to, and diagnostic of the P2 supercluster phages. These six genes encode the portal protein - large terminase - scaffolding protein - MCP - small terminase - head completion protein, respectively (reviewed by Nilsson and Ljungquist, 2006; Christie and Calendar, 2016). The other virion assembly genes of this phage group are also largely conserved within the supercluster.
Unlike most other Enterobacteriaceae phage clusters and superclusters, genome sequences have been reported for 29 additional authentic “P2-like” phages that infect bacterial families other than Enterobacteriaceae (listed in Table S2C). These include the following phages: Pseudomonas aeruginosa phage ∅CTX (Nakayama et al., 1999), Haemophilus influenzae phages HP1 and HP2 (Esposito et al., 1996), Pasteurella multocida phage F108 (Campoy et al., 2006), Mannheimia haemolytica phage ∅MHaA1 (Highlander et al., 2006), Stenotrophomonas maltophilia phage Smp131 (Lee et al., 2014), Aeromonas media phage ∅O18P (Beilstein and Dreiseikelmann, 2008), and Vibrio cholerae phage K139 (Kapfhammer et al., 2002) (hosts in the Gammaproteobacteria Pseudomonadaceae, Pasteurellaceae, Aeoromonadaceae, Xanthomonadaceae and Vibrionaceae families), as well as several Burkholderia cepacia and Ralstonia solanacearum Betaproteobacteria phages (Fujiwara et al., 2008; Lynch et al., 2010 and 2012; Kvitko et al., 2012; Niu et al., 2015). Although all well-characterized P2 supercluster phages are temperate, the “P2-like” Burkholderia phages ST79 and ∅E12-2 have no clearly recognizable integrase genes; ST79 has been reported to be lytic (Yordpratum et al., 2011; Kulsuwan et al., 2014), while it has been suggested that ∅E12-2 is temperate (Nakornpakdee et al., 2015). Neither of the latter conclusions has been rigorously proven.
Although being similar to P2 at the protein level, all of these authentic phages with hosts outside the Enterobacteriaceae are substantially diverged from, and lie largely outside of the previously defined P2- and ESSI-2-like clusters (see below). The virion assembly proteins of all these P2-like phages have the typical overall P2 gene order, including the unique and diagnostic Q-P-O-N-M-L head gene arrangement, and in silico proteomic analysis confirms that even the taxonomically diverse members of this group are much more similar to other P2-like phages than they are similar to phages outside this group (above references to the individual phages and our unpublished analysis). However, although members of this wider group of P2-like phages have the same overall genome organization there are several previously reported gene order differences. Figure 1, in which several different gene orders are evident, displays genomic maps of fourteen representative phages - ten that infect hosts from five Gammaproteobacteria families (phages P2, Fels-2, P88, ∅CTX, Smp131, ∅MHaA1, HP1, K139, ∅O18P and ESSI-2) and four that infect two Betaproteobacteria families (phages ϕ52237, ϕRSA1, ∅E12-2 and ST79). For example, phages P2, HP1 and ∅CTX have three different genome arrangements (Esposito et al., 1996; Nakayama et al., 1999). The HP1 homologues of P2 tail genes FI, FII and T lie in a transcriptionally upstream location relative to P2, and, while the virion assembly genes of ∅CTX are syntenic with P2, its integrase gene is inverted and lies at the opposite end of the early region. In addition, the ∅CTX att integration site location is different from P2 and HP1, while the HP1 cos packaging initiation site location is different from the other two phages. These latter differences cause the permutation of the prophage vs. virion DNA gene order to be different in these three cases. In most cases genes with similar function encode recognizable protein homologues throughout the P2-like phages. However, in a few cases such as the tail genes T through 42 in the “HP1-like subgroup” (phages HP1, K139, ∅O18P and ESSI-2), they are sufficiently divergent that most amino acid similarities are no longer recognizable by simple BLASTp (Altschul et al., 1997) searches even though some sequence motifs may be retained. In addition to the genome arrangements in Figure 1, we find several other arrangements in the P2-like prophage panels discussed below; these include (1) inversion of the integrase gene or the integrase-attachment site cassette relative to ∅CTX and HP1 (e.g., prophages ChromoP2-A and PseudoalteroP2-A, respectively), (2) several additional differences in tail gene order (e.g., NovoP2_A), and (3) two Zymomonas prophages that appear not to be integrated into the host chromosome but are possibly present as circular plasmids (e.g., prophage ZymoP2-A is plasmid pZZM401, accession No. CP001881) (see Table S2D for information on these prophages). It is not known if these prophages all represent bone fide phage gene arrangements, but the fact that several of these gene arrangements are present in prophages in multiple bacterial isolates suggests that they may be functional phages. In spite of their substantial divergence and genome mosaicism, the overall synteny of these “P2-like” phages is clear, and their encoded proteins, especially the virion assembly proteins, are largely similar. We therefore include all of them within the P2 supercluster.
Bacterial taxa infected by the P2-like supercluster phage group
Phage P2, the namesake for this group of phages, is actually known to successfully infect Shigella, Klebsiella, Serratia and Yersinia strains in addition to E. coli (summarized in Bertani and Six, 1988). This highlights the fact that the entire host range of most of the authentic phages and all of the prophages discussed in this report has not been determined. We do not wish to imply that the hosts mentioned here are the only species that any given phage can infect. Nonetheless, to discover the taxa of bacterial hosts that can be infected by phages in the P2 supercluster we used MCPs from the P2-like supercluster phages to search for related prophages in the extant bacterial sequence database. There is considerable diversity within this phage supercluster (see below), so the divergent MCPs of authentic phages P2, ESSI-2, ∅52237 (the latter is a Burkholderia Betaproteobacteria phage) and several prophages from the Alpha- and Deltaproteobacteria, for which there is no studied authentic phage, were used as probes. The bacterial genera whose genomes encode proteins with >30% amino acid sequence identity to any one of these MPCs are listed in Table 2. The 30% value was chosen as a moderately stringent cutoff that avoids spurious matches outside the P2 supercluster, so this prophage estimate is a minimum value as some truly P2-like prophages could have diverged beyond this point. Nonetheless, additional searches with phage HP1 and ∅CTX MCPs or P2 large terminase and portal protein (the latter two are even more conserved within this supercluster than are the MCPs) did not identify prophages in any additional bacterial genera, so we believe this strategy found essentially all P2 supercluster prophages in the database (analysis completed November 30, 2015). No automated computer algorithm currently exists that can easily examine and compare the gene order within this multitude of prophages; however, all P2-like MCP genes that we manually examined (over one hundred) were found to lie in prophages that contain a syntenic homologue of the diagnostic P2 Q-P-O-N-M-L gene cluster.
Table 2.
CLASS |
ORDER |
FAMILY |
GENUS1 |
Alphaproteobacteria2 |
Caulobacteriales |
Caulobacteraceae |
Asticcacaulis 72(54)% |
Brevundimonas 48(50)% |
Rhodospirillales |
Rhodospirillacae |
Novispirillum 42(39)% |
Pararhodospirilium 39(42)% |
Sphingomonadales |
Sphingomonadaceae |
Blastomonas 48(50)% |
Erythrobacter 24(50)% |
Novosphingobium 50(100)% |
Sphingomonas 50(47)% |
Sphingopyxis (44%) |
Zymomonas 42(44)% |
Betaproteobacteria3 |
Burkholderiales |
Alcaligenacae |
Achromobacter 60(58)% |
Burkholderiaceae |
Burkholderia 55(100)% |
Chitinimonas 53(64)% |
Cupriavidus 56(65)% |
Pandoraea 54(60)% |
Oxalobacteracae |
Collimonas 52(64)% |
Herbaspirillum 54(62)% |
Janthinobacterium 54(62) % |
Oxalobacter 53(63)% |
Comamonadaceae |
Acidovorax 51(57)% |
Alicycliphilus 49(58)% |
Comamonas 49(56)% |
Curvibacter 51(60)% |
Delftia 52(60)% |
Hylemonella 48(50)% |
Ottowia 48(56)% |
Polaromonas 51(54)% |
Methylophilaceae |
Methylobacillus 52(56%) |
Ralstoniacecae |
Ralstonia 58(66)% |
Rhodocyclaceae |
Uliginosibacterium 48(59)% |
Neisseriales |
Neisseriaceae |
Neisseria 41(41)% |
Chromobacterium 54(62)% |
Gulbenkiania 57(66)% |
Kingella 41(38)% |
Laribacter 54(58)% |
Pseudogulbenkiania 34(34)% |
Nitrosomonadales |
Nitrosomonadaceae |
Ferriphaselus 52(56)% |
Gammaproteobacteria4 |
Aeromonadales |
Aeromonadaceae |
Aeromonas 55(57)% |
Oceanimonas 30(56)% |
Alteromonadales |
Alteromonadaceae |
Marinobacter 37(32)% |
Teredinibacter 53% |
Moritellaceae |
Moritella 33% |
Psychromonadaceae |
Psychromonas 39(50)% |
Pseudoalteromonadaceae |
Algicola 35% |
Pseudoalteromonas 56(55)% |
Shewanellaceae |
Shewanella 55(55)% |
Enterobacteriales |
Enterobacteriaceae |
Arsenophonus 56(74)% |
Brenneria 39(51)% |
Buttiauxella 79% |
Cedecea 59(34)% |
Citrobacter 84(82)% |
Cronobacter 83(100)% |
Dickeya 60(82)% |
Edwardsiella 78(57)% |
Enterobacter 98(84)% |
Erwinia 59(30)% |
Escherichia 100(83)% |
Franconibacter 61(89)% |
Hafnia 62(81)% |
Kosakonia 64(81)% |
Klebsiella 99(82)% |
Kluyvera 63% |
Leclercia 59% |
Morganella 57(70)% |
Pantoea 76(31)% |
Pectobacterium 72(78)% |
Photorhabdus 61(74)% |
Plesiomonas 61(30)% |
Pluralibacter 49(49)% |
Proteus 55(73)% |
Providencia 56(74)% |
Rahnella 69(75)% |
Raoultella 77(51)% |
Rouxiella 72(50)% |
Salmonella 100(82)% |
Siccibacter 81% |
Serratia 77(79)% |
Shigella 99(47)% |
Sodalis (71)% |
Tatumella 58% |
Trabulsiella 41(34)% |
Xenorhabdus 58(74)% |
Yersinia 72(77)% |
Yokenella 57(84)% |
Cardiobacteriales |
Cardiobacteriaceae |
Cardiobacterium 46% |
Dichelobacter 46% |
Oceanospirillales |
Oceanospirillaceae |
Oleispira 38(33)% |
Oceanobacter 52% |
Marinomonas 33% |
Hahellaceae |
Endozoicomonas 34(30)% |
Hahella 39% |
Zooshikella 34% |
Halomonadaceae |
Chromohalobacter 52(31)% |
Halomonas 53(30)% |
Zymobacter 55(32)% |
Orbales |
Orbaceae |
Gilliamella 51% |
Pasteurellales |
Pasteurellaceae |
Actinobacillus (36)% |
Avibacterium 32(35)% |
Gallibacterium 50(33)% |
Haemophilus 52(35)% |
Histophilus 46(32)% |
Mannheimia 55% |
Necropsobacter 46(31)% |
Pasteurella 50(35)% |
Pseuodmonadales |
Moraxellaceae |
Acinetobacter 52(31)% |
Alkanindiges 49% |
Ehydrobacter 48% |
Moraxella 37% |
Perlucidibaca 49% |
Psychrobacter 35% |
Pseudomonadaceae |
Pseudomonas 55(32)% |
Azotobacter 55% |
Dasania 36% |
Serpens 47% |
Thiotrichales |
Piscirickettsiaceae |
Hydrogenovibrio 34(30)% |
Thiotrichaceae |
Thiotrix 42% |
Vibrionales |
Vibrionaceae |
Aliivibrio 36(55)% |
Grimontia (49)% |
Photobacterium (59)% |
Salinivibrio (54)% |
Vibrio 39(57)% |
Xanthomonadales |
Xanthomonadaceae |
Luteibacter 46% |
Lysobacter 54(32)% |
Xanthomonas 57(33)% |
Stenotrophomonas 53(32)% |
Deltaproteobacteria5 |
Desulfovibrionales |
Desulfovibrionaceae |
Bilophila 37(64)% |
Desulfovibrio 39(100)% |
The best BLASTp reported matches with ≥30% identity to phage P2 MCP from each bacterial genus are included in the table. Similar best matches to other P2 supercluster phage MCP probes are indicated in parentheses; the “other” phage MCP was different for each bacterial class as indicated in footnotes 2–5.
Parentheses % identity to prophage NovoP2-A MCP (Table S2D)
Parentheses % identity to Burkholderia phage ∅52237 MCP
Parentheses % identity phage ESSI-2 MCP
Parentheses % identity to prophage DesulfP2-A MCP (Table S2D)
Table 2 shows that P2-like MCPs are encoded by prophages present in the genomes of 127 different genera in 32 different Proteobacteria families. These include ten genera in three families of the Alphaproteobacteria, 27 genera in nine families of the Betaproteobacteria, 88 genera in nineteen families of the Gammaproteobacteria and two genera in one family of the Deltaproteobacteria. This is 114 more host genera that were evident from the previously known authentic P2 supercluster phages (Table S2A and D). Clearly this group of phages has been very successful and is widespread among the Proteobacteria. No convincing BLASTp matches to any of the P2 supercluster MCP or portal protein probes used were found outside the Proteobacteria phylum; the very small number of such matches that were found to reside on very small contigs that do not show the prophage integration junction with the host. We suspect that these are most likely Proteobacteria phages that contaminated the DNA source of these non-Proteobacteria genome sequencing projects and suggest that they should be treated with skepticism until more convincing evidence is available. The relative numbers of sequenced bacterial genomes from the various taxa colors any database observations such as these, but the huge number of bacterial genome sequences available from many branches of the bacterial evolutionary tree strongly suggests that the failure to find matches outside the Proteobacteria is not likely to be due to insufficient sampling of bacterial genomes. We conclude that the P2 phage supercluster is likely restricted to the Proteobacteria phylum but is widely dispersed therein.
P2 supercluster phage diversity within the Enterobacteriaceae
Our previous analysis of the tailed phages that infect the Enterobacteriaceae defined two related clusters, the P2- and ESSI-2-like phages, and we originally divided the P2-like cluster into four subclusters A (typified by phage P2), B (phage 186), C (phage Fels-2) and D (phage ENT90) (Grose and Casjens, 2014). Since that study was completed the newly released sequences of authentic phage P88 as well as the similar pair of phages SEN4 the SEN5 form two additional well-defined subclusters within P2-like cluster (Table S2A; Figure 2 below). Phage ESSI-2 remains the only known authentic phage in its cluster. To examine the diversity of P2 supercluster phages in this host family in more detail, we chose a panel of 30 prophages from the Enterobacteriaceae that were identified in the above MCP search; these prophages are listed in Table S2B along with locus_tags that can be used to locate each MCP gene and its encoding prophage. These prophages were chosen randomly, but were required to have a largely complete gene content and to span a significant fraction of the extant MCP diversity. This heuristic sampling approach, although not comprehensive, can give an initial view of the P2-like phage diversity within this host family.
Figure 2 shows a dot plot analysis of the genomes of this panel of prophages and a representative sample of authentic Enterobacteriaceae P2 supercluster phages. The P2-, 186-, Fels-2-, ENT90-, P88- and SEN4-like subclusters can be seen in this plot as having weak but nearly full-length diagonals between subclusters. Prophage ProvP2-B appears to be the most distantly related but even it has a weak but long diagonal with a few other members of the prophage panel (e.g., MorgP2-A). The previously defined ESSI-2 cluster remains distinct from the P2-like cluster with only very weak incomplete diagonals with a few P2-like cluster phages. None of the prophages in the panel showed stronger relationships with authentic phages with hosts outside of the Enterobacteriaceae than with authentic phages inside this host family (not shown). Of the 30 prophages in the panel, nineteen reside in the P2-like cluster and eleven are in the ESSI-2-like cluster. The nineteen P2-like prophages form twelve new subclusters, and the eleven ESSI-2-like prophages form four new subclusters in addition to the one typified by ESSI-2. This semi-random collection of 30 prophages increases the number of known subclusters in the P2-like cluster from six to eighteen and increases the number of ESSI-2-like subclusters from one to five. Thus, examination of even a small fraction of extant prophage sequences from this single bacterial host family greatly increases the observed P2 supercluster phage diversity beyond that which was known previously.
An MCP dot plot of the phages whose genomes are compared in Figure 2 shows that MCP relationships largely reflect the whole genome relationships, although two prophages, ProdP2-A and YersP2-A, appear to have undergone intra-cluster MCP exchanges that disrupt this relationship (Figure S2). Figure 3 shows a neighbor-joining tree that includes the MCPs of the Enterobacteriaceae phages and panel of prophages, where branches with high bootstrap values strongly support the relationships in the whole genome dot plot shown in Figure 2. The analyses in Figures 2, S2 and 3 also strongly support our previous observation (Grose and Casjens, 2014) that P2-like MCP sequence always reflects cluster and nearly always subcluster membership. These Enterobacteriaceae phages and prophages have both P2 and HP1 genome organizations (Figure 2). Our Enterobacteriaceae panel does not contain any phages with the ∅CTX-like or other organizations, and searches of the Enterobacteriaceae genome sequences with ∅CTX MCP found no close matches, only matches with ≤57% identity in prophages with P2 gene organization (best match was 57% in Pantoea rwandensis strain ND04, accession No. CP009454), indicating that our panel selection likely did not fortuitously miss close relatives of ∅CTX in this host family. Similar searches with other very distantly related P2 supercluster MCPs indicates that it is very likely that we discovered all of the P2-like phages in the extant Enterobacteriaceae genomes.
P2 supercluster phage diversity across the Gammaproteobacteria class
As discussed above, unlike many of the Enterobacteriaceae tailed phage types, the P2 supercluster includes authentic phages that infect bacteria outside of this family. Figure S3 shows a dot plot of authentic P2-like phages that includes those with Gammaproteobacteria hosts outside the Enterobacteriaceae. The phages that infect Aeromonadaceae and Vibrionaceae each form a clear separate cluster, and those that infect Pasteurellaceae form two clear clusters. The dot plot also shows that the Pseudomonadaceae and Xanthomonadaceae phages are weakly related to each other as well as to some Enterobacteriaceae P2-like phages, especially phage ENT90.
To further examine the diversity of P2-like phages that infect Gammaproteobacteria hosts, we collected a panel of 25 largely intact P2-like prophages that includes one or two randomly chosen prophages from each of thirteen non-Enterobacteriaceae Gammaproteobacteria families (listed in Table S2D; the remaining Gammaproteobacteria families from Table 2 were not included because their prophage sequences were present in multiple small contigs). The genome dot plot in Figure 4 includes these 25 prophages; four representative Enterobacteriaceae P2-like phages are shown, and the Enterobacteriaceae prophage panel members (above) are not shown in order to avoid making individual phage genomes even smaller in the presentation. Numerous other comparisons (not shown) did not identify any additional strong similarities between the Enterobacteriaceae P2-like phages or prophages and those from other Gammaproteobacteria families. Again, the MCP relationships largely reflect the overall genome relationships (cf. Figures 4 and S4).
These prophages are quite diverse and can be considered to form as many as a dozen novel phage clusters (all within the P2 supercluster). There is considerable cluster-level phage diversity within host families such as the Hahellaceae, Oceoanospirillaceae, Alteromonadaceae and Pseudoalteromonadaceae where none of the prophages within a family show strong relationships. On the other hand, some phages and prophages from the Halomonadaceae, Aeromonadaceae, Pseudomonadaceae, Moraxellaceae, Xanthomonadaceae, Pasteurellaceae, and Vibrionaceae host families form related intra-family groups whose members have sufficient syntenic similarity to warrant their inclusion in the same cluster. The expanded dot plots of example phages in Figures S5 and S6 show that although phage clusters and subclusters tend to parallel host families, relationships are not always that simple and are not strictly limited to intra-host family clusters. For example authentic Pseudomonadaceae phage ∅CTX and Xanthomonadaceae phage Smp131 exhibit significant similarity. Multi-cluster intra-host family diversity is clearly not restricted to the Enterobacteriaceae, and yet some clusters can contain phages with hosts in different families.
P2 supercluster phage diversity across the Proteobacteria phylum
The genome dot plot in Figure S3 includes the known authentic P2-like Betaproteobacteria phages; they are most highly related within their host family. Although no Alpha-, Delta- or Zetaproteobacteria P2-like authentic phages have been characterized, our prophage search (above) identified Alpha- and Deltaproteobacteria genera that serve as hosts for P2 supercluster phages (Table 2). Again, to examine the diversity outside the Gammaproteobacteria a panel of fourteen prophages was chosen randomly from Alpha-, Beta-, and Deltaproteobacteria genomes (listed in Table S2D). The lower right portion of the Figure 4 genome dot plot shows these prophages. The two Burkholderiaceae P2-like prophages show strong to moderate dot plot similarity with the authentic Burkholderia phages, but each of the twelve other prophages has weaker relationships with the other P2-like phages and prophages and can be thought of as founding a new cluster or subcluster.
The Ralstoniaceae, Burkholderiaceae, Alcaligenacae, Oxalobacteraceae and Comamonadaceae Betaproteobacteria prophages exhibit syntenic inter-family similarities with each other, as well as weak similarity with the Alphaproteobacteria families Caulobacteraceae and Sphingomonadaceae and the Gammaproteobacteria Pseudomonadaceae/Xanthomonadaceae/Halomonadaceae group of phage families mentioned above (weak diagonal lines are visible in Figure 4, and examples of these inter-phylum relationships are shown in more detail in the dot plots in S5 and S6). This may imply that the Alpha- and Betaproteobacteria P2-like phages expanded from a Pseudomonadaceae/Xanthomonadaceae/Halomonadaceae Gammaproteobacteria P2-like phage ancestor.
P2-like prophage abundance
To the extent that the bacterial genome sequences in the current database reflect the natural abundance of various bacterial taxa, it should be possible to estimate the natural frequency of occurrence of a particular prophage type by searching for diagnostic prophage genes. As the number of bacterial sequences grows this goal should be approached, although human pathogens are and will likely continue to be greatly overrepresented. We chose the genus Salmonella as a phylogenetically narrower host group to examine in more detail, since it has a large number of searchable genome sequences and because Salmonella strains are, unlike most other bacterial species, routinely serotyped. The latter enables the species to be separated into smaller related groups such as subspecies (Desai et al., 2013) and groups of isolates with similar phage receptor surface polysaccharides (see below). Salmonella isolates are known to carry numerous prophages. Schmieger and co-workers (Schicklmaier et al., 1998; Schmieger, 1999) found that 173 S. enterica serovar Typhimurium isolates released a minimum of 136 functional phages, and the commonly used laboratory Typhimurium isolate LT2 carries four intact, fully functional prophages (Yamamoto, 1967 & 1969; Yamada et al., 1986; Figueroa-Bossi and Bossi, 1999; McClelland et al., 2001). Many studies have also shown that prophage content accounts for a substantial part of the natural variation among Salmonella isolates (e.g., Reen et al., 2005; Cooke et al., 2007; Drahovska et al., 2007; Rychlik et al., 2008; Fricke et al., 2011; Moreno Switt et al., 2012; Pang et al., 2013; Bobay et al., 2014; Hiley et al., 2014; Switt et al., 2015).
We identified a panel of 3298 unique Salmonella complete and draft genome sequences (NCBI taxid 590; 3295 S. enterica and 3 S. bongori isolates) on July 31, 2015 that encoded a searchable annotated DnaK and/or ProA protein in a BLASTp search. The dnaK gene is unique and universally present as a single copy in Salmonella, and in the few cases where the dnaK gene was not annotated in a genome (presumably because it falls between reported contigs of draft a genome sequence) we confirmed the presence of a searchable Salmonella-like proA gene. These 3298 Salmonella isolates were examined for the presence of P2 supercluster prophages by the presence of genes encoding MCPs >50% identical to those of the authentic Salmonella P2-like phage Fels-2, >50% identical to phage P88, >50% identical to E. coli prophage EcoP2-KTE234-A, or >45% identical phage ESSI-2; these four include two divergent MCPs from each of the two deepest Enterobacteriaceae branches in the P2-like MCP tree (Figure 3). These search cutoffs ensure that the database matches define non-overlapping sets of MCPs and thus different prophages. Manual examination of numerous matches below the above cutoffs did not identify any novel MCP types or additional Salmonella prophages. In addition, probing the Salmonella genomes with MCPs from more distantly related P2 supercluster phages with hosts outside the Enterobacteriaceae (HP1, Smp131, K139, ∅CTX and ∅MHaA-1) found no close Salmonella matches and did not identify any P2 supercluster prophages in addition to those found with the above MCP probes. We therefore believe this search likely identified the complete set of Salmonella P2 supercluster prophages with intact and annotated MCP genes; nonetheless, our numbers should be considered as minimum values.
Table S3 shows that the 3298 Salmonella genomes carry 4758 P2-like prophage matches that are scattered across 91 different Salmonella serotypes. Among these prophages 4312 have MCPs that indicate P2-like cluster membership, and 446 are ESSI-2-like (no close relatives of the Enterobacteriaceae EcoP2-KTE234-A prophage MCP branch in Figure 3 were found in Salmonella). Several observations emerge from this analysis. (i) Both P2 cluster and ESSI-2 cluster prophages are present in Salmonella. (ii) Diversity of MCP amino acid sequences implies substantial diversity within Salmonella family hosts for both the P2- and ESSI-2-like clusters. (iii) P2-like prophages are not uniformly present within serotypes; for example among sequenced genomes of serovar Typhimurium only 6.7% carry a prophage with a P2/Fels-2-like MCP and 0.7% carry ESSI-2-like prophages, while 36.9% and 99.0% of sequenced isolates of the closely related Heidelberg serovar carry Fels-2- and ESSI-2-like prophages, respectively. We note that the prophages that encode an ESSI-2-like MCP all have the HP1-like genome organization (see Figure 1). (iv) Salmonella strains, for example serovar Typhi strain STH2370 (Valenzuela et al., 2014), can harbor as many as three different P2-like prophages. (v) Finally, P2 supercluster prophages are very common in Salmonella with an average of 1.44 such prophages per isolate in sequenced genomes.
Phage P22-like prophage diversity
Defining the P22-like phage group
The phage lambda supercluster (sometimes called the “lambdoid” phages) is well known to be extremely diverse (see Hendrix and Casjens, 2006; Grose and Casjens, 2014), and here we focus on a single subgroup, the P22-like phages, within this large group. We chose this subgroup for further analysis as a contrast to the P2-like phages discussed above. The long contractile tailed P2-like phage group infects a very wide range of hosts, but little is known about how they adsorb to their susceptible hosts (Yamashita et al., 2011), while the short tailed P22-like phages on the other hand infect a much narrower range of hosts (below), and their adsorption is much better understood. The P22-like phages have a unique, diagnostic and universally syntenic cluster of twelve essential and three nonessential genes that encode all of the proteins necessary for virion assembly (reviewed by Casjens and Thuman-Commike, 2011). We previously separated phages that carry this type of virion assembly genes into two clusters, the P22- and APSE-1-like phages. The latter are Hamiltonella defensa prophage-like entities that produce virions that have not yet been shown to be infectious (van der Wilk et al., 1999; Moran et al., 2005); they have virion assembly genes that are similar to those of the P22-like cluster, but because they have quite different early genes they were placed in a separate APSE-1-like phage cluster (Grose and Casjens, 2014). Figure S7 shows genome maps of these two phages and two other minor variations on the P22 organization that we are aware of. Because of their similar virion assembly genes and since our discussion concerned mainly the MCP and tailspike virion assembly genes, we consider them together here as the “P22 phage group”. There is considerable genetic mosaicism within this group of phages, and we have previously discussed this feature of their virion assembly gene clusters in some detail (Casjens and Thuman-Commike, 2011). We will not address all aspects of this mosaicism here, but focus on the diversity of P22-like MCPs, on the relationship of tailspike diversity to the host bacteria surface polysaccharides (tailspikes are virion’s cell-adsorption proteins, see below), and on the host ranges and abundance of P22-like prophages. The number of very different protein sequence types encoded by each of the fifteen P22-like virion assembly genes ranges from one for the decoration protein (nonessential and the only member of this gene cluster that is present in only a subset of the P22-like phages) to a much larger number of tailspike protein types. There are currently 24 authentic phages in this group whose genomes have been completely sequenced (listed in Table S4A) and well over a thousand P22-like prophage sequences in the current bacterial sequence database (below). These are very large increases over the 12 phages and 45 prophages in this group whose virion assembly genes were compared previously (Casjens and Thuman-Commike, 2011).
Diversity and host range of the P22 phage group
Authentic P22 group phages are known that infect S. enterica, E. coli and Shigella flexneri (and Hamiltonella defensa if the APSE-1 phage-like entities mentioned above are included). Each of the studied members of this phage group adsorbs to a specific cell surface polysaccharide, so the individual phages have narrow host ranges that are restricted by the structure of these host polysaccharides. In order to learn about the breadth of bacterial species that phages in the P22 group can infect, we again used P22-like MCPs as probes to identify prophages in the bacterial genome sequence database (see above). These MCPs have been previously observed to be present as three major sequence types that are typified by phages P22, CUS-3 and Sf6 (Eppler et al., 1991; Casjens et al., 2004; King et al., 2007; Casjens and Thuman-Commike, 2011). P22 and CUS-3 MCPs are 28% identical in amino acid sequence, and these are both <20% identical to Sf6 MCP. In spite of these somewhat weak sequence similarities, the three MCP types appear to have diverged from a common ancestor because they all have the same polypeptide fold and harbor a unique I-domain insert relative to other phage MCPs (Parent et al., 2012 & 2014; Suhanovsky and Teschke, 2013; Rizzo et al., 2014). The database search for relatives of these three MCPs was completed on July 31, 2015.
More than thirteen hundred P22 group MCP matches were found, all of which were in Enterobacteriaceae host genomes. Manual examination of >150 of these prophage MCP genes (spanning all the host genera with MCP matches) showed that they all reside in P22-like prophages; no P22 group MCP genes have been found that do not lie in a P22-like prophage. As with P2 above, P22-like MCP genes are thus an excellent indicator of P22-like prophages. The 24 bacterial genera - 20 more than were known from the characterized authentic phages - that contain these P22-like prophages are listed in Table 3. These MCP matches to P22-like prophages are all ≥61% identical to one of the three probes, and the next best matches are <35% identical to any of the three probes (e.g., the Acinetobacter prophages discussed below). Examination of weak bacterial MCP matches and similar searches with two other proteins that are even more strongly conserved in the P22 group, portal protein and the N-terminal tailspike domain, did not identify any additional prophages with a novel P22-like MCP type. Thus this large set of MCPs forms three well-defined sequence types that are well separated from the very distant homologues encoded by other phage types (with the single exception of the lytic phage IME-EC2, above and below). In notable contrast to the P2 supercluster phages discussed above, P22-like prophages are limited to a single bacterial family, the Enterobacteriaceae, where we found P22-like phages in 24 of the 74 known genera (see the NCBI taxonomy browser for current bacterial taxa; http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi).
Table 3.
Genus | P221 | Sf61 | CUS-31 |
---|---|---|---|
Arsenophonas | – | 81% | – |
Citrobacter | 70% | – | 87% |
Cronobacter | 67% | 86%2 | – |
Enterobacter | 98% | – | 61% |
Erwinia | 75% | – | – |
Escherichia | 73% | 99% | 100%3 |
Hamiltonella | – | 80% | – |
Klebsiella | – | – | 95% |
Kluyvera | 83% | – | – |
Leclercia | – | – | 88%2 |
Morganella | 64% | – | – |
Pantoea | 68% | – | – |
Pectobacterium | 69% | – | 82% |
Pluralibacter | 91% | – | – |
Proteus | 67% | – | – |
Providencia | 65%2 | 83% | – |
Salmonella | 100%3 | 99% | 100% |
Serratia | 76% | 92% | – |
Shigella | 73% | 100%3 | – |
Shimwellia | – | 89% | – |
Sodalis | – | 80% | – |
Tatumella | 77% | – | – |
Xenorhabdus | 68% | – | – |
Yersinia | 71% | 87% | – |
Best MCP hit in each genus is indicated with BLASTp percent sequence identity.
”WP_” GenBank accession number entry with no strain number or access to corresponding genome sequence given.
Host genus of phage that encodes the search probe MCPs is shown in bold.
The three MCP types are not distributed evenly within the host genera for which there are a significant number of sequences available (Table 3). For example, Salmonella strains have been successfully lysogenized by the P22-like phages that encode each of the three MCP types, but prophages with MCPs in the P22 branch are by far the most abundant, and known Klebsiella P22-like prophages all encode a CUS-3 type MCP (Table 3). Although MCPs from the P22 branch (with the exception of the phage IME-10 MCP subtype; see Figure 5 below) are not present in Escherichia prophages, P22 itself infects E. coli apparently normally in the laboratory if its genome is delivered into an E. coli cell (Botstein and Herskowitz, 1974; Gordon et al., 1994). The latter observation suggests that the non-uniform prophage distributions could be due to random evolutionary genetic shuffling of mosaic sections not yet having by chance arrived at some MCP-tailspike combinations. However, since laboratory infections do not reflect long-term evolution, MCP type could confer a not yet understood genus-specific long-term advantage or disadvantage.
A panel of 30 prophages was chosen from the P22 group MCP database matches for more detailed examination (listed in Table S4B); the panel was chosen randomly except that it includes at least one match from each of the 20 host genera that harbor largely intact P22-like prophages. The neighbor-joining tree in Figure 5 shows this panel’s MCP diversity in relationship to the hosts. All prophage panel members in the P22 MCP branch (yellow box in figure) are ≥64% identical to P22 MCP, all CUS-3 branch members (pink box) are ≥61% identical to CUS-3 MCP, and all Sf6 branch members (blue box) are ≥80% identical to Sf6 MCP. Thus the three MCP branches remain well defined after the addition of the panel of prophages. Figure 6 shows a dot plot comparison of the genomes of 25 prophages from the panel (five of the 30 MCP genes reside on short sequence contigs and were not included in this comparison). Only six members of this prophage panel have strong dot plot diagonal similarity to a characterized authentic P22-like phage. Each of the remaining nineteen shows only weak to moderate diagonal similarity to the known authentic phages. The latter prophages are shown below the thick lines that separates them from the P22- and APSE-1-like cluster members above. They have variable relationships to the other members of the prophage panel that range from a few nearly full genome-length, but segmented moderate similarities (e.g., between the two Yersinia prophages, the two Pectobacterium prophages, the Serratia plymuthica AS9 - Enterobacter aerogenes UCI47 pair, or the Morganella morganii KT-Proteus mirabilis ATCC7002 - Providencia rettgeri DSM1311 trio) to a majority that have incomplete similarity diagonals (see expanded plots in Figure S8 for examples of these relationships). We note that members of three sets of prophages in the same genus, three Escherichia prophages and two each from Yersinia and Pectobacterium, are more like each other than they are like the other prophages and phages, while the two Serratia prophages are not similar to one another or to other prophages or phages. In general this supports our previous analysis of the Enterobacteriaceae authentic phages in which we found that most but not all temperate subclusters (groups of highly related phages) contain phages that infect a single host species (Grose and Casjens, 2014).
Figure S9 shows a dot plot of the MCPs of the same phages and prophages whose genomes are compared in Figure 6. The similarity patterns in these two plots are quite different. For example the P22-like prophages in E. coli strains W, B7A and CVM-N38428PS are all rather closely related in overall DNA sequence (Figure 6), but they carry P22-, Sf6- and CUS-3-like MCPs, respectively. Thus, recombinational gene shuffling appears to be rather frequent within this phage group, so although phages can be accurately placed into the P22-like group by their MCP relationship with very few exceptions, whole genome comparison is necessary to determine their closer (e.g., subcluster) relationships. There appears to be considerably less MCP shuffling within the Enterobacteriaceae phage P2 group (cf. Figures 2 and S2 above) than in the P22 group, suggesting that rates or extents of intragroup shuffling of mosaic sections is variable and is a unique property of each phage group or cluster. It doesn’t seem useful to attempt to define clusters that have <50% genome-length diagonal homology in the continuum of relationships seen in Figures 6 and S8, but it is nonetheless clear that each host genus has some prophages that are quite different from those in the other genera, and the prophage panel demonstrates wide diversity within the relatively narrowly defined P22-like phage group.
Relationship of the P22-like phage group to other phages
Like most other Enterobacteriaceae-infecting tailed phage clusters (for example, the P2-like supercluster above and see Grose and Casjens, 2014), very few even moderately close P22-like MCP relatives are found outside the confines of the defined P22-like group. However, unlike the P2 supercluster for which we found no MCP matches outside the supercluster, the P22 group MCP matches include a small number of authentic phages that have significant (but not strong) similarity to the P22-like cluster phages. These are (i) E. coli lytic phage IME-EC2 that was discussed above, (ii) several prophages in genomes of the Acinetobacter bacterial genus (Moraxellaceae family, Gammaproteobacteria) and Novosphingobium (Alphaproteobacteria), and (iii) three authentic phage types that infect bacteria outside the Enterobacteriaceae family. The latter are typified by lytic short tailed phages PVA1 (Zhang et al., 2014; Accession No. KJ395778) and BA3 (Efrony et al., 2009; Accession No. EU124666), which infect Vibrio alginolyticus and Thalassomonas loyana (Vibrionaceae and Colwelliaceae Gammaproteobacteria families, respectively) and ∅RSK1 (Accession No. AB863625), which infects Ralstonia solanacearum (Ralstoniaceae family of the Betaproteobacteria). PVA1 and BA3 are apparently lytic phages from analysis of their genomes, making these possible examples of rare ancient hybridization between lytic and temperate phages (our unpublished bio-informatic analysis). Figure 5 shows that in a neighbor-joining tree the IME-EC2 MCP falls robustly within the CUS-3 branch of the P22-like MCPs, while the PVA1, BA3, ∅RSK1 and the Acinetobacter prophage MCPs define separate deep branches (a phylogenetic analyses with MEGA5 (Tamura et al., 2010) gave an essentially identical branching pattern, not shown). All of these “outsider” P22-lke MCPs are similar in size to those of the P22-like MCPs, but, except for IME-EC2, they are all sufficiently different in amino acid sequence to make it impossible to determine whether they carry an I-domain insert (above) without further structural information. Nonetheless, these phages are all sufficiently different from one another and from the P22-like phages to unambiguously represent separate phage clusters (not shown). It thus appears that P22 MCP (and associated procapsid assembly genes, not shown) has undergone horizontal exchange with other phage types several times, with the IME-EC2 event happening most recently.
P22 group prophage abundance and diversity in Salmonella
To make the analysis more manageable, we restricted further examination of P22 group prophages to the Salmonella genus. We searched the 3298 Salmonella genomes (above) for P22-, CUS-3- and Sf6-like MCP matches and identified 741 P22 group prophages. A parallel search with the highly conserved P22 N-terminal tailspike domain (see below) did not find any additional P22-like prophages with other MCP types, but did identify three additional tailspike-containing P22-like prophage fragments that lack an MCP gene. These were in serovar Miami strain 1923, Mbandaka strain CVM N51302 and subsp. houtenae strain ATCC BAA-1581. We therefore believe that it is unlikely that a significant number of Salmonella P22-like prophages were missed in this search. These 744 prophages are tabulated in Table S3. Over 100 of the Salmonella MCP matches were examined manually, and their genes all lie in regions with homologous synteny to the P22 virion assembly gene cluster. Some of these prophages are apparently truncated by natural deletion, but smaller than whole prophage contigs in draft genomes, make it impossible to determine the fraction that are in fact potentially fully functional prophages.
P22-like prophages were found in 22.6% of the 3298 Salmonella genomes and in 45 (49.5%) of the 91 serovars whose genomes have been sequenced (Table S3). This value for the fraction of all Salmonella genomes that harbor P22 group prophages is heavily weighted in favor of the seven serovars that have more than 100 genome sequences. As a preliminary estimate of the fraction of Salmonella cells that naturally harbor a P22-like prophage, it would perhaps be more useful at this point to note that the average of the fraction of genomes containing such prophages in each of the 91 serovars in Table S3 is 39%. P22-like prophages are present in Salmonella cells that carry twenty different O-antigen surface polysaccharides. Are P22-like phages that infect bacteria that have the same O-antigen more closely related to each other than they are to P22-like phages that infect hosts with different O-antigens? To examine this question and to display some of the genome diversity of the S. enterica P22-like phages, we randomly chose a panel of fifteen prophages from fifteen different Salmonella serovars that include eight O-antigen types (listed in Table S4C). A dot plot comparing these genomes is presented in Figure S10, and it does not show any strong clustering of more highly related prophages within the different O-antigen types. Our overall conclusion from this comparison is that diversity is considerable within O-antigen type hosts (diagonal dot plot lines are segmented rather than solid), and in most cases examined it appears to be as great as the diversity between O-antigen types.
We also examined P22-like prophage diversity within individual Salmonella serovars in more detail (serovars can perhaps be thought of a sub-lineages within the O-antigen types that have different flagellar antigens and/or metabolic differences). These prophages are not distributed uniformly across the 91 serovars. For example, among the serovars with many genome sequences Typhi and Enteritidis are severely under-represented for P22-like prophages; they carry zero and one P22-like prophage out of 1755 and 318 genome sequences, respectively (and we believe that in the one putative Enteritidis case the O-antigen type was incorrectly determined; see below). The near uniformity in prophage content of these two serovars correlates well with the low overall genetic diversity present in these two monophyletic lineages (Pang et al., 2007; Holt et al., 2008; Betancor et al., 2009; Lan et al., 2009; Yim et al., 2010; Allard et al., 2013; Timme et al., 2013). On the other hand, nearly all members of some serovars carry P22-like prophages. For example, they are present in all 130 of the available 130 serovar Paratyphi A genomes (Table S3). A dot plot in Figure S11 shows that nine randomly chosen Paratyphi A P22-like prophages are extremely similar to one another (listed in Table S4D), which correlates with the low diversity and highly clonal (monophyletic) nature of this serovar (Zhou et al., 2014). In contrast, although all 103 serovar Heidelberg genome sequences (Table S3) carry a P22-like prophage, the dot plot in Figure S12 shows that P22-like prophages are present in at least three different sequence types in this highly clonal serovar (Hoffmann et al., 2014; these authors also noted that all the Heidelberg isolates they examined carried P22-like prophages). The genomes of some serovars have intermediate frequencies of P22-like prophages, for example Typhimurium (87% carry such prophages), Newport (21%), Bareilly (12%) and Agona (11%) (Table S3). Among the latter four, we examined prophage panels from Typhimurium and Newport and found them to carry at least 10 and 5 different types of P22-like prophages, respectively (Figures S13 and S14). Newport, Agona and Bareilly are all polyphyletic serovars (Sangal et al., 2010; Achtman et al., 2012; Cao et al., 2013; Timme et al., 2013; Zhou et al., 2013) and Typhimurium is substantially more diverse than Typhi (Lan et al., 2009; Bell et al., 2011; Barco et al., 2015). Thus, there is a noticeable but imperfect correlation between uniformity of P22-like prophage contents and overall serovar genetic uniformity.
Clearly prophages are not absolutely constant within serovars or groups of serovars that have the same O-antigen, but their coming and going is not so rapid that it completely obscures this correlation, especially in the apparently monophyletic serovars. It is not possible to determine rigorously whether prophage similarities among independent isolates are due to common evolutionary descent from an ancestral lysogen or whether particular phages are very successful and have lysogenized virtually all descendants of a nonlysogen ancestor. However, we note that serovars that have the same surface polysaccharide, for example Typhimurium, Agona, Paratyphi B, Saint Paul and Heidelberg that all have the same O:4 O-antigen, can have very different levels of P22-like prophage occupancy (Table S3). It seems unlikely that such similar bacterial lineages that presumably occupy very similar niches would be able to differentially discriminate against establishment of lysogeny by P22-like phages, so it seems likely that prophage frequency of Salmonella serovars is a function of both linear descent and new infections, making the correlation between serovar and prophage type less than perfect.
Tailspike diversity and host ranges of the P22-like phages
During virion adsorption the tailspikes (the product of P22 gene 9) of P22-like virions that have been studied bind to the O-antigen polysaccharide or a capsular polysaccharide on the bacterial host’s surface (Israel et al., 1967; Lindberg et al., 1970). In addition to binding, the tailspikes have an enzymatic activity that either cleaves or de-acetylates the target polysaccharide, and this presumably allows an oriented descent of the virion to the bacterial surface (reviewed in Casjens and Molineux, 2012). These tailspikes have two polypeptide domains, an N-terminal ~120 amino acid domain that attaches the tailspike to the virion (Steinbacher et al., 1997) and a larger 500–600 amino acid C-terminal domain that carries the O-antigen binding and cleavage site (Steinbacher et al., 1996; Muller et al., 2008). X-ray structures are known for the whole tailspike or its C-terminal domain from three phages in the P22 group, P22, HK620 and Sf6 (Steinbacher et al., 1994; Steinbacher et al., 1997; Freiberg et al., 2003; Barbirz et al., 2008). All tailspike N-terminal domains are highly conserved and have ≥50% amino acid sequence identity, and although the C-terminal domain amino acid sequences of these three tailspikes are not recognizably similar, they have similar polypeptide folds. On the other hand, E. coli P22-like phage CUS-3 tailspike has a canonical P22-like N-terminal virion-attachment domain and a C-terminal domain that almost certainly has a different fold from the above tailspikes (see Parent et al., 2014).
The tailspikes of the P22-like phages are much more diverse than their MCPs, but they can be unambiguously identified by their highly conserved N-terminal virion attachment domains (Villafane et al., 2005; Casjens and Thuman-Commike, 2011). Among the extant P22-like phage and prophage sequences, we have anecdotally identified over eighty different C-terminal domains that are not recognizably similar in amino acid sequence to one another (not shown). However, in order to make a more detailed analysis tractable, in this report we examine the tailspikes encoded by the 24 authentic, completely sequenced, Salmonella-infecting P22-like phages listed in Table S4A, several incomplete P22-like phage genome sequences (e.g., phage SETP1; accession No. EF151184) and the 743 tailspike gene-containing P22 group prophages present in the above 3298 Salmonella genomes (Table S3). Tailspike diversity is comparable in the P22-like phages that infect E. coli, but Salmonella was chosen because O-antigen phage receptor type is routinely available for Salmonella isolates. Salmonella O-antigen polysaccharides fall into 46 different “O groups” as determined by their antigenicity, and they correspond to at least 50 different polysaccharide structures (Grimont and Weill, 2007; Liu et al., 2014). The O group constitutes a major part of Salmonella serotype determination, and different serovars within the same O group have different flagellar antigens and/or metabolic characteristics (Grimont and Weill, 2007).
How do the different P22-like phage tailspike sequences correspond to the many Salmonella O-antigen structures? The sequences of all 743 Salmonella P22-like prophage-encoded tailspikes and the 26 known authentic Salmonella P22-like phage encoded tailspikes were examined. In order to simplify presentation, groups of nearly identical tailspikes were collapsed to one prototype except that when very similar tailspikes are present in prophages in multiple serovars one representative from each serovar was included (these prophage tailspikes are listed with locus_tags in Table S4H). Figure 7 shows a neighbor-joining tree of these C-terminal tailspike domains. The 769 phages and prophages have eighteen very different tailspike C-terminal domains whose amino acid sequences are not recognizably similar to one another (marked by colored boxes in Figure 7). In essentially all cases one tailspike type dominates within each host O-antigen type. This correlation is as follows (compiled from Table S3 and the sequenced P22-like authentic phage tailspike genes): all 130 tailspikes from O:2 hosts are type I tailspikes, eight of the eleven tailspikes from O:3,10 hosts are type III, 530 of 532 tailspikes from O:4 hosts are type I, 31 of the 32 tailspikes from O:6,7 hosts are type V, all 22 tailspikes from O:8 hosts are type VII, fifteen of sixteen tailspikes from O:9 hosts are type I, two of two tailspikes from O:13 hosts are type VIII, two of two tailspikes from O:16 hosts are type VI, all 21 tailspikes from O:18 hosts are type IV, and one of two tailspikes from O:30 hosts are type XII. Eight O groups, O:11, O:21, O:39, O:40, as well as subspecies arizonae, salamae, houtenae and diarizonae, contain singleton prophages or phages with type XIV, X, XIII, IX, XV, XVI, XVII and XVIII tailspikes, respectively. Experimental evidence from authentic phages with known hosts identifies tailspike types I, II, III, V, VII and XVIII as able to adsorb to specific O-antigens (Figure 7), and in each case the phage tailspike type is the same as the majority of prophages found in that O group. We therefore believe that for all the O groups the majority prophage tailspike type is almost certainly able to bind to that host group’s surface polysaccharide.
However, a small minority of prophages have tailspikes that are different from the others in their reported host O group (Figure 7 and Table S3). These few “anomalous” tailspikes reside in cells to which the original infecting phage (that integrated to become the tailspike-encoding prophage) should not have been able to adsorb. Several of these apparent exceptions can be understood when the O-antigen structures are considered in more detail. The O:2, O:4 and O:9 groups have the same mannose-rhamnose-galactose O-antigen backbone repeat and differ only by different single sugar side chains (paratose, abequose, and tyvelose, respectively; reviewed in Liu et al., 2014). Phage P22 with its type I tailspikes successfully infects Salmonella from each of these three O groups (Zinder and Lederberg, 1952; Eriksson et al., 1979; Bergthorsson and Roth, 2005), so its type I tailspike can accommodate the three different side groups (see Andres et al., 2013). The C-terminal domains of type I tailspikes have substantial intra-type diversity (Figure 7); however, we have not found a convincing correlation between type I tailspike subtypes and the different O-antigen side groups. Similarly, intra-tailspike type (subtype) sequence differences in types V and VII (Figure 7) range down to about 40% amino acid sequence identity in their C-terminal domains, and these differences are also not known to correlate with O-antigen differences. Thus it appears that tailspike C-terminal domain amino acid sequences can be as much as 60% different and still bind the same polysaccharide. We also note that serovar Schwarzengrund in the O:4 group carries a P22-like prophage with a unique type XI tailspike that is very different from the other O:4-specific tailspikes. This is very likely explained by the fact that although it has the same mannose(abequose)-rhamnose-galactose O:4 sugar repeat as, for example serovars Typhimurium and Heidelberg, the Schwarzengrund O-antigen has a different linkage between the backbone tri-saccharide repeats even though it is antigenically in the O:4 group (Wang et al., 2002). This backbone difference likely translates to a structural difference that is sufficient to require a different tailspike type.
The fact that phages with both type II and III tailspikes appear to infect O:3,10 hosts (Figure 7) can be understood as follows: Phage epsilon 15 prophages express several functions, including a polysaccharide β polymerase, that alters the host’s O:3,10 O-antigen structure so that its normal mannose-rhamnose-acetylgalactose repeat units connected by α1–>6 linkages become mannose-rhamnose-galactose repeats connected by β1–>6 linkages (summarized in Kropinski et al., 2007). Phage epsilon 34 is a P22-like phage with a type II tailspike that only infects O:3,10 strains that harbor an epsilon 15 prophage (Uetake et al., 1955; Barksdale, 1959). The P22-like prophage that has this type of tailspike resides in serovar Anatum (O:3.10) strain USDA-ARS-USMARC-1735 (Accession No. CP007584), and this strain also carries an epsilon 15-like prophage with an apparently intact polysaccharide β polymerase gene (our unpublished observation). On the other hand, the Anatum strain CDC 06-0532 (accession No. CP007211) carries a P22-like prophage with a type III tailspike and does not have an epsilon 15 prophage (our unpublished observation). P22-like phage g341 has a type III tailspike that is specific for unmodified O:3,10 O-antigen (Iwashita and Kanegasaki, 1976b). Thus, the specificity differences between type II and type III tailspikes can be understood in terms of this phage epsilon 15-mediated O-antigen structural difference. We also note that the phage g341 tailspike does not cleave the O-antigen like the other P22-like tailspikes that have been studied (those of P22, epsilon 34, Sf6 and HK620; Iwashita and Kanegasaki, 1975, 1976a; Chua et al., 1999; Zayas and Villafane, 2007; Barbirz et al., 2008), but is instead a de-acetylase that removes the acetyl moiety from the acetyl-galactose of the O:3,10 polysaccharide (Iwashita and Kanegasaki, 1976b). Their high sequence similarities strongly suggest that all of the type III tailspikes are such de-acetylases.
But what of the other six cases where the prophage tailspike does not correlate with the O group (blue text names in Figure 7; red text in Table S3)? These prophages that reside in individual isolates of serovars Give (O-antigen group O:3,10; strain CFSAN004343), Muenster (O:3,10; str. 0315), Aqua (O:30; str. NVSL2001), Montevideo (O:6,7; str. CFSAN004346), Enteritidis (O:9; str. 3402), and Typhimurium (O:4; str. FORC_015) are not so simply understood (see Table S4H for tailspike locus_tags). It appears that either (i) the serotypes of these Salmonella strains were incorrectly determined, (ii) the prophages arrived in their current host by some mechanism other than normal virion adsorption such as uptake of free phage DNA, prophage transfer by bacterial mating or transduction by some other type of phage, or (iii) the lineage in which the prophage currently resides horizontally acquired new O-antigen synthesis genes or changed its tailspike gene after the prophage integrated.
An analysis of the O-antigen synthesis gene cluster (rfb gene region) of these strains showed that in all six of these cases, these host genomes contain rfb genes predicted by the phage tailspike rather than by the reported serotype (see Figure S15). Serovar “Give” strain CFSAN004343 contains the O:4 rfb region rather than the reported O:3,10 region, “Muenster” strain CFSAN004344 contains the O:6,7 region rather than the O:3,10 rfb region, “Aqua” strain NVSL2001 contains the O:4 rfb region rather than the O:30 region, “Montevideo” strain CFSAN004346 contains the O:9 region rather than the O:6,7 rfb region, “Enteritidis” strain 3402 contains the O:3,10 region rather than the O:9 rfb region and “Typhimurium” strain FORC_015 contains the O:6,7 region rather than the O:4 rfb region. These six strains were almost certainly incorrectly serotyped, and their P22-like prophage tailspikes are after all not anomalous. Thus, all 743 of the Salmonella P22-like prophage-encoded tailspikes reside in cells that have surface O-antigens to which they should be able to bind, and prophage tailspike type predicts the O-antigen structure (or a set of related structures in the O:2/O:4/O:9 case) of a Salmonella P22-like phage lysogen with 100% accuracy in our large panel of strains.
During this analysis we noticed that several P22-like prophages in Salmonella serovar Newport strains CVM4176, CVM19470, CVM19536, CVN4176, SH111077, Shangdon_3 and Henan_3 have a second tailspike gene immediately downstream of the canonical one whose C-terminal domain is about 50% identical to the upstream one. The upstream genes’ N-terminal virion-attachment domains are 96% identical to the parallel phage P22 tailspike domain; however, the downstream genes do not have a homologous N-terminal domain, suggesting that the two adjacent tailspike genes do not encode alternative proteins that attach to the same virion site. Perhaps the protein encoded by the downstream gene attaches to the virion differently from the canonical upstream tailspike. This arrangement is unique among the known P22-like phages and prophages, and it could indicate that these virions have two different tailspikes analogous to Salmonella T7-like phages SP6 and K1–5 (Dobbins et al., 2004; Scholl et al., 2004; Leiman et al., 2007).
Finally, we also note that the subtype Va, Vb and Vc C-terminal tailspike domains (Figure 7) are nearly identical to one another in their N-terminal regions but abruptly change to only 25–55% identity across their C-terminal 340 amino acids. This defines a new mosaic boundary inside these P22-like tailspike genes that has not been seen previously (Figure S16). Since the structures of the type V proteins are not known, we do not yet know if this this mosaic boundary coincides with a protein domain boundary as do other such intra-gene boundaries in the P22-like virion assembly genes (Casjens and Thuman-Commike, 2011).
Total tailed phage prophages in Salmonella genomes
Our analysis of the authentic Enterobacteriaceae tailed phages has delineated 26 temperate tailed phage clusters, 24 of which have MCPs that are only very distantly related to all the others. In addition, we identified several “exceptional” MCP types of phages CUS-3, SfV, mEp235, ∅V10 and StyA that are <50% identical the others in their cluster. We probed the panel of 3298 Salmonella genomes (above) with each of the 30 MCP types and compiled the number of >50% identity matches to each MCP type. Numerous spot checks of the resulting MCP matches showed that the prophages tested are indeed in the expected phage cluster. Table 4 shows that this search identified 9371 prophages from 20 of the 26 currently known Enterobacteriaceae temperate phage clusters. There are on average 2.84 prophages with MCPs of known type present per Salmonella genome. This total is a minimum value since it does not include prophages of as yet unidentified phage types, more divergent members of known types, or prophages whose MCP genes have been deleted or are not annotated. Again, these prophage types are scattered among the Salmonella serovars with a few obvious exceptions as follows: all thirteen ENT47970-like MCPs are encoded by prophages in serovar Newport isolates, all but one of the 128 SSU5 prophage plasmids and a large majority of the SEN34-like prophages are present in serovar Typhi isolates (the latter were first identified as prophage Sti4b in Typhi strain CT18 Parkhill et al., 2001; Casjens, 2003).
Table 4.
Supercluster | Phage cluster1 | Prophages2 |
---|---|---|
P2 | Fels-2 (P2) 3 | 4197 |
P886 | 115 | |
ESSI-2 | 446 | |
Lambda | Lambda4/∅804/N154,5 | 607 |
PY545,6 | 4 | |
HK97 | 0 | |
ES-18 | 99 | |
Gifsy-2 | 559 | |
BP-4795 | 7 | |
SfI | 483 | |
SfV6 | 10 | |
P22 | 729 | |
CUS-36 | 7 | |
APSE-1 (Sf66) | 5 | |
HK639 | 2 | |
∅ES15 | 1 | |
HS2 | 0 | |
ENT47970 | 13 | |
ZF40 | 0 | |
∅ET88 | 0 | |
mEp2356 | 41 | |
933W | 17 | |
SEN34 | 1806 | |
ε15 | ε15 | 4 |
∅V106 | 11 | |
P1 | P12,4 | 67 |
styA5,6,7 | 6 | |
Mu | Mu | 7 |
SSU5 | SSU55 | 128 |
GF-2 | GF-2 | 0 |
PEp14 | PEp14 | 0 |
Total Prophages | 9371 |
Clusters indicated by the prototype phage for each cluster (Grose and Casjens, 2014 and our unpublished analyses)
Prophages identified by the presence of a gene that encodes and MCP >50% identical to listed phages.
Salmonella phage Fels-2 MCP was used from the P2-like cluster.
These three clusters have similar MCPs and differ in other parts of the genome, especially the early regions.
These are prophage plasmids that are not integrated into the host chromosome.
Subtype of MCP that is >50% different from prototypical members of the cluster immediately above it in the list. Note that the P22-like phage Sf6 has an MCP that is similar to that of the APSE-1-like cluster.
StyA prophage is a plasmid prophage in S. enterica serovar Typhimurium str. 34502; accession No. AUQT01000011.
Conclusions
We find that tailed phages only extremely rarely exchange MCP genes, so temperate tailed phage MCP sequence type is diagnostic of phage type (above). These observations allow the use of temperate phage MCP probes to quite accurately identify prophages of known type in bacterial genome sequences. The number of bacterial genome sequences available for analysis is growing rapidly, and this database has become large enough, especially for human pathogens, that in some cases it should begin to serve as a valid indicator for the extant breadth of prophage presence and genetic diversity in the natural world.
The number of temperate phage MCP matches in the bacterial genome database is too enormous and diverse for any comprehensive and detailed analysis. We have therefore chosen the approach of studying particular temperate phage types and bacterial hosts as informative examples, rather than performing a comprehensive analysis of all phages and bacteria. We analyzed the current sequence database for prophages that belong to the phage P2 and P22 groups. These are two of the best-studied “model system” temperate phages, and the wealth of information available on them serves to make such analyses much more robust and informative. Since prophages should be found in the genomes of the hosts they naturally infect, the presence of a particular prophage type in a bacterial genome sequence is a good indication that phages of that type naturally infect that bacterial species. We found P22-like prophages are present in 24 bacterial genera, 20 more than was previously known, and that they are limited to the Enterobacteriaceae bacterial family. On the other hand P2 supercluster prophages are present in 127 genera, 114 more than the previously known 13 host genera. These 127 genera reside in 32 families in four of the five Proteobacteria classes, but no P2-like prophages were found outside the Proteobacteria phylum. These two phage types clearly infect very different taxonomic ranges of bacterial hosts, and the P2 group has been more successful at expanding (or has had more time to expand?) its host repertoire than has the P22 group. The notion that temperate phages from the same cluster that infect the same host genus (or small group of closely related genera) usually have closer overall relationships with each other than they do to phages of the same type that infect other genera is strongly supported for temperate phages in this study.
We focused on these prophages in the Salmonella host genus because of the large number of genome sequences and because the wealth of serovar information for this genus allows it to be divided into a number of often, but not always, highly clonal lineages. We identified 744 P22-like prophages and 4758 P2-like prophages in the 3298 Salmonella genomes available at the time of our study. In both cases the prophages are still too many to examine all of them in detail, so we used analysis of semi-randomly chosen panels of prophages to demonstrate that these prophages greatly increase the known diversity within their phage group, but do not increase the diversity to the point that these phage clusters begin to lose their individuality. Prophages of both P22 and P2 phage types are nonrandomly distributed across the various Salmonella serovars whose genomes have been sequenced, but the degree of P22-like prophage diversity within a serovar generally (with some exceptions) correlates with overall diversity or lack of clonality of each serovar. Finally, we found a total of 9371 prophages in these Salmonella genomes that correspond to 20 of the 26 known Enterobacteriaceae temperate phage clusters. We suspect from this analysis of the Enterobacteriaceae phages that few temperate tailed phage types (clusters) will be found that are restricted to one or a small number of very closely related host genera, but at present there is very little information on whether restriction to one host family like the P22 group or expansion to many host families like the P2 group is the more common situation.
Our analysis of the P22 phage group found that the sequence type of their receptor-binding tailspike proteins correlates perfectly with the structure of the host’s receptor surface polysaccharide. Every one of 743 Salmonella P22-like prophages encodes a tailspike that is predicted to bind to the O-antigen polysaccharide that its cell displays on its surface, so prophage-encoded tailspike type is an excellent predictor of surface O-antigen structure. In Salmonella, eighteen different types of P22-like phage tailspike receptor-binding domains that are not recognizably similar to one another in amino acid sequence are present in the current sequence database. These domains are being exchanged within the P22-like phages as well as with other phages types; for example, the phage P22 type I O:2/O:4/O:9-specific tailspike has homologues in P22-like prophages with Sf6 and CUS-3 type MCPs), as well as in the very different Typhimurium-infecting lytic phages with long non-contractile tails (phage 9NA; Andres et al., 2012; Casjens et al., 2014) and long contractile tails (Det7; Walter et al., 2008; Casjens et al., 2015). These are examples of horizontal exchange events of a single protein domain that do not significantly disrupt the overall relationships among the participating phage types. These different tailspike domains all have or are predicted to have substantial beta secondary structure (not shown), but they have at least two very different polypeptide folds. It seems likely that phages or prophages with this kind of tailspike will eventually be found that infect each of the over 50 different Salmonella surface polysaccharides as well as the polysaccharides present on other Enterobacteriaceae cells. These proteins or phages will be of technical use in identifying and detecting cells that display these different polysaccharides (see Handa et al., 2008; Thouand et al., 2008; Singh et al., 2012 & 2013; Miletic et al., 2015).
Taken together, our studies point out the large contribution prophages can make to our knowledge of the enormous abundance, ubiquity and diversity that is present even within individual tailed phage types or clusters. They also support the idea that, in spite of their great diversity and of rare examples of fairly recent horizontal genetic exchange of genes between phages of different types, temperate phage MCP genes move (as genes or whole phages) between host taxa only rarely. If such events to occur, they do little to obscure the fact that tailed phages exist as groups of genetically similar (although mosacially related) phages that infect related hosts rather than as a smooth genetic continuum with hosts of highly related phages scattered among diverse taxa. On one hand, the large number of P22-like tailspike types and the fact that they have undergone a number of horizontal exchange events (more types and more exchanges than the other virion structural genes), suggests that it can be especially advantageous for a phage to acquire a new tailspike that gives it the ability to adsorb to a new, presumably naive host. But on the other hand, we suspect that temperate phages are genetically fine-tuned to succeed in co-existence with a relatively small range of related hosts, so it may be more difficult for them than for a lytic phage to survive after horizontal acquisition of a new tailspike gene that allows adsorption to a host that is not closely related to their current host.
Materials and methods
Prophage sequences
Prophage and MCP sequences were manually extracted from reported bacterial genome sequences. Since precision of extraction of prophage sequences would not greatly affect our analysis or conclusions, and in order to make the task manageable, we did not attempt to make the ends of the extracted prophage sequences correspond exactly to predicted attachment site sequences. We always included the most distal phage-related genes in the prophage sequences; in a majority of cases these were the terminal genes in one of the known P2-like or P22-like phages. Thus, our prophage sequences in general have ends that are located within a few kbp of the true ends of the integrated phage DNA.
Computer methods
Comparisons of phage genomes were made using matrix dot plot generating computer programs Gepard (Krumsiek et al., 2007) and DNA Strider (Douglas, 1994). Neighbor-joining trees were created with ClustalX (Larkin et al., 2007), and protein family analysis and map construction was done with Phamerator (Cresawn et al., 2011); trees and maps thus created were modified in Adobe ILLUSTRATOR using reported annotations and supplementary BLASTp (Altschul et al., 1997) analysis.
Supplementary Material
Research Highlights.
9371 tailed phage prophages were found in 3298 Salmonella genome sequences.
The 4758 P2-like and 744 P22-like Salmonella prophages are very diverse.
P2- and P22-like prophages were found in 114 and 24 host genera, respectively.
A tailed phage’s major capsid protein can predict temperate or lytic lifestyle.
P22 prophage tailspike sequence predicts the O-antigen type of its host.
Acknowledgments
This work was supported by NIH grant RO1-GM114817 (SRC) and the College of Life Sciences and the Department of Microbiology and Molecular Biology of Brigham Young University (JHG).
Footnotes
Supplementary information associated with this article can be found in the online version at this journal’s website.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Achtman M, Wain J, Weill FX, Nair S, Zhou Z, Sangal V, Krauland MG, Hale JL, Harbottle H, Uesbeck A, Dougan G, Harrison LH, Brisse S, Group SEMS. Multilocus sequence typing as a replacement for serotyping in Salmonella enterica. PLoS Pathog. 2012;8:e1002776. doi: 10.1371/journal.ppat.1002776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Allard MW, Luo Y, Strain E, Pettengill J, Timme R, Wang C, Li C, Keys CE, Zheng J, Stones R, Wilson MR, Musser SM, Brown EW. On the evolutionary history, population genetics and diversity among isolates of Salmonella Enteritidis PFGE pattern JEGX01.0004. PLoS One. 2013;8:e55254. doi: 10.1371/journal.pone.0055254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andres D, Roske Y, Doering C, Heinemann U, Seckler R, Barbirz S. Tail morphology controls DNA release in two Salmonella phages with one lipopolysaccharide receptor recognition system. Mol Microbiol. 2012;83:1244–53. doi: 10.1111/j.1365-2958.2012.08006.x. [DOI] [PubMed] [Google Scholar]
- Andres D, Gohlke U, Broeker NK, Schulze S, Rabsch W, Heinemann U, Barbirz S, Seckler R. An essential serotype recognition pocket on phage P22 tailspike protein forces Salmonella enterica serovar Paratyphi A O-antigen fragments to bind as nonsolution conformers. Glycobiology. 2013;23:486–94. doi: 10.1093/glycob/cws224. [DOI] [PubMed] [Google Scholar]
- Barbirz S, Muller JJ, Uetrecht C, Clark AJ, Heinemann U, Seckler R. Crystal structure of Escherichia coli phage HK620 tailspike: podoviral tailspike endoglycosidase modules are evolutionarily related. Mol Microbiol. 2008;69:303–16. doi: 10.1111/j.1365-2958.2008.06311.x. [DOI] [PubMed] [Google Scholar]
- Barco L, Barrucci F, Cortini E, Ramon E, Olsen JE, Luzzi I, Lettini AA, Ricci A. Ascertaining the relationship between Salmonella Typhimurium and Salmonella 4,[5],12:i:- by MLVA and inferring the sources of human salmonellosis due to the two serovars in Italy. Front Microbiol. 2015;6:301. doi: 10.3389/fmicb.2015.00301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barksdale L. I. Lysogenic Conversions in Bacteria. Bacteriol Rev. 1959;23:202–12. doi: 10.1128/br.23.4.202-212.1959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beilstein F, Dreiseikelmann B. Temperate bacteriophage ∅O18P from an Aeromonas media isolate: characterization and complete genome sequence. Virology. 2008;373:25–9. doi: 10.1016/j.virol.2007.11.016. [DOI] [PubMed] [Google Scholar]
- Bell RL, Gonzalez-Escalona N, Stones R, Brown EW. Phylogenetic evaluation of the ‘Typhimurium’ complex of Salmonella strains using a seven-gene multi-locus sequence analysis. Infect Genet Evol. 2011;11:83–91. doi: 10.1016/j.meegid.2010.10.005. [DOI] [PubMed] [Google Scholar]
- Bergh O, Borsheim KY, Bratbak G, Heldal M. High abundance of viruses found in aquatic environments. Nature. 1989;340:467–8. doi: 10.1038/340467a0. [DOI] [PubMed] [Google Scholar]
- Bergthorsson U, Roth JR. Natural isolates of Salmonella enterica serovar Dublin carry a single nadA missense mutation. J Bacteriol. 2005;187:400–3. doi: 10.1128/JB.187.1.400-403.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bertani E, Six E. The P2-like phages and their parasite, P4. In: Calendar R, editor. The Bacteriophages, volume 2. Oxford Press; New York City, N.Y: 1988. pp. 73–143. [Google Scholar]
- Betancor L, Yim L, Fookes M, Martinez A, Thomson NR, Ivens A, Peters S, Bryant C, Algorta G, Kariuki S, Schelotto F, Maskell D, Dougan G, Chabalgoity JA. Genomic and phenotypic variation in epidemic-spanning Salmonella enterica serovar Enteritidis isolates. BMC Microbiol. 2009;9:237. doi: 10.1186/1471-2180-9-237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bobay LM, Touchon M, Rocha EP. Pervasive domestication of defective prophages by bacteria. Proc Natl Acad Sci U S A. 2014;111:12127–32. doi: 10.1073/pnas.1405336111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Botstein D, Herskowitz I. Properties of hybrids between Salmonella phage P22 and coliphage lambda. Nature. 1974;251:584–9. doi: 10.1038/251584a0. [DOI] [PubMed] [Google Scholar]
- Breitbart M, Salamon P, Andresen B, Mahaffy JM, Segall AM, Mead D, Azam F, Rohwer F. Genomic analysis of uncultured marine viral communities. Proc Natl Acad Sci U S A. 2002;99:14250–5. doi: 10.1073/pnas.202488399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brum JR, Ignacio-Espinoza JC, Roux S, Doulcier G, Acinas SG, Alberti A, Chaffron S, Cruaud C, de Vargas C, Gasol JM, Gorsky G, Gregory AC, Guidi L, Hingamp P, Iudicone D, Not F, Ogata H, Pesant S, Poulos BT, Schwenck SM, Speich S, Dimier C, Kandels-Lewis S, Picheral M, Searson S, Tara Oceans C, Bork P, Bowler C, Sunagawa S, Wincker P, Karsenti E, Sullivan MB. Ocean plankton. Patterns and ecological drivers of ocean viral communities. Science. 2015;348:1261498. doi: 10.1126/science.1261498. [DOI] [PubMed] [Google Scholar]
- Brussaard CP, Wilhelm SW, Thingstad F, Weinbauer MG, Bratbak G, Heldal M, Kimmance SA, Middelboe M, Nagasaki K, Paul JH, Schroeder DC, Suttle CA, Vaque D, Wommack KE. Global-scale processes with a nanoscale drive: the role of marine viruses. ISME J. 2008;2:575–8. doi: 10.1038/ismej.2008.31. [DOI] [PubMed] [Google Scholar]
- Campoy S, Aranda J, Alvarez G, Barbe J, Llagostera M. Isolation and sequencing of a temperate transducing phage for Pasteurella multocida. Appl Environ Microbio. 2006;72:3154–60. doi: 10.1128/AEM.72.5.3154-3160.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Canchaya C, Proux C, Fournous G, Bruttin A, Brussow H. Prophage genomics. Microbiol Mol Biol Rev. 2003;67:238–76. doi: 10.1128/MMBR.67.2.238-276.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cao G, Meng J, Strain E, Stones R, Pettengill J, Zhao S, McDermott P, Brown E, Allard M. Phylogenetics and differentiation of Salmonella Newport lineages by whole genome sequencing. PLoS One. 2013;8:e55687. doi: 10.1371/journal.pone.0055687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Casjens S. Prophages and bacterial genomics: what have we learned so far? Mol Microbiol. 2003;49:277–300. doi: 10.1046/j.1365-2958.2003.03580.x. [DOI] [PubMed] [Google Scholar]
- Casjens S, Winn-Stapley D, Gilcrease E, Moreno R, Kühlewein C, Chua JE, Manning PA, Inwood W, Clark AJ. The chromosome of Shigella flexneri bacteriophage Sf6: complete nucleotide sequence, genetic mosaicism, and DNA packaging. J Mol Biol. 2004;339:379–394. doi: 10.1016/j.jmb.2004.03.068. [DOI] [PubMed] [Google Scholar]
- Casjens SR, Thuman-Commike PA. Evolution of mosaically related tailed bacteriophage genomes seen through the lens of phage P22 virion assembly. Virology. 2011;411:393–415. doi: 10.1016/j.virol.2010.12.046. [DOI] [PubMed] [Google Scholar]
- Casjens SR, Molineux IJ. Short noncontractile tail machines: adsorption and DNA delivery by podoviruses. Adv Exp Med Biol. 2012;726:143–79. doi: 10.1007/978-1-4614-0980-9_7. [DOI] [PubMed] [Google Scholar]
- Casjens SR, Leavitt JC, Hatfull GF, Hendrix RW. Genome sequence of Salmonella phage 9NA. Genome Announc. 2014;2:e00531–14. doi: 10.1128/genomeA.00531-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Casjens SR, Jacobs-Sera D, Hatfull GF, Hendrix RW. Genome sequence of Salmonella enterica phage Det7. Genome Announc. 2015;3:e00279–15. doi: 10.1128/genomeA.00279-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Christie G, Calendar R. Bacteriophge P2. Bacteriophage. 2016;6:e1145782. doi: 10.1080/21597081.2016.1145782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chua JE, Manning PA, Morona R. The Shigella flexneri bacteriophage Sf6 tailspike protein (TSP)/endorhamnosidase is related to the bacteriophage P22 TSP and has a motif common to exo- and endoglycanases, and C-5 epimerases. Microbiology. 1999;145:1649–59. doi: 10.1099/13500872-145-7-1649. [DOI] [PubMed] [Google Scholar]
- Cooke FJ, Wain J, Fookes M, Ivens A, Thomson N, Brown DJ, Threlfall EJ, Gunn G, Foster G, Dougan G. Prophage sequences defining hot spots of genome variation in Salmonella enterica serovar Typhimurium can be used to discriminate between field isolates. J Clin Microbiol. 2007;45:2590–8. doi: 10.1128/JCM.00729-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cresawn SG, Bogel M, Day N, Jacobs-Sera D, Hendrix RW, Hatfull GF. Phamerator: a bioinformatic tool for comparative bacteriophage genomics. BMC Bioinformatics. 2011;12:395. doi: 10.1186/1471-2105-12-395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Lappe N, Doran G, O’Connor J, O’Hare C, Cormican M. Characterization of bacteriophages used in the Salmonella enterica serovar Enteritidis phage-typing scheme. J Med Microbiol. 2009;58:86–93. doi: 10.1099/jmm.0.000034-0. [DOI] [PubMed] [Google Scholar]
- Desai PT, Porwollik S, Long F, Cheng P, Wollam A, Bhonagiri-Palsikar V, Hallsworth-Pepin K, Clifton SW, Weinstock GM, McClelland M. Evolutionary genomics of Salmonella enterica subspecies. MBio. 2013;4:e00579–12. doi: 10.1128/mBio.00579-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dobbins AT, George M, Jr, Basham DA, Ford ME, Houtz JM, Pedulla ML, Lawrence JG, Hatfull GF, Hendrix RW. Complete genomic sequence of the virulent Salmonella bacteriophage SP6. J Bacteriol. 2004;186:1933–44. doi: 10.1128/JB.186.7.1933-1944.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Douglas SE. DNA Strider. A Macintosh program for handling protein and nucleic acid sequences. Methods Mol Biol. 1994;25:181–94. doi: 10.1385/0-89603-276-0:181. [DOI] [PubMed] [Google Scholar]
- Drahovska H, Mikasova E, Szemes T, Ficek A, Sasik M, Majtan V, Turna J. Variability in occurrence of multiple prophage genes in Salmonella Typhimurium strains isolated in Slovak Republic. FEMS Microbiol Lett. 2007;270:237–44. doi: 10.1111/j.1574-6968.2007.00674.x. [DOI] [PubMed] [Google Scholar]
- Efrony R, Atad I, Rosenberg E. Phage therapy of coral white plague disease: properties of phage BA3. Curr Microbiol. 2009;58:139–45. doi: 10.1007/s00284-008-9290-x. [DOI] [PubMed] [Google Scholar]
- Endersen L, O’Mahony J, Hill C, Ross RP, McAuliffe O, Coffey A. Phage Therapy in the Food Industry. Annu Rev Food Sci Technol. 2014;5:327–49. doi: 10.1146/annurev-food-030713-092415. [DOI] [PubMed] [Google Scholar]
- Eppler K, Wyckoff E, Goates J, Parr R, Casjens S. Nucleotide sequence of the bacteriophage P22 genes required for DNA packaging. Virology. 1991;183:519–38. doi: 10.1016/0042-6822(91)90981-g. [DOI] [PubMed] [Google Scholar]
- Eriksson U, Svenson SB, Lonngren J, Lindberg AA. Salmonella phage glycanases: substrate specificity of the phage P22 endo-rhamnosidase. J Gen Virol. 1979;43:503–11. doi: 10.1099/0022-1317-43-3-503. [DOI] [PubMed] [Google Scholar]
- Esposito D, Fitzmaurice WP, Benjamin RC, Goodman SD, Waldman AS, Scocca JJ. The complete nucleotide sequence of bacteriophage HP1 DNA. Nucleic Acids Res. 1996;24:2360–8. doi: 10.1093/nar/24.12.2360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Falgenhauer L, Yao Y, Fritzenwanker M, Schmiedel J, Imirzalioglu C, Chakraborty T. Complete genome sequence of phage-like plasmid pECOH89, pncoding CTX-M-15. Genome Announc. 2014;2:e00356–14. doi: 10.1128/genomeA.00356-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Figueroa-Bossi N, Bossi L. Inducible prophages contribute to Salmonella virulence in mice. Mol Microbiol. 1999;33:167–76. doi: 10.1046/j.1365-2958.1999.01461.x. [DOI] [PubMed] [Google Scholar]
- Finn R, Coggill P, Eberhardt R, Eddy S, Mistry J, Mitchell A, Potter S, Punta M, Qureshi M, Sangrador-Vegas A, Salazar G, Tate J, Bateman A. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res Database issue. 2016;44:D279–D285. doi: 10.1093/nar/gkv1344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fouts DE. Phage_Finder: automated identification and classification of prophage regions in complete bacterial genome sequences. Nucleic Acids Res. 2006;34:5839–51. doi: 10.1093/nar/gkl732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Freiberg A, Morona R, Van den Bosch L, Jung C, Behlke J, Carlin N, Seckler R, Baxa U. The tailspike protein of Shigella phage Sf6. A structural homolog of Salmonella phage P22 tailspike protein without sequence similarity in the beta-helix domain. J Biol Chem. 2003;278:1542–8. doi: 10.1074/jbc.M205294200. [DOI] [PubMed] [Google Scholar]
- Fricke WF, Mammel MK, McDermott PF, Tartera C, White DG, Leclerc JE, Ravel J, Cebula TA. Comparative genomics of 28 Salmonella enterica isolates: evidence for CRISPR-mediated adaptive sublineage evolution. J Bacteriol. 2011;193:3556–68. doi: 10.1128/JB.00297-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fujiwara A, Kawasaki T, Usami S, Fujie M, Yamada T. Genomic characterization of Ralstonia solanacearum phage ∅RSA1 and its related prophage (∅RSX) in strain GMI1000. J Bacteriol. 2008;190:143–56. doi: 10.1128/JB.01158-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gordon C, Sather S, Casjens S, King J. Selective in vivo rescue by GroEL/ES of thermolabile folding intermediates to phage P22 structural proteins. J Biol Chem. 1994;269:27941–51. [PubMed] [Google Scholar]
- Grimont P, Weill F. World Heath Organization Collaborating Centre for Reference and Research on Salmonella. 9. Institut Pasteur; Paris: 2007. Antigenic formulae of the Salmonella serovars. [Google Scholar]
- Grose JH, Casjens S. Understanding the enormous diversity of bacteriophages: the tailed phages that infect the bacterial family Enterobacteriaceae. Virology. 2014;468–470:421–443. doi: 10.1016/j.virol.2014.08.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hambly E, Suttle CA. The viriosphere, diversity, and genetic exchange within phage communities. Curr Opin Microbiol. 2005;8:444–50. doi: 10.1016/j.mib.2005.06.005. [DOI] [PubMed] [Google Scholar]
- Handa H, Gurczynski S, Jackson MP, Auner G, Mao G. Recognition of Salmonella Typhimurium by immobilized phage P22 monolayers. Surf Sci. 2008;602:1392–1400. doi: 10.1016/j.susc.2008.01.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hatfull GF, Jacobs-Sera D, Lawrence JG, Pope WH, Russell DA, Ko CC, Weber RJ, Patel MC, Germane KL, Edgar RH, Hoyte NN, Bowman CA, Tantoco AT, Paladin EC, Myers MS, Smith AL, Grace MS, Pham TT, O’Brien MB, Vogelsberger AM, Hryckowian AJ, Wynalek JL, Donis-Keller H, Bogel MW, Peebles CL, Cresawn SG, Hendrix RW. Comparative genomic analysis of 60 Mycobacteriophage genomes: genome clustering, gene acquisition and gene size. J Mol Biol. 2010;397:119–43. doi: 10.1016/j.jmb.2010.01.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hendrix R, Casjens S. Bacteriophage λ and its genetic neighborhood. In: Calendar R, editor. The Bacteriophages, 2nd Edition. Oxford Press; New York City, N.Y: 2006. pp. 409–447. [Google Scholar]
- Hendrix RW. Bacteriophages: evolution of the majority. Theor Popul Biol. 2002;61:471–80. doi: 10.1006/tpbi.2002.1590. [DOI] [PubMed] [Google Scholar]
- Highlander SK, Weissenberger S, Alvarez LE, Weinstock GM, Berget PB. Complete nucleotide sequence of a P2 family lysogenic bacteriophage, ∅MhaA1-PHL101, from Mannheimia haemolytica serotype A1. Virology. 2006;350:79–89. doi: 10.1016/j.virol.2006.03.024. [DOI] [PubMed] [Google Scholar]
- Hiley L, Fang NX, Micalizzi GR, Bates J. Distribution of Gifsy-3 and of variants of ST64B and Gifsy-1 prophages amongst Salmonella enterica serovar Typhimurium isolates: evidence that combinations of prophages promote clonality. PLoS One. 2014;9:e86203. doi: 10.1371/journal.pone.0086203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hoffmann M, Zhao S, Pettengill J, Luo Y, Monday SR, Abbott J, Ayers SL, Cinar HN, Muruvanda T, Li C, Allard MW, Whichard J, Meng J, Brown EW, McDermott PF. Comparative genomic analysis and virulence differences in closely related Salmonella enterica serotype Heidelberg isolates from humans, retail meats, and animals. Genome Biol Evol. 2014;6:1046–68. doi: 10.1093/gbe/evu079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Holt KE, Parkhill J, Mazzoni CJ, Roumagnac P, Weill FX, Goodhead I, Rance R, Baker S, Maskell DJ, Wain J, Dolecek C, Achtman M, Dougan G. High-throughput sequencing provides insights into genome variation and evolution in Salmonella Typhi. Nat Genet. 2008;40:987–93. doi: 10.1038/ng.195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hua Y, An X, Pei G, Li S, Wang W, Xu X, Fan H, Huang Y, Zhang Z, Mi Z, Chen J, Li J, Zhang F, Tong Y. Characterization of the morphology and genome of an Escherichia coli podovirus. Arch Virol. 2014;159:3249–56. doi: 10.1007/s00705-014-2189-x. [DOI] [PubMed] [Google Scholar]
- Hurwitz BL, Sullivan MB. The Pacific Ocean virome (POV): a marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology. PLoS One. 2013;8:e57355. doi: 10.1371/journal.pone.0057355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hurwitz BL, Westveld AH, Brum JR, Sullivan MB. Modeling ecological drivers in marine viral communities using comparative metagenomics and network analyses. Proc Natl Acad Sci U S A. 2014;111:10714–9. doi: 10.1073/pnas.1319778111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Israel JV, Anderson TF, Levine M. in vitro morphogenesis of phage P22 from heads and baseplate parts. Proc Natl Acad Sci U S A. 1967;57:284–291. doi: 10.1073/pnas.57.2.284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iwashita S, Kanegasaki S. Release of O antigen polysaccharide from Salmonella newington by phage epsilon 34. Virology. 1975;68:27–34. doi: 10.1016/0042-6822(75)90144-0. [DOI] [PubMed] [Google Scholar]
- Iwashita S, Kanegasaki S. Enzymic and molecular properties of base-plate parts of bacteriophage P22. Eur J Biochem. 1976a;65:87–94. doi: 10.1111/j.1432-1033.1976.tb10392.x. [DOI] [PubMed] [Google Scholar]
- Iwashita S, Kanegasaki S. Deacetylation reaction catalyzed by Salmonella phage c341 and its baseplate parts. J Biol Chem. 1976b;251:5361–5. [PubMed] [Google Scholar]
- Jiang XM, Neal B, Santiago F, Lee SJ, Romana LK, Reeves PR. Structure and sequence of the rfb (O antigen) gene cluster of Salmonella serovar typhimurium (strain LT2) Mol Microbiol. 1991;5:695–713. doi: 10.1111/j.1365-2958.1991.tb00741.x. [DOI] [PubMed] [Google Scholar]
- Kapfhammer D, Blass J, Evers S, Reidl J. Vibrio cholerae phage K139: complete genome sequence and comparative genomics of related phages. J Bacteriol. 2002;184:6592–601. doi: 10.1128/JB.184.23.6592-6601.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- King MR, Vimr RP, Steenbergen SM, Spanjaard L, Plunkett G, 3rd, Blattner FR, Vimr ER. Escherichia coli K1-specific bacteriophage CUS-3 distribution and function in phase-variable capsular polysialic acid O acetylation. J Bacteriol. 2007;189:6447–56. doi: 10.1128/JB.00657-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kropinski AM, Kovalyova IV, Billington SJ, Patrick AN, Butts BD, Guichard JA, Pitcher TJ, Guthrie CC, Sydlaske AD, Barnhill LM, Havens KA, Day KR, Falk DR, McConnell MR. The genome of epsilon15, a serotype-converting, Group E1 Salmonella enterica-specific bacteriophage. Virology. 2007;369:234–44. doi: 10.1016/j.virol.2007.07.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krumsiek J, Arnold R, Rattei T. Gepard: a rapid and sensitive tool for creating dotplots on genome scale. Bioinformatics. 2007;23:1026–8. doi: 10.1093/bioinformatics/btm039. [DOI] [PubMed] [Google Scholar]
- Kulsuwan R, Wongratanacheewin S, Wongratanacheewin RS, Yordpratum U, Tattawasart U. Lytic capability of bacteriophages (family Myoviridae) on Burkholderia pseudomallei. Southeast Asian J Trop Med Public Health. 2014;45:1344–53. [PubMed] [Google Scholar]
- Kvitko BH, Cox CR, DeShazer D, Johnson SL, Voorhees KJ, Schweizer HP. ∅X216, a P2-like bacteriophage with broad Burkholderia pseudomallei and B. mallei strain infectivity. BMC Microbiol. 2012;12:289. doi: 10.1186/1471-2180-12-289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lan R, Reeves PR, Octavia S. Population structure, origins and evolution of major Salmonella enterica clones. Infect Genet Evol. 2009;9:996–1005. doi: 10.1016/j.meegid.2009.04.011. [DOI] [PubMed] [Google Scholar]
- Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23:2947–8. doi: 10.1093/bioinformatics/btm404. [DOI] [PubMed] [Google Scholar]
- Lawrence JG, Hendrix RW, Casjens S. Where are the bacterial pseudogenes? Trends Microbiol. 2001;9:535–540. doi: 10.1016/s0966-842x(01)02198-9. [DOI] [PubMed] [Google Scholar]
- Lee CN, Tseng TT, Chang HC, Lin JW, Weng SF. Genomic sequence of temperate phage Smp131 of Stenotrophomonas maltophilia that has similar prophages in xanthomonads. BMC Microbiol. 2014;14:17. doi: 10.1186/1471-2180-14-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leiman PG, Battisti AJ, Bowman VD, Stummeyer K, Muhlenhoff M, Gerardy-Schahn R, Scholl D, Molineux IJ. The structures of bacteriophages K1E and K1–5 explain processive degradation of polysaccharide capsules and evolution of new host specificities. J Mol Biol. 2007;371:836–49. doi: 10.1016/j.jmb.2007.05.083. [DOI] [PubMed] [Google Scholar]
- Lima-Mendez G, Van Helden J, Toussaint A, Leplae R. Prophinder: a computational tool for prophage prediction in prokaryotic genomes. Bioinformatics. 2008;24:863–5. doi: 10.1093/bioinformatics/btn043. [DOI] [PubMed] [Google Scholar]
- Lindberg AA, Sarvas M, Makela PH. Bacteriophage attachment to the somatic antigen of Salmonella: Effect of O-specific structures in leaky R mutants and S, T1 hybrids. Infect Immun. 1970;1:88–97. doi: 10.1128/iai.1.1.88-97.1970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu B, Knirel YA, Feng L, Perepelov AV, Senchenkova SN, Reeves PR, Wang L. Structural diversity in Salmonella O antigens and its genetic basis. FEMS Microbiol Rev. 2014;38:56–89. doi: 10.1111/1574-6976.12034. [DOI] [PubMed] [Google Scholar]
- Liu D, Haase AM, Lindqvist L, Lindberg AA, Reeves PR. Glycosyl transferases of O-antigen biosynthesis in Salmonella enterica: identification and characterization of transferase genes of groups B, C2, and E1. J Bacteriol. 1993;175:3408–13. doi: 10.1128/jb.175.11.3408-3413.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lynch KH, Stothard P, Dennis JJ. Genomic analysis and relatedness of P2-like phages of the Burkholderia cepacia complex. BMC Genomics. 2010;11:599. doi: 10.1186/1471-2164-11-599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lynch KH, Stothard P, Dennis JJ. Comparative analysis of two phenotypically-similar but genomically-distinct Burkholderia cenocepacia-specific bacteriophages. BMC Genomics. 2012;13:223. doi: 10.1186/1471-2164-13-223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malki K, Kula A, Bruder K, Sible E, Hatzopoulos T, Steidel S, Watkins SC, Putonti C. Bacteriophages isolated from Lake Michigan demonstrate broad host-range across several bacterial phyla. Virol J. 2015;12:164. doi: 10.1186/s12985-015-0395-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McClelland M, Sanderson KE, Spieth J, Clifton SW, Latreille P, Courtney L, Porwollik S, Ali J, Dante M, Du F, Hou S, Layman D, Leonard S, Nguyen C, Scott K, Holmes A, Grewal N, Mulvaney E, Ryan E, Sun H, Florea L, Miller W, Stoneking T, Nhan M, Waterston R, Wilson RK. Complete genome sequence of Salmonella enterica serovar Typhimurium LT2. Nature. 2001;413:852–6. doi: 10.1038/35101614. [DOI] [PubMed] [Google Scholar]
- McNair K, Bailey BA, Edwards RA. PHACTS, a computational approach to classifying the lifestyle of phages. Bioinformatics. 2012;28:614–8. doi: 10.1093/bioinformatics/bts014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miletic S, Simpson DJ, Szymanski CM, Deyholos MK, Menassa R. A plant-produced bacteriophage tailspike protein for the control of Salmonella. Front Plant Sci. 2015;6:1221. doi: 10.3389/fpls.2015.01221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moran NA, Degnan PH, Santos SR, Dunbar HE, Ochman H. The players in a mutualistic symbiosis: insects, bacteria, viruses, and virulence genes. Proc Natl Acad Sci U S A. 2005;102:16919–26. doi: 10.1073/pnas.0507029102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moreno Switt AI, den Bakker HC, Cummings CA, Rodriguez-Rivera LD, Govoni G, Raneiri ML, Degoricija L, Brown S, Hoelzer K, Peters JE, Bolchacova E, Furtado MR, Wiedmann M. Identification and characterization of novel Salmonella mobile elements involved in the dissemination of genes linked to virulence and transmission. PLoS One. 2012;7:e41247. doi: 10.1371/journal.pone.0041247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muller JJ, Barbirz S, Heinle K, Freiberg A, Seckler R, Heinemann U. An intersubunit active site between supercoiled parallel beta helices in the trimeric tailspike endorhamnosidase of Shigella flexneri phage Sf6. Structure. 2008;16:766–75. doi: 10.1016/j.str.2008.01.019. [DOI] [PubMed] [Google Scholar]
- Nakayama K, Kanaya S, Ohnishi M, Terawaki Y, Hayashi T. The complete nucleotide sequence of ∅CTX, a cytotoxin-converting phage of Pseudomonas aeruginosa: implications for phage evolution and horizontal gene transfer via bacteriophages. Mol Microbiol. 1999;31:399–419. doi: 10.1046/j.1365-2958.1999.01158.x. [DOI] [PubMed] [Google Scholar]
- Nakornpakdee Y, Sermswan RW, Tattawasart U, Yordpratum U, Wongratanacheewin S. A PCR-Based detection of Burkholderia Pseudomallei diversity using Myoviridae prophage typing. Southeast Asian J Trop Med Public Health. 2015;46:30–7. [PubMed] [Google Scholar]
- Nilsson AS, Ljungquist EH. The P2-like bacteriophages. In: Calendar R, editor. The Bacteriophages, second edition. Oxford University Press; New York, NY: 2006. pp. 365–390. [Google Scholar]
- Niu YD, McAllister TA, Nash JH, Kropinski AM, Stanford K. Four Escherichia coli O157:H7 phages: a new bacteriophage genus and taxonomic classification of T1-like phages. PLoS One. 2014;9:e100426. doi: 10.1371/journal.pone.0100426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Niu YD, Cook SR, Wang J, Klima CL, Hsu YH, Kropinski AM, Turner D, McAllister TA. Comparative analysis of multiple inducible phages from Mannheimia haemolytica. BMC Microbiol. 2015;15:175. doi: 10.1186/s12866-015-0494-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pang JC, Chiu TH, Helmuth R, Schroeter A, Guerra B, Tsen HY. A pulsed field gel electrophoresis (PFGE) study that suggests a major world-wide clone of Salmonella enterica serovar Enteritidis. Int J Food Microbiol. 2007;116:305–12. doi: 10.1016/j.ijfoodmicro.2006.05.024. [DOI] [PubMed] [Google Scholar]
- Pang S, Octavia S, Feng L, Liu B, Reeves PR, Lan R, Wang L. Genomic diversity and adaptation of Salmonella enterica serovar Typhimurium from analysis of six genomes of different phage types. BMC Genomics. 2013;14:718. doi: 10.1186/1471-2164-14-718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parent KN, Gilcrease EB, Casjens SR, Baker TS. Structural evolution of the P22-like phages: comparison of Sf6 and P22 procapsid and virion architectures. Virology. 2012;427:177–88. doi: 10.1016/j.virol.2012.01.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parent KN, Tang J, Cardone G, Gilcrease EB, Janssen ME, Olson NH, Casjens SR, Baker TS. Three-dimensional reconstructions of the bacteriophage CUS-3 virion reveal a conserved coat protein I-domain but a distinct tailspike receptor-binding domain. Virology. 2014;464–465:55–66. doi: 10.1016/j.virol.2014.06.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parkhill J, Dougan G, James KD, Thomson NR, Pickard D, Wain J, Churcher C, Mungall KL, Bentley SD, Holden MT, Sebaihia M, Baker S, Basham D, Brooks K, Chillingworth T, Connerton P, Cronin A, Davis P, Davies RM, Dowd L, White N, Farrar J, Feltwell T, Hamlin N, Haque A, Hien TT, Holroyd S, Jagels K, Krogh A, Larsen TS, Leather S, Moule S, O’Gaora P, Parry C, Quail M, Rutherford K, Simmonds M, Skelton J, Stevens K, Whitehead S, Barrell BG. Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature. 2001;413:848–52. doi: 10.1038/35101607. [DOI] [PubMed] [Google Scholar]
- Paul JH, Sullivan MB. Marine phage genomics: what have we learned? Curr Opin Biotechnol. 2005;16:299–307. doi: 10.1016/j.copbio.2005.03.007. [DOI] [PubMed] [Google Scholar]
- Perna N, Glasner J, Burland V, Plunkett G., 3rd . The Genomes of Escherichia coli K-12 and Pathogenic E. coli. In: Donnenberg M, editor. Escherichia coli: Virulence mechanisms of a versatile pathogen. Academic Press; San Diego, CA, New York: 2002. pp. 3–53. [Google Scholar]
- Ravin V, Shulga MG. The evidence of extrachromosomal location phage prophage N15. Virology. 1970;40:800–805. doi: 10.1016/0042-6822(70)90125-x. [DOI] [PubMed] [Google Scholar]
- Ravin V, Ravin N, Casjens S, Ford ME, Hatfull GF, Hendrix RW. Genomic sequence and analysis of the atypical temperate bacteriophage N15. J Mol Biol. 2000;299:53–73. doi: 10.1006/jmbi.2000.3731. [DOI] [PubMed] [Google Scholar]
- Reen FJ, Boyd EF, Porwollik S, Murphy BP, Gilroy D, Fanning S, McClelland M. Genomic comparisons of Salmonella enterica serovar Dublin, Agona, and Typhimurium strains recently isolated from milk filters and bovine samples from Ireland, using a Salmonella microarray. Appl Environ Microbiol. 2005;71:1616–25. doi: 10.1128/AEM.71.3.1616-1625.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reeves P. Evolution of Salmonella O antigen variation by interspecific gene transfer on a large scale. Trends Genet. 1993;9:17–22. doi: 10.1016/0168-9525(93)90067-R. [DOI] [PubMed] [Google Scholar]
- Rizzo AA, Suhanovsky MM, Baker ML, Fraser LCR, Jones LM, Rempel DL, Gross ML, Chiu W, Alexandrescu AT, Teschke CM. Multiple functional roles of the accessory I-domain of bacteriophage P22 coat protein revealed by NMR structure and cryoEM modeling. Structure. 2014;22:830–41. doi: 10.1016/j.str.2014.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roux S, Krupovic M, Debroas D, Forterre P, Enault F. Assessment of viral community functional potential from viral metagenomes may be hampered by contamination with cellular sequences. Open Biol. 2013;3:130160. doi: 10.1098/rsob.130160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rychlik I, Hradecka H, Malcova M. Salmonella enterica serovar Typhimurium typing by prophage-specific PCR. Microbiology. 2008;154:1384–9. doi: 10.1099/mic.0.2007/015156-0. [DOI] [PubMed] [Google Scholar]
- Sanchez SE, Cuevas DA, Rostron JE, Liang TY, Pivaroff CG, Haynes MR, Nulton J, Felts B, Bailey BA, Salamon P, Edwards RA, Burgin AB, Segall AM, Rohwer F. Phage phenomics: Physiological approaches to characterize novel viral proteins. J Vis Exp. 2015;11:e52854. doi: 10.3791/52854. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sangal V, Harbottle H, Mazzoni CJ, Helmuth R, Guerra B, Didelot X, Paglietti B, Rabsch W, Brisse S, Weill FX, Roumagnac P, Achtman M. Evolution and population structure of Salmonella enterica serovar Newport. J Bacteriol. 2010;192:6465–76. doi: 10.1128/JB.00969-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schicklmaier P, Moser E, Wieland T, Rabsch W, Schmieger H. A comparative study on the frequency of prophages among natural isolates of Salmonella and Escherichia coli with emphasis on generalized transducers. Antonie van Leeuwenhoek. 1998;73:49–54. doi: 10.1023/a:1000748505550. [DOI] [PubMed] [Google Scholar]
- Schmieger H. Molecular survey of the Salmonella phage typing system of Anderson. J Bacteriol. 1999;181:1630–5. doi: 10.1128/jb.181.5.1630-1635.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scholl D, Kieleczawa J, Kemp P, Rush J, Richardson CC, Merril C, Adhya S, Molineux IJ. Genomic analysis of bacteriophages SP6 and K1–5, an estranged subgroup of the T7 supergroup. J Mol Biol. 2004;335:1151–71. doi: 10.1016/j.jmb.2003.11.035. [DOI] [PubMed] [Google Scholar]
- Singh A, Poshtiban S, Evoy S. Recent advances in bacteriophage based biosensors for food-borne pathogen detection. Sensors (Basel) 2013;13:1763–86. doi: 10.3390/s130201763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singh A, Arutyunov D, Szymanski CM, Evoy S. Bacteriophage based probes for pathogen detection. Analyst. 2012;137:3405–21. doi: 10.1039/c2an35371g. [DOI] [PubMed] [Google Scholar]
- Steinbacher S, Seckler R, Miller S, Steipe B, Huber R, Reinemer P. Crystal structure of P22 tailspike protein: interdigitated subunits in a thermostable trimer. Science. 1994;265:383–6. doi: 10.1126/science.8023158. [DOI] [PubMed] [Google Scholar]
- Steinbacher S, Baxa U, Miller S, Weintraub A, Seckler R, Huber R. Crystal structure of phage P22 tailspike protein complexed with Salmonella sp. O-antigen receptors. Proc Natl Acad Sci USA. 1996;93:10584–8. doi: 10.1073/pnas.93.20.10584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinbacher S, Miller S, Baxa U, Budisa N, Weintraub A, Seckler R, Huber R. Phage P22 tailspike protein: crystal structure of the head-binding domain at 2.3 Å, fully refined structure of the endorhamnosidase at 1.56 Å resolution, and the molecular basis of O-antigen recognition and cleavage. J Mol Biol. 1997;267:865–80. doi: 10.1006/jmbi.1997.0922. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suhanovsky MM, Teschke CM. An intramolecular chaperone inserted in bacteriophage P22 coat protein mediates its chaperonin-independent folding. J Biol Chem. 2013;288:33772–83. doi: 10.1074/jbc.M113.515312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suttle CA. Marine viruses - major players in the global ecosystem. Nat Rev Microbiol. 2007;5:801–12. doi: 10.1038/nrmicro1750. [DOI] [PubMed] [Google Scholar]
- Switt AI, Sulakvelidze A, Wiedmann M, Kropinski AM, Wishart DS, Poppe C, Liang Y. Salmonella phages and prophages: genomics, taxonomy, and applied aspects. Methods Mol Biol. 2015;1225:237–87. doi: 10.1007/978-1-4939-1625-2_15. [DOI] [PubMed] [Google Scholar]
- Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9. doi: 10.1093/molbev/msr121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thouand G, Vachon P, Liu S, Dayre M, Griffiths MW. Optimization and validation of a simple method using P22::luxAB bacteriophage for rapid detection of Salmonella enterica serotypes A, B, and D in poultry samples. J Food Prot. 2008;71:380–5. doi: 10.4315/0362-028x-71.2.380. [DOI] [PubMed] [Google Scholar]
- Tikhe CV, Martin TM, Gissendanner CR, Husseneder C. Complete genome sequence of Citrobacter phage CVT22 isolated from the gut of the Formosan subterranean termite, Coptotermes formosanus Shiraki. Genome Announc. 2015;3:e00408–15. doi: 10.1128/genomeA.00408-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Timme RE, Pettengill JB, Allard MW, Strain E, Barrangou R, Wehnes C, Van Kessel JS, Karns JS, Musser SM, Brown EW. Phylogenetic diversity of the enteric pathogen Salmonella enterica subsp. enterica inferred from genome-wide reference-free SNP characters. Genome Biol Evol. 2013;5:2109–23. doi: 10.1093/gbe/evt159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Uetake H, Nakagawa T, Akiba T. The relationship of bacteriophage to antigenic changes in Group E salmonellas. J Bacteriol. 1955;69:571–9. doi: 10.1128/jb.69.5.571-579.1955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Valenzuela C, Ugalde JA, Mora GC, Alvarez S, Contreras I, Santiviago CA. Draft genome sequence of Salmonella enterica serovar Typhi strain STH2370. Genome Announc. 2014;2:e00104–14. doi: 10.1128/genomeA.00104-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van der Wilk F, Dullemans AM, Verbeek M, van den Heuvel JF. Isolation and characterization of APSE-1, a bacteriophage infecting the secondary endosymbiont of Acyrthosiphon pisum. Virology. 1999;262:104–13. doi: 10.1006/viro.1999.9902. [DOI] [PubMed] [Google Scholar]
- Villafane R, Costa S, Ahmed R, Salgado C. Conservation of the N-terminus of some phage tail proteins. Arch Virol. 2005;150:2609–21. doi: 10.1007/s00705-005-0597-7. [DOI] [PubMed] [Google Scholar]
- Walter M, Fiedler C, Grassl R, Biebl M, Rachel R, Hermo-Parrado XL, Llamas-Saiz AL, Seckler R, Miller S, van Raaij MJ. Structure of the receptor-binding protein of bacteriophage Det7: a podoviral tail spike in a myovirus. J Virol. 2008;82:2265–73. doi: 10.1128/JVI.01641-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang L, Andrianopoulos K, Liu D, Popoff MY, Reeves PR. Extensive variation in the O-antigen gene cluster within one Salmonella enterica serogroup reveals an unexpected complex history. J Bacteriol. 2002;184:1669–77. doi: 10.1128/JB.184.6.1669-1677.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilhelm SW, Jeffrey WH, Suttle CA, Mitchell DL. Estimation of biologically damaging UV levels in marine surface waters with DNA and viral dosimeters. Photochem Photobiol. 2002;76:268–73. doi: 10.1562/0031-8655(2002)076<0268:eobdul>2.0.co;2. [DOI] [PubMed] [Google Scholar]
- Wommack KE, Colwell RR. Virioplankton: viruses in aquatic ecosystems. Microbiol Mol Biol Rev. 2000;64:69–114. doi: 10.1128/mmbr.64.1.69-114.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wyk P, Reeves P. Identification and sequence of the gene for abequose synthase, which confers antigenic specificity on group B salmonellae: homology with galactose epimerase. J Bacteriol. 1989;171:5687–93. doi: 10.1128/jb.171.10.5687-5693.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yamada M, Fujisawa H, Kato H, Hamada K, Minagawa T. Cloning and sequencing of the genetic right end of bacteriophage T3 DNA. Virology. 1986;151:350–61. doi: 10.1016/0042-6822(86)90055-3. [DOI] [PubMed] [Google Scholar]
- Yamamoto K. The origin of bacteriophage P221. Virology. 1967;33:545–547. doi: 10.1016/0042-6822(67)90132-8. [DOI] [PubMed] [Google Scholar]
- Yamamoto N. Genetic evolution of bacteriophage. I. Hybrids between unrelated bacteriophages P22 and Fels 2. Proc Natl Acad Sci USA. 1969;62:63–9. doi: 10.1073/pnas.62.1.63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yamashita E, Nakagawa A, Takahashi J, Tsunoda K, Yamada S, Takeda S. The host-binding domain of the P2 phage tail spike reveals a trimeric iron-binding structure. Acta Crystallogr Sect F Struct Biol Cryst Commun. 2011;67:837–41. doi: 10.1107/S1744309111005999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yarmolinsky MB, Sternberg N. Bacteriophage P1. In: Calendar R, editor. The Bacteriophages. Vol. 2. Plenum Press; New York, NY: 1988. pp. 291–438. [Google Scholar]
- Yasuike M, Nishiki I, Iwasaki Y, Nakamura Y, Fujiwara A, Sugaya E, Kawato Y, Nagai S, Kobayashi T, Ototake M, Nakai T. Full-genome sequence of a novel myovirus, GF-2, infecting Edwardsiella tarda: comparison with other Edwardsiella myoviral genomes. Arch Virol. 2015;160:2129–33. doi: 10.1007/s00705-015-2472-5. [DOI] [PubMed] [Google Scholar]
- Yim L, Betancor L, Martinez A, Giossa G, Bryant C, Maskell D, Chabalgoity JA. Differential phenotypic diversity among epidemic-spanning Salmonella enterica serovar enteritidis isolates from humans or animals. Appl Environ Microbiol. 2010;76:6812–20. doi: 10.1128/AEM.00497-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yordpratum U, Tattawasart U, Wongratanacheewin S, Sermswan RW. Novel lytic bacteriophages from soil that lyse Burkholderia pseudomallei. FEMS Microbiol Lett. 2011;314:81–8. doi: 10.1111/j.1574-6968.2010.02150.x. [DOI] [PubMed] [Google Scholar]
- Zayas M, Villafane R. Identification of the Salmonella phage epsilon34 tailspike gene. Gene. 2007;386:211–7. doi: 10.1016/j.gene.2006.09.013. [DOI] [PubMed] [Google Scholar]
- Zhang J, Cao Z, Xu Y, Li X, Li H, Wu F, Wang L, Cao F, Li Z, Li S, Jin L. Complete genomic sequence of the Vibrio alginolyticus lytic bacteriophage PVA1. Arch Virol. 2014;159:3447–51. doi: 10.1007/s00705-014-2207-z. [DOI] [PubMed] [Google Scholar]
- Zhou Z, McCann A, Weill FX, Blin C, Nair S, Wain J, Dougan G, Achtman M. Transient Darwinian selection in Salmonella enterica serovar Paratyphi A during 450 years of global spread of enteric fever. Proc Natl Acad Sci U S A. 2014;111:12199–204. doi: 10.1073/pnas.1411012111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou Z, McCann A, Litrup E, Murphy R, Cormican M, Fanning S, Brown D, Guttman DS, Brisse S, Achtman M. Neutral genomic microevolution of a recently emerged pathogen, Salmonella enterica serovar Agona. PLoS Genet. 2013;9:e1003471. doi: 10.1371/journal.pgen.1003471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zinder N, Lederberg J. Genetic exchange in Salmonella. J Bacteriol. 1952;64:679–699. doi: 10.1128/jb.64.5.679-699.1952. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.