Abstract
Bacteriophage AR9 and its close relative PBS1 have been extensively used to construct early Bacillus subtilis genetic maps. Here, we present the 251,042 bp AR9 genome, a linear, terminally redundant double-stranded DNA containing deoxyuridine instead of thymine. Multiple AR9 genes are interrupted by non-coding sequences or sequences encoding putative endonucleases. We show that these sequences are group I and group II self-splicing introns. Eight AR9 proteins are homologous to fragments of bacterial RNA polymerase (RNAP) subunits β/β’. These proteins comprise two sets of paralogs of RNAP largest subunits, with each paralog encoded by two disjoint phage genes. Thus, AR9 is a phiKZ-related giant phage that relies on two multisubunit viral RNAPs to transcribe its genome independently of host transcription apparatus. Purification of one of PBS1/AR9 RNAPs has been reported previously, which makes AR9 a promising object for further studies of RNAP evolution, assembly and mechanism.
Keywords: Bacteriophage, Genome, Bacillus subtilis, Introns, RNA polymerase, Transcription, Evolution
1. Introduction
Giant or “jumbo” bacteriophages, defined as phages with double-stranded (ds) DNA genomes larger than 200 kbp, belong to the family Myoviridae (Hendrix, 2009). Phylogenetically, the giant phage group is highly diverse and includes multiple distantly related viruses (Hendrix, 2009). The genomes of several giant phages harbor genes for proteins homologous to the two largest subunits of cellular DNA-dependent RNA polymerases (RNAPs) (Krylov et al., 2007). The current databases contain 16 such phages, all infecting Proteobacteria: Pseudomonas aeruginosa phages phiKZ (Mesyanzhinov et al., 2002), EL (Hertveldt et al., 2005), PA7 (Kwan et al., 2006), and PhiPa3 (Monson et al., 2011); P. chlororaphis phage 201phi2-1 (Thomas et al., 2008); P. fluorescens phage OBP (Cornelissen et al., 2012), Salmonella phage SPN3US (Lee et al., 2011); Yersinia enterocolitica phage phiR1–37 (Skurnik et al., 2012), Cronobacter sakazakii CR5 (Lee et al., 2016),. Halocynthia phage JM-2012 (NC_017975), Vibrio sp. phage VP4B (KC131130.1), Erwinia amylovora phages phiEaH2 (Dömötör et al., 2012), phiEaH1 (Meczker et al., 2014), and Ea35–70 (Yagubi et al., 2014) and Ralstonia solanacearum phages RSL2 (BAQ02576) and RSF1 (AP014927). Below, we will refer to these phages as “phiKZ-related”.
The largest subunits of cellular (bacterial, archaeal, and eukaryal) RNAPs are very large, evolutionarily conserved proteins. Together, they form the RNAP catalytic center. The largest RNAP subunits are naturally split, i. e., encoded by separate genes, in a number of organisms (Lane and Darst, 2010). Moreover, splits can be introduced into the largest subunits of bacterial RNAPs by means of genetic engineering without impairing the basic RNAP function (Severinov et al., 1997). Among the 8 polypeptides homologous to RNAP subunits encoded by every phiKZ-related phage, there are two sets that together could form complete or almost complete counterparts of the largest bacterial RNAP subunits β and β′. Thus, these phages seem to encode two distinct RNAPs. In cases where it has been studied, one set of putative RNAP subunits is found in phage virions (Mesyanzhinov et al., 2002; Thomas et al., 2010; Skurnik et al., 2012). These proteins can be predicted to comprise a virion RNAP that is injected into the cell together with phage DNA and transcribes early phage genes (a similar strategy is employed, for example, by the Escherichia coli phage N4 (Falco et al., 1977)). The second set of putative RNAP subunits could form a non-virion (nv) RNAP, which would transcribe viral genes expressed at later stages of infection. Again, a similar strategy is used by the phage N4.
Recently, we have shown that the development of the phiKZ phage is resistant to rifampicin, a potent inhibitor of bacterial RNAP, thus providing first experimental evidence that phiKZ and, by extension, its relatives rely on phage-encoded transcription machinery and are independent of host transcription apparatus (Ceyssens et al., 2014). Furthermore, we purified nvRNAP from phiKZ-infected cells and showed that it contained the four phage proteins jointly comprising the β and β′ subunit and a fifth subunit with no sequence similarity to functionally characterized proteins. In vitro, this enzyme was able to specifically recognize late phage promoters (Yakunina et al., 2015).
Both RNAPs of phage N4, as well as all other phage-encoded RNAPs, including the best-characterized T7 enzyme, are homologous to the DNA polymerases of the A family containing the core Palm domain (Steitz, 2009), which is unrelated to the multisubunit cellular RNAPs. RNAPs of phiKZ-related phages are distinct from all other known phage RNAPs as they are evolutionarily related to cellular RNAPs and belong to the DPBB (double-psi beta-barrel) fold family of polymerases (Yakunina et al., 2015). However, these multisubunit phage RNAPs show notable differences from their cellular counterparts. First, none of the phages encodes a recognizable homolog of α, a subunit that is strictly required for assembly of the largest subunits of cellular RNAPs. Second, the phages do not encode homologs of bacterial sigma factors, which are required for promoter-specific transcription initiation by bacterial RNAPs. Thus, the assembly, promoter recognition, and, possibly, other aspects of the transcription catalyzed by giant phage RNAPs must differ from corresponding processes in cellular RNAPs. Studies of giant phage RNAPs might thus provide novel insights into transcription mechanism, regulation, and evolution.
The Bacillus subtilis bacteriophage AR9 was isolated in 1968 in the USSR (Belyaeva and Azizbekyan, 1968) and appeared to be very similar to the PBS1 phage (Rima and van Kleeff, 1971). PBS1 and its clear plaque derivative PBS2 were discovered by Takahashi in 1961 (Takahashi, 1961, 1963). In the 1970s, purification of a rifampicin-resistant multisubunit RNAP with a unique subunit composition from B. subtilis infected with bacteriophage PBS2 was reported (Clark et al., 1974). Similarly to our recent findings with the phiKZ infection, it has been reported that the PBS2 infection is fully resistant to rifampicin (Price and Frabotta, 1972). PBS1 is a very large phage: the molecular weight of its DNA has been determined to be 1.9 × 108 (Yamagishi, 1967), which is comparable to the size of giant phage genomes. The PBS1/2 DNA contains deoxyuridine instead of thymidine, as does the DNA of the phiKZ-related phage phiR1–37 (Kiljunen et al., 2005). In contrast to lytic phiKZ-related phages, PBS1 appears to be a lysogenic phage capable of general transduction. It is the latter property that made this phage popular in the early period of B. subtilis studies as it allowed the construction of genetic maps of the bacterial chromosome (Dubnau et al., 1967).
Because PBS1/2 (and, presumably, AR9) encode a unique multisubunit RNAP and are very large phages, we hypothesized that they may be related to the phiKZ phage. To test this hypothesis, we obtained a sample of the AR9 phage and determined its full genomic sequence. Here, we report the results of the analysis of the AR9 genome. We find that AR9 indeed encodes two sets of homologs of bacterial RNAP large subunits as well as numerous other proteins with homologs in phiKZ-related phage genomes, and thus clearly belongs to this phage group. AR9 proteins from one RNAP set apparently are subunits of the enzyme whose purification has been reported earlier (Clark, 1978). The results of our work open way for purification and analysis of this unique enzyme by modern methods.
2. Materials and methods
2.1. Bacterial strains, phage and growth conditions
Bacteriophage AR9 was generously provided by David Dubnau from the Public Health Research Institute Center New Jersey Medical School – Rutgers, NJ. B. subtilis 168 was used as a host. To prepare AR9 lysates, a single plaque was resuspended in 100 μl of overnight B. subtilis 168 culture inoculated in 100 ml of LB medium and incubated with shaking at 37 °C until complete lysis occurred (5–6 h). Cell debris was removed by centrifugation at 5000g for 20 min The resulting phage lysate stock (5×109–2×1010 PFU/ml) was stored at +4 °C.
2.2. AR9 genome DNA purification and sequencing
For further purification, AR9 virions were precipitated from the lysate with polyethylene glycol 8000 and purified using CsCl density-gradient centrifugation as described in Sambrook et al. (1989). CsCl was removed by dialysis against storage buffer (50 mM Tris–HCl (pH 8.0), 100 mM NaCl, 10 mM MgCl2). AR9 genomic DNA was prepared from CsCl-purified phage by extraction with phenol/chloroform and subsequent precipitation with ethanol as described in (Sambrook and Russell, 2006). To ensure robust amplification of uracil-containing AR9 DNA, KAPA HiFi Uracil+ polymerase was used instead of KAPA HiFi DNA polymerase during library preparation step. In addition to 500-bp fragment library, 6 and 10 kb mate-pair libraries were also prepared using Nextera Mate Pair Sample Preparation Kit. Libraries were sequenced on the Illumina MiSeq benchtop sequencer with the sequencing-by-synthesis technology. Reads that passed Illumina quality control filtering were used as raw data for further bioinformatics analysis. To determine the phage genome ends, PCR with a pair of primers annealing close to each end (Pright_end TGGTCCAATTCCATACTCAATTATACTTC; Pleft_end CATCACTTGTTA-CACCCACTAGTAATG) was performed.
2.3. AR9 genome analysis, gene prediction and annotation
After filtering for quality control, sequence reads were assembled by Consed version 23.0 software into a single contig. The average genome coverage of 153 was obtained. Gene prediction was performed using the RAST web server (http://www.phantome.org/PhageSeed/Phage.cgi? page=phast). Predicted proteins were searched against the NR (non-redundant) database at the NCBI and, separately, against protein sets of phiKZ-related phages using the PSI-BLAST program (Schäffer et al., 2001) with an E-value cutoff of 0.001, composition based statistics and low complexity filtering disabled. The results were reviewed and, in cases of uncertainty, were re-analyzed by re-running PSI-BLAST against NR, with composition-based statistics and low complexity filtering enabled, and/or against proteins of tailed phages only. Additionally, proteins with functionally uncharacterized homologs only were analyzed using Conserved Domain Database (CDD) search at the NCBI (Marchler-Bauer et al., 2009) and HHpred profile-profile search (Söding et al., 2005) in order to identify remote sequence similarities. The TMHMM program (Krogh et al., 2001) was used to predict membrane proteins. Transcription terminator sequences were predicted using Arnold (http://rna.ig mors.u-psud.fr/toolbox/arnold/). Overrepresented intragenic motifs were found using the MEME program (Bailey et al., 2009). Intron sequences were predicted using Rfam (http://rfam.xfam.org/). The RNAP subunit sequences were aligned using Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/), with manual correction. Phylogenetic trees were constructed by the neighbor-joining method (http://www.ebi.ac.uk/Tools/phylogeny/clustalw2_phylogeny/). Protein secondary structure was predicted using Jpred4 program (Drozdetskiy et al., 2015).
2.4. Mass spectrometric identification of phage particle proteins
An aliquot of solution containing purified virions was heated at 98 °C for 5 min in Laemmli sample buffer and proteins were resolved by denaturing (SDS) PAGE in a 12% gel. The gel was stained with Coomassie brilliant blue and manually dissected into 77 slices with care taken to avoid transecting visible Coomassie-stained bands. Individual slices were prepared for mass-spectrometric analysis by trypsin “in gel” digest methods (Shevchenko et al., 1996). Mass spectrometric analysis was conducted on Fourier Transform (Ion Cyclotron Resonance) Mass Spectrometer (Varian 902-MS) equipped with MALDI and 9.4 T magnet (FTMS). Proteins were identified using the Mascot version 2.2.1 (Matrix Science) against the NCBI database and an in-house database of AR9 proteins. Search criteria for both Mascot searches were as follows: trypsin digestion with one missed cleavage allowed, the maximum peptide mass tolerance was ±10 ppm. A Mascot score of more than 30 was used as a threshold for identification.
2.5. AR9 RNA purification and RT-PCR
For RNA purification, cells were grown to OD595 of 0.6 and infected with AR9 at MOI of 10. Infection was stopped at various time by rapid chilling. Cells were collected by centrifugation at 5000g for 15 min and used for total RNA purification with the hotphenol method (Sambrook and Russell, 2006). For cDNA synthesis, 5 μg of total RNA was reverse-transcribed in the presence of random hexamer primers with 100 U of Maxima enzyme (Invitrogen) according to the manufacturer’s protocol. The resulting cDNA was used for amplification with gene-specific primer pairs.
3. Results
3.1. General features of the AR9 genome
When DNA extracted from highly purified AR9 virions was subjected to Illumina sequencing using a standard protocol no phage DNA sequence was produced. Instead, reads corresponding to B. subtilis host genomic sequences were generated, yielding instead the host genome sequence at high coverage. Because AR9 is a transducing phage, the host DNA must have been packaged into the virions and preferentially sequenced, presumably because phage genomic DNA contains uracil instead of thymine. A special procedure was therefore used to sequence the AR9 DNA (see Materials and Methods). Reads were assembled into a single linear contig of 251,042 bp. Primer extension reactions towards the ends of the assembled genomic sequence were performed. The results indicated that both ends of the genome harbor extensive (more than 1 kbp) direct repeats (data not shown) whose boundaries were not mapped precisely. On the nucleotide level, the AR9 sequence has only one extensive region of similarity to sequences available in databases. This region is 100% identical to the only sequenced and available in databases region of PBS1/2, a 255 bp-long uracil glycosylase inhibitor (UDI) gene and 465 bp of flanking sequences (IPR024062, 1udi).
292 AR9 open reading frames (ORFs) were predicted to encode phage proteins, with 182 predicted genes (62.6%) oriented rightward and remaining ones oriented leftward (Fig. 1). Genes transcribed from different strands were distributed along the genome in an apparently random way. Several transcription terminators could be predicted, mostly in areas separating convergently transcribed genes. The predicted AR9 genes were numbered consecutively, from left to right (each gene name included the prefix “g” followed by a number). The mean gene density per kilobase of the AR9 genome is 1.167, which is typical of phages (Lucks et al., 2008). About 44% of the predicted AR9 proteins have no detectable homologs in protein sequence databases. Most of the AR9 gene products (gps) are unique, but 7 families of paralogs were detected, including 6 unique families with no detectable homologs. Most of these families appear to have evolved by tandem duplication. Since the similarity between paralogs rarely exceeds 30% amino acid identity, the duplication events were not recent. The largest AR9 paralog family includes 6 products of adjacent genes (although some of them are probably fragmented), namely gp030, gp031, gp032, gp033, gp034, and gp035, which are alpha/beta proteins according to the secondary structure prediction. Only one tRNA gene (tRNAAsn) located between g243 and g244 was identified.
Fig. 1.

The AR9 genome. The 251,042 bp AR9 genome is schematically presented. ORFs are indicated as arrows colored according to functional predictions, the direction of an arrow shows the direction of transcription. Light green – genes coding enzymes of nucleotide metabolism; dark green – genes coding proteins involved in DNA replication and repair; red - virion protein genes; yellow - genes for predicted RNAP subunits; yellow arrows with red outline indicate virion RNAP subunits genes. Introns are labeled with the orange-colored font. Overrepresented motifs that likely function as early promoters are indicated as blue flags.
Shine-Dalgarno sequences could be identified for all predicted ORFs. Predicted start codons included AUG (88.7%), UUG (8.6%), and GUG (2.7%) consistent with B. subtilis start codon usage frequencies (Kunst et al., 1997). The overall G+C content of the phage genome is 27.75%, which is significantly lower than that of the host (43.51%). No significant difference in the G+C content between oppositely transcribed units was observed; 81 AR9 genes have significantly higher than average G+C content, up to 43.3% for g123 encoding a hypothetical protein, see also below).
The annotated nucleotide sequence of AR9 have been submitted to the GenBank database under the accession number KU878088.
3.2. Proteins of the AR9 virions
SDS-PAGE of purified phage particles and mass spectrometric analysis of protein bands was used to catalogue AR9 virion proteins (Fig. 2). At least 43 virion components were identified (Fig. 2). While some of these are homologous to other phage proteins with assigned functions, such as the contractile tail sheath structural protein gp126, and portal vertex protein gp128, most have no identifiable homologs. Proteins homologous to AR9 virion components gp117, gp205, gp213, gp218, gp251, gp259, and gp281 have been identified in phiR1–37 virions, while a homolog of gp214 - in the virions of OBP (Skurnik et al., 2012; Cornelissen et al., 2012). The functions of these proteins are unknown, but according to our experimental data (Fig. 2), we postulate, that gp117, which forms a major band on SDS gels, is a precursor of the major head subunit.
Fig. 2.

Identification of AR9 virion proteins. Proteins from purified AR9 particles were separated by SDS-PAGE in 12% acrylamide gel and stained with Coomassie blue. The molecular weights (in kDa) of marker (M) proteins are indicated on the left. The numbers on the right indicate genes whose products (gp’s) were detected in individual gel bands by proteometric analysis. Gene products found in multiple bands on the gel are indicated by colored fonts.
Comparisons of tryptic maps and predicted translated sequences showed that, similar to other phages, some AR9 virion proteins are subject to proteolytic processing (Thomas et al., 2010). Polypeptides matching gp189, gp117, gp216, and gp214 were identified in multiple SDS-gel bands (labeled in different colors in Fig. 2), so the corresponding proteins apparently contain multiple sites of proteolysis. For gp276, gp128, gp251, gp057, gp145, gp096, gp114, gp077, and gp115, mass-peaks matching N-terminal tryptic peptides were not detected by mass spectroscopy, suggesting that these proteins may also be subject to controlled proteolysis/maturation.
AR9 gp114 is homologous to bacteriophage T4 gp21 zymogen, the protease that degrades the scaffold and cleaves most of the other head proteins, including gp23 and gp24, making room in the prohead cavity. Thus, gp114 is predicted to function as the assembly protease. Another virion protein, gp051, belongs to the SprT-like peptidase family and could also participate in structural proteins cleavage (Rao and Black, 1985).
Virion assembly is often aided by chaperonins. The gp228 of AR9 is a putative GroEL chaperone. Genes encoding similar protein are found in many Myoviridae genomes. The best-studied homolog is phage T4 gp31, which is essential for the folding of the T4 major capsid protein (gp23) and its insertion into a growing T4 head (Linder et al., 1994).
24 virion proteins are encoded by genes with higher than average G+C content. A tendency of closer match between the G+C content of phage structural genes with that of bacterial hosts has been noted previously and it was hypothesized that this mimicry could help increase the level of production of virion proteins which have to be abundant (Lucks et al., 2008). Many of AR9 virion protein genes with higher than average G+C content form clusters (Fig. 1), suggestive of operon formation and/or coregulation.
3.3. Predicted functions of AR9 gene products
Sequence similarity searches allowed prediction of at least a general function for over 30% of AR9 proteins as described below.
3.3.1. AR9 RNA polymerases
Our main impetus to sequence the AR9 genome was to determine whether it is related to phiKZ, which encodes two unusual multisubunit RNAPs. Indeed, the AR9 genome encodes a full complement of RNAP-like proteins found in other phiKZ-related phages. These include gp078, C-terminal portion of the RNAP β subunit; gp089, C-terminal portion of β; gp105, N-terminal portion of β; gp145, N-terminal portion of the β′ subunit; gp154, C-terminal part of β′; gp189, C-terminal portion of β; gp264, C-terminal part of β′, and gp270, N-terminal portion of β′.
In all experimentally characterized phiKZ-related phages, one set of RNAP subunits is found in the virion (Mesyanzhinov et al., 2002; Thomas et al., 2010; Skurnik et al., 2012). Proteometric analysis of AR9 virions revealed the presence of gp078, gp145, gp189, and gp264. Together, these proteins correspond to full-length β and β′-like subunits (Fig. 3). Thus, these proteins form the AR9 virion RNAP (vRNAP). The RNAP-like part of gp078 is fused to an N-terminal HD domain, a predicted nuclease. This domain appears to be absent from gp078 present in AR9 virions since i) no corresponding tryptic peptides were detected by mass spectrometry (while 14 unique peptides that cover 24% of the rest of gp078 sequence were detected) and ii) the SDS gel mobility of gp078 from virions corresponds to a Mr of 50 kDa compared to the calculated 60 kDa of full-sized protein.
Fig. 3.

Comparison of virion and non-virion RNAP subunits of phiKZ-related phages with Thermus thermophilus β and β′ RNAP subunits. On the left, the AR9 and Thermus thermophilus (T.th) RNAP subunits β (A) and β′ (B) are shown as arrows, evolutionarily conserved sequence regions are labeled according to the Lane and Darst nomenclature: 1–16 for β and 1–20 for β′. Active center regions are shown as yellow ovals. Sequence of gp078 was trimmed to remove the N-terminal nuclease domain. Constructed by neighbor joining method phylogenetic trees of β and β′ subunits from all phiKZ-related phages are shown on the right. The AR9 subunits are highlighted by red underline. Sequences β and β′ subunits from T.th were selected as an outgroup.
Other AR9 proteins homologous to cellular RNAP largest subunits must form the non-virion RNAP (nvRNAP). Recently, a nvRNAP encoded by the phiKZ phage was purified. Besides subunits corresponding to bacterial RNAP β and β′, an additional subunit, gp68, was present in the enzyme (Yakunina et al., 2015). The phiKZ gp68 shows no sequence similarity to proteins of known function but is homologous to AR9 gp226, suggesting that the latter is also subunit of the AR9 nvRNAP. Sequences of the homologs of bacterial RNAP subunits β and β′ from all 16 sequenced phiKZ-related phages were compared and several sequences were corrected, mainly in intron-flanking areas (see below) and in the N-terminal parts of the proteins (Fig. 3). To ascertain the evolutionary conservation and functionally identify the β and β′–like subunits of AR9, we compared the sequences of the phiKZ-related phage RNAPs subunits with those of corresponding subunits of Thermus thermophilus RNAP and localized the universally conserved RNAP regions (Figs. S3–S6) (Lane and Darst, 2010). All the major regions of conservation including the amino acid motifs that comprise the catalytic center were detected in the giant phage sequences indicating that, notwithstanding the split genes for both largest subunits, each of these phages encodes two functional RNAPs. The presence of homologs of both RNAPs of phiKZ-related phages is a diagnostic feature that clearly identifies AR9 as a new member of the group, the first one infecting a Gram-positive host. The N-terminal HD nuclease domain fused to gp078, the C-terminal fragment of vRNAP β, is a unique feature of AR9. The topologies of the phylogenetic trees for the β and β’ subunits are fully congruent and compatible with the simple evolutionary scenario in which the vRNAP and nvRNAP emerged via two gene duplications in an ancestral phage genome (Fig. 3). Given the different positions of the splits in the β subunits of vRNAP and nvRNAP, it appears most likely that the genes were split after the duplications, conceivably due to enhanced intron mobility in ancestral phage. In contrast, vRNAP and nvRNAP β′ subunits are split at the same area, which may imply that they were split before the duplication of the ancestral gene.
Genomes of each phiKZ-related phages contain multiply repeated conserved motifs in front of predicted or, in the case of phiKZ, validated, early genes (Ceyssens et al., 2014; Cornelissen et al., 2012). We selected 200 bp regions upstream of all predicted AR9 genes and searched for overrepresented sequence motifs using MEME program (Bailey et al., 2009). Only one motif, a highly conserved A-T-rich sequence, was identified (Figs. 1 and 4). This motif occurs 34 times in AR9 intergenic regions. This consensus is not detectable in other phiKZ-related phages, which, however, possess distinct AT-rich motifs of their own (Fig. 4). One gene located downstream of this recurrent motif, g104, encodes uracil-DNA-glycosylase inhibitor UDI that has been identified as an early gene in PBS1 (Friedberg et al., 1975). The A-T-rich motif thus likely defines the AR9 early promoter recognized by vRNAP. Nine other genes located downstream of putative early promoters encode proteins involved in DNA metabolism or replication, namely DNA gyrase subunit B (gp024), AAA ATPase (gp028), PhoH-like protein (gp074), AAA ATPase (gp083), DNA polymerase exonuclease subunit (gp132), DNA polymerase elongation subunit (gp152), ribonuclease H (gp241), Hef-like homing endonuclease (gp265), and the N-terminal fragment of nvRNAP β’ (gp270). Products of two genes, gp057 and gp214, that are located downstream of putative early promoters were detected in AR9 virions. Interestingly, one early promoter motif is located in front of the HNH homing endonuclease gene 190 but inside the coding sequence interrupted gene 189 (see below).
Fig. 4.

Putative early (vRNAP) promoter consensus sequences of phiKZ-related phages. Consensus sequences for overrepresented motifs from intergenic regions of indicated phages are shown.
13 putative early promoters are located in intergenic areas separating 22 oppositely transcribed AR9 gene clusters. Thus, in these intergenic regions, an oppositely oriented promoter of a different temporal class shall be present. The remaining 9 intergenic areas are each expected to contain two oppositely oriented promoters of middle or late expression classes. However, we failed to identify any candidate promoter motifs in these areas.
3.4. DNA replication and repair
A number of AR9 genes encode proteins involved in DNA replication. The AR9 family B DNA polymerase is split like in several other phages (Petrov et al., 2010), including a Luz24-like phages of P. aeruginosa and phiEco32 phage of E. coli (Savalia et al., 2008) and consists of two subunits: gp132, containing a 3′–5′ exonuclease domain and a part of the elongation domain, and gp152, containing the rest of the elongation domain. There is no AR9 gene encoding a recognizable homolog of 5′–3′ exonuclease subunit of the PIN superfamily, a common domain of PolA, cellular DNA polymerase I.
The synthesis of DNA during AR9 replication is likely initiated from RNA primers made by gp225, a predicted DnaG-like primase. Gp241 is an RNase H, which is likely responsible for the removal of primers. The best-studied phage RNase H, encoded by the T4 phage, degrades both RNA·DNA and DNA·DNA duplexes from their 5′ termini, producing short oligonucleotide products (Hollingsworth and Nossal, 1991). This process requires the single-strand DNA-binding (SSB) protein (gp32 in T4), which binds DNA behind RNase H and converts it into a processive exonuclease (Gangisetty et al., 2005); the SSB is required also during the elongation stage of DNA replication. However, we were unable to identify an SSB homolog in the AR9 genome suggesting that the phage either uses the host SSB or encodes an SSB that is either extremely diverged or not homologous to known SSB proteins. Gp096 is an NAD-dependent DNA-ligase that may ligate the Okazaki fragments. Another key protein implicated in AR9 DNA replication is gp007, a distant homolog of the bacterial replicative helicase DnaB. Homologs of this protein are encoded by all sequenced phiKZ-related phages.
DNA hairpin structures can inhibit DNA replication and are also intermediates of recombination. Several AR9 genes encode proteins implicated in DNA hairpin resolution. These include gp209, a Holliday junction resolvase of the RusA family; gp233, a PD-[DE]xK superfamily endonuclease distantly related to phage endonuclease I, the enzyme involved in the processing of four-way Holliday DNA junctions; gp086, a UvsX-like recombinase that has been characterized in detail in T4 phage (Liu et al., 2013, Ando and Morrical, 1998), gp004 – DNA processing chain A DprA protein, and gp099 – non-homologous end joining protein of the Ku70/Ku80 family. Most of these proteins are encoded in the genomes of other phiKZ-related phages (Table S1). Similar to most other phiKZ relatives, AR9 encodes two subunits of the ATP-dependent dsDNA exonuclease repair complex SbcCD – MPP (metallophosphatase) family nuclease SbcD (gp002) and the ATPase subunit SbcC (gp139). The SbcCD complex acts as a 3′->5′ double-stranded exonuclease that can open up hairpin structures. AR9 also encodes DNA gyrase subunits A (gp044) and B (gp024). The specific functions of phage gyrases remain uncharacterized.
Gp016 is a member of the RecD/TraA family of superfamily I helicases. Gp156 is a close homolog of a phiR1–37 protein annotated as helicase subunit of a type III restriction enzyme likely involved in host DNA degradation (Skurnik et al., 2012).
Several AR9 DNA-binding proteins could be predicted. Gp014, gp027, gp149, gp249, and gp257 are helix-turn-helix domain proteins. Gp103 is a homolog of the bacterial DNA-binding protein HU known as the integration host factor. Gp041, gp042, gp063, gp085, gp136, gp141, gp172, gp183 contain a zinc finger domain that could be involved in DNA or RNA binding.
3.4.1. Nucleotide metabolism
Several AR9 genes encoding enzymes of nucleotide metabolism were identified, in particular a guanylate kinase (gp238) and a CMP/dCMP deaminase (gp221). Homologs of gp221 are also encoded by thymidine-containing genomes of some T4-like phages. Thus, gp221 could balance the levels of dUTP and dCTP during phage reproduction rather than supplying deoxyuridine precursors for incorporation in the AR9 DNA. Gp222, gp223, and gp224 are, respectively, ribonucleotide-diphosphate reductase (RNR) subunit beta and alpha, and flavodoxin co-factor. These three proteins are predicted to form a class I RNR complex, which catalyzes the biosynthesis of deoxyribonucleotides from the corresponding ribonucleotides ensuring a supply of precursors for phage DNA synthesis. Most phages encoding RNR enzyme complexes belong to Myoviridae and class I is the most common RNR class observed in phages (70%) (Dwivedi et al., 2013). Gp174 is highly similar to Bacillus-type 3′–5′ oligoribonuclease. This enzyme could degrade short RNAs into mononucleotides, supplying nucleotides for AR9 RNA synthesis via a salvage pathway. Gp104 is identical to the uracil-DNA-glycosylase (UDGase) inhibitor UDI from PBS1/2, as mentioned above.
3.4.2. Host lysis
AR9 encompasses a large and diverse set of proteins that could be involved in host lysis. These genes include six proteins implicated in cell wall lysis, including three distinct N-acetylmuramyl-l-alanine amidases, gp020 (similar to a cell wall hydrolase involved in spore germination), gp195 (an endolysin, contains the canonical N-terminal glycoside hydrolase domain and a C-terminal cell wall binding LysM domain), and gp272 (in which the catalytic domain appears to be inactivated). All three proteins are most closely similar to homologs from firmicutes, suggesting acquisition of respective genes from bacterial hosts. Two other predicted host lysis proteins are gp025, a member of the mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase family, and gp049, a predicted peptidoglycan-binding protein containing the LysM domain. Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase and the homologous flagellar protein J (FlgJ) both hydrolyze peptidoglycan facilitating flagellar rod assembly (Nambu et al., 1999). Given that AR9 is a flagellar-specific phage, gp025 could be involved in peptidoglycan hydrolysis facilitating the injection of AR9 DNA after the page attachment to the host cell (Raimondo et al., 1968). However, gp025 was not detected among the virion proteins, suggesting that this protein is either present in phage particles in low abundance or that it performs a different function. Another protein implicated in host lysis is gp067, a homolog of poly-γ-glutamate hydrolase P (PghP) of B. subtilis phage ϕNIT1 which hydrolyzes the γ-glutamyl peptide linkage of extracellular poly-γ-glutamate produced by some Bacilli (including B. subtilis 168) and facilitates infection and growth of the phage by allowing infection of encapsulated cells formed at high salt concentration (Kimura and Itoh, 2003). The gp196 protein is a homolog of N-acetylmuramoyl-l-alanine amidases (cd06523), which include bacteriophage endolysins that show potent lytic activity toward Bacilli. Finally, gp082 is a potential holin that might be involved in phage exit from the cell. Other potential candidates for holin function are gp008, gp047, gp048, gp066, and gp248 - small hydrophobic proteins with multiple transmembrane domains. The large number of AR9 proteins implicated in cell lysis might be specifically advantageous to a phage that infects spore-forming bacteria.
3.5. The AR9 introns
AR9 encodes 7 predicted homing endonucleases from two distinct families, namely the HNH family (gp190, gp271) and the Hef (VSR) family (gp079, gp146, gp147, gp188, and gp265). Four AR9 genes are interrupted by endonuclease genes (Fig. 1). Gene 145, which encodes the N-terminal fragment of vRNAP β, contains two closely spaced homing endonuclease genes. Some AR9 genes also appear to contain frame-shift mutations as suggested by the analysis of their polypeptide product sequences. Closer inspection revealed the presence of sequences similar to group I introns flanking the homing endonuclease genes. Regions around frame-shift sites without homing endonuclease genes also contain group I intron-like sequences (Gardner et al., 2009). Most of the putative AR9 introns are similar to well-characterized introns in the large ribonucleotide reductase NrdE subunit gene of Staphylococcus phage Twort (carries HNH endonuclease I-TwoI (Landthaler and Shub, 1999)) and in the small ribonucleotide reductase subunit NrdB gene of T4 (carries HNH endonuclease I-TevIII Sjöberg et al., 1986). Such introns are widespread in bacteria and phages, including all known phiKZ-related phages. For example, gene 179 of phage phiKZ encodes an HNH endonuclease and is located inside an intron inserted in N-terminal fragment of vRNAP β’ subunit gene 180; genes 312 and 313 of phage phiR1–37 likely encode a single protein that is produced from an mRNA that arises after splicing of an unannotated intron; genes 273 and 274 of phage 201phi2-1 are separated by an annotated intron and likely also encode a single protein. A putative second intron sequence interrupting g145 of AR9 (N-terminal fragment of vRNAP β’) is a group II intron (Table 1). A similar validated intron carrying a transposase gene is located in Bacillus sp. EA1 recA gene (Ng et al., 2007). The g145 intron, however, contains a Hef family endonuclease gene. Two introns in g044, one in g078, one in g189, two in g205 and one in g270 appear to be unrelated to known group I or group II introns according to the Rfam prediction (Table 1).
Table 1.
Intron positions in the AR9 genome and similarities of intron sequences to other known sequences.
| Gene carrying the intron | Intron coding gene | Rfam prediction | Similar sequences |
|---|---|---|---|
| g044 | – | – | Similar sequences, although not annotated as introns, are present in the Cryptosporidium varanii 18S ribosomal RNA gene, and many bacterial 16S rRNA genes. |
| DNA gyrase subunit A (35264..35482, 35837..35934, 36293..38405) | – | – | |
| g078 | gp079 | Group I | Intron sequences of many Bacillus and Staphylococcus phages, experimentally characterized introns of Twort nrdE and T4 nrdB |
| HD family phosphohydrolase | putative endonuclease | ||
| domain fused to DNA-directed RNAP beta subunit, C-terminus | (64355..65179) | ||
| (62476..63717, 64046..64153, 65371..65571) | – | – | |
| g114 | – | Group I | Intron sequences of many phages, experimentally characterized introns of phage Twort nrdE and T4 nrdB, a sequence in g179 of phage phiKZ |
| prohead core scaffold protein and protease (88502..89080, 89426..89653) | |||
| g121 | – | Group I | Intron sequences of many phages, including experimentally characterized introns of phage Twort nrdE and T4 nrdB, and a sequence of g273-g274 intron of phage 201phi2-1 |
| terminase large subunit | – | ||
| (94802..95404, 95694..95721, 96034..96078, 96399..96473, 96793..98003) | gp122 | Group I | |
| hypothetical protein | Group I | ||
| (95827..95946) | Group I | ||
| g145 | gp146 | Group I | Intron sequences of many phages, including experimentally characterized introns of Twort nrdE and T4 nrdB |
| DNA-directed RNA polymerase beta’-subunit, N-terminus (118211..118372, 120262..120446,121999..123163) | putative homing endonuclease (118698..120182) | ||
| gp147 | Group II | Similar sequences, although not annotated as introns, are present in Clostridium perfringens, Bacillus thuringiensis and other bacterial and plasmid genomes. Experimentally characterized group II intron in Bacillus sp. EA1 recA gene | |
| putative homing endonuclease (120534..121442) | |||
| g189 | gp190 | Group I | Intron sequences of many phages, including experimentally characterized introns of Twort nrdE and T4 nrdB and a sequence in g312-g313 of phage phiR1–37 |
| DNA-directed RNA polymerase beta subunit, N-terminus (149146..150057, 150356..150577, 150931..151108, 152162..152172,152508..152585, 152969..155764) | HNH homing endonuclease | Group I | |
| (151338..152015) | – | ||
| – | Group I | Intron sequences of Staphylococcus and other phages and Bacillus sp. including AR9 host B. subtilis 168 (nrdF gene for ribonucleoside-diphosphate reductase small subunit). | |
| Group I | |||
| g205 | gp206 | – | Intron sequences of many phages, including experimentally characterized introns of Twort nrdE and T4 nrdB and sequence in g179 of phage phiKZ and g334-g336 of phage phiR1–37 |
| virion protein | hypothetical protein | ||
| (173106..174131,174513..174572,175055..175441) | (174592..174735) | – | |
| – | |||
| g270 | gp271 | – | Intron sequence of Vibrio phage 11895-B1 g00138 (no experimental validation) |
| DNA-directed RNA polymerase beta’ subunit, N-terminus (225343..225504, 226402..227520) | HNH homing endonuclease (225738..226292) |
To validate predicted AR9 introns, PCR reactions were performed with appropriate gene-specific primer pairs on first-strand cDNA prepared using hexameric primers and total RNA isolated from AR9-infected cells collected at different time points of infection. In all cases, the sizes of PCR fragments amplified from cDNA templates were smaller than the sizes of the corresponding fragments amplified from the genome (Fig. 5(A)), thus demonstrating RNA splicing and validating intron predictions. The PCR fragments amplified from cDNA were sequenced, and the exact intron junctions were identified (Table 1). At no time point were PCR products corresponding to unspliced RNA detectable (Fig. S1), indicating that at least under our conditions of infection, the presence of an intron is unlikely to have a strong inhibitory effect on the translation of the intron-containing phage genes.
Fig. 5.

In vivo splicing of AR9 introns. (A) Electrophoretic analysis of PCR products obtained on AR9 genomic DNA and cDNA synthesized from RNA purified during the infection with primers annealing to flanking sequences of predicted introns in phage genes. The sizes of the PCR products are indicated. A 1 kb Plus DNA ladder was used as a marker (M), with sizes (in bp) shown on the right. (B) Schematic representation of the genomic region encoding the most interrupted gene – gp189. Below, the spliced mRNA product is shown.
Comparison of sequences of AR9 genes containing predicted introns with the corresponding cDNA-derived sequences showed that in several cases genes are interrupted by very closely spaced introns with short coding sequences in between. The shortest such coding sequence fragment was 78 bp long. The AR9 gene g189, which encodes the N-terminal fragment of vRNAP β, contains five juxtaposed introns (Fig. 5(B)), which is the record for currently known genes containing self-splicing introns. The gene with the greatest number of such introns (three) known up to date was ORF142 of S. aureus bacteriophage Twort (Landthaler and Shub, 1999).
Phage phiR1–37, the closest relative of AR9, also encodes several homing nucleases, namely gp151, gp171, g179, gp230, gp313, and gp336 only some of which are located within introns of other genes. For example, the gene coding for phiR1–37 gp313 interrupts the gene encoding the putative DNA gyrase subunit B and is therefore located in an intron, whereas phiR1–37 gp336 is a free-standing endonuclease. Phylogenetically, gp079 and phiR1–37_gp336 are more closely related to each other than gp079 is to gp147, or phiR1–37_gp313 to phiR1–37_gp336. However, the gene contexts of g079 and phiR1–37_g313 are different (Fig. S2). Intron-encoded endonuclease gp146 belongs to the family of Hef-like homing endonucleases. Two of its homologs (gp188 and gp265) are free-standing AR9 nucleases. Two homologous proteins are also encoded in the phage phiR1–37 genome, gp171 and gp230. PhiR1–37_g171 interrupts the DNA polymerase gene (phiR1–37_gp170/phiR1–37_gp172), whereas phiR1–37_g230 interrupts the C-terminal fragment of vRNAP β subunit gene (phiR1–37_gp231) (Fig. S2). The fact that homologs of AR9 endonucleases encoded by genes located in self-splicing introns are also present in intergenic regions of AR9 and in non-homologous genes of the phiR1–37 genome indicates that these introns are active mobile elements and that intron-encoded endonucleases are capable of introducing DNA breaks at different positions of the genome, contributing to the diversity of the phages in this group.
4. Discussion
In this paper we present the genome of the B. subtilis AR9 phage. Previous studies have suggested that AR9, isolated in the USSR, might be very close to better known PBS1/2 phages, that were isolated in Japan and have been extensively used for B. subtilis genetic engineering via general transduction (Dubnau et al., 1967). Although the PBS1/2 sequence is unknown, the only available in databases ~700 bp fragment of PBS1/2 is identical to a corresponding fragment of AR9 supports this conjecture and suggests that the two phages may be very closely related despite the fact that they were isolated at different times in geographically different sites.
Our main impetus to determine the AR9 sequence came from the pioneering work of Losick and colleagues who reported purification of a unique RNAP from PBS2-infected cells (Clark et al., 1974). Based on subunit composition, this enzyme was clearly distinct from both the single-subunit phage RNAPs and cellular multisubunit RNAPs. Much later, independent studies of phages of the so-called “giant phage” group have suggested that phages related to the P. aeruginosa phiKZ encode two sets of unusual proteins with similarities to largest, catalytic subunits of cellular RNAPs (Mesyanzhinov et al., 2002). The non-virion RNAP of phiKZ indeed has been recently purified and shown to contain 5 subunits and recognize late phage promoters in vitro (Yakunina et al., 2015). No virion RNAP has been purified from any phiKZ-related phage yet, however, the growth of phiKZ is resistant to the host RNAP inhibitor rifampicin (Ceyssens et al., 2014), suggesting that the phage relies on vRNAP for transcription of its early genes. Except for phiKZ, PBS1/2 is the only known dsDNA phage that is independent of transcription by the host RNAP (Price and Frabotta, 1972).
We hypothesized that the multisubunit PBS1/2 RNAP could be related to one of the RNAPs of phiKZ or its relatives. Analysis of the AR9 genome supports this conjecture and generally shows that AR9 is a clear relative of the phiKZ group phages. For bacteriophages that belong to a single phylogenetic group, a set of core proteins, which are essential for the function of the virus, typically can be identified (Comeau et al., 2007). For most phages, genes encoding core proteins are either clustered together or their relative positions in the genome are preserved between different phages. Among the 292 predicted AR9 proteins, 18 have readily identifiable orthologs in most of the currently sequenced genomes of phiKZ relatives. The very small size of the core gene set of the giant phages is unusual for bacterial viruses but is similar to the relative core size of the giant dsDNA viruses of eukaryotes in the proposed order “Megavirales” (Koonin and Yutin, 2010). The core gene set of phiKZ-related phages includes all RNAP subunits (9 proteins of which 8 comprise the two sets of split large subunits), phosphoesterase, DnaB-like replicative helicase, large terminase subunit, the split family B DNA polymerase, the SbcCD complex ATPase, RNA helicase, ribonuclease H, and only one (excluding vRNAP genes) virion protein (gp205 of AR9).
A comparison of the AR9 genome with its closest sequenced relative, phage phiR1–37, revealed 76 orthologous genes which are mostly dispersed along the genome, though a few syntenic regions are also present. Both phiR1–37 and AR9 contain uridine in their DNA instead of thymidine. Both AR9 and PBS1/2 encode uracil glycosylase inhibitor UDI that is synthesized early in infection and binds to and blocks the DNA-binding site of UDGase, completely abrogating the enzyme activity (Savva and Pearl, 1995). Surprisingly, phiR1–37 does not encode a predicted UDI (Skurnik et al., 2012). Another giant phage with a genome in which thymine has been replaced by uracil, is staphylococcal phage S6. The genome of S6 has not been sequenced yet although it has been suggested that this phage is related to PBS1 (Uchiyama et al., 2014). Whether S6 encodes a UDI is unknown but one could expect that there are multiple, unrelated mechanisms of host uracil glycosylase inhibition that remain to be characterized. Similarly to PBS1/2 UDI, such proteins may find applications in biotechnology.
Both fast evolution of some genes, especially structural proteins, and apparently high rate of horizontal gene transfer contribute to the overall weak genomic conservation between AR9 and other phiKZ-related phages. The AR9 genome contains multiple genes that seem to have been recently transferred from other phage or bacterial genomes. The closest relatives of eight AR9 genes are various Bacillus sp. phages unrelated to the phiKZ (0305phi8-36, phiNIT1, SPO1, SP10, G, phiGATE, B4.) For many other AR9 gene products, the best hits are to Bacillus and Clostridium proteins, which are likely encoded by prophages. A segment of the AR9 genome encoding gp053-gp062 is similar to a segment of DNA of a Clostridium species that contains mainly genes coding for proteins homologous to virion proteins of the giant B. thuringiensis phage 0305phi8-36 (Thomas et al., 2007).
In the small set of core proteins found in the group of phiKZ-related phages considered in this work, 9 proteins are RNAP subunits, implying that host-independent transcription by multisubunit phage RNAPs is a common strategy unifying the entire group. Biochemical study of these enzymes should provide new insights into the evolution, mechanism, and regulation of transcription. The enzyme purified from PBS2-infected cells by Clark et al. (1974). likely corresponds to nvRNAP. Their enzyme had an estimated molecular weight of 315 kDa and contained five subunits of 80 kDa (P80), 76 kDa (P76), 58 kDa (P58), 53 kDa (P53), and 48 kDa (P48) present in equimolar amounts (Clark et al., 1974; Clark, 1978). A minor form without P53 also has been purified and found to be transcription-competent. Comparing the sizes of PBS2 RNAP with AR9 proteins homologous to RNAP subunits, we predict that P80 is gp089, P76 is gp105, P58 is gp105, and P48 is gp270. Together, these proteins comprise complete counterparts of the bacterial RNAP β and β’–like polypeptides. By analogy with the phiKZ nvRNAP, which contains a fifth subunit with no detectable similarity to proteins of known function, gp68, P53 should correspond to AR9 gp226, the ortholog of phiKZ gp68.
Multiple homing endonuclease genes have been annotated in phiKZ-related phages and some intron predictions have been made, however, no experimental validation has been reported. The experiments described here validate the introns in the AR9 genome and raise interesting questions about intron dynamics in this phage group, given that intron positions vary between phages. The presence of introns within essential genes confirms a common theme seen in many prokaryotic genomes, namely that self-splicing introns insert into highly conserved regions within functionally important genes, particularly those involved in genome replication and gene expression. Targeting functionally important genes could be an evolutionary strategy to maximize spread and to prevent deletion of mobile elements because highly similar genes and target sites will be present in related genomes (Edgell et al., 2010). The fact that RNAP subunit genes are particularly enriched with introns, including multiple introns, testifies to the functional significance of these enzymes for development of phiKZ and its relatives. Furthermore, the conspicuous split of all genes coding for large RNAP subunits in all phages from this group may reflect deletions of ancestral introns accompanied by genome rearrangements.
Supplementary Material
Funding
This work was supported by NIH (GM RO159295), by the Russian Academy of Sciences Molecular and Cellular Biology program Grant to KS and by the Ministry of Education and Science of the Russian Federation, Grant 14. B25.31.0004. EVK and KSM are supported by intramural funds of the US Department of Health and Human Services (to the National Library of Medicine).
Appendix A. Supplementary material
Supplementary data associated with this article can be found in the online version at http://dx.doi.org/10.1016/j.virol.2016.04.030.
Footnotes
Competing interests
None declared.
References
- Ando RA, Morrical SW, 1998. Single-stranded DNA binding properties of the UvsX recombinase of bacteriophage T4: binding parameters and effects of nucleotides. J. Mol. Biol 283, 4. [DOI] [PubMed] [Google Scholar]
- Bailey TL, et al. , 2009. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res, 37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Belyaeva NN, Azizbekyan RR, 1968. Fine structure of new Bacillus subtilis phage AR9 with complex morphology. Virology 34, 1. [DOI] [PubMed] [Google Scholar]
- Ceyssens P-J, et al. , 2014. Development of giant bacteriophage ϕKZ is independent of the host transcription apparatus. J. Virol 88 (18). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark S, 1978. Transcriptional specificity of a multisubunit RNA polymerase induced by Bacillus subtilis bacteriophage PBS2. J. Virol 25, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark S, Losick R, J. P, 1974. New RNA polymerase from Bacillus subtilis infected with phage PBS2. Nature 1 (252), 5478. [DOI] [PubMed] [Google Scholar]
- Comeau AM, Bertrand C, Letarov A, Tétart F, Krisch HM, 2007. Modular architecture of the T4 phage superfamily: a conserved core genome and a plastic periphery. Virology, 362. [DOI] [PubMed] [Google Scholar]
- Cornelissen a, et al. , 2012. Complete genome sequence of the giant virus OBP and comparative genome analysis of the diverse phiKZ-related phages. J. Virol 86, 3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dömötör D, et al. , 2012. Complete genomic sequence of Erwinia amylovora phage PhiEaH2. J. Virol 86, 19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drozdetskiy A, et al. , 2015. JPred4: a protein secondary structure prediction server. Nucleic Acids Res. 43 (W1). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dubnau D, et al. , 1967. Genetic mapping in Bacillus subtilis. J. Mol. Biol 27, 1. [DOI] [PubMed] [Google Scholar]
- Dwivedi B, et al. , 2013. A bioinformatic analysis of ribonucleotide reductase genes in phage genomes and metagenomes. BMC Evolut. Biol 13, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edgell DR, Gibb EA, Belfort M, 2010. Mobile DNA elements in T4 and related phages. Virol. J, 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Falco SC, Vander Laan K, Rothman-Denes LB, 1977. Virion-associated RNA polymerase required for bacteriophage N4 development. Proc. Natl. Acad. Sci. U. S.A 74, 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friedberg EC, Ganesan AK, Minton K, 1975. N-Glycosidase activity in extracts of Bacillus subtilis and its inhibition after infection with bacteriophage PBS2. J. Virol 16, 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gangisetty O, et al. , 2005. Maturation of bacteriophage T4 lagging strand fragments depends on interaction of T4 RNase H with T4 32 protein rather than the T4 gene 45 clamp. J. Biol. Chem 280 (13), 12876–12887. [DOI] [PubMed] [Google Scholar]
- Gardner PP, et al. , 2009. Rfam: updates to the RNA families database. Nucleic Acids Res. 37 (Database). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hendrix RW, 2009. Jumbo bacteriophages. Curr. Top. Microbiol. Immunol, 328. [DOI] [PubMed] [Google Scholar]
- Hertveldt K, et al. , 2005. Genome comparison of Pseudomonas aeruginosa large phages. J. Mol. Biol 354, 3. [DOI] [PubMed] [Google Scholar]
- Hollingsworth H, Nossal N, 1991. Bacteriophage T4 encodes an RNase H which removes RNA primers made by the T4 DNA replication system in vitro. J. Biol. Chem 266, 3. [PubMed] [Google Scholar]
- Kiljunen S, et al. , 2005. Yersiniophage ϕR1–37 is a tailed bacteriophage having a 270 kb DNA genome with thymidine replaced by deoxyuridine. Microbiology 151, 12. [DOI] [PubMed] [Google Scholar]
- Kimura K, Itoh Y, 2003. Characterization of poly-gamma-glutamate hydrolase encoded by a bacteriophage genome: possible role in phage infection of bacillus subtilis encapsulated with poly-gamma-glutamate. Appl. Environ. Microbiol 69, 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koonin EV, Yutin N, 2010. Origin and evolution of eukaryotic large nucleo-cytoplasmic DNA viruses. Intervirology 53, 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krogh A, et al. , 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol 305, 3. [DOI] [PubMed] [Google Scholar]
- Krylov VN, et al. , 2007. “phiKZ-like viruses”, a proposed new genus of myovirus bacteriophages. Arch. Virol 152, 10. [DOI] [PubMed] [Google Scholar]
- Kunst F, et al. , 1997. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature 390, 6657. [DOI] [PubMed] [Google Scholar]
- Kwan T, et al. , 2006. Comparative Genomic Analysis of 18 Pseudomonas aeruginosa Bacteriophages. J. Bacteriol 188 (3). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Landthaler M, Shub DA, 1999. Unexpected abundance of self-splicing introns in the genome of bacteriophage Twort: introns in multiple genes, a single gene with three introns, and exon skipping by group I ribozymes. Proc. Natl. Acad. Sci. U.S.A 96, 12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lane WJ, Darst SA, 2010. Molecular evolution of multisubunit RNA polymerases: sequence analysis. J. Mol. Biol 395, 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee J-H, et al. , 2011. Complete genome sequence of salmonella bacteriophage SPN3US. J. Virol 85, 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee J-H, et al. , 2016. A novel bacteriophage targeting cronobacter sakazakii is a potential biocontrol agent in foods. Appl. Environ. Microbiol 82, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Linder CH, et al. , 1994. A late exclusion of bacteriophage T4 can be suppressed by Escherichia coli GroEL or Rho. Genetics 137, 3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu J, Berger CL, Morrical SW, 2013. Kinetics of presynaptic filament assembly in the presence of single-stranded DNA binding protein and recombination mediator protein. Biochemistry 52, 45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lucks JB, et al. , 2008. Genome landscapes and bacteriophage codon usage. Plos Comput. Biol 4 (2). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marchler-Bauer A, et al. , 2009. CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res. 37, D205–D210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meczker K, et al. , 2014. The genome of the Erwinia amylovora phage PhiEaH1 reveals greater diversity and broadens the applicability of phages for the treatment of fire blight. FEMS Microbiol. Lett 350 (1). [DOI] [PubMed] [Google Scholar]
- Mesyanzhinov VV, et al. , 2002. The genome of bacteriophage phiKZ of Pseudomonas aeruginosa. J. Mol. Biol 317, 1. [DOI] [PubMed] [Google Scholar]
- Monson R, et al. , 2011. The Pseudomonas aeruginosa generalized transducing phage phiPA3 is a new member of the phiKZ-like group of “jumbo” phages, and infects model laboratory strains and clinical isolates from cystic fibrosis patients. Microbiology 157 (Pt 3). [DOI] [PubMed] [Google Scholar]
- Nambu T, et al. , 1999. Peptidoglycan-hydrolyzing activity of the FlgJ protein, essential for flagellar rod formation in Salmonella typhimurium. J. Bacteriol 181, 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ng B, et al. , 2007. Reverse transcriptases: intron-encoded proteins found in thermophilic bacteria. Gene 393, 1–2. [DOI] [PubMed] [Google Scholar]
- Petrov VM, Ratnayaka S, Karam JD, 2010. Genetic insertions and diversification of the PolB-type DNA polymerase (gp43) of T4-related phages. J. Mol. Biol 395 (3). [DOI] [PubMed] [Google Scholar]
- Price AR, Frabotta M, 1972. Resistance of bacteriophage PBS2 to rifampicin infection, an inhibitor of Bacillus subtilis RNA synthesis. Biochem. Biophys. Res. Commun 48 (6). [DOI] [PubMed] [Google Scholar]
- Raimondo LM, Lundh NP, Martinez RJ, 1968. Primary adsorption site of phage PBS1: the flagellum of Bacillus subtilis. J. Virol 2, 3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rao VB, Black LW, 1985. Evidence that a phage T4 DNA packaging enzyme is a processed form of the major capsid gene product. Cell 42, 3. [DOI] [PubMed] [Google Scholar]
- Rima BK, van Kleeff BH, 1971. Similarity of Bacillus subtilis bacteriophages PBS1, 3NT and I10. Some remarks on the morphology of phage heads. Antonie Van. Leeuwenhoek 37 (3). [DOI] [PubMed] [Google Scholar]
- Söding J, Biegert A, Lupas AN, 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res, 33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sambrook J, Russell DW, 2006. Purification of nucleic acids by extraction with phenol:chloroform. CSH Protoc. 2006 (1). [DOI] [PubMed] [Google Scholar]
- Sambrook J, Fritsch EF, Maniatis T, 1989. Molecular Cloning: A Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory Press., Cold Spring Harbor, NY. [Google Scholar]
- Savalia D, Westblade LF, Goel M, Florens L, Kemp P, Akulenko N, Pavlova O, Padovan JC, Chait BT, Washburn MP, Ackermann HW, Mushegian A, Gabisonia T, Molineux I, Severinov K, 2008. Genomic and proteomic analysis of phiEco32, a novel Escherichia coli bacteriophage. J. Mol. Biol 377 (3). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Savva R, Pearl LH, 1995. Nucleotide mimicry in the crystal structure of the urecil-DNA glycosylase inhibitor protein complex. Nat. Struct. Biol 2, 9. [DOI] [PubMed] [Google Scholar]
- Schäffer AA, et al. , 2001. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29, 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Severinov K, Mooney R, Darst SA, Landick R, 1997. Tethering of the large subunits of Escherichia coli RNA polymerase. J. Biol. Chem 272, 39. [DOI] [PubMed] [Google Scholar]
- Shevchenko A, Jensen ON, Podtelejnikov AV, Sagliocco F, Wilm M, Vorm O, Mortensen P, Shevchenko A, Boucherie H, Mann M, 1996. Linking genome and proteome by mass spectrometry: large-scale identification of yeast proteins from two dimensional gels. Proc. Natl. Acad. Sci. U.S.A 93, 25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sjöberg BM, et al. , 1986. The bacteriophage T4 gene for the small subunit of ribonucleotide reductase contains an intron. EMBO J. 5, 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skurnik M, et al. , 2012. Characterization of the genome, proteome, and structure of yersiniophage R1–37. J. Virol 86, 23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steitz TA, 2009. The structural changes of T7 RNA polymerase from transcription initiation to elongation. Curr. Opin. Struct. Biol 19 (6), 171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takahashi I, 1963. Transducing phages from Bacillus subtili. J. Gen. Microbiol 31, 211–217. [DOI] [PubMed] [Google Scholar]
- Thomas JA, et al. , 2007. Complete genomic sequence and mass spectrometric analysis of highly diverse, atypical Bacillus thuringiensis phage 0305phi8-36. Virology 368, 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thomas JA, et al. , 2008. Characterization of Pseudomonas chlororaphis myovirus 201varphi2-1 via genomic sequencing, mass spectrometry, and electron microscopy. Virology 376, 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thomas JA, et al. , 2010. Proteome of the large Pseudomonas Myovirus 201ϕ2–1: delineation of proteolytically processed virion proteins. Mol. Cell. Proteom 9 (5). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Uchiyama J, et al. , 2014. Intragenus generalized transduction in Staphylococcus sAAAby a novel giant phage. ISME J. 8, 9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yagubi AI, et al. , 2014. Complete genome sequence of Erwinia amylovora bacteriophage vB_EamM_Ea35–70. Genome Announc. 2, 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yakunina M, et al. , 2015. A non-canonical multisubunit RNA polymerase encoded by a giant bacteriophage. Nucleic Acids Res. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yamagishi H, 1967. Molecular Weight of Bacteriophage PBS1 Deoxyribonucleic Acid, 1(4). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
