Abstract
RB49 is a virulent bacteriophage that infects Escherichia coli. Its virion morphology is indistinguishable from the well-known T-even phage T4, but DNA hybridization indicated that it was phylogenetically distant from T4 and thus it was classified as a pseudo-T-even phage. To further characterize RB49, we randomly sequenced small fragments corresponding to about 20% of the ≈170-kb genome. Most of these nucleotide sequences lacked sufficient homology to T4 to be detected in an NCBI BlastN analysis. However, when translated, about 70% of them encoded proteins with homology to T4 proteins. Among these sequences were the numerous components of the virion and the phage DNA replication apparatus. Mapping the RB49 genes revealed that many of them had the same relative order found in the T4 genome. The complete nucleotide sequence was determined for the two regions of RB49 genome that contain most of the genes involved in DNA replication. This sequencing revealed that RB49 has homologues of all the essential T4 replication genes, but, as expected, their sequences diverged considerably from their T4 homologues. Many of the nonessential T4 genes are absent from RB49 and have been replaced by unknown sequences. The intergenic sequences of RB49 are less conserved than the coding sequences, and in at least some cases, RB49 has evolved alternative regulatory strategies. For example, an analysis of transcription in RB49 revealed a simpler pattern of regulation than in T4, with only two, rather than three, classes of temporally controlled promoters. These results indicate that RB49 and T4 have diverged substantially from their last common ancestor. The different T4-type phages appear to contain a set of common genes that can be exploited differently, by means of plasticity in the regulatory sequences and the precise choice of a large group of facultative genes.
Approximately 170 bacteriophages with morphologies similar to T4 have been identified (1). These T4-like phages have been isolated on a wide range of bacterial hosts that grow in diverse environments (1, 2, 72). T4, the type phage of this family, is probably the best-understood virulent phage. Its genome has been entirely sequenced, and its life cycle is extremely well understood; however, until recently little was known about all the other T4-type phages.
Genomic hybridization and PCR analysis revealed that the T4-type phages vary considerably in their distance from T4 (54, 62, 72). Based on such data and limited sequence analysis, we can distinguish subgroups of the T4 type (54, 62, 72). The T-even subgroup shares considerable nucleic acid sequence homology with T4; for example, quantitative hybridization between the genomes of T2, T4, and T6 indicates that more than 90% of their sequences are nearly identical (18). As a result of their close relationship to T4, the T-even genomes can usually be analyzed and sequenced using PCR primers based on the T4 sequence (62, 71, 72). Although such comparisons confirmed that most of the T4-type phages were very close to T4, a few of them, such as RB69 (81) and SV14 (54), were clearly chimeras.
The genomes of these phages have blocks of sequence that diverge significantly from T4. The origin of these sequences became obvious with the characterization of the pseudo-T-even phages (54). The members of this subgroup of the T4-type phages (e.g., RB49 and 44rr2.8t) are more diverse in their host range than the T-even phages, and their genomes are phylogenetically distant from T4 (54, 62). Only a few genes of phage RB49, for example, still retain sufficient homology to hybridize with T4 DNA under stringent conditions (54).
The sequencing of the most conserved segments of the RB49 genome revealed that they encoded the structural proteins of the head (gp23 and gp24), the collar (gp20), and the contractile tail (gp18 and gp19) (54). Homologues of most of the structural components of the T4 virion were thought to be present in the pseudo-T-even phages (54). A small number of plasmids containing randomly cloned DNA of RB49 have been previously analyzed (54), and these had two sorts of sequences. A minority contained sequences that lacked homology to any entry in the NCBI database. The majority contained distant homologues to both nonstructural and structural genes of T4.
Aside from their significant divergence in nucleotide sequence, the T-even and the pseudo-T-even phages also differ in the modifications of their DNA. The T-even genomes contain hydroxyl-methylcytosine in place of cytosine, and these residues are generally glucosylated, which provides additional protection against host restriction systems (12, 29). The DNA of the pseudo-T-even phages does not appear to have these nucleotide modifications (54). Consequently, the pseudo-T-even phages must have evolved a transcription and replication apparatus that was adapted to this difference in the DNA template they use. Furthermore, Southern analysis of several different pseudo-T-even phages revealed that they are as distant from each other as they are from T4 (54). Their phylogenetic distance from T4, and the genome plasticity that this implies, motivated us to further investigate the pseudo-T-even subgroup of phages.
In this communication we have used an efficient partial sequencing strategy to compare the genomes of the T-even phage T4 and the pseudo-T-even phage RB49. Our snapshot of the RB49 genome indicates that it has diverged substantially and relatively uniformly from T4, and thus it is the first phage of this type to be analyzed. In addition to extensive random genomic sequencing, this study involved the targeted sequencing and characterization of two regions that contain the DNA replication genes. Although the genomes of T4 and RB49 share many features, they have important differences. A functional analysis of the RB49 promoters revealed a fundamental difference in the regulation of the gene expression in T4 and RB49. RB49 employs only two classes of temporally regulated promoters, rather than the three in T4. The evolutionary processes capable of generating such diversity within a phage family are discussed.
MATERIALS AND METHODS
Phages and bacteria.
The source of the bacteriophages T4 and RB49 were, respectively, R. H. Epstein (University of Geneva, Geneva, Switzerland) and K. Carlson, (University of Uppsala, Uppsala, Sweden). The Escherichia coli strain BE was used to prepare phage stocks and as host for most of the infections with either phage T4 or RB49. E. coli DH5α (Life Technologies) was used for electrotransformation of the RB49 genomic library. The wild-type E. coli strain AC21 and temperature-sensitive RNase E mutant (rne-3071) strain AC22 (14) were used as the hosts to test the involvement of the RNase E in the processing of the gene 32 transcripts by reverse transcriptase 5′-end mapping.
Primers.
Primers used for the reverse transcription experiments and to obtain the gene 43 and gene 32 PCR fragments are shown in Table 1. A large number of additional primers were used to perform the sequencing of the gp43 and gp32 regions that are not listed here. The sequences of these primers and others used to investigate the genomic organization of RB49 by PCR are available on request from the authors.
TABLE 1.
PCR primers used
Primer | Gene | Sequence (5′→3′) | Corresponding figure |
---|---|---|---|
49gp23rev2 | 23 | TCTGCTTAGACGCAGTTGCGATTTC | 5A |
49gp24rev1 | 24 | GCTACTGAGGAAGTAGATTCCCGCA | 5B |
49gp32rev5 | 32 | AAACCGTTGCCACCTTTCAGTGC | 4A |
49gp32rev6 | 32 | GCTCTTGCTTTTGCGGCGTCAATAA | 6B |
49gp43rev1 | 43 | GCGGCATGATGAAATAATGTTGGC | 4B |
49gp61.11 | 61 | AGCATCGCCAGCGGTCATATCTT | 2A |
49gp43.18 | 43 | GACTTTCGTCAAACCACTTGAAGG | 2A |
43gp43.21 | 43 | CATCCATCCCTAGCGCCTCTTG | 2A |
49gp460 | 46 | GCTTGATATTCGTACGCTAGGCG | 2A |
49gp341 | 34 | CTCTGTGCGGTGATAATACGGC | 2B |
49gp321 | 32 | GGCTTTACGCTTAATCAGTTGCC | 2B |
49gpnrdA1 | nrdA | TTTCTCCAGATATTCCACTTCTTCA | 2B |
49gp322 | 32 | TGATTTTGATTCTTGCCCTGTGTG | 2B |
Construction of RB49 library.
RB49 DNA was extracted from a phage stock purified by differential centrifugation (54). One volume of Roti-Phenol (Roth) was added to 4 ml of the RB49 stock (109 phages/ml). The sample was immediately centrifuged for 2 min at 4°C at ∼2,500 × g. The supernatant was collected, and one-half volume of phenol and one-half volume of chloroform were added. The samples were immediately centrifuged as before and the supernatant was collected. The phenol extraction was repeated three times. The DNA was precipitated with 0.75 volume of isopropanol (room temperature) and centrifuged for 30 min at 4°C at ∼6,000 × g. The pellet was dried and resuspended in 500 μl of Milli-Q water. Then 1 μg of this DNA was digested with the restriction enzyme Sau3A (according to New England Biolabs) (4 U/μg of DNA) for 10 min in a final volume of 30 μl. This digestion yielded heterogeneous fragments with a mean size of less than 1 kb.
A 1/10 fraction of this digestion was then incubated in the presence of 40 U of T4 DNA ligase, its appropriate buffer (New England Biolabs), and 100 ng of pUC19 vector (New England Biolabs) linearized at the BamHI site (according to New England Biolabs) in a final volume of 20 μl. The DNA was then precipitated by adding 1/10 volume of sodium acetate (3.3 M, pH 5) and 2.5 volumes of cold ethanol. The sample was centrifuged for 30 min at 4°C at ∼6,000 × g. The pellet was dried and resuspended in 20 μl of Milli-Q water. E. coli DH5α was electrotransformed with 2 μl of the sample by use of the Gene Pulser device (Bio-Rad) according to the manufacturer's recommendations.
Sequencing of the library.
The chimeric plasmids carrying the phage inserts were extracted using miniprep spin columns (Qiagen) according to the manufacturer's protocol. A 1.5-ml volume of an overnight culture was extracted and the plasmid DNA was resuspended in 50 μl of Milli-Q water. The DNA sequencing was performed on 5 μl of this material. The universal and reverse primers were both end-labeled with 25 μCi of [γ-33P]dATP using T4 polynucleotide kinase (New England Biolabs) as recommended by the manufacturer. The plasmid inserts were sequenced from both primers using the USB Thermosequenase cycle sequencing kit (US78500) and the three-deoxynucleoside triphosphate (dNTP) internal label cycle sequencing protocol recommended by the supplier.
Large PCR fragments.
PCR products with a size greater than 3 kb were obtained by using the kit Expand Long Template PCR System of Boehringer Mannheim. The reactions were prepared according to the manufacturer's recommendation with 5 μl of an RB49 stock (109 phage/ml) as the template and using the enzyme buffer number 3 of the kit. The reactions were carried out in 0.2-ml microcentrifuge tubes. The procedure involved 30 cycles in the Perkin-Elmer 2400 Gene amp PCR system. The first 10 cycles had a 30-s step at 94°C for denaturation, 10 s at 54°C for annealing and 10 min at 68°C for extension. This was followed by 20 cycles with 30 s at 94°C, 10 s at 54°C, and 15 min at 68°C.
Direct DNA sequencing of PCR fragments.
The protocol we used was developed in this laboratory and is described in detail by Bouet et al. (6).
Bacterial growth and phage infection.
Bacterial growth and phage infection were done by a modification of the protocol of Belin et al. (5). Bacterial strain BE was grown in Luria-Bertani (LB) medium at 30°C with aeration to a cell density of 108/ml, centrifuged, and concentrated fourfold in the same medium. This culture was then mixed with phage to give a final multiplicity of infection of 10 and was incubated at 12°C for 10 min without agitation to allow phage adsorption to the host. Simultaneously shifting the culture that had adsorbed phage at 12°C to 30°C and providing vigorous aeration was used to initiate infection. At 2 min after infection the percentage of surviving bacteria was 10 to 20%.
Bacterial strains AC21 and AC22 were grown in LB medium at 30°C with aeration to a cell density of 108/ml and put on ice for 5 min. They were then centrifuged and resuspended in one-quarter volume of the same medium and incubated for 10 min at 43.5°C with aeration. The culture was then placed on ice for 5 min before being mixed with phage (T4 or RB49) to give a final multiplicity of infection of 10. The phages were adsorbed to the host cells by a 10-min incubation at 12°C, and infection was initiated by shifting the culture to 43.5°C.
RNA isolation.
Isolation of total RNA from RB49- and T4-infected cells was performed essentially as described by Hagen and Young (31). At the indicated times after infection, a 4-ml portion of the infected cells was removed for cell lysis and nucleic acid extraction. About 200 to 300 μg of RNA was recovered and resuspended in 200 μl of diethylpyrocarbonate (DEPC)-treated water.
5′-End mapping of transcripts by primer extension.
5′-End mapping of transcripts by primer extension was done by a modification of the protocol of Gutierrez et al. (30). The appropriate primer (5 pmol) was end-labeled with 25 μCi of [γ-32P]dATP using T4 polynucleotide kinase (New England Biolabs) as recommended by the manufacturer. Then 10 μg of RNA isolated from phage-infected cells was incubated for 5 min at 80°C with 0.5 pmol of the labeled primer. The samples were then frozen in dry ice and defrosted gently at 4°C. The primer was extended at an appropriate temperature (42°C for all the reactions except for the gene 32 experiments, where the best results were obtained at 48°C) for 15 min in the presence of 10 U of avian myeloblastosis virus (AMV) reverse transcriptase (Promega), AMV reverse transcriptase buffer (Promega), and 0.5 mM each dNTP. Then an additional 5 U of the enzyme was added to each sample, and the extension reaction was continued for an additional 15 min. The samples were dried in a Speedvac for 30 min and resuspended in 4 μl of Milli-Q water and 4 μl of the sequencing stop solution of the USB Thermosequenase cycle sequencing kit. The samples were then incubated at 95°C for 3 min and loaded on a denaturing polyacrylamide gel (6.5%).
Nucleotide sequence accession numbers.
The nucleotide sequences obtained in this study have all been deposited in the NCBI database. The accession numbers are given in the appropriate figure legends and Table 2.
TABLE 2.
Random sequencing of RB49 genomic librarya
Identified protein sequence | Segment size, amino acids (% amino acid identity with T4) | Function of T4 proteinb |
---|---|---|
gp1 | 37 (27) | dNMP kinase |
gp50 = gp4 = gp65 | 22 (45) | Head completion; function unknown |
gp53 | 69 (55) | Baseplate, wedge |
Hypothetical protein (gp5-segC) | 50 (50) | Unknown |
gp6 | 103 (67); 130 (63) | Baseplate |
gp7 | 83 (62); 73 (41); 64 (40) | Baseplate |
gp8 | 45 (60) | Baseplate |
gp9 | 28 (75) | Baseplate, long tail fiber attachment |
gp12 | 64 (44) | Short tail fibers |
Wac | 63 (28) | Whisker antigen |
gp13* | 53 (66); 75 (61); 121 (51); 236 (49) | Head completion |
gp17* | 58 (79); 79 (70); 19 (84); 71 (73) | Terminase |
gp18* | 25 (61); 70 (63); 56 (42) | Tail sheath monomer |
gp19* | 47 (68) | Tail tube monomer |
gp20 | 38 (66); 110 (50); 98 (46) | Head portal protein |
gp23 | 110 (60); 27 (81); 68 (75); 57 (68) | Major capsid subunit |
gp24 | 30 (56); 32 (62) | Vertex head subunit |
INH | 50 (50) | Inhibitor of gene 21 protease |
UVS W | 89 (56); 48 (64) | Recombination protein; unknown |
UVS Y | 35 (45); 32 (50) | Recombination protein; helper of UvsX |
gp25 | 21 (66); 32 (39) | Baseplate, wedge, lysozyme |
gp26 | 33 (36) | Baseplate central hub |
gp29 | 49 (20) | Baseplate, determinant of tail length |
gp48 | 47 (46) | Baseplate, tail tube fiber |
gp54 | 37 (56) | Baseplate, tail tube |
Hypothetical protein (gp31-cd) | 41 (36) | Unknown function |
cd | 46 (65) | dCMP deaminase |
gp63 | 54 (51) | RNA ligase; helps tail fiber attachment |
NrdB | 50 (90) | Ribonucleoside diphosphate reductase B |
gp32 | 29 (66) | Single-strand-binding protein |
gp34 | 88 (41) | Proximal tail fiber subunit |
gp38 | 29 (100) | Bacterial receptor recognizing (K3) |
gp52 | 35 (48); 65 (41) | DNA topoisomerase subunit |
RIIA | 31 (43) | Membrane protein |
gp39 | 23 (47) | DNA topoisomerase subunit |
DexA | 29 (41.5) | Exonuclease A |
DAM | 57 (56) | DNA A-methylase |
gp61 = gp58 | 69 (58) | Helicase-primase subunit |
Hypothetical protein (gp58-gp41) | 26 (61) | Unknown |
gp43 | 20 (80); 28 (71) | DNA polymerase |
gp46 | 51 (72) | Recombination protein |
gp47 | 71 (67); 65 (55) | Recombination protein |
SunY = NrdD | 39 (64); 96 (52); 115 (71); 78 (57) | Anaerobic NTP reductase |
Hypothetical protein (NRDC-TK) | 32 (53) | Unknown |
Tk | 42 (61) | Thymidine kinase |
Hypothetical protein (Vs-RegB) | 97 (77) | Unknown |
A total of 66 chimeric plasmids with phage DNA inserts ranging in size from approximately 500 to 3,000 bp were sequenced with the reverse and universal primers of pUC18. The larger inserts were not entirely sequenced. The BlastN and BlastX programs were used for the sequence analysis (3, 26). In a number of cases, different segments of the same gene were identified among the clones. Furthermore, some of the clones had multiple, small, and clearly noncontiguous inserts of RB49 DNA. The length (in amino acids) of each of these separate, identifiable phage sequences is indicated and also, in parentheses, the percent amino acid identity with its T4 counterpart. The terminology adopted to designate T4 genes was that used in Molecular Biology of Bacteriophage T4 (39). The RB49 genes were arranged in the order of their homologues on the T4 genome, starting with gene 1. The function of the corresponding gene in T4 is indicated in the right column. These sequences have been deposited in EMBL/Genbank with accession numbers AY051145 to AY051197. The sequences that had already been deposited in the database are indicated with asterisks. About 30% of the sequences had no detectable homology to any known T4 gene, and these are not listed in this table but were also deposited in EMBL/Genbank with accession numbers BH153088 to BH153131.
dNMP, deoxymononucleotide phosphate; NTP, nucleoside triphosphate.
RESULTS
Random genomic sequencing of RB49.
To obtain an overview of the genome of phage RB49, 66 chimeric plasmids containing small (<1 kb) inserts of RB49 DNA were analyzed. These sequences represented about 20% of the 170-kb RB49 genome. Although the sequences generally lacked sufficient homology to T4 to be detected in an NCBI BlastN analysis, the BlastX program found that 70% of them encoded polypeptides related to T4 proteins. As shown in Table 2, 46 different RB49 genes were identified as homologous to T4. Most of these RB49 proteins had sequences with between 50 and 70% identity to their T4 homologues. Approximately half of the identified RB49 genes were homologues of the T4 late genes, and the large majority of these encoded the virion proteins. These genes are located in T4 within a contiguous 40-kb segment (Table 2 and Fig. 1). Among the various identified RB49 genes, the structural proteins had the highest levels of homology to their T4 counterparts (Table 2). T4 and RB49 have virtually identical morphology, so this sequence conservation of the virion proteins is not surprising. The numerous and strong interactions between them could place tight sequence constraints on these proteins. As shown in Fig. 1, homologues of various other T4 genes are distributed about the RB49 genome and most, if not all, regions of the RB49 genome contain them.
FIG. 1.
Organization of T4 and RB49 genomes. The exterior circle (continuous line) and the internal circle (dotted lines and stippled boxes) depict, respectively, the T4 and RB49 genomes that have a similar size. The radial lines indicate the positions of the T4 genes thus far identified by random genomic sequencing as having homologues in RB49. The sequences of the two genes indicated with an asterisk (gene 30 [g30] and nrdA) have been obtained, respectively, from C. Thiemer (personal communication) and C. Monod (EMBL accession number Z78083). Where two nearby sequences in the RB49 genome could be shown by PCR to be linked, the genomic interval between them is indicated by a stippled box. The arrowheads above the stippled box indicate the limits of the PCR fragments. When linkage could not be established because no PCR product was obtained, the corresponding genome interval is indicated by an open gap. The two genomic intervals of RB49 indicated by black boxes were completely sequenced in this study. Additional regions of the RB49 genome have been sequenced, but these are not indicated in this figure and will be published elsewhere (Desplats et al., unpublished data).
The remaining RB49 sequences that we analyzed have either diverged too much to be identified as T4 homologues, or the original T4-like versions of these sequences have been lost in RB49 and replaced by unrelated genes. Either mechanism could explain why several dispensable regions of the T4 genome were so poorly represented among the random RB49 sequences (Fig. 1). For example, no T4 homologues were identified in the region of the RB49 genome between the genes 39 and dam.
Genomic organization of RB49.
The random sequencing indicates that much of the T4 and RB49 genomes evolved from a common precursor. During their descent from this ancestor, the two genomes could either have remained relatively fixed in their organization or they could have undergone rearrangements. To investigate this question we have examined the linkage of various RB49 genes by a PCR technique. To do this, we used pairs of RB49 primers whose homologous sequences are located within 10 kb of each other on the T4 genome. As shown in Fig. 1, only two of these reactions failed to give a PCR product (genes cd and 30; genes 47 and sunY). The simplest explanation for this is that these genes are not located relative to each other as they are in T4. In the remainder of the reactions, the PCR fragments obtained have roughly the same size as would be predicted on the basis of the T4 genome. Thus, the global organization of the RB49 genome is very similar to that of T4. Nevertheless, the numerous small variations from the expected size of the PCR fragments indicate that these genomes have frequently undergone insertions and/or deletions of gene-sized sequences.
Sequencing of the DNA polymerase region of the RB49 genome.
To examine the RB49 genome in greater detail, we focused on two genomic regions and completely sequenced them. First, we sequenced the region that encodes the DNA polymerase (gp43) and its associated proteins in the replication complex: gp41 (primase); gp44 and gp62 (the clamp loader); gp45 (sliding clamp); and gp46 (exonuclease subunit) (Fig. 1 and Materials and Methods). The RB49 “gene 43” segment (Fig. 1 and 2A) is significantly smaller (11.9 kb) than its T4 counterpart (18.5 kb). Although the NCBI BlastN analysis detected no extended homologies to the T4 DNA sequence, the ClustalX program aligns the T4 and RB49 sequences to reveal numerous small blocs of nucleotide sequence homology (data not shown).
FIG. 2.
Schematic diagrams of the genomic regions containing the principal replication genes of phages T4 and RB49. (A) Diagrams of the T4 and RB49 loci containing the genes encoding the DNA polymerase and the accessory replication proteins. A box proportional to the length of the coding sequence represents each gene in the region. The name of the gene is indicated either inside the corresponding box or just below it. A stippled box indicates a putative RB49 gene lacking homology to all entries in the database. A grey box represents a gene in RB49 with a homologue in T4. The percentage of amino acid identity of the RB49 protein to the corresponding T4 homologue is indicated below the grey boxes. A box with a broken end indicates that the sequence of the corresponding gene is not complete. The black boxes identify the T4 genes that are absent from the RB49 sequence. The thin grey (RB49) and black (T4) lines that connect the boxes depict the intergenic spacer sequences. These lines are differently shaded to indicate that they have little nucleotide sequence homology. The bent arrowheads indicate the positions of the promoters in this region, and they are oriented in the direction of their transcription. The temporal class of each promoter is marked above the arrow (E, early promoters; M, Middle promoters; L, late promoters). If this letter is circled, the function of the corresponding promoter in RB49 has been confirmed by 5′-end mapping of the transcripts. The sequence of this region of the RB49 genome was obtained from two different PCR products. The first product was obtained with the primers 49gp460 and 49gp43.21, while the second PCR product was obtained with primers 49gp6111 and 49gp43.18 (see Table 1 for the primer sequences). The sequences of these oligonucleotides were based on the random clones that were identified in this work as having homology to gene 46, gene 43, and gene 61. These sequences have been deposited in EMBL/GenBank with accession number AF410869. (B) Diagrams of the gene 32 region of the genome of phages T4, T2, and RB49. The notation is the same as used in the panel above. The T2 gene 32 sequence has been determined (50), but the remainder of the operon has only been characterized by partial sequencing and PCR analysis (47). For the T2 sequence, the small dotted line depicts the intergenic spacer between gene 32 and gene 59 whose the nucleotide sequence is completely different from that of T4 (47). The black lines at the extremity of the T2 locus indicate that we have not compared these sequences in T2 and T4. The sequence of this region of the RB49 genome was obtained from two different overlapping PCR products. The first product was obtained with the primers 49gp341 and 49gp321 (see Table 1 for the primer sequences), while the second PCR product was obtained with primers 49gpnrdA1 (based on the RB49 gene nrdA sequence obtained previously by Monod et al. [54]) and 49gp32.2 (based on a random clone analyzed in this work). These sequences have been deposited in EMBL/GenBank with accession number AF410870.
As shown in Fig. 2A, when translated, however, this segment contained homologues of many of the T4 replication proteins. The order of these RB49 replication genes is essentially the same as in T4. Insertions and deletions at several sites account for the difference in the size of the region in the two phages. For example, the adjacent nonessential T4 replication genes dCMP hydroxymethylase (gene 42) and β-glucosyltransferase (gene β-gt) are absent from RB49 (Fig. 2A). These genes encode the enzymes that produce the modified base hydroxy-methylcytosine and mediate its glucosylation (12, 29) and their absence is consistent with the lack of nucleotide modifications of the RB49 DNA (54). In the RB49 genome both of these genes have been replaced by a small intergenic sequence. In other instances, nonessential T4 genes have been replaced in RB49 by open reading frames (ORFs) of unknown function (the stippled boxes in Fig. 2). These RB49 ORFs appear to be functional genes since they are preceded by good Shine-Dalgarno sequences.
RB49 replisome proteins.
Only a few replication genes of phages with T4-like morphology have been sequenced (52, 64, 76, 81). Among the closely related T-even phages the amino acid sequences of the replication homologues typically diverge from T4 proteins by less than 5%. However, in the chimeric T4-like phage RB69 (64, 81), the sequences can diverge significantly more than this (e.g., 20%). The replication genes of RB49 are clearly the most divergent T4-type sequences thus far analyzed. Although the RB49 gp43 is the most conserved replication protein, it still has 44% divergence from the T4 sequence. All the RB49 polymerase accessory proteins differ by 50% or more from their T4 homologues (Fig. 2A).
When the alignments (Fig. 3A) of the T4 and the RB49 protein gp43 are made, it is evident that there are numerous differences in this key component of the replisome. Highly divergent segments are interspersed among much more conserved sequence blocks. In T4, this protein has both DNA polymerase and proofreading activities and also interacts with the accessory proteins (58, 61). A previous comparison of the T4 and RB69 gp43 protein sequences suggested that these two enzyme functions were separated into a two-domain protein structure (46, 61, 76). More recent x-ray crystallographic studies suggest a much more complex five domain organization of the protein (40, 77). Our compilation of the three known gp43 variants (T4, RB69, and RB49) confirms the conservation of the diagnostic motifs of B family polymerases (21, 36) (Fig. 3A). In particular, the residues believed to be involved in the active sites (75, 76) are well conserved in all three phages. This analysis also confirms the existence of a nonconserved block of 70 amino acids (residues 482 to 552 in the T4 sequence) previously suggested by the T4 and RB69 comparison (76). Domain deletion and swapping experiments have demonstrated that this segment plays an important role in the replication activity of the protein (76). Our expanded sequence comparison reveals additional segments of the gene whose sequence can vary substantially; for example, the 30-amino-acid segment (152 to 182) located in the N-terminal portion of the protein. No residue in this segment is conserved in all of the gp43 sequences. Eventual comparison of the RB49 gp43 with the RB69 structure and the construction of gp43 chimeras by swapping such nonconserved segments between the different versions of gp43 should define the function of the variable segments.
FIG. 3.
Sequence alignment of the DNA polymerase (gp43) of T4, RB69, and RB49 (A) and sequence alignment of the single-strand-binding protein (gp32) of T4 and RB49 (B). (A) The sequence comparison of the gp43 protein sequence of the phages T4, RB69, and RB49. The sequence of the T4 gp43 protein is presented in the one-letter code, and the RB69 and RB49 sequences are aligned below it. The residues indicated in red are amino acids that are conserved in all three phages, while those in green are conserved in only two of the phages. The residues in black are amino acids that are not conserved. Several landmarks (76) are indicated above the T4 sequence; they are labeled EXO I, II, and III (conserved exonuclease motifs in DNA polymerase) and POL I, II, III, and IV (conserved sequence motifs in B DNA polymerases) (7), also referred to as the polymerase α family (37, 75). The polymerase I, II, III, and IV motifs have been implicated in enzyme activity by mutational studies on the T4 enzyme (61). The black dots below the polymerase and exonuclease motifs identify the residues that when mutated affect these functions (67, 68). The yellow boxes indicate the two most divergent regions of the protein. (B) The sequence comparison of the gp32 protein sequences in phages T4 and RB49. The sequence of the T4 gp32 protein is presented in the one-letter code, and the RB49 sequence is aligned below it. The red letters identify residues that are conserved between the two phages, and the black letters indicate residue divergence. Landmarks are indicated above the T4 sequence, and the corresponding residues are color coded: L1 (NH2 LAST motif), grey box; L2 (internal LAST motif), blue box; I, II, III, IV (region forming four α-helices), uncolored box; ZF (zinc finger motif), brackets. The six tyrosine residues known to be important for the DNA-RNA binding activity of the T4 protein are indicated by green boxes. When the tyrosine residue is conserved in the RB49 sequence, the letter is also in a green box, but when another aromatic residue replaces a tyrosine, this is indicated by an orange box. The amino acid residues in black are not conserved.
Similar observations emerge from the comparative sequence analysis of other replisome proteins such as those of the gp 62/gp44 protein complex (data not shown). Some of the conserved motifs in these protein sequences are known to be involved in the ATPase activity of the complex. Sequence comparisons of the other replisome accessory proteins will become much more informative once protein function and structure data become available for them.
Sequencing of the RB49 gene 32 region.
We used similar methods (see Materials and Methods) to identify and isolate the gene 32 region of the RB49 genome (Fig. 1 and Fig. 2B). The single-stranded-DNA-binding protein gp32 plays a central role in T4 DNA replication, recombination, repair and late transcription (41, 55, 80). Its amino acid sequence is extremely well conserved (>95%) among the numerous T-even phages thus far examined (50, 73). This 3.5-kb genomic segment of RB49 extended from the RNase H gene to gene 32 and is somewhat larger than the analogous 3.4-kb region of the T4 genome. As with the other RB49 DNA sequences, the gene 32 segment has little homology to the T4 nucleotide sequence and the relation to the T4 genome becomes obvious only after translation. In addition to gene 32, this segment contains gene 59, which plays an important role in DNA replication, and gene 33, a transcription factor required for late gene expression (34, 80). As shown in Fig. 3B RB49 lacks the T4 gene 32.1 but contains a novel large ORF of unknown function inserted between the RNase H and dsbA genes.
The alignment of the RB49 and T4 gp32 sequences reveals a 60% amino acid identity (Fig. 2B and Fig. 3B). The T4 protein is organized in a three-domain structure. The amino terminal domain of the protein is composed of basic residues and contains a LAST motif (Lys/Arg/Lys/Ser/Thr) (15). This element is believed to be involved in the protein's cooperative binding to DNA (25, 48, 69, 73, 79). The carboxyl-terminal portion of the protein is very acidic (16) and is involved both in the protein's ability to denature double-stranded DNA and its interaction with the various replication and recombination proteins (9, 10). The central domain of the T4 gp32 contains both a zinc-finger motif and a fairly regularly spaced series of 6 tyrosine residues, features that are believed to be important to its nucleic acid binding activity (23, 24). A second LAST motif is also present here and would be able to interact with either the acidic carboxyl terminal part of the same monomer when the protein is not bound to DNA, or with the nucleic acid backbone when the protein is bound to DNA (15).
As shown in Fig. 3B, except for the amino-terminal LAST motif all of the elements believed to be important for DNA binding activity of the T4 gp32 are conserved in RB49. These include the zinc finger motif, the internal LAST motif, and the series of aromatic residues. Most of the differences between the RB49 and T4 gp32 sequences are located in the carboxyl terminus, where only 50% of the last 48 residues are identical. In addition, RB49 has an 18-amino-acid insertion within one of the three α-helices in the carboxyl-terminal domain of the protein. Nevertheless, the number of negatively charged residues in this domain is similar to that in T4 gp32 and thus, in spite of the sequence divergence, the acidic character of the domain is preserved. To summarize, the C-terminal domain of gp32 that mediates its protein-protein interactions manifests more sequence plasticity than the remainder of the gene.
Regulatory sequences.
In T4, many of the regulatory signals are located in the intergenic spacers. The analysis of the gp43 and gp32 replication regions of RB49 indicates that their intergenic sequences have diverged even more from those of T4 than the coding sequences. In particular, the presumed promoter sequences in RB49 differ strikingly from their T4 counterparts (Table 3). In T4 the gp43 and gp32 regions contain early and middle mode T4 promoters (35, 39) but in RB49 no T4 consensus promoter sequences of either type are found. However, upstream of many RB49 replication genes, there are motifs identical to the consensus sequence recognized by the E. coli σ70 (Table 3 and Fig. 2). Interestingly, we also found sequences identical to the T4 late promoter consensus located upstream of the RB49 replication genes 44 and regA (Table 3 and Fig. 2); promoter sequences that are absent from the corresponding region in T4. Many of the putative promoter sequences of RB49 have upstream AT-rich sequence, as do many strong T4 promoters.
TABLE 3.
Comparison of sequences of the different temporal classes of promoters in T4 and RB49a
The consensus sequences of the early, middle, and late promoters of T4 are compared to those of the early and late promoters of RB49.
5′-End mapping of RB49 transcripts.
The transcription of a number of RB49 genes was examined by primer extension to determine if the putative early and late promoter sequences were functional. Figure 4 shows such an analysis of the RB49 promoter sequences located before genes 32 and 43. As expected, both of the promoters with an E. coli-type consensus sequence are transcribed early, being active from three minutes to twelve minutes after infection. In the case of gene 43, we detected a second, larger extension product. This probably corresponds to a transcript produced from an early promoter located much further upstream of the gene and when the reverse transcriptase uses this polycistronic mRNA as a template, it may pause at the secondary structure located just upstream of the early promoter (Fig. 4B). For gene 32, a second mRNA species (Fig. 2B and Fig. 4A; see below) corresponds to a transcript initiated from late promoter consensus sequence located in the 3′ end of the gene 59. Thus, in RB49, unlike the situation in T4, there are both early and late monocistronic gene 32 mRNA species. Several of the other putative early promoters indicated in Figures 2A and 2B were examined and all of these were functional for the early period of the RB49 phage infection (data not shown). We conclude that RB49 early promoter has the same consensus sequence as the E. coli σ70 promoter.
FIG. 4.
Reverse transcriptase mapping of the 5′ ends of the early gene 32 and gene 43 transcripts. Primer extensions were performed with Moloney murine leukemia virus reverse transcriptase. Total RNA was extracted at different times after infection of strain BE by RB49, and 10 μg was hybridized to 0.5 pmol of the radiolabeled oligonucleotide. The primers used were either (A) 49gp32rev5 (complementary to the gene 32 mRNA from nucleotides +45 to +68 downstream of the gene 32 initiation codon) or (B) 49gp43rev1 (complementary to the mRNA from nucleotides +111 to +135 from the gene 43 initiation codon) (see Table 1 for the primer sequences). The extension products were analyzed on a 6% polyacrylamide-7 M urea sequencing gel. The sequence ladders were obtained by sequencing PCR products obtained with oligonucleotides specific for RB49 and located just upstream and downstream of the region of interest. The −10 and −35 boxes are indicated by brackets on the sequence ladder, and their nucleotide sequence is shown on the left of the panel. The arrow, at the right of the gel, marks the positions of the 5′ ends. The position of the 5′ end relative to the initiation codon and the temporal class of the promoter producing this transcript are indicated in parentheses to the right of the arrow. In panel B the star just above the arrow indicates an additional signal that may correspond to pausing of the reverse transcriptase at a potentially strong stem-loop structure. The nucleotides that could form this secondary structure are indicated by dots on the sequence ladder. The time after infection (in minutes) when each RNA sample was collected is indicated at the top of the corresponding lane in the gel.
The 5′ ends of several late transcripts were also analyzed. Figure 5 shows the results obtained for genes 23 and 24. The late promoter consensus sequences 5′ to both of these genes initiate transcription starting at 12 min after infection and continuing until lysis. Thus, the late transcription of both phages T4 and RB49 seems to involve recognition of the same consensus sequence. This result is compatible with the identification of a gp33 homologue in RB49 (Fig. 2), and suggests that this factor involved in the enhancement late transcription in T4 fulfills an analogous role in phage RB49.
FIG. 5.
Mapping of the 5′ ends of the late transcripts of gene 23 and gene 24 by primer extension with reverse transcriptase. Total RNA was extracted at various times after infection of strain BE by phage RB49, and about 10 μg was hybridized to 0.5 pmol of radioactively labeled oligonucleotide. The primer extension was performed with the Moloney murine leukemia virus reverse transcriptase using either (A) 49gp23rev2 (complementary to the gene 23 mRNA from nucleotides +64 to +89 from the gene 23 initiation codon) or (B) 49gp24rev1 (complementary to the gene 24 mRNA from nucleotides +26 to +51 from the gene 24 initiation codon) (see Table 1 for the primer sequences). The extension products were analyzed on a 6% polyacrylamide-7 M urea sequencing gel. The sequence ladders were obtained by sequencing PCR products obtained with oligonucleotides specific for RB49 located just upstream and downstream of the relevant late promoters. The location of the extended −10 sequence (Juke box) of the late promoter is indicated by the brackets at the left of each panel. The arrows at the right of the gel mark the 5′ ends of the transcripts initiated from these promoters. The position of the 5′ end relative to the initiation codon is indicated in parentheses to the right of the arrow, while the L indicates that a late promoter produces the transcript. The time, in minutes after infection, when the RNAs were extracted is indicated at the top of each lane of the gel.
Absence of middle promoters.
Extensive scanning of the sequences of the RB49 genome revealed no T4-like intermediate mode promoter sequences. Moreover, we have sequenced (data not shown) the entire genomic region of RB49 that surrounds the location of the middle mode transcription factor (gpmot) in T4 (19, 20) (accession number in the EMBL/GenBank Z78088). Although several of the other expected homologous genes were found, no counterpart of the mot gene is present in this region of the RB49 genome. In the same region of the T4 genome is located the gene AsiA that encodes a protein interacting with σ70 and is necessary for a correct recognition of the middle promoters (17, 59). Again there was no obvious homologue of this gene in RB49.
Posttranscriptional control.
In T4 infection, there exist posttranscriptional control mechanisms that fine-tune phage gene expression by means of mRNA processing or translational repression (53). The substantial differences that we found in the transcriptional regulation of RB49 motivated a comparison of mRNA processing in cells infected with RB49 and T4.
One of the more striking examples of mRNA processing influencing gene expression involves T4 gene 32. There are four different primary transcripts produced in T4-infected cells that contain the gene 32 mRNA sequence (5, 13). These transcripts can be cleaved by E. coli RNase E at several sites, including a prominent one located 71 nucleotides upstream of the gene 32 initiation codon (5, 27, 57). This cleavage permits a stabilization of the 3′ portion the transcript containing the gene 32 sequence and initiates the degradation of the 5′ portion of the polycistronic transcripts containing the upstream genes (57).
The presence of a functional early promoter in the Rnase H gene and the absence of an obvious transcription terminator downstream of this sequence led us to believe that, in addition to the proximal early and late monocistronic transcripts, a polycistronic gene 32 species also exists (Fig. 2B). This would explain the series of minor 5′ ends upstream of the gene 32 late transcript that were detected by primer extension. We have examined the possibility that RNase E processes either this large gene 32 transcript or the monocistronic ones as it does in T4 (5, 57). To do this we have compared the results of reverse transcription experiments using phage mRNA prepared from isogenic host strains that were either wild-type or mutant for RNase E. As shown in Fig. 6, the pattern of primer extension was essentially identical when both of these mRNA preparations were used as templates. This indicates that for RB49 infection, unlike that of T4, the host Rnase E has no major role in processing the gene 32 transcription unit.
FIG. 6.
Analysis of RNase E processing in 5′ leader region of the gene 32 mRNA of T4 (A) and RB49 (B). (A) The 5′ ends of the gene 32 mRNAs of T4 were mapped using primer +21, which is complementary to the first 21 nucleotides of T4 gene 32 (47). The RNAs were extracted from T4 infections of rne+ and rnets host strains at 12 min after infection at 43°C. The different species of monocistronic gene 32 mRNAs are indicated by arrows. The numbers in parentheses refer to the coordinates of the 5′ ends relative to the ATG of gene 32. The temporal class of promoter responsible for the primary transcript is indicated. When the transcript corresponds to the processing of a larger primary transcript, this is indicated by a letter P at the 5′ end. The middle and late transcripts are indicated, respectively, by the letters M and L. (B) The 5′ ends of the gene 32 mRNAs of RB49 were mapped using primer 32rev6, which is complementary to the segment between nucleotides −124 and −149 from the ATG of the RB49 gene 32. The primer extensions were carried out with RNAs extracted from rne+ and rnets strains infected with RB49 for 12 min at 43°C. Asterisks indicate the numerous minor 5′ ends detected. The only major species of gene 32 mRNA is indicated by the arrow. The number in parentheses refers to the coordinates of the 5′ end relative to the ATG of the gene 32, and the letter L indicates that this 5′ end corresponds to initiation at a late promoter.
In T4, gene 32 is autogenously regulated at the translational level (42, 43, 44, 45, 63). gp32 is able to bind in a cooperative manner to the 5′ end of its own mRNA. As diagrammed in Fig. 7, gp32 binding to the A+U-rich sequence flanking the ribosomal binding site of the gene 32 mRNA is believed to be nucleated at an upstream stem-loop structure that contains a pseudoknot (50, 65, 66). None of these features involved in gene 32 translational self-regulation are obvious in the translation initiation region of RB49 gene 32. No stable secondary structures were predicted in the translation initiation region of RB49 gene 32. Moreover, as shown in Fig. 7, there are no contiguous repeated A+U-rich motifs (e.g., UUAAA) in this region of the RB49 sequence as there are in T4. Thus, the immediate vicinity of the RB49 gene 32 translation initiation region does not seem to provide an extended unstructured mRNA sequence to allow the cooperative binding of gp32 that blocks translation initiation.
FIG. 7.
5′ leader region of the processed T4 gene 32 mRNA (A) and sequence upstream of RB49 gene 32 (B). (A) This diagram is adapted from Miller et al. (53). The 5′ end of the processed T4 gene 32 has a number and letter in parentheses above the first residue that refer to the position (−71) of the RNase E processing event (P) relative to the gene 32 ATG. The gene 32 initiation codon is marked with asterisks, as is the Shine-Dalgarno sequence. The two hairpins believed to be formed by the gene 32 mRNA are represented. Additional base pairing between the loop of the hairpin and four upstream nucleotides could form an RNA pseudoknot. The unstructured region flanking the translation initiation region is illustrated as accommodating the cooperative binding of nine gp32 monomers (shown as stippled ellipsoids). The positions of the nucleotides forming the 5′ hairpin are also indicated. The boxes indicate the positions of the four UUAAA motifs believed to be important for gp32 binding to its own translation initiation region. (B) The diagram represents the RB49 gene 32 late transcript. The letter and the number (−233) in parentheses above the first residue refer to the relative location from the ATG of the 5′ end of the late gene 32 transcript (L). The proximal early promoter sequence −35 and −10 boxes are marked Pe, and the 5′ end of the corresponding transcript is marked by a vertical arrow. Asterisks indicate the residues of the Shine-Dalgarno sequence and the initiation codon. The inverted horizontal arrows and the question mark designate the location of a predicted hairpin structure. Boxes indicate the positions of the six dispersed UUAAA motifs that are found in close proximity to the T4 gene 32 translation initiation region.
Nevertheless, the transcript produced from the more distal late promoter is 200 bp larger and contains irregularly spaced repetitions of various A+U-rich motifs, including UUAAA, and a putative secondary structure could be formed at the 5′ end. Thus, if translational repression of gene 32 expression occurs in RB49, it must involve regulatory elements that differ in their position and sequence from those employed in T4. A further investigation of the regulation of gene 32 expression in RB49 is in progress.
DISCUSSION
Our partial sequencing of the genome of the pseudo-T-even bacteriophage RB49 has provided a clear snapshot of this genome. The overall genetic content and the structure of this genome are now apparent. The complete sequencing of the RB49 genome is eventually envisioned in a program to systematically compare a series of genomes of T4-type phages (J. Karam and H. Krisch, personal communication), but it is difficult to imagine that our current conclusions about the RB49 genome will change in any significant way as a result of the completion of the genome sequence.
We have demonstrated that phages T4 and RB49 clearly derived from a common ancestor, but that their subsequent evolution has caused these genomes to diverge considerably. These two distantly related genomes share a common set of essential genes that are distributed throughout the chromosome. Nevertheless, some of the T4-type sequences, such as the nonstructural genes or the regulatory signals, have changed considerably more than others and nearly 30% of the RB49 genes are without homologues in the database. Many of these ORFs of unknown function are in sites that are occupied by nonessential genes in T4 genome. These novel ORFs may have been transferred horizontally into RB49 genome.
Comparison of the sequences of the replication proteins that have substantially diverged between RB49 and T4 should help to define the sequence motifs that are critical to protein activity, especially when such an analysis can be combined with structural information. Previous attempts to do this for the sequences of T-even proteins were largely unsuccessful, because the replication proteins were too conserved within this subgroup of the T4 phage. Gene 32 illustrates this problem; although the sequence of gene 32 has been determined for a number of T-even phages (44, 50, 73), only in the pseudo-T-even phage RB49 gene is there any substantial variation from the T4 sequence. In spite of these divergences, the gp32 motifs involved in the DNA-binding activity of the protein are well conserved. Thus, it is reasonable to suppose that the T4 and the RB49 gp32 ensure essentially the same function.
Nevertheless, some of the sequence variations may be related to subtle differences in the way the T4 and RB49 proteins function. For example, the absence of base modifications in the RB49 DNA could necessitate changes in the sequence motifs involved in binding to single-stranded DNA. Moreover, the RB49 gp32 sequence may be adapted to specific interactions within the RB49 replication complex that differ from those of its T4 counterpart. This could explain why the segments of gp32 involved in protein-protein interactions with the other replication proteins are more plastic than the sequences involved in its DNA-binding activity. Similarly, the differences between T4 and RB49 gene 43 proteins are nonrandomly distributed in the sequence. Both the T4 and RB49 DNA polymerases contain numerous motifs that are widely conserved in B-family polymerases.
Nevertheless the two phages' enzymes can substitute only poorly for each other (J. Karam et al., personal communication). This may be a consequence of the considerable divergence of the various RB49 accessory proteins from their T4 homologues. It has been shown that the T4 and RB69 versions of proteins gp44 and gp62, which are far more homologous than the T4 and RB49 versions, cannot substitute for each other to form an active gp44/gp62 heteromer (81). Such results suggest that the various components of the replication machinery of RB49 and T4 may not be as interchangeable as one might have imagined based on the fact that they fulfill similar functions in the replication complex.
The most striking difference between phages RB49 and T4 is that the regulation of the transcription of RB49 relies on only two classes of promoters instead of three, as in T4 (56). In T4, the early promoter consensus sequence is apparently optimized to ensure its preferential recognition by the RNA polymerase compared to the endogenous E. coli promoters (78). However, the RB49 early promoter sequence is identical to the consensus of the E. coli σ70-dependent promoter. Furthermore, since the RB49 DNA is not modified, it is difficult to imagine how, early in infection, there could be efficient discrimination between the phage and host promoters. This may explain why the titers of RB49 stocks produced on E. coli are typically a factor of 10 lower than those produced by T4.
However, the use of an E. coli promoter consensus sequence for early phage transcripts may be advantageous in other hosts because this promoter sequence functions well in numerous bacterial species, including distant ones (4, 22, 28, 51, 60). Thus, RB49's utilization of this early promoter sequence may expand its host-range compared to T4. Cryptic promoters are found in front of some T4 genes that appear to be recognized only when the phage DNA is not modified (13). This suggests that a mechanism of early transcription initiation similar to that in RB49 was also present in an ancestral version of T4. These two phages may have chosen very different survival strategies: T4 is very efficient but can produce numerous progeny in only a limited set of hosts, E. coli and its closest relatives, whereas RB49 might be able to infect a wider range of hosts but has a significantly lower progeny yield in E. coli.
The distribution of the promoters in RB49 is different from that in T4, probably in part to compensate for the lack of middle promoters. In the gene 43 region of the RB49 genome, a number of early promoters are arranged so as to insure adequate synthesis of all the replication genes for the early phase of infection. The presence of two late promoters in the same region could significantly extend the period of the expression of these genes. Similarly the RB49 gene 32 has proximal early and late promoter sequences that replace the intermediate and late promoters in the T4 gene. Since the early and middle periods of transcription in a T4 infection partially overlap, this switch in promoters may have only modest effects on the kinetics of gp32 synthesis, especially if, as in T4, the gene 32 mRNA was not rapidly degraded.
The observations reported here indicate that there is a high level of plasticity in phage regulatory sequences. This phenomenon is particularly well illustrated by the comparison of gene 32 transcription among the various T4-type phages. In the T-even subgroup, the regulatory region just upstream of gene 32 is dimorphic (47, 62). The T2 and T4 versions of this locus differ from each other by the presence in T4 of the nonessential gene 32.1 (47). In T2 the gene 59-gene 32 intergenic sequence contains an intercalated middle and late promoter that both transcribe monocistronic gene 32 mRNAs (47). The same classes of promoters transcribe gene 32 in T4, but in this case these promoters are embedded within the inserted gene 32.1 sequence (5, 47, 49, 74). Since the same classes of promoters transcribe gene 32 in both of the T-even versions of this locus, we had assumed that some strong regulatory constraint imposed this conservation. The novel organization of gene 32 transcription in RB49 shows that this is not the case.
Our results also demonstrate that RB49 has, in some cases, evolved different posttranscriptional regulatory strategies than T4 (e.g., RNase E processing of the 5′ leader of gene 32 mRNA). A detailed analysis of the self-regulation of gene 43 in RB49 will be reported in a separate publication (Petrov et al., unpublished data) and similar studies are now being performed on the posttranscription regulation of gp32 synthesis (Desplats et al., unpublished data). Some variations in regulatory strategies have been previously observed (47, 81). For example, T4 and RB69 control the synthesis of the polymerase accessory proteins gp45, gp44 and gp62 differently (81). The apparent absence of intermediate promoter sequences in this part of the RB69 replication region was remarked (81). A simple explanation for this could be that RB69 has swapped this segment of its genome for an analogous segment derived from a pseudo-T-even related to RB49. The results reported here suggest that the regulation of homologous genes in the T4-type phages may vary much more than would have been imagined previously. The evolutionary significance of such variation needs to be further investigated.
Swapping of analogous modular sequences by homologous recombination may occur frequently among the T4-type phages (62). Regions of sequence homology between these phages, either in regulatory or in coding sequences, could occasionally mediate these genetic exchanges. For example, alignments of the RB49 and T4 nucleotide gene sequences have revealed small “islands” (approximately 20 bp) of strong conservation (up to 80% of nucleotide identity) in spite of the high global level of nucleotide sequence divergence. Such motifs might act as preferred sites of recombination as do the His-box or the glycine island motifs previously characterized in the long tail fiber locus of these phages (70, 71). Such conserved sequences might delimit protein domains that could be exchanged to generate functional chimeric proteins. In such a scenario, even the distant pseudo-T-even phages could provide the T-even phages with a highly diverse but nevertheless generally compatible source of modules. Modular exchange of segments of the viral genome could generate enormous diversity within the context of a fairly standard T4-type genome. In addition, since the sequences involved in the regulation of the phage expression are not as rigidly constrained as had been previously imagined, the same module in different phage genomes may have different patterns of expression. Such context-dependent modulation of gene expression could be yet another important mechanism by which modularity generates phage diversity.
Recent comparative studies indicate that different phage families can have major differences in the characteristics of the genomic plasticity that they manifest (8, 33, 38). For example, the existence of a common set of essential genes distinguishes the T4-type genomes from those of the well-known lambdoid phages. The lambdoid genome is an ordered array of modular elements that are essentially independent of each other (11). The genes encoded by functionally equivalent lambdoid modules need not share common phylogenetic origin and have evolved independently from each other. Thus, in the lambdoid genome each of the various functional elements has its own evolutionary history, whereas in the T4-type genomes an ensemble of essential genes appears to have coevolved together. This difference in evolutionary strategy is probably necessitated by the greater size of the T4-type genome and the more intricate pattern of interactions between its diverse products.
Recently, two new subgroups of T4-type phage have been identified that are more distant from T4 than the pseudo-T-even phages. These subgroups are called the schizo-T-even phages (phages with a generally T4-like morphology but having elongated heads) and the exo-T-even phages (phages with only a vaguely T4-like morphology, having isometric heads and longer contractile tails) (32, 72). Since these phages infect bacterial species that are very distant from enterobacteria (vibrios, aeromonads, and cyanobacteria), this difference in the ecological niche they occupy could partly explain their massive divergence from T4 (1, 2, 32). At this time, very little else is known about these most distant members of the T4 phage family. Without doubt the lessons learned from this study of the RB49 genome will facilitate the analysis of the genomes of these new subgroups of T4-type phages. Such studies should help us to understand the fundamental evolutionary processes that created all of the diverse T4-type phages.
Acknowledgments
We thank Jim Karam, A. J. Carpousis, Hans Ackermann, Patricia Bordes, and Carla Theimer for their diverse contributions. Our colleagues Caroline Monod, Dave Lane, and Jean-Pierre Claverys tried to improve the manuscript. We thank Vasiliy Petrov and Jim Karam, who confirmed our sequence for RB49 gene 43 in New Orleans; the sequence actually entered in the database reflects a few of their corrections. We also thank Yvette de Preval, who synthesized the oligonucleotides, and David Villa, who participated in the preparation of the figures.
This research was supported by the CNRS and by grants from the Ministère de la Recherche (PRFMMIP) and the GIP HMR and the Toulouse Genopole for DNA sequencing facilities.
REFERENCES
- 1.Ackermann, H. W., and H. M. Krisch. 1997. A catalogue of T4-type bacteriophages. Arch. Virol. 142:2329-2345. [DOI] [PubMed] [Google Scholar]
- 2.Ackermann, H. W. 1998. Tailed bacteriophages: the order Caudovirales. Adv. Virus Res. 51:135-201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Altschul, S. F. W., W. Miller Gish, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403-410. [DOI] [PubMed] [Google Scholar]
- 4.Bagdasarian, M. M., E. Amann, B. Lurz, R. Ruckert, and M. Bagdasarian. 1983. Activity of the hybrid trp-lac (tac) promoter of Escherichia coli in Pseudomonas putida. Construction of broad-host-range, controlled-expression vectors. Gene 26:273-282. [DOI] [PubMed] [Google Scholar]
- 5.Belin, D., E. A. Mudd, P. Prentki, Y. Yi-Yi, and H. M. Krisch. 1987. Sense and antisense transcription of bacteriophage T4 gene 32. Processing and stability of the mRNAs. J. Mol. Biol. 194:231-243. [DOI] [PubMed] [Google Scholar]
- 6.Bouet, J. Y., J. Woszczyk, F. Repoila, V. Francois, J. M. Louarn, and H. M. Krisch. 1994. Direct PCR sequencing of the ndd gene of bacteriophage T4: identification of a product involved in bacterial nucleoid disruption. Gene 141:9-16. [DOI] [PubMed] [Google Scholar]
- 7.Braithwaite, D. K., and J. Ito. 1993. Compilation, alignment, and phylogenetic relationships of DNA polymerases. Nucleic Acids Res. 21:787-802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Brussow, H., and F. Desiere. 2001. Comparative phage genomics and the evolution of Siphoviridae: insights from dairy phages. Mol. Microbiol. 39:213-222. [DOI] [PubMed] [Google Scholar]
- 9.Burke, R. L., B. M. Alberts, and J. Hosoda. 1980. Proteolytic removal of the COOH terminus of the T4 gene 32 helix-destabilizing protein alters the T4 in vitro replication complex. J. Biol. Chem. 255:11484-11493. [PubMed] [Google Scholar]
- 10.Burke, R. L., M. Munn, J. Barry, and. B. M. Alberts. 1985. Purification and properties of the bacteriophage T4 gene 61 RNA priming protein. J. Biol. Chem. 260:1711-1722. [PubMed] [Google Scholar]
- 11.Campbell, A. 1994. Comparative molecular biology of lambdoid phages. Annu. Rev. Microbiol. 48:193-222. [DOI] [PubMed] [Google Scholar]
- 12.Carlson, K., E. A. Raleigh, and S. Hattman. 1994. Restriction and modification, p. 14-27. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D.C.
- 13.Carpousis, A. J., E. A. Mudd, and H. M.. Krisch. 1989. Transcription and messenger RNA processing upstream of bacteriophage T4 gene 32. Mol. Gen. Genet. 219:39-48. [DOI] [PubMed] [Google Scholar]
- 14.Carpousis, A. J., G. Van Houwe, C. Ehretsmann, and H. M. Krisch. 1994. Copurification of E. coli RNAase E and PNPase: evidence for a specific association between two enzymes important in RNA processing and degradation. Cell 76:889-900. [DOI] [PubMed] [Google Scholar]
- 15.Casas-Finet, J. R., K. R. Fischer, and R. L. Karpel. 1992. Structural basis for the nucleic acid binding cooperativity of bacteriophage T4 gene 32 protein: the (Lys/Arg)3(Ser/Thr)2 (LAST) motif. Proc. Natl. Acad. Sci. USA 89:1050-1054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chase, J. W., and K. R. Williams. 1986. Single-stranded DNA binding proteins required for DNA replication. Annu. Rev. Biochem. 55:103-136. [DOI] [PubMed] [Google Scholar]
- 17.Colland, F., G. Orsini, E. N. Brody, H. Buc, and A. Kolb. 1998. The bacteriophage T4 AsiA protein: a molecular switch for sigma 70-dependent promoters. Mol. Microbiol. 27:819-829. [DOI] [PubMed] [Google Scholar]
- 18.Cowie, D. B., R. J. Avery, and S. P. Champe. 1971. DNA homology among the T-even bacteriophages. Virology 45:30-37. [DOI] [PubMed] [Google Scholar]
- 19.De Franciscis, V., and E. Brody. 1982. In vitro system for middle T4 RNA. I. Studies with Escherichia coli RNA polymerase. J. Biol. Chem. 257:4087-4096. [PubMed] [Google Scholar]
- 20.De Franciscis, V., R. Favre, M. Uzan, J. Leautey, and E. Brody. 1982. In vitro system for middle T4 RNA. II. Studies with T4-modified RNA polymerase. J. Biol. Chem. 257:4097-4101. [PubMed] [Google Scholar]
- 21.Filée, J., P. Forterre, T. Seng-lin, and J. Laurent. Evolution of DNA polymerase families: insights into ancient relationships between cellular and viral proteins. J. Mol. Evol., in press. [DOI] [PubMed]
- 22.Frey, J., E. A. Mudd, and H. M. Krisch. 1988. A bacteriophage T4 expression cassette that functions efficiently in a wide range of gram-negative bacteria. Gene 62:237-247. [DOI] [PubMed] [Google Scholar]
- 23.Gauss, P., K. B. Krassa, D. S. McPheeters, M. A. Nelson, and L. Gold. 1987. Zinc (II) and the single-stranded DNA binding protein of bacteriophage T4. Proc. Natl. Acad. Sci. USA 84:8515-8519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Giedroc, D. P., K. M. Keating, K. R. Williams, W. H. Konigsberg, and J. E. Coleman. 1986. Gene 32 protein, the single-stranded DNA binding protein from bacteriophage T4, is a zinc metalloprotein. Proc. Natl. Acad. Sci. USA 83:8452-8456. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Giedroc, D. P., R. Khan, and K. Barnhart. 1990. Overexpression, purification, and characterization of recombinant T4 gene 32 protein22-301 (g32P-B). J. Biol. Chem. 265:11444-11455. [PubMed] [Google Scholar]
- 26.Gish, W., and D. J. States. 1993. Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-272. [DOI] [PubMed] [Google Scholar]
- 27.Gorski, K., J. M. Roch, P. Prentki, and H. M. Krisch. 1985. The stability of bacteriophage T4 gene 32 mRNA: a 5′ leader sequence that can stabilize mRNA transcripts. Cell 43:461-469. [DOI] [PubMed] [Google Scholar]
- 28.Gragerov, A. I., A. A. Chenchik, V. A. Aivasashvilli, R. S. Beabealashvilli, and V. G. Nikiforov. 1984. Escherichia coli and Pseudomonas putida RNA polymerases display identical contacts with promoters. Mol. Gen. Genet. 195:511-515. [DOI] [PubMed] [Google Scholar]
- 29.Greenberg, G. R., P. He, J. Hilfinger, and M. Tseng. 1994. Deoxyribonucleoside triphosphate synthesis and phage T4 DNA replication, p. 14-27. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D.C.
- 30.Gutierrez, C., S. Gordia, and S. Bonnassie. 1995. Characterization of the osmotically inducible gene osmE of Escherichia coli K-12. Mol. Microbiol. 16:553-563. [DOI] [PubMed] [Google Scholar]
- 31.Hagen, F. S., and E. T. Young. 1978. Effect of RNase III on efficiency of translation of bacteriophage T7 lysozyme mRNA. J. Virol. 26:793-804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hambly, E., F. Tétart, C. Desplats, W. H. Wilson, H. M. Krisch, and N. H. Mann. 2001. A conserved genetic module that encodes the major virion components in both the coliphage T4 and the marine cyanophage S-PM2. Proc. Natl. Acad. Sci. USA 58:11411-11416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Hendrix, R. W., M. C. Smith, R. N. Burns, M. E. Ford, and G. F. Hatfull. 1999. Evolutionary relationships among diverse bacteriophages and prophages: all the world's a phage. Proc. Natl. Acad. Sci. USA 96:2192-2197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Herendeen, D. R., K. P. Williams, G. A. Kassavetis, and. E. P. Geiduschek. 1990. An RNA polymerase-binding protein that is required for communication between an enhancer and a promoter. Science 248:573-578. [DOI] [PubMed] [Google Scholar]
- 35.Hsu, T., and J. D. Karam. 1990. Transcriptional mapping of a DNA replication gene cluster in bacteriophage T4. Sites for initiation, termination, and mRNA processing. J. Biol. Chem. 265:5303-5316. [PubMed] [Google Scholar]
- 36.Ito, J., and D. K. Braithwaite. 1991. Compilation and alignment of DNA polymerase sequences. Nucleic Acids Res. 19:4045-4057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Joyce, C. M., and T. A. Steitz. 1994. Function and structure relationships in DNA polymerases. Annu. Rev. Biochem. 63:777-822. [DOI] [PubMed] [Google Scholar]
- 38.Juhala, R. J., M. E. Ford, R. L. Duda, A. Youlton, G. F. Hatfull, and R. W. Hendrix. 2000. Genomic sequences of bacteriophages HK97 and HK022: pervasive genetic mosaicism in the lambdoid bacteriophages. J. Mol. Biol. 299:27-51. [DOI] [PubMed] [Google Scholar]
- 39.Karam, J. D., J. W. Drake, K. N. Kreuzer, G. Mosig, D. H. Hall, F. A. Eiserling, L. Black, E. K. Spicer, E. M. Kutter, K. Carlson, and E. S. Miller (ed.). 1994. Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D.C.
- 40.Karam, J. D., and W. H. Konigsberg. 2000. DNA polymerase of the T4-related bacteriophages. Prog. Nucleic Acid Res. Mol. Biol. 64:65-96. [DOI] [PubMed] [Google Scholar]
- 41.Kreuzer, K. N., and S. W. Morrical. 1994. Initiation of DNA replication, p. 28-43. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D.C.
- 42.Krisch, H. M., and G. Van Houwe. 1976. Stimulation of the synthesis of bacteriophage T4 gene 32 protein by ultraviolet light irradiation. J. Mol. Biol. 108:67-81. [DOI] [PubMed] [Google Scholar]
- 43.Krisch, H. M., G., Van Houwe, D. Belin, W. Gibbs, and R. H. Epstein. 1977. Regulation of the expression of bacteriophage T4 genes 32 and 43. Virology 78:87-98. [DOI] [PubMed] [Google Scholar]
- 44.Krisch, H. M., and B. Allet. 1982. Nucleotide sequences involved in bacteriophage T4 gene 32 translational self-regulation. Proc. Natl. Acad. Sci. USA 79:4937-4941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Lemaire, G., L. Gold, and M. Yarus. 1978. Autogenous translational repression of bacteriophage T4 gene 32 expression in vitro. J. Mol. Biol. 126:73-90. [DOI] [PubMed] [Google Scholar]
- 46.Lin, T. C., G. Karam, and W. H. Konigsberg. 1994. Isolation, characterization, and kinetic properties of truncated forms of T4 DNA polymerase that exhibit 3′-5′ exonuclease activity. J. Biol. Chem. 269:19286-19294. [PubMed] [Google Scholar]
- 47.Loayza, D., A. J. Carpousis, and H. M. Krisch. 1991. Gene 32 transcription and mRNA processing in T4-related bacteriophages. Mol. Microbiol. 5:715-725. [DOI] [PubMed] [Google Scholar]
- 48.Lonberg, N., S. C. Kowalczykowski, L. S. Paul, and P. H. von Hippel. 1981. Interactions of bacteriophage T4-coded gene 32 protein with nucleic acids. III. Binding properties of two specific proteolytic digestion products of the protein (G32P*I and G32P*III). J. Mol. Biol. 145:123-138. [DOI] [PubMed] [Google Scholar]
- 49.Mattson, T., G. Van Houwe, and R. H. Epstein. 1978. Isolation and characterization of conditional lethal mutations in the mot gene of bacteriophage T4. J. Mol. Biol. 126:551-570. [DOI] [PubMed] [Google Scholar]
- 50.McPheeters, D. S., G. Gosch, and L. Gold. 1988. Nucleotide sequences of the bacteriophage T2 and T6 gene 32 mRNAs. Nucleic Acids Res. 16:9341.. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Mermod, N., J. L. Ramos, P. R. Lehrbach, and K. Timmis. 1986. Vector for regulated expression of cloned genes in a wide range of gram-negative bacteria. J. Bacteriol. 167:447-454. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Miller, E. S., and C. E. Jozwik. 1990. Sequence analysis of conserved regA and variable orf43.1 genes in T4-like bacteriophages. J. Bacteriol. 172:5180-5186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Miller, E. S., J. D. Karam, and E. Spicer. 1994. Control of translation initiation: mRNA structure and protein repressors, p.193-208. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D. C.
- 54.Monod, C., F. Repoila, M. Kutateladze, F. Tétart, and H. M. Krisch. 1997. The genome of the pseudo-T-even bacteriophage, a diverse group that resembles T4. J. Mol. Biol. 267:237-249. [DOI] [PubMed] [Google Scholar]
- 55.Mosig, G. 1994. Homologous recombination, p. 54-83. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D.C.
- 56.Mosig, G., and D. H. Hall. 1994. Gene expression: A paradigm of integrated circuits, p. 127-131. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D.C.
- 57.Mudd, E. A., P. Prentki, D. Belin, and H. M. Krisch. 1988. Processing of unstable bacteriophage T4 gene 32 mRNAs into a stable species requires Escherichia coli ribonuclease E. EMBO J. 7:3601-3607. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Nossal, N. G. 1994. The bacteriophage T4 DNA replication fork, p. 43-53. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D.C.
- 59.Ouhammouch, M., G. Orsini, and E. N. Brody. 1994. The asiA gene product of bacteriophage T4 is required for middle mode RNA synthesis. J. Bacteriol. 176:3956-3965. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Peschke, U., V. Beuck, H. Bujard, R. Gentz, and S. Le Grice. 1985. Efficient utilization of Escherichia coli transcriptional signals in Bacillus subtilis. J. Mol. Biol. 186:547-555. [DOI] [PubMed] [Google Scholar]
- 61.Reha-Krantz, L. J. 1994. Genetic dissection of T4 DNA polymerase structure-function relationships, p. 307-312. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D.C.
- 62.Repoila, F., F. Tétart, J.-Y. Bouet, and H. M. Krisch. 1994. Genomic polymorphism in the T-even bacteriophages. EMBO J. 13:4181-4192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Russel, M., L. Gold, H. Morrissett, and P. Z. O'Farrell. 1976. Translational, autogenous regulation of gene 32 expression for bacteriophage T4 infection. J. Biol. Chem. 251:7263-7270. [PubMed] [Google Scholar]
- 64.Selick, H. E., G. D. Stormo, R. L. Dyson, and B. M. Alberts. 1993. Analysis of five presumptive protein-coding sequences clustered between the primosome genes, 41 and 61, of bacteriophages T4, T2, and T6. J. Virol. 67:2305-2316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Shamoo, Y., K. R. Webster, K. R. Williams, and W. H. Konigsberg. 1991. A retrovirus-like zinc domain is essential for translational repression of bacteriophage T4 gene 32. J. Biol. Chem. 266:7967-7970. [PubMed] [Google Scholar]
- 66.Shamoo, Y., A. Tam, W. H. Konigsberg, and K. R. Williams. 1993. Translational repression by the bacteriophage T4 gene 32 protein involves specific recognition of an RNA pseudoknot structure. J. Mol. Biol. 232:89-104. [DOI] [PubMed] [Google Scholar]
- 67.Spacciapoli, P., and N. G. Nossal. 1994. Interaction of DNA polymerase and DNA helicase within the bacteriophage T4 DNA replication complex. Leading strand synthesis by the T4 DNA polymerase mutant A737V (tsL141) requires the T4 gene 59 helicase assembly protein. J. Biol. Chem. 269:447-455. [PubMed] [Google Scholar]
- 68.Spacciapoli, P., and N. G. Nossal. 1994. A single mutation in bacteriophage T4 DNA polymerase (A737V, tsL141) decreases its processivity as a polymerase and increases its processivity as a 3′→5′ exonuclease. J. Biol. Chem. 269:438-446. [PubMed] [Google Scholar]
- 69.Spicer, E., K. R. Williams, and W. H. Konigsberg. 1979. T4 gene 32 protein trypsin-generated fragments. Fluorescence measurement of DNA-binding parameters. J. Biol. Chem. 254:6433-6436. [PubMed] [Google Scholar]
- 70.Tetart, F., F. Repoila, C. Monod, and H. M. Krisch. 1996. Bacteriophage T4 host range is expanded by duplications of a small domain of the tail fiber adhesin. J. Mol. Biol. 258:726-731. [DOI] [PubMed] [Google Scholar]
- 71.Tetart, F., C. Desplats, and H. M. Krisch. 1998. Genome plasticity in the distal tail fiber locus of the T-even bacteriophage: recombination between conserved motifs swaps adhesin specificity. J. Mol. Biol. 282:543-556. [DOI] [PubMed] [Google Scholar]
- 72.Tetart, F., C. Desplats, M. Kutateladze, C. Monod, H. W. Ackermann, and H. M. Krisch. 2001. Phylogeny of the major head and tail genes of the wide-ranging T4-type bacteriophages. J. Bacteriol. 183:358-366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Theimer, C. A., Y. D. Wang, W. Hoffman, H. M. Krisch, and D. P. Giedroc. 1998. Non-nearest neighbor effects on the thermodynamics of unfolding of a model mRNA pseudoknot. J. Mol. Biol. 279:545-564. [DOI] [PubMed] [Google Scholar]
- 74.Uzan, M., d'Aubenton-Carafa, Y. Favre, R. de Franciscis, and V. E. Brody. 1985. The T4 mot protein functions as part of a pre-replicative DNA-protein complex. J. Biol. Chem. 260:633-639. [PubMed] [Google Scholar]
- 75.Wang, T. S., S. W. Wong, and D. Korn. 1989. Human DNA polymerase alpha: predicted functional domains and relationships with viral DNA polymerases. FASEB J. 3:14-21. [DOI] [PubMed] [Google Scholar]
- 76.Wang, C. C., L. S. Yeh, and J. D. Karam. 1995. Modular organization of T4 DNA polymerase. Evidence from phylogenetics. J. Biol. Chem. 270:26558-26564. [DOI] [PubMed] [Google Scholar]
- 77.Wang, J., W. H. Sattar, A. K. Wang, C. C. Karam, J. D. Konigsberg, and T. A. Steitz. 1997. Crystal structure of a pol alpha family replication DNA polymerase from bacteriophage RB69. Cell 89:1087-1099. [DOI] [PubMed] [Google Scholar]
- 78.Wilkens, K., and W. Rüger. 1994. Transcription from early promoters, p. 132-142. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D.C.
- 79.Williams, K. R., L. O. Sillerud, D. E. Schafer, and W. H. Konigsberg. 1979. DNA binding properties of the T4 DNA helix-destabilizing protein. A calorimetric study. J. Biol. Chem. 254:6426-6432. [PubMed] [Google Scholar]
- 80.Williams, K. P., G. A. Kassavetis, D. R. Herendeen, and E. P. Geiduschek. 1994. Regulation of late-gene expression, p. 161-176. In J. D. Karam et al. (ed.), Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, D. C.
- 81.Yeh, L. S., T. Hsu, and J. D. Karam. 1998. Divergence of a DNA replication gene cluster in the T4-related bacteriophage RB69. J. Bacteriol. 180:2005-2013. [DOI] [PMC free article] [PubMed] [Google Scholar]