Abstract
Like most phages with double-stranded DNA, phage T4 exits the infected host cell by a lytic process requiring, at a minimum, an endolysin and a holin. Unlike most phages, T4 can sense superinfection (which signals the depletion of uninfected host cells) and responds by delaying lysis and achieving an order-of-magnitude increase in burst size using a mechanism called lysis inhibition (LIN). T4 r mutants, which are unable to conduct LIN, produce distinctly large, sharp-edged plaques. The discovery of r mutants was key to the foundations of molecular biology, in particular to discovering and characterizing genetic recombination in T4, to redefining the nature of the gene, and to exploring the mutation process at the nucleotide level of resolution. A number of r genes have been described in the past 7 decades with various degrees of clarity. Here we describe an extensive and perhaps saturating search for T4 r genes and relate the corresponding mutational spectra to the often imperfectly known physiologies of the proteins encoded by these genes. Focusing on r genes whose mutant phenotypes are largely independent of the host cell, the genes are rI (which seems to sense superinfection and signal the holin to delay lysis), rIII (of poorly defined function), rIV (same as sp and also of poorly defined function), and rV (same as t, the holin gene). We did not identify any mutations that might correspond to a putative rVI gene, and we did not focus on the famous rII genes because they appear to affect lysis only indirectly.
INTRODUCTION
The classical T-even phages T2, T4, and T6 infect many strains of Escherichia coli and related genera and produce plaques with fuzzy edges. Occasional mutants produce larger plaques with sharp edges. When liquid cultures of exponentially growing host cells are infected with the wild-type phage, lysis occurs only after several hours and results in titers that can exceed 1011 virions/ml. However, when such cultures are infected with the large-plaque mutants, they typically clear much more rapidly and display titers on the order of 109 virions/ml. These mutants were accordingly named “r” for rapid lysis (20) and reflect the failure of the process of lysis inhibition (LIN), wherein cells that are reinfected (“superinfected”) several minutes after the first infection greatly extend their latent periods and produce much larger burst sizes (12). This is a highly effective mechanism that senses the impending local disappearance of uninfected host cells and adopts a strategy that maximizes the final yield of progeny. As such, it has long been of interest not only to virologists but also to students of evolution and life histories.
Genetic studies in the 1940s to 1960s, mainly using T2 and T4, defined three r loci, rI, rII, and rIII, while subsequent studies implicated three more loci, rIV, rV, and rVI. Based mainly on summarized T4 genomics (28), map locations of confirmed r genes (rI to rV) and related genes are shown in Fig. 1 and some parameters of genes rI through rV are summarized in Table 1. We sought to determine whether this roster was complete. The matter of discovery was complicated from the beginning by the properties of rII mutations, which produce the R phenotype on some hosts (such as E. coli B) but not on others (such as E. coli K-12, where lysis inhibition is close to normal). It was further complicated by the rV locus, where a small number of missense mutations produce the R phenotype whereas numerous null alleles such as chain terminators are lethal. Our search was fortuitously initiated by a decision to use the rI locus as a mutation reporter because there seemed to be no constraints on the kinds of mutations it could report, including many kinds of base pair substitutions (BPSs), and because rI mutations appear to display no epistatic interactions with genes underlying T4 DNA transactions, as do the previously widely used rII mutations (5, 6). Using r mutants harvested from E. coli strains that do not display the R phenotype of rII mutants, DNA sequencing revealed that about one-third of new r mutants carried alleles of loci other than rI. We used locus-specific sequencing to assign a number of these mutations to known loci and ultimately used whole-genome sequencing to examine the residual mutants whose mutations had not been thus localized.
Fig. 1.
Map of the T4 r genes and some ancillary partners. Genes are centered on their midpoints. Genes in boldface are the classical r genes, although the role of the rII genes in lysis inhibition is almost certainly indirect. Gene e encodes the lytic lysozyme, and the gene pair pseT.2 and pseT.3 promote the last step in cell lysis, but none of these three genes yield r mutations, nor does the rI.−1gene in the rI operon at the top. PL and T indicate positions of a late promoter and a transcription terminator, respectively.
Table 1.
T4 r genes
Genea | Other name(s) | Locationb (start→stop) | Size (nt) | Protein size (aa) |
---|---|---|---|---|
rI | tk.−2 | 59,495→59,202 | 294 | 97 |
rIIA | 2,189→12 | 2,178 | 725 | |
rIIB | 168,903→167,965 | 939 | 312 | |
rIII | 131,033→130,785 | 249 | 82 | |
rIV | sp, 61.3 | 20,051→19,758 | 294 | 97 |
rV | t | 160,221→160,877 | 657 | 218 |
See text concerning the rI.1 gene, which yields many alleles with R phenotypes but whose status as an r gene is uncertain.
Values are along the 168,903-nt T4 chromosome (28). See text for canonical references and for a description of the entire rI operon.
MATERIALS AND METHODS
Mutants.
Diverse studies over the past decade in which the rI gene was used as a mutation reporter generated a collection of spontaneous mutants bearing r mutations that did not sequence to the rI gene (3, 4, 5, 21, 42, 43, 50). These studies employed a variety of genetic backgrounds (including mutators) that are largely irrelevant to the present analysis because they did not appear to strongly disfavor the production of either BPSs or small indels. The mutants were recovered from lawns of E. coli BB or K-12 cells, thus excluding the rII mutations that produce an R+ phenotype on those hosts. In addition to the desired rI mutants, unsystematic samples of other r mutants lacking an rI mutation were selected for further study.
In some spectra we display only BPSs, although indels were sometimes frequent. Although multiple mutations were observed at some sites (hot spots), especially in rI, we usually display only single rather than cumulative BPSs, although these can be assessed by reference to the source publications (3, 4, 5, 21, 42, 43, 50) or the figure legends.
Sequencing.
We first sequenced mutant DNAs (in both directions) for the well-characterized genes rI and rI.1 (5), rIII (39), rIV (23), and rV (13) using the primers listed in Table 2.
Table 2.
Primers
Gene(s) | Primer use | Primer sequences (5′→3′) |
---|---|---|
rI.1, rI | Amplification | ATTTATGTTTCTTTGTGTAG + CCTAAGTATTCATCTGCCTTTG |
Sequencing | ATTTATGTTTCTTTGTGTAG + CCTAAGTATTCATCTGCCTTTG | |
rIII | Amplification | TGCGAAGTTGGTGATTTG + AACCATAAACCCGTCATC |
Sequencing | TGCGAAGTTGGTGATTTG + AACCATAAACCCGTCATC | |
rIV | Amplification | TGTGAAGTGGATTGCCGAG + TCGGCTCATAGGACACAAC |
Sequencing | TGTGAAGTGGATTGCCGAG + GAGCAGCACAAGAAGC, GCTTCTTGTGCTGCTC + TCGGCTCATAGGACACAAC | |
rV | Amplification | GTTCTTTGCTGTTTTAG + GAGTGGTATAGTTAATG |
Sequencing | GTTCTTTGCTGTTTTAG + GAGGGTTTTGAGGGTGT |
Mutants containing mutations not localized in this manner were then subjected to genomic sequencing. Sequencing primers were designed using Primer3 (40) and were purchased from IDT (Coralville, IA). The average amplicon length was about 700 bp, with an average amplicon overlap of about 150 bp. Liquid-handling steps were automated on the BioMek FX robot (Beckman Coulter, Brea, CA). Cycle sequencing reactions were performed at 1/64 Big Dye reaction scale (cycle sequence version v1.1; Applied Biosystems, Carlsbad, CA). Thermocycling steps were run on MJ Tetrads (Bio-Rad, Hercules, CA). A magnetic-bead purification system (Agencourt Bioscience, Beverly, MA) was used for post-PCR and cycle sequencing reaction cleaning. Bead-cleaned sequencing reactions were run on 48 capillary ABI 3730 sequencers (Applied Biosystems). Both strands were sequenced. Sequence data files were uploaded into the Phred/Phrap/Consed/PolyPhred suite of tools (32) on a Linux server for quality analysis and polymorphism detection. Briefly, trace file analysis and single-nucleotide polymorphism (SNP) detection are conducted by Phred/Phrap (15, 16) and PolyPhred (32), respectively. The interface to view trace file and SNP detection analysis results is done by Consed (19).
RESULTS
The rI operon.
T4 open reading frames (ORFs) tend to reside in clusters that have the same transcriptional orientation. Whether or not clustered genes are cotranscribed, uncharacterized ORFs located clockwise to a named gene g on the standard circular map are designated g.1, g.2, etc., while those located anticlockwise to g are designated g.−1, g.−2, etc. In the case of the rI region, transcription is anticlockwise. The genes rI.1, rI, and rI−1 appear to constitute an operon with a T4 late promoter (PL) upstream of rI.1 and a typical T4 transcription terminator (T) downstream of rI.−1 (Fig. 1) (28, 33). No farther-upstream promoter without an intervening terminator has been identified for this region (28). The rI and rI.−1 genes overlap slightly, with the end of rI (AATGAA) and the start of rI.−1 (AATGAA) sharing two bases (AATGAA). None of our mutations mapped within rI.−1, and a mutated rI.−1 did not express an R phenotype (33), so rI.−1 cannot be an r gene in the classical sense.
(i) rI.
The middle gene, rI, encodes a transmembrane protein that was proposed to be involved in sensing superinfection and signaling lysis inhibition (33). Because of its use as a mutation reporter, many different alleles were discovered, including numerous missense and chain-terminating BPSs (Fig. 2 and Table 3), small and large indels, and complex mutations consisting of two or more nearby or adjacent changes. The R phenotype lends itself well to the display of mutants with less than the null phenotype, and many of the missense mutations appear to be mildly leaky, producing smaller and/or less sharply edged plaques. To date, of the 882 possible rI BPSs (including the synonymous substitutions, which are almost never detected as single mutations), 143 have been detected phenotypically as either missense or chain-terminating mutations. An important aspect of the rI gene is its propensity to produce complex mutations that can be modeled as the result of transient template switching between direct or reverse, perfect or imperfect, or tandem or spaced repeats and including numerous GCG → CTA mutations at positions 146 to 148 (5, 42, 43). This propensity is enhanced by the unusually rich density of potential secondary structure in the individual strands of the rI gene and its immediate upstream region (43).
Fig. 2.
BPSs in the T4 rI gene. BPSs are indicated by letters above the gene sequence. G233C was described previously (33). Italic letters are BPSs that generated chain-terminating codons, whereas all other BPSs generated missense mutations (Table 3). Numerous indels and complex mutations have also been observed.
Table 3.
T4 rI missense mutations
Site | Mutation | Amino acid change | Site | Mutation | Amino acid change | Site | Mutation | Amino acid change |
---|---|---|---|---|---|---|---|---|
002 | T→C | Met→Thr | 098 | T→C | Phe→Ser | 194 | A→T | Lys→Ile |
T→A | Met→Lys | 100 | A→G | Met→Val | 202 | C→T | Pro→Ser | |
003 | G→A | Met→Phe | 101 | T→C | Met→Thr | C→A | Pro→Thr | |
G→T | Met→Phe | 102 | G→T | Met→Ile | 203 | C→T | Pro→Leu | |
008 | T→C | Leu→Ser | 103 | G→A | Glu→Lys | C→G | Pro→Arg | |
011 | A→C | Lys→Thr | 104 | A→G | Glu→Gly | C→A | Pro→Gln | |
013 | G→T | Ala→Ser | 106 | T→C | Ser→Pro | 205 | T→C | Cys→Arg |
019 | G→T | Ala→Ser | 107 | C→T | Ser→Phe | T→G | Cys→Gly | |
G→A | Ala→Thr | C→A | Ser→Tyr | T→A | Cys→Ser | |||
020 | C→T | Ala→Val | 109 | G→A | Gly→Ser | 206 | G→A | Cys→Tyr |
029 | C→A | Ala→Asp | G→T | Gly→Cys | G→T | Cys→Phe | ||
031 | A→G | Met→Val | 110 | G→A | Gly→Asp | 211 | T→C | Ser→Pro |
A→C | Met→Leu | G→T | Gly→Val | 221 | A→G | Glu→Gly | ||
032 | T→G | Met→Arg | 119 | A→C | His→Pro | 223 | T→C | Cys→Arg |
035 | T→G | Leu→Arg | 124 | T→C | Tyr→His | 224 | G→A | Cys→Tyr |
T→C | Leu→Pro | T→G | Tyr→Asp | G→T | Cys→Phe | |||
043 | T→C | Ser→Pro | 125 | A→G | Tyr→Cys | 230 | A→G | Glu→Gly |
050 | T→A | Val→Asp | 142 | A→G | Lys→Glu | 233 | G→C | Arg→Pro |
059 | C→T | Pro→Leu | 145 | A→C | Ser→Arg | 235 | G→A | Gly→Ser |
C→A | Pro→Gln | 146 | G→C | Ser→Thr | G→T | Gly→Cys | ||
068 | A→T | Glu→Val | G→T | Ser→Ile | G→C | Gly→Arg | ||
070 | G→A | Ala→Thr | 147 | C→A | Ser→Arg | 236 | G→A | Gly→Asp |
071 | C→T | Ala→Val | 154 | T→C | Ser→Pro | G→T | Gly→Val | |
074 | A→T | Asn→Ile | 157 | T→C | Ser→Pro | G→C | Gly→Ala | |
077 | T→C | Val→Ala | 158 | C→A | Ser→Tyr | 239 | C→T | Ala→Val |
T→A | Val→Asp | 166 | T→A | Phe→Ile | 242 | A→G | Glu→Gly | |
079 | G→T | Asp→Tyr | 168 | C→A | Phe→Leu | 243 | G→T | Glu→Asp |
G→A | Asp→Asn | 169 | T→A | Tyr→Asn | 247 | G→A | Ala→Thr | |
080 | A→G | Asp→Gly | 170 | A→G | Tyr→Cys | G→T | Ala→Ser | |
A→T | Asp→Val | 175 | T→G | Phe→Val | 248 | C→A | Ala→Glu | |
082 | C→T | Pro→Ser | 176 | T→C | Phe→Ser | C→T | Ala→Val | |
C→A | Pro→Thr | 178 | A→C | Met→Leu | 253 | T→C | Ser→Pro | |
083 | C→T | Pro→Leu | A→G | Met→Val | 256 | T→C | Tyr→His | |
C→A | Pro→His | 179 | T→C | Met→Thr | T→A | Tyr→Asn | ||
085 | C→A | His→Asn | 180 | G→A | Met→Ile | 257 | A→G | Tyr→Cys |
087 | T→A | His→Gln | 181 | A→G | Arg→Gly | 260 | C→A | Ala→Asp |
089 | T→C | Phe→Ser | 182 | G→A | Arg→Lys | 268 | A→G | Met→Val |
090 | T→A | Phe→Leu | 187 | A→C | Thr→Pro | 272 | A→T | Asn→Ile |
091 | G→T | Asp→Tyr | 190 | T→C | Tyr→His | 275 | T→A | Ile→Asn |
092 | A→G | Asp→Gly | 191 | A→G | Tyr→Cys | 281 | T→C | Leu→Ser |
A→C | Asp→Ala | |||||||
A→T | Asp→Val |
(ii) rI.1.
A substantial number of mutations generating the R phenotype mapped to the promoter region and to rI.1, including one BPS that converted the stop codon of the upstream tk gene into a missense codon, several BPSs that converted the second of the two rI.1 stop codons into an alternative stop codon or into a missense codon, and one at the first base of the nine-base spacer between rI.1 and rI (Fig. 3). The most striking result was that numerous mutations in the last 18 rI.1 codons produced synonymous codons. In addition, a mutation at position 115 that generates a stop codon resides in a mutant with a weak R phenotype. These observations throw doubt on the function of the rI.1-encoded protein in mediating lysis inhibition under laboratory conditions. No pattern of switching from favored to unfavored T4 synonymous codons was to be seen in these alleles (analysis not shown). Three other, smaller ORFs in the same region both lack suitable locations relative to a promoter and would also harbor synonymous mutations with an R phenotype. To evaluate the possibility that parts of rI.1 mRNA modulate some aspect of T4 gene expression, the wild-type equivalents of sequence segments containing synonymous mutations with R phenotypes were used to search the T4 genome for complements of length sufficient to signal potential interactions, but no hits were obtained. Like rI, however, rI.1 contains considerable capacity for secondary structure; our results are not presented here, but some involving both rI and rI.1 were presented previously (43). Some of these secondary structures overlie rI.1 synonymous mutations, but we have not investigated their potential to diminish the expression of rI.
Fig. 3.
BPSs in and around the T4 rI.1 gene. BPSs producing an R phenotype are indicated by letters above the sequence; a few indels and complex mutations are not shown. TAG starting the top line of sequence is the end of the tk gene. A T4 late promoter is centered on position −21 (underlined). The rI.1 gene starts with the ATG on the second line of sequence and ends with the two chain-terminating codons on the last line of sequence; it is followed by a nine-base spacer that leads into the rI gene. BPSs in italic at positions 97 and 115 generate chain-terminating codons, but the ochre mutation at 115 has a weak R phenotype. Mutations in boldface generate synonymous codons.
(iii) The rI operon terminator.
Two mutations mapped to the terminator (Fig. 4). It is not obvious why mutations within the terminator would produce an R phenotype, and the region downstream from the terminator offers no clue. The BPS at base 407 might suffice to destabilize the initial stem-loop region, while the cluster of five BPSs contains two (at bases 407 and 409) that also might destabilize the initial stem-loop region. A cluster of five closely spaced BPSs in a single mutant strongly signals an ectopic template (42, 43), and a search of the T4 genome quickly revealed a direct repeat of 42 bases corresponding exactly to the mutated sequence in the terminator mutation. Not surprisingly, this donor sequence is itself a terminator (28).
Fig. 4.
Two terminator mutations with an R phenotype. The wild-type sequence begins with the TAA stop codon of the rI.−1 gene and is followed by a T4 Rho-independent transcription terminator starting with a G·C-rich stem-loop whose stem components are indicated by arrows and ending with an underlined A·T-rich segment (28). One mutant contained A → T at base 407 relative to the start of the rI.−1 gene (of length 387 bases). Another mutant bore the five substitutions joined by horizontal lines and spanning bases 407 through 424. The capital letters in the bottom sequence correspond to the 42 bases of a T4 terminator at genomic position 15319 to 15278, and the lowercase letters indicate its flanking sequences, whereas the corresponding bases in the rI terminator are 59811 to 58770 (see text for interpretation).
(iv) stI.
Rare mutants display cleared sectors in the turbid perimeters of wild-type plaques, grow poorly on host cells approaching stationary phase, and suffer an extended latent period that can, however, be overcome by secondary r mutations (47). One cluster of such “star” mutations (stI) was reported to map sufficiently close to the T4 rI gene that it is probably adjacent (24, 47). Thus, stI is probably the same as either rI.1 or rI.-1. However, the star mutants have not been revisited for several decades, and their precise mapping and related physiology remain ambiguous.
rIIA and rIIB.
The rII locus comprises two adjacent genes that encode membrane-associated proteins of poorly understood function. rII mutations display the R phenotype on only a few E. coli strains and appear to compromise lysis inhibition indirectly, perhaps by perturbing membrane functions (33, 44). Because of their inability to grow on λ lysogens, rII mutants were employed in numerous classical studies of intragenic recombination (7) and mutation (9, 10, 17, 18) requiring the measurement of small frequencies of r+ virions. However, the usefulness of the rII system as a mutation reporter gradually eroded as more and more examples of complex epistatic interactions between mutations in rII and in diverse genes encoding proteins involved in DNA transactions arose (6) and because of the roughly 3-kb size of the rII locus, which rendered sequencing tedious. Despite a large literature on forward and reverse mutation in the rII region, the conditionality of the R phenotype and the resulting implication that its role in lysis inhibition is indirect leads us not to further consider the rII genes here.
rIII.
After rII, rIII was the second r gene to be sequenced (39). Its role in lysis inhibition remains poorly defined, but the R phenotype is generated by 12 missense mutations, two chain termination mutations, and one frameshift mutation scattered across the gene (Fig. 5). rIII mutations seem to produce the R phenotype on all tested E. coli strains, but the plaques are somewhat smaller and less sharply edged in our hands than indicated in the canonical report. As a consequence, we may have undersampled rIII mutants with less than the null phenotype.
Fig. 5.
Mutations in the T4 rIII gene. Mutations are shown by letters above the nucleotide sequence. Mutations in italic at sites 34 and 48 produce chain termination codons, while the remaining BPSs produce missense mutations. Mutations at sites 34, 48, 125, 126, 209, and 244 were described in the canonical report (39), and the remainder are from the present compilation. Because the mutation at 200 arose in combination with a frameshift mutation at positions 204 to 205 in rIV, its phenotype as a single rIII mutation is uncertain. Similarly, the mutation at 209 arose in combination with one of three occurrences of the ochre mutation at position 34, so its phenotype as a single mutation is also uncertain. The insertion of AC into ACAC at positions 238 to 241 results in the replacement of Leu-Lys by His at the end of the protein and replaces TAA with TGA as the termination signal.
rIV.
rIV was first defined by mutants with a temperature-sensitive R phenotype (25) and was later found to be the same as sp (for spackle), an immediate-early gene whose mutations allow limited lysis of cells infected with mutants defective in the e (lysozyme) gene and sometimes also exhibit a temperature-sensitive R phenotype (14, 23, 35). (sp is mistakenly equated with rV instead of rIV in Table 1 but not in Table 2 of reference 28.) The C-terminal 75-residue polypeptide of the rIV-encoded protein was predicted to localize in the periplasmic space after the removal of an N-terminal signal sequence (23), and rIV/sp clearly directs multiple aspects of membrane-mediated phenotypes and shares some functions with those of the imm (immunity) gene. These include establishing resistance to lysis from without by T4 “ghost” particles (which can adsorb but lack injectable DNA) and blocking productive infection by superinfecting particles (reviewed in reference 1), as well as roles in LIN (2).
The rIV mutation spectrum appears in Fig. 6. It has two remarkable aspects. First, with the exception of the engineered chain termination mutation at positions 41 to 42, all eight other mutations are indels scattered widely across the sequence, whereas the large majority of mutations in most T4 mutational spectra (and most spectra in general) are BPSs. One possibility is that rIV is overloaded with frameshift-prone regions, which in T4 consist of A·T homonucleotide runs of length five (warm spots) or six (hot spots). Compared to rI (three 5-mers and no 6-mers) and rIII (one 5-mer and no 6-mers), rIV (with four 5-mers and one 6-mer) bears a typical load of frameshift-prone regions, which together provide five of its eight indels. A second possibility is that the present collection of alleles is a statistically freakish sample. A third possibility is that the protein is unusually insensitive to missense mutations, at least with respect to their potential to produce the R phenotype at our standard plating temperature of 37°C; the rII locus is somewhat similar (44). However, rIV does contain 33 well-scattered codons that can mutate to stop codons by single BPSs, a number similar to that in the identically sized rI gene, so that a typical ratio of BPSs to indels should have produced some chain termination mutations.
Fig. 6.
Mutations in the T4 rIV gene. All single-base additions and deletions detected to date are indicated by “+” and “−,” respectively, centered above single-base runs where appropriate; “−(GT)” in the fifth line of sequence indicates a two-base deletion, which was accompanied by a second r gene mutation, C → A at rIII base 200. The mutation at bases 41 to 42 is a double replacement engineered to generate a TAG chain termination mutation; it and the −A deletion at positions 262 to 266 were described in the canonical report (22), while the other mutations are from the present collection.
The second anomaly in the rIV spectrum is the three mutations in the six-A·T run at codons 2 and 3. There are three six-A·T runs in the rII genes, and they account for half of all spontaneous rII mutations, a ratio shaped in part by the bias against BPS detectability in the rII genes. Although these three rII runs vary by 10-fold in specific mutability, they appear to produce similar numbers of single-base additions and deletions, with a possible modest bias toward deletions (44, 45). Thus, the absence of deletions at the rIV hot spot was surprising. One possibility is a sampling anomaly. Another is that the deletions but not the additions result in a toxic polypeptide, a phenomenon already encountered with a frameshift mutation in the rIIB gene (31). Yet another possibility is translation reinitiation (30), wherein the −1 reading frame creates a UAG termination codon at new codon 10, which is followed about 9 codons farther along by an AUG codon in the normal rIV reading frame that might suffice to reinitiate a functional protein.
rV.
Like rIV, rV was also first defined by mutants with a temperature-sensitive R phenotype (25). It was later found to be identical to the t gene (13, 29), which encodes the holin that passes the lysozyme to its target, thus triggering lysis (22), and which interacts with the protein encoded by rI (33, 38, 48). (t in this case stands not for time of lysis but for the unfortunate Tithonus from Greek mythology.) Although several other phenotypes can result from some missense t mutations, conditionally null alleles are lethal in a nonsuppressing host (13). Not surprisingly, therefore, the five r alleles of t are all missense mutations (Fig. 7). These five mutations are confined to the first 37% of the gene, but because only one of them (at position 115) was recovered twice, it is likely that the spectrum is far from saturated; nevertheless, given the 657-nucleotide (nt) size of the gene, the small number of alleles found to date probably also reflects a sharply restricted mutational target for the R phenotype.
Fig. 7.
Mutations with R phenotypes in the T4 t gene. Mutations are shown by letters above the nucleotide sequence. The underlined sequence, corresponding to amino acids 35 to 55, defines the probable transmembrane domain in the T protein (48). Mutations at nucleotides 14, 115, and 224 are from the canonical rV report (13), and the others are from the present collection (including one repeat of the mutation at position 115). All of the entries are missense mutations.
Other r genes?
We next considered an early report of mutations at a locus called rVI that mapped between genes 39 and 56 (53). The defining mutants seem to have been lost, and the candidate map region is bereft of useful markers. We therefore turned to genomic sequencing of the 16 remaining unmapped mutations in mutants with possible R phenotypes, all of which had wild-type sequences for the r genes considered above. Two of these mutants bore mutations of the rI terminator as described above. The remaining 14 bore various background mutations that were also present in their parental and/or sibling strains lacking any R-like phenotype and that had accumulated in our T4D and T4B and mutagenized derivatives over more than half a century. All of these mutants had ambiguous R phenotypes, and several produced small but sharp-edged plaques, as often seen in segregants from T4D × T4B crosses that might have occurred in their lineages.
Two other T4 genes linked to lysis physiology, pseT2.2 and pseT2.3, are members of a family widely distributed among phages with double-stranded DNA (8, 46). They encode N-terminal transmembrane domains and act to modulate lysis in the presence of higher concentrations of divalent cations (such as 10 mM Mg2+) that may occur in some natural environments, although not in laboratory broths. In the case of phage λ, their proteins interact so as to span the inner and outer membranes and appear to promote a final step in lysis leading to membrane fusion and expedited release of progeny particles. The T4 pseT2.2 and pseT2.3 genes share a short sequence that includes the end of pseT.3 and the start of pseT.2 with a two-base overlap, with the gene pair being preceded by both early and late promoters and followed at some distance by a terminator (28). Our sequencing revealed no r mutations in either gene, and a mutant with both T4 genes deleted did not display an R phenotype (46).
DISCUSSION
Lysis and LIN.
Phages with double-stranded DNA encode a lysis system based on a holin (sometimes including an antiholin), an endolysin, and some ancillary proteins (54). The physiological and molecular foundations of phage-induced lysis have been extensively studied by the Ryland Young laboratory, most deeply in phage T4, which displays LIN, and in phage λ, which does not. LIN is initiated by superinfection of T4-infected cells later than about 3 min after the primary infection and is extended by continued superinfection (1, 22).
The T4 holin T (encoded by t, also known as rV) bears an N-terminal domain of about 34 amino acids that localizes to the cytoplasm, a transmembrane domain (probably amino acids 35 to 55) that localizes to the membrane, and a C-terminal domain of about 163 amino acids that localizes to the periplasmic space (37, 48). t was originally identified as a gene required for lysis; its mutational inactivation both blocked lysis and allowed long-continued phage synthesis (22). It was eventually identified as encoding a holin by its ability to replace the holin gene of phage λ (36). T was initially proposed to accumulate as growing membrane-anchored rafts that eventually reached a size sufficient to promote reorganization into a structure bearing a central hole large enough to allow the endolysin protein to pass through the membrane and degrade the cell wall (54) but is now recognized to accumulate as dispersed membrane-embedded molecules that eventually reorganize into rafts (52). The holins of phage λ and probably phage T4 turn out to produce extraordinarily large holes and may lead to a general membrane disruption rather than or in addition to the holes themselves (11, 34, 41, 51). Different t alleles result in lysis at different times, and the holin seems to be the sole determinant of time of lysis following infection by a single phage particle (37, 54).
The RI protein is an antiholin that can trigger LIN. The N-terminal end of the RI protein bears a signal anchor release domain (residues 2 to 22) that causes secretion of RI into the periplasm in a membrane-tethered form, followed by direct release into the periplasm without associated proteolysis (48). When RI is expressed from a plasmid, LIN occurs even without superinfection, so that RI plus T suffice to bring about LIN (38). RI interacts directly with T via their respective C-terminal periplasmic domains, and the RI C-terminal sequence after the signal anchor release domain suffices for LIN (38, 48). RI is normally present in very small amounts and is rendered unstable (with a half life of about 2 min) by its signal anchor release domain, which seems to recruit or enable the DegP protease (38, 49). Thus, the unknown LIN-inducing signal resulting from superinfection may act either to activate RI (33) or to stabilize it (49). RIII, whose specific function is unknown but which produces a transient LIN and whose mutants produce a rather weak R phenotype (33, 38; this report), has been predicted to reside in the cytoplasm, to interact with the T protein, and to act by stabilizing the LIN state (38).
The expression patterns of the components of the lysis system are somewhat unclear. t was described as driven by a late promoter with terminators both directly after t and shortly upstream of the promoter (28); it was also described as expressed both early and late but not at middle times, although the early signal was probably artifactual (25). The endolysin-encoding gene e was described as transcribed both early and late, but E is also regulated translationally and is transcribed mainly late (26, 27); however, e is also described as driven by a cluster of late promoters preceded by a terminator, with the next promoter considerably downstream of e (28). The expression of the rI operon may also be complicated. A late promoter sits immediately upstream and a terminator sits immediately downstream of the operon, with a rather distant upstream early promoter blocked by an intervening upstream terminator (28). However, cited but unpublished reverse transcription-PCR (RT-PCR) data indicated rI expression both early and late (33) and microarray analysis indicated that rI and rI.1 are expressed at middle times while rI.−1 is expressed late (26). rIII is closely followed by two terminators and is preceded closely by a late promoter, then by a middle promoter, then somewhat farther back by an early promoter, and then by a terminator (28). In microarray analysis, rIII is expressed at middle times (26). Given the vagaries of microarray analysis and in silico analysis of gene regulation, the predominant picture of these genes as expressed late may be subtly supplemented by some functionally significant earlier expression.
Insights from mutation spectra.
A conventional mutation spectrum based on a protein-encoding gene displays a large number of BPSs, including a much smaller number of chain termination mutations that are particularly useful for estimating BPS mutation rates because chain terminators are usually detected with high efficiency except occasionally near the end of the gene. Indels, mostly ±1 nt, usually comprise 5 to 20% of spontaneous mutations, with the value depending primarily upon the efficiency with which the BPSs are detected and the density of indel-prone short-sequence repeats. Deviations from these patterns may suggest physiological aspects of the function of a gene.
(i) rI.
The rI mutation spectrum is conventional and displays a fairly typical recovery of BPSs, i.e., about 21% of all possible BPSs, and is not yet saturated. Some recovered rI BPS mutations are leaky, the plaques being smaller and softer edged than with the strong R phenotype. In comparison, the well-saturated lacZα spectrum, based on mutations in a transgene in phage M13 growing in E. coli, detects about 35% of BPSs, including many with very weak phenotypes (3, 4, 5), while the nearly saturated URA3 spectrum in Saccharomyces cerevisiae detects about 18% of BPSs (S. A. Lujan and T. A. Kunkel, unpublished data) and may fail to detect even slightly leaky mutants because it results from a lethal selection. The rI mutations are mainly missense and are scattered across the entire gene, indicating that a broad range of alterations can abolish gene function both in the N-terminal signal anchor release domain and throughout the rest of the gene, which encodes at least two functions, interacting with the T protein and receiving and transmitting the LIN-inducing superinfection signal.
(ii) In and around rI.1.
The mutations in the last base of the upstream tk gene and in the late-promoter region are likely to perturb the transcription of the rI operon, producing the R phenotype. In addition, many of the BPSs within the rI.1 ORF produce synonymous codons: 10 out of 21 altogether and 10 out of 18 of the mutated sites in the last 58 nt (27%) of the gene, including the two terminal stop codons. In addition, one of the two internal chain termination mutations is leaky, and no other chain termination mutations were observed at any of the remaining 33 sites where a BPS could generate a chain termination codon. The synonymous mutations did not present a regular pattern of switching between high-usage and low-usage codons. Because of the possibility that these synonymous mutations might affect sequences involved in some aspect of RNA-mediated gene expression, we conducted searches for regions of homology between this region and both the T4 and the host genome, but we encountered no interesting hits. The rI.1 mutation spectrum therefore suggests that the rI.1 ORF is not an r gene in the classical sense. One possible explanation would be that the potential for forming or disrupting secondary structures in the region blocks transcription into the rI ORF.
(iii) rI operon transcription terminator mutations.
The two mutations shown in Fig. 4 might be able to disrupt the function of the rI operon transcription terminator, and one of them also provides a further example of a templated complex mutation. However, they pose an irritating question: why should transcriptional readthrough generate an R phenotype?
In sum, only one of the three genes of the rI operon displays clear functional properties and traditional mutational propensities. Perhaps the other two modulate lysis and LIN in environments encountered regularly by T4 in nature but poorly explored in the laboratory, including anaerobiosis and desiccation.
(iv) rIII.
Of the 15 rIII mutations defined to date by sequencing by us and others, two are of unknown phenotype because they arose in combination with a null allele at some other site (one site in rIII and one site in rIV). Although rIII with 249 nt is similar in size to rI with 294 nt, recorded rI mutations clearly outnumber recorded rIII mutations. In both our and the Ryland Young laboratories, rIII mutations tended to display a weaker R phenotype than is characteristic of strong rI mutations. Because only 2 of 26 BPS paths to chain termination codons were detected and single-base indels were seen in neither an (A)5 nor two (A)4 indel-prone sequences, null mutations by themselves do not seem to suffice for detection. Perhaps a generally weak R phenotype for all rIII mutations underlies the paucity of detected rIII mutations.
(v) rIV.
The rIV spectrum is remarkable for its complete lack of BPSs, especially because BPSs strongly outnumber indels in the large majority of mutational spectra from many organisms. As expected, the sequences (A)6 and (A)5 are indel prone. Although the chain termination mutation at codon 48 was engineered by a double BPS, rIV contains 39 codons at which chain termination codons can arise by single BPS mutations. Thus, sequence does not explain the predilection for indels. At least three improbable possibilities can be entertained: the spectrum represents an extreme sampling deviation (compared to an expectation of 1 indel and 7 BPSs, 8 indels and 0 BPSs give P ≈ 0.0002), rIV is intrinsically hardly mutable by BPSs (which is unprecedented in diverse organisms), or most BPSs produce either a dominant-lethal or wild phenotype (for which we know of no precedents).
(vi) rV.
Because null alleles of rV are lethal, neither indels nor chain termination mutations are expected. Indeed, all of the five r mutations in t are BPSs producing missense mutations. The cytoplasmic domain and the transmembrane domain each harbor one missense r allele, while the periplasmic domain harbors three. Thus, the interactions between the RI and T proteins might engage all three domains of each protein, a pattern not addressed in current models, or polypeptide structural changes might propagate into adjacent domains.
(vii) Other r genes?
It would not be surprising to discover other genes that sport mutant alleles displaying an R phenotype that is conditional on the medium, the growth temperature, or some other physiological variable that might affect, for instance, membrane integrity. These might be special alleles of an otherwise non-r gene, as in the case of the rV mutations, or might produce phenotypically weak R plaques. In addition, T4 has numerous very small ORFs and might have even smaller transcripts encoding untranslated RNAs, and any such short sequences might be r genes with mutational target sizes so small as to have escaped detection. However, the roster of r genes described here seems to be otherwise complete.
ACKNOWLEDGMENTS
We especially thank Geraldine Carver, now retired, who conducted the sequencing that assigned most r mutants to known genes. We also thank Ry Young for numerous very helpful suggestions and Libertad García-Villada and Jana Stone for critical readings of the paper.
This research was supported by funds allocated to project numbers Z01ES061054, Z01ES065011, and Z01ES065016 of the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences.
Footnotes
Published ahead of print on 13 May 2011.
REFERENCES
- 1. Abedon S. T. 1994. Lysis and the interaction between free phages and infected cells, p. 397–405 In Karam J. D. (ed.). Molecular biology of bacteriophage T4. American Society for Microbiology, Washington, DC [Google Scholar]
- 2. Abedon S. T. 1999. Bacteriophage T4 resistance to lysis-inhibition collapse. Genet. Res. Camb. 74:1–11 [DOI] [PubMed] [Google Scholar]
- 3. Bebenek A., et al. 2002. Dissecting the fidelity of bacteriophage RB69 DNA polymerase: site-specific modulation of fidelity by polymerase accessory proteins. Genetics 162:1003–1018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Bebenek A., Carver G. T., Kadyrov F. A., Kissling G. E., Drake J. W. 2005. Processivity clamp gp45 and ssDNA-binding-protein gp32 modulate the fidelity of bacteriophage RB69 DNA polymerase in a sequence-specific manner. Genetics 169:1815–1824 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Bebenek A., et al. 2001. Interacting fidelity defects in the replicative DNA polymerase of bacteriophage RB69. J. Biol. Chem. 276:10387–10397 [DOI] [PubMed] [Google Scholar]
- 6. Bebenek A., Smith L. A., Drake J. W. 1999. Bacteriophage T4 rnh (RNase H) null mutations: effects on spontaneous mutation and epistatic interaction with rII mutations. J. Bacteriol. 181:3123–3128 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Benzer S. 1955. Fine structure of a genetic region in bacteriophage. Proc. Natl. Acad. Sci. U. S. A. 41:344–354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Berry J., Summer E. J., Struck D. K., Young R. 2008. The final step in the phage infection cycle: the Rz and Rz1 lysis proteins link the inner and outer membranes. Mol. Microbiol. 70:341–351 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Brenner S., Stretton A. O. W., Kaplan S. 1965. Genetic code: the ‘nonsense’ triplets for chain termination and their suppression. Nature 206:994–998 [DOI] [PubMed] [Google Scholar]
- 10. Crick F. H. C., Barnett L., Brenner S., Watts-Tobin R. J. 1961. General nature of the genetic code for proteins. Nature 192:1227–1232 [DOI] [PubMed] [Google Scholar]
- 11. Dewey J. S., et al. 2010. Micron-scale holes terminate the phage infection cycle. Proc. Natl. Acad. Sci. U. S. A. 107:2219–2223 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Doermann A. H. 1948. Lysis and lysis inhibition with Escherichia coli bacteriophage. J. Bacteriol. 55:257–276 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Dressman H. K., Drake J. W. 1999. Lysis and lysis inhibition in bacteriophage T4: rV mutations reside in the holin t gene. J. Bacteriol. 181:4391–4396 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Emrich J. 1968. Lysis of T4-infected bacteria in the absence of lysozyme. Virology 35:158–165 [DOI] [PubMed] [Google Scholar]
- 15. Ewing B., Green P. 1998. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 8:186–194 [PubMed] [Google Scholar]
- 16. Ewing B., Hillier L., Wendl M. C., Green P. 1998. Base-calling of automated sequencer traces using Phred. I. Accuracy assessment. Genome Res. 8:175–185 [DOI] [PubMed] [Google Scholar]
- 17. Freese E. 1959. The difference between spontaneous and base analogue induced mutation of phage T4. Proc. Natl. Acad. Sci. U. S. A. 45:622–633 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Freese E., Bautz-Freese E., Bautz E. 1961. Hydroxylamine as a mutagenic and inactivating agent. J. Mol. Biol. 3:133–143 [DOI] [PubMed] [Google Scholar]
- 19. Gordon D., Abajian C., Green P. 1998. Consed: a graphical tool for sequence finishing. Genome Res. 8:195–202 [DOI] [PubMed] [Google Scholar]
- 20. Hershey A. D. 1946. Mutation of bacteriophage with respect to type of plaque. Genetics 31:620–640 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Jacewicz A., Makiela K., Kierzek A., Drake J. W., Bebenek A. 2007. The roles of Tyr391 and Tyr619 in RB69 DNA polymerase replication fidelity. J. Mol. Biol. 368:18–29 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Josslin R. 1970. The lysis mechanism of phage T4: mutants affecting lysis. Virology 40:719–726 [DOI] [PubMed] [Google Scholar]
- 23. Kai T., Ueno H., Otsuka Y., Morimoto W., Yonesaki T. 1999. Gene 61.3 of bacteriophage T4 is the spackle gene. Virology 260:254–259 [DOI] [PubMed] [Google Scholar]
- 24. Krylov V. N. 1971. Star mutants of the bacteriophage T4B. Genetika 7:112–119 [PubMed] [Google Scholar]
- 25. Krylov V. N., Zapadnaya A. A. 1965. Bacteriophage T4B r-mutations sensitive to temperature (rts). Genetika 1:7–11 [Google Scholar]
- 26. Luke K., et al. 2002. Microarray analysis of gene expression during bacteriophage T4 infection. Virology 299:182–191 [DOI] [PubMed] [Google Scholar]
- 27. McPheeters D. S., Christensen A., Young E. T., Stormo G., Gold L. 1986. Translational regulation of expression of the bacteriophage T4 lysozyme gene. Nucleic Acids Res. 14:5813–5826 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Miller E. S., et al. 2003. Bacteriophage T4 genome. Microbiol. Mol. Biol. Rev. 67:86–156 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Montag D., Degen M., Henning U. 1987. Nucleotide sequence of gene t (lysis gene) of the E. coli phage T4. Nucleic Acids Res. 15:6736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Napoli C., Gold L., Singer B. S. 1981. Translational reinitiation in the rIIB cistron of bacteriophage T4. J. Mol. Biol. 149:433–449 [DOI] [PubMed] [Google Scholar]
- 31. Nelson M. A., Singer B. S., Gold L., Pribnow D. 1981. Mutations that detoxify an aberrant T4 membrane protein. J. Mol. Biol. 149:377–403 [DOI] [PubMed] [Google Scholar]
- 32. Nickerson D. A., Tobe V. O., Taylor S. L. 1997. PolyPhred: automating the detection and genotyping of single nucleotide substitutions using fluorescence-based resequencing. Nucleic Acids Res. 25:2745–2751 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Paddison P., et al. 1998. The roles of the bacteriophage T4 r genes in lysis inhibition and fine-structure genetics: a new perspective. Genetics 148:1539–1550 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Pang T., Savva C. G., Fleming K. G., Struck D. K., Young R. 2009. Structure of the lethal phage pinhole. Proc. Natl. Acad. Sci. U. S. A. 106:18966–18971 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Peterson R. F., Cohen P. S., Ennis H. L. 1972. Properties of phage T4 messenger RNA synthesized in the absence of protein synthesis. Virology 48:201–206 [DOI] [PubMed] [Google Scholar]
- 36. Ramanculov E., Young R. 2001. Functional analysis of the phage T4 holin in a λ context. Mol. Genet. Genomics 265:345–353 [DOI] [PubMed] [Google Scholar]
- 37. Ramanculov E., Young R. 2001. Genetic analysis of the phage T4 holin: timing and topology. Gene 265:25–36 [DOI] [PubMed] [Google Scholar]
- 38. Ramanculov E., Young R. 2001. An ancient player unmasked: T4 rI encodes a t-specific antiholin. Mol. Microbiol. 41:575–583 [DOI] [PubMed] [Google Scholar]
- 39. Raudonikiene A., Nivinskas R. 1993. The sequences of gene rIII of bacteriophage T4 and its mutants. Gene 134:135–136 [DOI] [PubMed] [Google Scholar]
- 40. Rozen S., Skaletsky H. 2000. Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132:365–386 [DOI] [PubMed] [Google Scholar]
- 41. Savva C. G., et al. 2008. The holin of bacteriophage lambda forms rings with large diameter. Mol. Microbiol. 69:784–793 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Schultz G. E., Jr, Carver G. T., Drake J. W. 2006. A role for replication repair in the genesis of templated mutations. J. Mol. Biol. 358:963–973 [DOI] [PubMed] [Google Scholar]
- 43. Schultz G. E., Jr., Drake J. W. 2008. Templated mutagenesis in bacteriophage T4 involving imperfect direct or indirect sequence repeats. Genetics 178:661–673 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Singer B. S., Shinedling S. T., Gold L. 1983. The rII genes: a history and a prospectus, p. 327–333 In Mathews C. K., Kutter E. M., Mosig G., Berget P. B. (ed.), Bacteriophage T4. American Society for Microbiology, Washington, DC [Google Scholar]
- 45. Streisinger G., Owen J. E. 1985. Mechanisms of spontaneous and induced frameshift mutation in bacteriophage T4. Genetics 109:633–659 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Summer E. J., et al. 2007. Rz/Rz1 lysis gene equivalents in phages of gram-negative hosts. J. Mol. Biol. 373:1098–1112 [DOI] [PubMed] [Google Scholar]
- 47. Symonds N. 1958. The properties of a star mutant of phage T2. J. Gen. Microbiol. 18:330–345 [DOI] [PubMed] [Google Scholar]
- 48. Tran T. A. T., Struck D. K., Young R. 2005. Periplasmic domains define holin-antiholin interactions in T4 lysis inhibition. J. Bacteriol. 187:6631–6640 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Tran T. A. T., Struck D. K., Young R. 2007. The T4 RI antiholin has an N-terminal signal anchor release domain that targets it for degradation by DegP. J. Bacteriol. 189:7618–7625 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Trzemecka A., Jacewicz A., Carver G. T., Drake J. W., Bebenek A. 2010. Nullifying the mutator activity of a DNA polymerase pocket mutation by a nearby replacement that by itself hardly impacts fidelity. J. Mol. Biol. 404:778–793 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Wang I.-N., Deaton J., Young R. 2003. Sizing the holin lesion with an endolysin–β-galactosidase fusion. J. Bacteriol. 185:779–787 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. White R., et al. 2011. Holin triggering in real time. Proc. Natl. Acad. Sci. U. S. A. 108:798–803 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Yankovsky N. K., Krylov V. N. 1975. Genetical and physiological study of mutations of phage T4 suppressing the ligase defect of gene stII mutants. Genetika 11:51–60 [PubMed] [Google Scholar]
- 54. Young R. 2002. Bacteriophage holins: deadly diversity. J. Mol. Microbiol. Biotechnol. 4:21–36 [PubMed] [Google Scholar]