Abstract
Activation-induced deaminase (AID) initiates immunoglobulin somatic hypermutation (SHM). Since in vitro AID was shown to deaminate cytosines on single-stranded DNA or the nontranscribed strand, it remained a puzzle how in vivo AID targets both DNA strands equally. Here we investigate the roles of transcription and DNA sequence in cytosine deamination. Strikingly different results are found with different substrates. Depending on the target sequence, the transcribed DNA strand is targeted as well as or better than the nontranscribed strand. The preferential targeting is not related to the frequency of AID hot spots. Comparison of cytosine deamination by AID and bisulfite shows different targeting patterns suggesting that AID may locally unwind the DNA. We conclude that somatic hypermutation on both DNA strands is the natural outcome of AID action on a transcribed gene; furthermore, the DNA sequence or structure and topology play major roles in targeting AID in vitro and in vivo. On the other hand, the lack of mutations in the first ∼100 nucleotides and beyond about 1 to 2 kb from the promoter of immunoglobulin genes during SHM must be due to special conditions of transcription and chromatin in vivo.
The variable regions of immunoglobulin (Ig) genes encode the antigen binding sites of antibodies for estimated billions of different antigenic determinants. Thousands to millions of different antibody binding sites are created when the hundred or so variable, diversity, and joining genes for Ig heavy and light chains are recombined and diversified by nucleotide deletions and insertions at the V(D)J joints in developing B lymphocytes. In mature B lymphocytes, the rearranged V(D)J sequences are extensively further diversified by somatic hypermutation (SHM) during exposure to antigens that react with the specific Igs comprising the B-cell receptors.
SHM is initiated by the activation-induced cytosine deaminase (AID) (11, 18; reviewed in reference 25). It is likely that in vivo AID deaminates cytosines in DNA since inactivation of the uracil glycosylase Ung greatly increases the proportion of transitions from C to T (16). This is probably due to copying the uracil (product of cytosine deamination) with adenine during replication. Furthermore, expression of AID in uracil glycosylase-deficient Escherichia coli results in mutations (14, 17, 23), most of which are also due to replication of uracil (14). While AID clearly can act on DNA, a role as an editor for the mRNA of a specific endonuclease (1) cannot be ruled out. In any case, whether it acts in vivo as a DNA cytosine deaminase or whether it creates an endonuclease or other factor that promotes SHM, three major questions about targeting of the somatic hypermutation process remain to be explained. All three must be understood in view of the finding that Ig gene transcription is required for SHM (2, 13). First, how does SHM target Ig genes and a few other genes expressed in mutating B lymphocytes without affecting all transcribed genes? There must be SHM-specific cis elements in all SHM target genes. One potential targeting element is a binding site for certain helix-loop-helix proteins present in all Ig enhancers (10). However, since this 6-bp element is abundant in the genome, it remains to be determined whether it is essential for SHM and how it would act as a targeting signal. Second, within the genes altered by SHM, only about 1 to 2 kb downstream of the promoter is targeted. This mutation distribution has been shown to relate to initiation of transcription (13), but how the mutation process is terminated in the middle of a transcribed gene is not understood.
This report addresses a third issue of SHM targeting, the DNA strand specificity of AID. In vitro experiments by several laboratories have shown that AID targets single-stranded DNA but not double-stranded DNA (4, 7, 15). Double-stranded DNA was targeted in vitro when it was transcribed, but then mainly on the nontranscribed strand (3, 5, 6, 23). Also, when AID is expressed in E. coli, it can target endogenous or introduced genes, but again, mainly on the nontranscribed strand (14, 17, 23). In vivo, however, clearly both DNA strands are targeted (24). This poses the question of whether the in vitro experiments can give reliable clues about the in vivo situation.
Our laboratory has shown recently that AID targets double-stranded DNA in vitro on both strands when the DNA is supercoiled (21). In the plasmid substrates we used, the cytosine deaminations were seen mainly in two reportedly negatively supercoiled regions. Based on this finding, we proposed that during SHM in vivo, negative supercoils upstream of the elongating RNA polymerase during transcription may provide access to AID. Here we investigate the role of the DNA sequence and transcription in AID targeting using two plasmid constructs in which either the T7 or the T3 promoter drives transcription of different target sequences. Each plasmid carries two antibiotic resistance markers: one constitutively active and one that requires AID deamination to create a functional ATG translation initiation codon. This independence of the analysis from antibiotic selection allowed us to show that both the transcribed and the nontranscribed DNA strands can be efficiently targeted by AID during transcription. The additional unexpected finding was that the primary sequence greatly influences which DNA strand is targeted by AID and that this difference is independent of the distribution of AID hot spots.
MATERIALS AND METHODS
Plasmid construction.
pKM2 and its precursor pKM1 (21) were modified for the current in vitro transcription study. For pKMT7 (Fig. 1A), the T7 promoter downstream and in the opposite direction of the Kanr gene in pKM2 was deleted using the QuikChange site-directed mutagenesis kit (Stratagene), and a linker containing XhoI, PstI, SpeI, EcoRI, and HindIII sites was inserted into the DraI site 44 bp upstream of hypermutable region 1 (21). The T7 promoter was then added between the SpeI and EcoRI sites, and the direction is shown in Fig. 1A. pKMK1 (Fig. 1B) was derived from pKM1 (described in Fig. 1B of reference 21) in which the original translation start codon for the Ampr gene is maintained and the Kanr start codon was replaced with ACG located in a SnaBI site. Because of the altered initiation codon, E. coli cells transformed with pKMT7 do not grow in ampicillin, but grow in kanamycin, and the reverse is true for pKMK1 transformants.
FIG. 1.
Plasmids used in this study and their cytosine deaminations. (A and B) Plasmids pKMT7 and pKMK1 were used to determine mutations by AID in the Ampr and Kanr genes, respectively. The start of transcription and transcription direction for the T7 and T3 promoters in vitro are indicated by bent arrows. The direction of transcription of the Ampr and Kanr genes in E. coli is indicated by straight arrows. C to F and G to L show cytosine deaminations near the Ampr gene in pKMT7 (left panels) and the Kanr gene in pKMK1 (right panels), respectively. *C* and *E*, AID treatment in Goodman buffer (4). The numbers in parentheses are the number of mutated colonies whose plasmids were sequenced. Brackets indicate the antibiotic in which the E. coli cells transformed with the AID-treated plasmids were selected. Mutations on the top of each line are from the top strand; the ones on the bottom of each line are from the bottom strand. The top strand is the nontranscribed strand in pKMT7 transcribed by T7 polymerase, and that in pKMK1 is transcribed by the T3 polymerase. The top strand is the transcribed strand in pKMK1 when it is transcribed by the T7 polymerase. The ACG initiation codons that need to be changed to ATG for production of the selectable markers Ampr and Kanr are at positions 2715 to 2717 and 1213 to 1215, respectively.
In vitro transcription.
The transcription buffer used was described by Bransteitter et al. (3). A 10-μl reaction mixture contains 1× transcription buffer (50 mM HEPES, pH 7.5, 1 mM dithiothreitol [DTT], 10 mM MgCl2), 40 U of RNase inhibitor (Roche Diagnostics, Manheim, Germany), 20 U of T3 or T7 RNA polymerase (Roche Diagnostics, Manheim, Germany), 100 fmol of DNA, 250 μM rNTPs, and 20 μCi of [32P]UTP. Transcription was carried out at 37°C for 60 min. Two units of RNase-free DNase I (Ambion) was added immediately after reaction. The mixtures were further incubated at 37°C for 30 min. The samples were loaded on a 6% polyacrylamide sequencing gel containing 8 M urea. The gel was run at 100 W for 3.5 h and then was exposed to an X-ray film for 15 to 30 min.
AID treatment in the presence or absence of transcription.
Glutathione S-transferase (GST)-AID protein was obtained as described previously (21). One batch of GST-AID protein was pretreated with RNase A before purification. Another batch was treated with RNase A (1 μg/10 μl) at 37°C for 30 min just before the reaction with AID. AID (450 ng) was added to 1× transcription buffer (50 mM HEPES, pH 7.5, 1 mM DTT, 10 mM MgCl2), and 20 U of T3 or T7 RNA polymerase (Roche Diagnostics, Manheim, Germany), 250 μM rNTPs, and 100 fmol of DNA were mixed on ice to detect AID activity during transcription. In the case of AID pretreated with RNase A before purification, 40 U of RNase inhibitor (Roche Diagnostics, Manheim, Germany) was added. The RNA polymerases were left out in AID reactions with supercoiled DNA without transcription. In the case of the pKMT7 plasmid without transcription, no C deaminations were obtained with AID in the transcription buffer. However, this plasmid was efficiently deaminated by AID without transcription (Fig. 1C and E) in the following buffer (4): 10 mM Tris.HCl, pH 8.0, 1 mM EDTA, 1 mM DTT, and 1 μg/10 μl of RNase A. The reaction mixtures were incubated at 37°C for 2 h. The treated plasmids were purified using phenol-chloroform-isoamyl alcohol (25:24:1) before being transformed into the UDG-deficient E coli strain BW504 (a gift from Ashok S. Bhagwat). AID treated with RNase before purification or directly before the AID assay gave the same results in several assays. The data shown here were all obtained with RNase addition 30 min before the AID reaction.
Primers.
The primers for removing the T7 promoter 3′ of the Kanr gene were 5′-GGTGGAGCTCCAATTCGTACAATTCACTGG-3′ and 5′-CCAGTGAATTGTACGAATTGGAGCTCCACC-3′, and those for replacing the Kanr start codon with a SnaBI site were 5′-CCCCTCGGTACGTATGAACAAGATGG-3′ and 5′-CCATCTTGTTCATACGTACCGAGGGG.
Bisulfite sequencing.
Sodium bisulfite sequencing was performed as described previously (12), except the treatment with 2.3 M sodium bisulfite was only for 3 h at 37°C. As a negative control, we used the linearized plasmids. The regions to be analyzed were amplified using specific PCR primers. For the PKMT7 vector, the region from nucleotides (nt) 2045 to 3060 was amplified either with forward primer 5′-CGTCGTTTTACAACGTCGTG-3′ and reverse primer 5′-TCATGCCATCCGTAAGATGC-3′ or the region from nucleotides 2067 to 3069 was amplified with converted primers 5′-TGTGGTGGAGTTTTAATTTGTA-3′ and 5′-TCTCTTACTATCATACCATCC-3′, respectively. For the PKMK1 plasmid, the region from nucleotides 715 to 1910 was amplified using primers 5′-GAGCGTCGATTTTTGTGATG-3′ and 5′-CCAAGCTCTTCAGCAATATC-3′; the region from nucleotides 1676 to 2525 was amplified using primers 5′-GGATGATCTGGACGAAGAG-3′ and 5′-AACAAGAGTCCACTATTAA-3′. PCR products were electrophoresed on a 1% agarose gel and isolated using the QIAGEN fragment isolation kit (QIAGEN, Valencia, CA). PCR fragments were subcloned into the TOPO-TA PCR4 vector (Invitrogen, Carlsbad, CA) and sequenced using either the M13 forward primer or M13 reverse primer.
Sequence analysis.
Plasmid DNAs were sequenced using an automatic sequencing machine (3730XL; Applied Biosystems), and sequence analysis was performed using Sequencher 4.1. (For the sequences, see Fig. S1 to S13 in the supplemental material.) The only mutations seen are C-to-T and G-to-A transitions. T replacing C is seen in the DNA strand that had acquired a uracil. In the UDG-deficient E. coli strain, the U is retained and copied into A, which is copied into T in the next round of replication. An A-for-G replacement is seen where in the opposing strand a C-to-U deamination had occurred. In this fashion, C to T represents U in the strand whose sequence is shown (top strand) and G to A represents C to U in the opposite (bottom) strand.
RESULTS
In all experiments reported here, supercoiled plasmid DNA was treated in vitro with human AID (the AID plasmid was a gift of M. Goodman [4, 21]) in the presence or absence of simultaneous transcription. The treated plasmid DNAs were purified and used to transform uracil DNA-glycosylase-deficient E. coli (a gift of A. Bhagwat).
An ampicillin resistance gene is targeted by AID on the transcribed DNA strand as well as the nontranscribed DNA strand.
Previously we found that an apparently negatively supercoiled region upstream and just into the coding region of the ampicillin resistance (Ampr) gene is hypermutable by AID in the plasmid pKM2 (21). In order to evaluate whether transcription affects the mutation pattern, we modified pKM2 by placing a T7 promoter upstream of the Ampr gene (and removing a T7 promoter downstream of the Kanr gene) (Fig. 1A, pKMT7). In pKMT7, the Ampr gene can be transcribed from the T7 promoter with T7 RNA polymerase. The plasmid has ACG replacing the translation initiation codon (ATG) of Ampr (Fig. 1A). E. coli cells transformed with the pKMT7 plasmid do not grow in ampicillin (not shown). The supercoiled plasmid pKMT7 was treated with AID in vitro in transcription buffer without RNA polymerase and tested for mutations by transformation of uracil DNA-glycosylase-deficient E. coli cells, which were selected in ampicillin. (The promoter driving Ampr expression when the treated plasmids are expressed in E. coli is the original plasmid promoter). In a large sample, only two colonies were found and they had recombined the plasmid, somehow creating an initiation codon for the Ampr gene without cytosine deamination (not shown) (Fig. 1C). However, when the Ampr gene in pKMT7 was transcribed by T7 polymerase and the transformed bacteria were selected in ampicillin, many colonies were seen (Fig. 1D) and all, except one, had reverted the ACG to an ATG initiation codon (for the sequences see Fig. S2 in the supplemental material). One sequence had an ACG-to-ACA change. Presumably this threonine codon or another mutation upstream allowed initiation of translation in E. coli. Besides the initiation codon change, many other mutations had occurred in these sequences (Table 1, D; see Fig. S2 in the supplemental material). Most of the mutations were on the coding strand (Fig. 1D and Table 1, D). The strand bias was likely due to the ampicillin selection of the ATG codon created from ACG by AID on the coding strand. The Ampr coding strand is the nontranscribed strand with respect to the T7 promoter. Thus, transcription greatly increases the access of AID to the DNA and in this particular target sequence appears to be required for cytosine deamination.
TABLE 1.
Cytosine deaminations of in vitro AID-treated ± RNA polymerase-transcribed plasmids
| Plasmid (no. of hot spots coding/template strand)a | Exptb | Transcribed | RNA polymerase | Selectionc | No. of mutations in coding/template strandd | Total bp sequenced in mutated coloniese | No. of mutations/ 103 ntf |
|---|---|---|---|---|---|---|---|
| pKMT7 (47/50) | *C* | − | − | Amp | 142/13 | 19,680 | 7.9 |
| C | − | − | Amp | 0 | 0 | ||
| D | + | T7 | Amp | 232/10 | 13,120 | 18.4 | |
| *E* | − | − | Kan | 25/31 | 11,480 | 4.9 | |
| F | + | T7 | Kan | 30/78 | 12,300 | 8.8 | |
| pKMK1 (57/63 or 103/114) | G | − | − | Kan | 56/0 | 15,580 | 3.6g |
| H | + | T7 | Kan | 14/0 | 18,150 | 0.8g | |
| I | + | T3 | Kan | 36/0 | 14,760 | 2.4g | |
| J | − | − | Amp | 91/15 | 13,200 | 8.0 | |
| K | + | T7 | Amp | 363/47 | 21,450 | 19.1 | |
| L | + | T3 | Amp | 82/10 | 9,900 | 9.3 |
Number of C’s in WRC/GYW hot spots in the coding/template strands in 820 bp of pKMT7 (5.7/6.1% of total bp) and 820 bp (7.0/7.7%) or 1,650 bp (6.2/6.9%) of pKMK1, respectively.
DNAs were purified after in vitro treatments and transformed into UDG-minus E. coli. For details, see Fig. 1C to 1L. *C* and *E* were treated with AID in the Goodman buffer (see Materials and Methods).
The antibiotic that selects for the ACG-to-ATG mutation is underlined.
Mutations in the Ampr gene in pKMT7 and in the Kanr gene in pKMK1. Mutations in the transcribed strand are underlined.
DNA was sequenced from each of 6 to 24 bacterial colonies (see Fig. 1).
Mutations in mutated colonies. Underlined results were obtained with antibiotic that selects for expression of the sequenced target gene (see footnote c). nt, nucleotides.
More mutations inactivate the target gene?
When bacteria transformed with the AID-treated T7 polymerase-transcribed pKMT7 plasmid were selected in kanamycin, both DNA strands showed similar frequencies of cytosine deaminations, with preference for mutations in the transcribed strand (Fig. 1F and Table 1, F; see Fig. S4 in the supplemental material). Selection in kanamycin reveals the real pattern of mutations in the Ampr gene, because mutations in the Ampr gene are not selected for or against in this situation. Thus, contrary to previous findings (3, 5, 6, 17, 23), the nontranscribed strand is not always the preferred target for AID.
To test whether cytosine deaminations could be obtained in the original Goodman buffer (4) used for the analysis of the Ampr gene in a different plasmid, pKMK2 (21), the experiment was repeated with that buffer. Indeed, Ampr colonies were recovered without transcription, and their Ampr gene showed mutations of the ACG codon to ATG and multiple additional C-to-T mutations and a few G-to-A mutations due to cytosine deaminations on the coding or template strand, respectively (Fig. 1*C*; Table 1, *C*; and see Fig. S1 in the supplemental material). Likewise, Kanr colonies showed mutations in the Ampr gene (Fig. 1*E* and Table 1, *E*; and see Fig. S3 in the supplemental material). Again, selection in ampicillin showed mutations mainly on the nontranscribed (i.e., coding) strand, while growth of the E. coli cells in kanamycin revealed mutations on both strands. We do not know why the transcription buffer prevents mutations of the supercoiled plasmid. Perhaps the conformation of the sequences is altered in this buffer, preventing AID from recognizing cytosines. In any case, transcription overcomes the problem.
A kanamycin resistance gene is targeted by AID almost exclusively on the coding strand, regardless of transcription and the direction of the transcripts.
In order to determine the influence of DNA sequence, we investigated cytosine deaminations in the Kanr gene whose initiator ATG had been changed to ACG in the plasmid pKMK1 (Fig. 1B). E. coli cells transformed with AID-treated pKMK1 plasmids were recovered either in kanamycin, requiring the expression of Kanr for growth, or in ampicillin, allowing the recovery of mutations in Kanr without selective pressure. In order to further test the role of transcription in AID activity, the pKMK1 test plasmid contains two promoters for either the T3 polymerase (transcription in the natural direction) or the T7 polymerase (transcription in the opposite direction) (Fig. 1B).
We find that the Kanr gene is targeted by AID mainly on the coding (top) strand under all conditions tested (Fig. 1G to L and Table 1, G to L; and see Fig. S5 to S10 in the supplemental material). As expected, selection in kanamycin enriches for sequences that have mutations on the Kanr coding strand, including a change of the ACG-to-ATG initator codon (Fig. 1G to I). This is true for supercoiled DNA without transcription (Fig. 1G and Table 1, G) (for this sequence transcription buffer allows cytosine deminations without transcription) and when transcribed from the T3 promoter upstream of the Kanr gene (Fig. 1I and Table 1, I). Similar proportions of cytosine deaminations are also seen with kanamycin selection when Kanr antisense is transcribed in vitro from a T7 promoter downstream of the gene where the mutations occurred on the strand transcribed by the T7 polymerase (Fig. 1H and Table 1, H). (The promoter driving Kanr expression when the treated plasmids are expressed in E. coli is the original Lac promoter.) Unexpectedly, when the bacteria were grown in ampicillin, a very high proportion of mutations were revealed on the strand transcribed by the T7 polymerase (the Kanr coding strand), compared with the nontranscribed strand (Fig. 1K and Table 1, K). These findings suggest that AID can readily access the transcribed strand. It appears that the preferred targeting by AID of the Kanr coding strand under all conditions may be a property of the primary sequence or structure of the Kanr gene since selection in ampicillin of sequences treated with AID during transcription from the T3 promoter also shows they have most cytosine deaminations on the coding strand, which in this case is the nontranscribed strand (Fig. 1L and Table 1, L). To determine whether an unusual secondary structure of the DNA existed in this region, nucleotides 2100 to 2400 (where most of the mutations due to AID were observed in pKMK1) were analyzed for stem-loop structures, using the IDT software. Three potential stem-loops were observed, but the loops in these structures had both C's and G's, and 6 of the 10 AID hot spots were in loops of the bottom strand, the strand that was barely mutated by AID (not shown). Thus, these structures are unlikely the reason for deamination of C's mainly on the top strand.
Interestingly, the distribution of the nucleotides targeted by AID is different in the transcribed DNA as compared with the targeting without transcription (Fig. 1J versus K and L). Active transcription was proven by efficient incorporation of [32P]UTP into plasmid RNAs when the plasmids were transcribed either by the T3 or T7 RNA polymerase (not shown). Thus, transcription alters the targeting.
AID and bisulfite have different cytosine deamination preferences.
These data show that which DNA strand is targeted by AID strongly depends on the primary sequence (and presumably structure) of the target gene. In order to gain insight into the sequences that are preferred by AID, we compared AID-mediated deamination in the supercoiled plasmid DNAs with bisulfite-mediated deaminations. Bisulfite deaminates almost all cytosines in single-stranded DNA and very few in linear double-stranded DNA (20, 26; data not shown). We find that in supercoiled DNA bisulfite can deaminate cytosines (Fig. 2). Thus, unpaired cytosines are apparently present at least transiently in the supercoiled plasmids. Interestingly, the targeting by AID and bisulfite is different. With pKMT7, the highest deamination efficiency for bisulfite is around nt ∼2250 to 2380, a region in which very little AID activity is seen (Fig. 2A and B). With pKMK1, the extreme preference that AID displays for the coding (top) strand and clustering of deaminations between nt ∼801 and 1201, ∼1550 and 1650, and ∼2100 and 2450 is not seen with bisulfite (Fig. 2D, E, and F). There are also fewer deaminations with bisulfite than with AID (2.5 × 10−3/nt and 4.3 × 10−3/nt with bisulfite versus 4.9 × 10−3/nt and 8 × 10−3/nt with AID; Fig. 2 and Table 1). Interestingly, AID prefers WRC/GYW hot spots: 33% and 29% of mutations with AID are in hot spots in pKMT7 and pKMK1, respectively. The sequenced regions of pKMT7 and pKMK1 have only 12% and 11% of C and G in hot spots, respectively. Bisulfite does not greatly prefer these hot spots: 14% and 20% of deaminations by bisulfite are in AID hot spots in pKMT7 and pKMK1, respectively. Apparently, AID sees cytosines in supercoiled DNA differently from bisulfite.
FIG. 2.
Comparison of cytosine accessibility to AID and sodium bisulfite. Supercoiled plasmids were treated with AID or bisulfite (see Materials and Methods). The regions of the plasmids analyzed are indicated. The bars above (below) the lines indicate C-to-T conversion on the top (bottom) strand of the plasmids. (A) Conversion of cytosines by AID in supercoiled, untranscribed pKMT7 selected in kanamycin (enlarged version of Fig. 1*E*). (B) Conversion of cytosines by sodium bisulfite in pKMT7; 17 clones were analyzed, and a total of 35 mutations were obtained (2.5 × 10−3 mutations/bp). Of these mutations, 5 (14%) were in AID hot spots. (C) AID WRC/GYW hot spots in pKMT7. (D) Conversion of cytosines by AID in supercoiled, untranscribed pKMK1 selected in ampicillin (enlarged version of Fig. 1J). (E and F) Conversion of cytosines by sodium bisulfite in pKMK1. Twenty-six and 29 clones for the 5′ and 3′ pKMK1 were analyzed, and a total of 169 mutations were obtained (4.3 × 10−3 mutations/bp). Of these, 34 (20%) were in AID hot spots. (G) AID hot spots in pKMK1.
DISCUSSION
In the experiments presented here, transcription increases the frequency of mutations by AID in vitro. For the Ampr gene in pKMT7, transcription increases C deaminations strongly, by more than 18.4-fold (Table 1, mutations in D versus C). Likewise, with pKMK1, transcription increases mutation by AID (Table 1, L and K versus J). (This is only seen with pKMK1 when bacterial colonies are not selected in kanamycin; see below). Also, the pattern of targeting is influenced by transcription. This is especially obvious when there is no selection of the transcribed gene (Fig. 1J versus K or L).
We had postulated that in vivo negatively supercoiled DNA that arises in the wake of transcript elongation by RNA polymerase may allow AID to access both DNA strands (21). In vitro, given that in several cases mutations are seen upstream of the promoter, targeting of negative supercoils arising by transcription is indeed a possibility. While there is no barrier to the elimination of negative supercoils arising behind the RNA polymerase by encountering positive supercoils arising in front of the polymerase in the unconstrained plasmids, such supercoils will temporarily pile up and may have a sufficiently long half-life to allow access to AID (8). One would of course expect that the nontranscribed single strand in the transcription bubble could be accessible to AID (3, 5, 6, 23). However, as shown here, the transcribed strand is accessible to AID as well (Fig. 1). In fact, for pKMK1 transcribed by the T7 polymerase, the transcribed strand is the greatly preferred target (Fig. 1K and Table 1, K). It seems unlikely that the transcribed strand is accessible in the portion of the transcription bubble where nascent RNA is annealed to the DNA (about 9 nt) since the RNA/DNA hybrid presumably excludes AID. Thus, it is likely that negative supercoiling arising upstream of the transcribing polymerase creates AID-accessible “flipped-out” cytosines.
By comparing the experiments reported here to those in previous reports, the conclusion from previous experiments that the nontranscribed strand is preferentially targeted by AID apparently cannot be generalized. As it turns out, the strong preference for deaminated C's on the nontranscribed strand when that strand was the coding strand of the Kanr gene (17, 23) was apparently mainly due to the particular structure of the kanamycin resistance gene. In the published experiments, the bacteria were selected for expression of the Kanr gene. Only those bacterial colonies would have been selected that reverted the antibiotic resistance gene to activity. Particular C deaminations that did not restore function or that caused changes detrimental to the function of the Kanr gene would not have been noticed. That the Kanr gene is rather susceptible to inactivation by AID is shown by comparing the observed mutations in the pKMK1 gene when the bacteria are grown in kanamycin and selected for Kanr gene function (Table 1, G, H, and I), with mutations seen without selection (Table 1, J, K, and L). In all cases, fewer mutations were seen in the Kanr gene when pKMK1 was selected in kanamycin. Presumably, AID-induced mutations in many cases inactivated the Kanr gene, so that colonies with a high mutation load would not have arisen in kanamycin. (Such an effect was not seen for the Ampr gene in pKMT7. Here the ampicillin-selected colonies had a higher mutation frequency than the kanamycin-selected colonies. In pKMT7, the sequences targeted are mainly upstream of the Ampr translation initiation. This region appears to be more tolerant of mutations than the coding sequences of the Kanr gene. Apparently, the chance for creating the initiator ATG by AID-mediated cytosine deamination is not very high in pKMT7; thus, selection for ampicillin resistance may select for more highly mutated plasmids.)
The pattern of targeting by AID is different in vitro from the observed somatic hypermutation pattern in vivo. In vivo, there are essentially no mutations upstream of the promoter, the first 100 to 200 nucleotides are spared from mutations, and the mutations extend up to 1 to 2 kb from the promoter. The molecular basis for the rules of this mutation distribution is not known. In vitro, none of these rules appear to apply (Fig. 1). In several cases, mutations are seen upstream of the promoter; in no case is there a consistent gap of mutations immediately 3′ of the promoter, and the majority of mutations are within less than 1 kb downstream. It is likely that in vivo, chromatin, especially the positioning and modifications of nucleosomes, and the factors that interact with the RNA polymerase, influence the SHM distribution in a major way.
However, two parameters that do play a role in the in vitro experiments are most likely also important for SHM in vivo. First, clearly there is a strong influence on AID targeting of the primary sequence of the DNA, presumably caused by its micro-differences, potentially involving base stacking. The AID targeting differences between the Ampr and Kanr genes and between the top and bottom strands of these genes are not due to different distributions of AID hot spots. In fact, the hot spots are in very similar frequencies in both regions and in their respective top and bottom strands (Table 1). It would be interesting to know why the Kanr gene is so highly mutable on the coding strand. Presumably, in vivo such an extreme bias would be modified by nucleosomes distorting the naked DNA structure. Thus, perhaps in vivo, short stretches of preference for one strand also exist, but overall both strands are more or less equally targeted. Interestingly, once mutations have arisen in a sequence, its structure as perceived by AID also may be altered and thus the patterning may change in the course of the somatic mutation process.
The second parameter that likely also applies in vivo is the topology of the DNA. Negatively supercoiled DNA arising during transcription may indeed play a role in AID targeting in vivo. In addition, in nucleosomes the DNA is overstretched in places (9) and overall more stretched than naked DNA (19). It remains an open question how much of the requirement for transcription in somatic hypermutation is due to such topology effects. Since not all genes transcribed in a cell that undergoes somatic hypermutation are mutated, one component of the transcription effect is likely due to the specific attraction of AID to immunoglobulin and certain other genes either directly or indirectly by specific cis-elements and the transcription machinery (10). The other components of the role of transcription likely are changes of the DNA topology which alter the accessibility of AID. The differences in cytosine deamination patterns between bisulfite and AID suggest different interactions with the DNA target. Unfortunately, the comparison between AID and bisulfite cannot be done under identical conditions, since AID requires relatively low salt and bisulfite needs to be at a high concentration. Despite this caveat, since in the experiments both treatments were carried out at 37°C and supercoiled DNA was recovered and assayed after either treatment, the supercoiling along the double-stranded DNA circle was presumably the same or similar. Relaxed double-stranded DNA is a poor substrate for both bisulfite (20, 26; data not shown) and AID (4, 6, 7, 17, 21). In contrast, as shown in Fig. 2, supercoiled DNA can be accessed by both. Presumably, the cytosines that can be deaminated by either AID or bisulfite are at least transiently flipped out. However, bisulfite deaminates fewer cytosines, most of which are different from the ones AID deaminates. It is possible that AID has a helix-unwinding activity, similar to single-strand binding protein (SSB) of E. coli which denatures DNA at 37°C without ATP (22; suggestion by Martin Gellert). Perhaps AID can gain access to flipped-out cytosines which exist in supercoiled, but not relaxed, DNA and then unwind the DNA locally for a certain distance. This scenario is compatible with the more clustered cytosine deaminations seen with AID compared with bisulfite. A possible DNA-unwinding as well as processive activity of AID (15), its interaction with the single-strand binding protein RPA (5), and the role of nucleosomes in vivo are open questions that require further investigation.
Supplementary Material
Acknowledgments
We thank M. Gellert for discussions and the suggestion that AID may have a helix-unwinding activity. We are grateful to M. Goodman for the AID plasmid; A. Bhagwat for UDG-minus E. coli; W. Buikema of the DNA Sequencing facility of the Cancer Research Center of the University of Chicago for advice; M. Lieber for technical suggestions concerning bisulfite sequencing; and T. E. Martin, S. Longerich, and P. Engler for critical reading of the manuscript.
S.R. is supported by a postdoctoral fellowship from the Cancer Research Institute. This research was supported by NIH grants AI47380 and AI053130.
Footnotes
Supplemental material for this article may be found at http://mcb.asm.org/.
REFERENCES
- 1.Begum, N. A., K. Kinoshita, M. Muramatsu, H. Nagaoka, R. Shinkura, and T. Honjo. 2004. De novo protein synthesis is required for activation induced cytidine deaminase-dependent DNA cleavage in immunoglobulin class switch recombination. Proc. Natl. Acad. Sci. USA 101:13003-13007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Betz, A., C. Milstein, R. Gonzalez-Fernandes, R. Pannell, T. Larson, and M. Neuberger. 1994. Elements regulating somatic hypermutation of an immunoglobulin K gene: critical role for the intron enhancer/matrix attachment region. Cell 77:239-248. [DOI] [PubMed] [Google Scholar]
- 3.Bransteitter, R., P. Pham, P. Calabrese, and M. F. Goodman. 2004. Biochemical analysis of hypermutational targeting by wild type and mutant activation-induced cytidine deaminase. J. Biol. Chem. 279:51612-51621. [DOI] [PubMed] [Google Scholar]
- 4.Bransteitter, R., P. Pham, M. Scharff, and M. Goodman. 2003. Activation-induced cytidine deaminase deaminates deoxycytidine on single-stranded DNA but requires the action of RNase. Proc. Natl. Acad. Sci. USA 100:4102-4107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Chaudhuri, J., C. Khuong, and F. W. Alt. 2004. Replication protein A interacts with AID to promote deamination of somatic hypermutation targets. Nature 430:992-998. [DOI] [PubMed] [Google Scholar]
- 6.Chaudhuri, J., M. Tian, C. Khuong, K. Chua, E. Pinaud, and F. Alt. 2003. Transcription-targeted DNA deamination by the AID antibody diversification enzyme. Nature 421:726-730. [DOI] [PubMed] [Google Scholar]
- 7.Dickerson, S., E. Market, E. Besmer, and F. N. Papavasiliou. 2003. AID mediates hypermutation by deaminating single stranded DNA. J. Exp. Med. 197:1291-1296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kouzine, F., J. Liu, S. Sanford, C. Hye-Jung, and D. Levens. 2004. The dynamic response of upstream DNA to transcription-generated torsional stress. Nat. Struct. Mol. Biol. 11:1092-1100. [DOI] [PubMed] [Google Scholar]
- 9.Luger, K., A. Maeder, R. Richmond, D. Sargent, and T. Richmond. 1997. Crystal structure of the nuclosome core particle at 2.8 A resolution. Nature 389:251-260. [DOI] [PubMed] [Google Scholar]
- 10.Michael, N., H. Shen, S. Longerich, N. Kim, A. Longacre, and U. Storb. 2003. The E box motif CAGGTG enhances somatic hypermutation without enhancing transcription. Immunity 19:235-242. [DOI] [PubMed] [Google Scholar]
- 11.Muramatsu, M., K. Kinoshita, S. Fagarasan, S. Yamada, Y. Shinkai, and T. Honjo. 2000. Class switch recombination and somatic hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell 102:553-563. [DOI] [PubMed] [Google Scholar]
- 12.Padjen, K., S. Ratnam, and U. Storb. 2005. DNA methylation precedes chromatin modifications under the influence of the strain-specific modifier, Ssm1. Mol. Cell. Biol. 25:4782-4791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Peters, A., and U. Storb. 1996. Somatic hypermutation of immunoglobulin genes is linked to transcription initiation. Immunity 4:57-65. [DOI] [PubMed] [Google Scholar]
- 14.Petersen-Mahrt, S. K., R. S. Harris, and M. S. Neuberger. 2002. AID mutates E. coli suggesting a DNA deamination mechanism for antibody diversification. Nature 418:99-103. [DOI] [PubMed] [Google Scholar]
- 15.Pham, P., R. Bransteitter, J. Petruska, and M. Goodman. 2003. Processive AID-catalysed cytosine deamination on single-stranded DNA simulates somatic hypermutation. Nature 424:103-107. [DOI] [PubMed] [Google Scholar]
- 16.Rada, C., G. Williams, H. Nilsen, D. Barnes, T. Lindahl, and M. Neuberger. 2002. Immunoglobulin isotype switching is inhibited and somatic hypermutation perturbed in UNG-deficient mice. Curr. Biol. 12:1748-1755. [DOI] [PubMed] [Google Scholar]
- 17.Ramiro, A., P. Stavropoulos, M. Jankovic, and M. Nussenzweig. 2003. Transcription enhances AID-mediated cytidine deamination by exposing single-stranded DNA on nontemplate strand. Nat. Immunol. 4:452-456. [DOI] [PubMed] [Google Scholar]
- 18.Revy, P., T. Muto, Y. Levy, F. Geissman, A. Plebani, O. Sanal, N. Catalan, M. Forveille, R. Dufourcq-Lagelouse, A. Gennery, I. Tezcan, F. Ersoy, H. Kayserili, A. Ugazio, N. Brousse, M. Muramatsu, L. Notarangelo, K. Kinoshita, T. Honjo, A. Fischer, and A. Durandy. 2000. Activation-induced cytidine deaminase (AID) deficiency causes the autosomal recessive form of hyper-IgM syndrome (HIGM2). Cell 102:565-575. [DOI] [PubMed] [Google Scholar]
- 19.Richmond, T., and C. Davey. 2003. The structure of DNA in the nucleosome core. Nature 423:145-150. [DOI] [PubMed] [Google Scholar]
- 20.Shapiro, R., B. Braverman, B. J. Louis, and E. R. Servis. 1973. Nucleic acid reactivity and conformation. II. Reaction of cytosine and uracil with sodium bisulfite. J. Biol. Chem. 148:4060-4064. [PubMed] [Google Scholar]
- 21.Shen, H., and U. Storb. 2004. Activation-induced cytidine deaminase (AID) can target both DNA strands when the DNA is supercoiled. Proc. Natl. Acad. Sci. USA 101:12997-13002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Sigal, N., H. Delius, T. Kornberg, M. Gefter, and B. Alberts. 1972. A DNA-unwinding protein isolated from Escherichia coli: its interaction with DNA and with DNA polymerases. Proc. Natl. Acad. Sci. USA 69:3537-3541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Sohail, A., J. Klapacz, M. Samaranayake, A. Ullah, and A. Bhagwat. 2003. Human activation-induced cytidine deaminase causes transcription-dependent, strand-biased C to U deaminations. Nucleic Acids Res. 31:2990-2994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Storb, U., A. Peters, N. Kim, H. M. Shen, G. Bozek, N. Michael, J. Hackett, E. Klotz, L. Loeb, and T. Martin. 1999. Molecular aspects of somatic hypermutation of Ig genes. Cold Spring Harbor Lab. Symp. Quant. Biol. 64:227-234. [DOI] [PubMed] [Google Scholar]
- 25.Storb, U., and J. Stavnezer. 2002. Immunoglobulin genes: generating diversity with AID and UNG. Curr. Biol. 12:R725-R727. [DOI] [PubMed] [Google Scholar]
- 26.Yu, K., F. Chedin, C. Hsieh, T. Wilson, and M. Lieber. 2003. R-loops at immunoglobulin class switch regions in the chromosomes of stimulated B cells. Nat. Immunol. 4:442-451. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


