Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Sep 15.
Published in final edited form as: J Immunol. 2011 Aug 15;187(6):3247–3255. doi: 10.4049/jimmunol.1101568

Separation of mutational and transcriptional enhancers in immunoglobulin genes

Naga Rama Kothapalli *, Kaitlin M Collura *,, Darrell D Norton *, Sebastian D Fugmann *,
PMCID: PMC3169751  NIHMSID: NIHMS312003  PMID: 21844395

Abstract

Secondary immunoglobulin (Ig) gene diversification relies on activation-induced cytidine deaminase (AID) to create U:G mismatches that are subsequently fixed by mutagenic repair pathways. AID activity is focused to Ig loci by cis-regulatory DNA sequences named targeting elements. Here we show that in contrast to prevailing thought in the field, the targeting elements in the chicken IGL locus are distinct from classical transcriptional enhancers. These mutational enhancer elements (MEEs) are required over and above transcription to recruit AID-mediated mutagenesis to Ig loci. We identified a small 222 bp fragment in the chicken IGL locus that enhances mutagenesis without boosting transcription, and this sequence represents a key component of a MEE. Lastly, MEEs are evolutionarily conserved amongst birds, both in sequence and function, and contain several highly conserved sequence modules that are likely involved in recruiting trans-acting targeting factors. We propose that MEEs represent a novel class of cis-regulatory elements whose function is to control genomic integrity.

Introduction

Activation-induced cytidine deaminase (AID) is a DNA mutator enzyme that initiates the secondary antibody gene diversification processes somatic hypermutation (SHM), immunoglobulin gene conversion (GCV), and class switch recombination (CSR) (13). AID converts cytosines to uracils in the context of single-stranded DNA that is generated during transcription, and the resulting U:G mismatches are fixed by direct replication or repaired by error-prone DNA repair pathways (4, 5). SHM and GCV are closely related, and alter the nucleotide sequence in the VJ (or VDJ) exon of immunoglobulin (Ig) genes, thus modifying antibody specificity. In contrast, CSR leaves the antigen binding sites unaltered, and swaps the constant region from the Cμ isotype to Cγ, Cε, or Cα, altering the effector function of the encoded antibody.

While these processes are restricted to Ig genes, and in the case of CSR specifically to the immunoglobulin heavy chain (IGH) locus, low levels of AID activity have also been reported for a much larger range of genes (68). Such “mis-targeting” can lead to point mutations (e.g. in BCL6) or translocations (C-MYC/IGH) and likely represent a key event in the formation of B cell lymphomas characterized by such genetic alterations (912). The very robust targeting of AID-mediated sequence diversification processes to Ig loci has been the focus of intense investigation over almost two decades and is thought to be mediated by cis-regulatory DNA sequences, in particular transcription enhancers (13, 14). We recently identified the first such targeting element in the Ig light chain (IGL) genes of chicken DT40 B cells (15). This cis-regulatory sequence resides within a 4 kb region downstream of a transcriptional enhancer element, and this region is now referred to as the chicken IGL 3′-regulatory region (3′RR). Two independent research groups confirmed our findings and recently reported large non-coding DNA elements with targeting activity which overlap with our 3′RR and are referred to as Diversification Activator (DIVAC) (16) or “Region A” (17), respectively. However, several outstanding questions remained: (1) the identity of the minimal DNA sequence critical for targeting, (2) the trans-acting factors binding to this sequence and their mode of action, and (3) whether transcriptional enhancers and mutational enhancers are unified in the very same DNA elements.

Here we provide first definitive evidence that distinct mutational enhancer elements (MEEs) are required over and above transcriptional enhancer activity to recruit AID-mediated sequence diversification to the chicken IGL locus. We show that two independent and redundant MEEs exist in this locus. In addition, we defined the location of one of them to within 1.5 kb, and have experimental evidence that a critical part of the MEE resides within a small 222 bp region. Lastly, we demonstrate that the targeting activity is evolutionarily conserved, as an orthologous fragment from zebra finch (Taeniopygia guttata) is functional when placed into the IGL locus of DT40 cells. Sequence alignments suggest that small modules highly conserved during evolution provide the platform on which trans-acting targeting factors assemble to mediate SHM/GCV targeting.

Materials and Methods

Oligonucleotides

The sequences of all oligonucleotides used in this study are listed in Table I.

Table I.

Oligonucleotides

Luciferase constructs (see Figure 1)
IgL promoter
PLF gactctcgagtgggaaatactggtgatagg
PLRH gactaagcttggcggaatcccagcagctgtgtgtc
Enh (467bp enhancer)
CHEFB1 gactagatctcagctggggccacacaaagag
CHERB1 gactggatccctggaagcaggcaggagtcgtg
Fragment 1
KR1F gtcaggatcccccggcaagtggcggctgct
KR1R gtcaagatctacccacagctggccgtggcatc
Fragment 2
KR2F gtcaggatcctccgccacaaccgctccgca
KR2R gtcaagatctagcgtcctgctggacagcaggc
Fragment 3
KR3F gtcaggatccgcccactctcattgcggtgct
KR3R gtcaagatctaaatgtacgcagcccaggag
Fragment 4
KR4F gtcaggatcccacggcacaaaggtgtttat
KR4R gtcaagatctatgaggatccgctttgctaatgagcagaag
Fragment 5
KR5F gtcaggatccgtgccctggtgctctgcaat
KR5R gtcaagatcttccagtctgcagcgtgtgcat
Fragment 6
KR6F gtcaggatcctgtggtcactgctgggctct
KR6R gtcaagatctaagcagagccaggagcagga
Fragment 7
KR7F gtcaggatcctggctgcggtcagcacatct
KR7R gtcaagatctactgtgggcagcaggctgaa
Fragment A
KR7.2F gactggatccgagacacggatggagcagtgtg
KR7R gtcaagatctactgtgggcagcaggctgaa
Fragment B
KR4F gtcaggatcccacggcacaaaggtgtttat
KR7R gtcaagatctactgtgggcagcaggctgaa
Fragment C
KR7.1F gactggatccgctccgaggccacaagccct
KR7R gtcaagatctactgtgggcagcaggctgaa
Fragment D
KR7.1F gactggatccgctccgaggccacaagccct
KR7.2R gactagatctcacactgctccatccgtgtctc
Fragment E
KR7F gtcaggatcctggctgcggtcagcacatct
KR3R gtcaagatctaaatgtacgcagcccaggag
Fragment F
KR7F gtcaggatcctggctgcggtcagcacatct
KR7.1R gactagatctagggcttgtggcctcggagc
ZFTE fragment (from zebra finch IgL locus)
ZFTEF ggatccaaactaatcagtccctgcctgc
ZFTER1 ggatcctgtggctctgctgggaatg
Targeting vectors for deletion analysis
ΔME6.1 and ME6.1S (downstream homology arm)
Eraf53B gactggatcccagcaattcacagaaacattg
Erar21X gactctcgagcagggctgcaataaaggtgag
ΔME6.2 (downstream homology arm)
hypR3A aattggatcccagggagctcacctttatt
hypF1A gatactcgagttgtatttcccatcctggtg
Δ1.5K (upstream homology arm)
C9N gactgcggccgctcaacagatcagcactggagac
D3KR gactggatccgcgtggtgggagcgggcagg
Δ1.5K (downstream homology arm)
ERAF54B gactggatcccaggctctggtcccatctcactg
Erar21X gactctcgagcagggctgcaataaaggtgag
Southern and Northen Blot probes
CL probe (used for Northern Blots to determine IGL transcript levels and Δ1.5K Southern blots)
CCLF1 cccaccgtcaaaggaggagctg
CHVLR1 cagtagatctttagcactcggacctcttcagg
GAPDH probe
CHGAPDHF accagggctgccgtcctctc
CHGAPDHR ttctccatggtggtgaagac
ER probe (used for Southern Blot genotyping of all clones except for Δ1.5K)
erar4 agcacagaacaggcacgtgct
QG11R gacgttgatgtggacgatgtg

Plasmids

To generate the enhancer-less pIgLP luciferase reporter construct a 425 bp IgL promoter fragment was amplified from DT40 genomic DNA using the oligonucleotide pair PLFX/PLRH, and cloned as a XhoI/HindIII fragment into pGL3-Basic (Promega) digested with the same enzymes. To create all other reporter plasmids, the fragments Enh, 1–7, A, B, C, D, E, F, and ZFTE were amplified from DT40 genomic DNA (or zebra finch genomic DNA, in case of ZFTE) using the primer pairs listed (see Table I), cloned into the pCR4-TOPO vector (Invitrogen), sequenced, and subsequently inserted as BamHI/BglII fragments into the BamHI site of pIgLP. As we were unable to amplify fragment 3′RR+ directly from genomic DNA, it was assembled from three sub-fragments using the internal unique NdeI and BsmBI sites.

The gene targeting plasmids for ΔME6.2/ΔME6.1/ΔME6.1S were generated using the targeting vector for the DT40 ΔME cell line as a backbone, which contains the intronless VJ-C exon in its left arm (15). The right arm was replaced with the respective PCR fragments that were cloned as BamHI/XhoI fragments. The loxP-flanked puromycin selection cassette was cloned as a BamHI (from pLoxPuro) (18) or a BamHI/BglII fragment (from pBSK BBH puro) into the unique BamHI site located between the left and the right arm. Where required, the SV40 enhancer was inserted as an BamHI/BglII fragment in the unique BamHI site next to the puromycin selection cassette.

The gene targeting plasmid for the Δ1.5K line was generated by cloning respective PCR products as NotI/BamHI and BamHI/XhoI fragments into pBluescript and inserting the puromycin selection cassette as described above. For the knock-in constructs, the enhancer fragments B, C, and ZFTE were inserted as BamHI/BglII fragments into the unique BamHI site next to the puromycin cassette in the ΔME6.1 targeting vector.

Southern Blots

Ten micrograms of genomic DNA were digested with respective restriction enzymes, separated on 0.8% agarose gels in 1×TBE. Subsequently, the gel was treated with HCl to fragment high molecular weight DNA, and the DNA was denatured and transferred to GeneScreen Plus (Perkin Elmer) membranes using alkaline transfer buffer (0.4M NaOH, 1mM EDTA). Probes were amplified from genomic DNA using the primer pairs listed (see Table I), and radiolabeled with [32P]-α-dCTP using the NEblot kit (New England Biolabs). Blots were hybridized at 62°C overnight, washed, and the band patterns were detected using phosphorimaging screens and a Storm 860 phosphorimager (GE Healthcare Life Sciences).

Northern Blots

Total RNA was isolated using RNA-Bee (Tel-Test) according to the manufacturer’s instructions. Ten micrograms of RNA were loaded on 1% agarose gels with formaldehyde and separated in 1×MOPS at 80V. RNA was transferred to GeneScreen Plus membranes in 10×SSC (150mM sodium citrate, 1.5M NaCl). Blots were hybridized (see Methods – Southern Blots) with an IGL constant region probe (amplified using primer pair CCLF1/CHVLR1) and a GAPDH probe (amplified using primer pair CHGAPDHF/CHGAPDHR). Blots were visualized using phosphorimaging, and quantified using ImageQuant software (Molecular Dynamics).

Cell culture

DT40 were grown in RPMI 1640 medium (Mediatech) supplemented with 10% fetal bovine serum (Invitrogen), 1% chicken serum (Sigma), 10mM Hepes, 2mM L-Glutamine, and penicillin/streptomycin. Cell cultures were maintained at 41°C in 5% CO2.

Luciferase assays

All luciferase assays were performed using the Dual-Luciferase Assay Kit (Promega) according to the manufacturer’s protocol. Briefly, 1×106 cells were transfected with 1µg of luciferase reporter plasmids and 1µg of pRL-SV40 (Promega) using the Amaxa Nucleofactor T kit (Lonza) with program B023. After 48 hours, cells were harvested, lysed and luciferase activities were measured using a Monolight 2010 luminometer (BD Biosciences).

Gene-targeting

The ΔM cell line was used as the parent for all clones generated in this study, except for the Δ1.5K line for which wildtype cells are the parental line. Transfections were performed by electroporation (580V, 25µF, ∞Ω) with 25–30µg of linearized targeting plasmids. Stable integrants were selected using 0.5µg/ml puromycin, and the genotypes of individual clones determined by Southern Blot analyses. Clones that carried a randomly integrated copy in addition to the targeted integration were discarded. The puromycin resistance cassettes were removed using recombinant cell-permeable Cre protein as described previously (15).

SHM/GCV analyses

The assays for GCV (by FACS) and SHM+GCV (by DNA sequencing) were performed and analyzed as described previously (15). For the IGH locus we consider any sequence that has more than one nucleotide change as a single event, as we do not have the sequence information of all pseudo VH elements that would be required to assign distinct donor sequence to each observed nucleotide change. Hence the IGH frequencies represent the results of conservative lower limit calculations.

Sequence alignments

Bioinformatic sequence analyses were performed using the web-based tools Pipmaker (http://bio.cse.psu.edu/pipmaker) (19), MULAN, (http://mulan.dcode.org/) and multiTF (http://multitf.dcode.org/) (20).

GenBank submission

The sequences of the condor IGL locus fragments were deposited at GenBank (http://www.ncbi.nlm.nih.gov) under the accession numbers HQ414233 and JF693631.

Results

The chicken IGL locus

The rearranged chicken IGL locus consists of a single set of functional leader L, VJ and constant region exons (Fig. 1A). Upstream of these coding segments are 25 pseudo V elements, and downstream of them a 467 bp minimal enhancer (Enh) element has been identified (Fig. 1A) (21). The 3′RR that contains a targeting element for SHM/GCV is located downstream of this enhancer (15), and the only readily identifiable sequence element in this area is a non-functional copy of a CR1 retrotransposon (22). This CR1 retrotransposon is not unique to the DT40 cell line, as it is also present in the published chicken genome assembly. However, the functional importance (if any) of this element for the IGL locus is unclear. Lastly, the 3′RR is also essential for transcription of the IGL locus (15), and hence our approach was to identify sequence elements of functional importance within this region.

Figure 1. Chicken IGL transcriptional enhancer analysis using luciferase assays.

Figure 1

(A) Schematic representation of the rearranged IGL allele in wildtype DT40 cells. Exons are shown as filled boxes, and labeled as leader (L), variable region (VJ), and constant region (C). Previously described transcriptional control elements are shown as filled ovals, and labeled as matrix attachment region (MAR) and enhancer (Enh), respectively. The location of the 3′ regulatory region (3′RR) is indicated by a bracket, and the CR1 retrotransposon is shown as a filled box. (B+C) The individual IGL fragments that were tested for their enhancer activities are shown below the locus with their assigned designations. Luciferase reporter constructs carrying the indicated IGL fragments were transiently transfected into DT40 cells together with the pRL-SV40 control vector. The firefly luciferase activities were first normalized to those of Renilla luciferase (cotransfected as a separate plasmid) and subsequently to the value of the enhancer-less reporter carrying only the IGL promoter (P). Each value represents the average (±s.d.) of three independent experiments.

The chicken DT40 cell line serves as our experimental system. Although it is a cell line model, it is commonly thought that the molecular mechanism of SHM and GCV in these cells, in particular with respect to targeting, is likely a good reflection of what occurs in primary B cells. Our experiments utilize the endogenous IGL promoter and IGL gene as a mutation read-out instead of employing a more artificial EGFP reporter system.

Transcription enhancers in the chicken IGL locus

To define the transcriptional enhancers in the 3′RR and its vicinity, luciferase assays were employed (Fig. 1), and Fragment 7, which contains the previously defined minimal enhancer (21), showed the highest luciferase activity, roughly two-fold higher than Enh alone (Fig. 1B). Further dissections defined fragment B (Fig. 1C) as the minimal DNA fragment providing the strongest enhancer activity in our transient luciferase assays. The increase in enhancer activity of fragment B over fragment 7 is likely a result of a deletion of negative regulatory elements in the 3′ end of fragment 7.

Separation of transcription and mutation enhancer function

To determine whether fragment B contained both transcription and mutation enhancer activity in the context of the endogenous locus, knock-in clones were generated in which fragment B was placed into the “empty” IGL locus of the DT40 ΔME6.1 line (FragB, Fig. 2A, and Fig. S1A). The ΔME6.1 cells showed neither transcription nor SGM/GCV (Fig. 3 and discussed below), and thus served as a platform to test the properties of respective cis-regulatory elements. Steady-state transcript levels of the IGL gene in FragB cells were comparable to those in the parental ΔM cell line (Fig. 2B). Thus, fragment B acts as a transcriptional enhancer in the context of the endogenous IGL locus.

Figure 2. Transcriptional and mutational enhancers in the IGL locus of DT40 cells.

Figure 2

(A) Schematic representation of the rearranged IGL allele in wildtype DT40 cells and the individual indicated clones, drawn to scale. The regions/elements shown in grey are deleted in the respective lines. (B) Steady state-transcript levels of IGL were determined by Northern blot analysis, and normalized to those of GAPDH. The level for the parental ΔM genotype is arbitrarily set to one, and each bar is the average (±s.d.) of triplicate experiments for two cell clones per genotypes. (C) Mutation event frequencies in the IGL (filled bars) and IGH (white bars) genes were determined by DNA sequencing after four weeks of culture starting from single cell clones. The compiled frequencies are obtained from the analyses of two independent clones of each genotype. The background levels of 5×10−5 events/bp is shown as a dotted line and was obtained by sequencing the IGL locus of AID−/− DT40 cells after four weeks of culture (23). (D) Surface IGM reversion analysis was done by flow cytometry using PE-labeled anti chicken IGM antibodies. The percentage of IGM+ cells after four weeks is shown for one representative subclone of each genotype, and the average (±s.d.) obtained from a total of 20–24 subclones of two independent clones per genotype is shown below.

Figure 3. Systematic deletion analysis of the chicken IGL locus in DT40 cells.

Figure 3

Schematic representation of the rearranged IGL allele in wildtype DT40 cells and selected DT40 deletion mutants, drawn to scale. Deleted regions are shown in grey. The steady-state IGL transcript levels determined by Northern blot analysis are shown on the right (relative to that observed in ΔM). The mutation event frequencies in both the IGL and IGH locus were determined by sequencing these genes after four weeks of culture from individual single cell clones (two per genotype). Note that the background (defined by the respective analysis of AID−/− DT40 cells) is at 5×10−5 events/bp, and all IGL mutation frequencies that are above this threshold are highlighted by a grey frame. N.D.=not determined. *only one clone of this genotype was analyzed. §data from (15)

To test whether this element is also sufficient to drive SHM/GCV, subclones of the FragB lines were continuously cultured for four weeks starting from single cell clones. The VJ exon in these clones contains a premature stop codon, and its reversion by GCV leads to the appearance of surface IGM+ cells. While flow cytometry analyses of parental ΔM clones showed on average 3.72% IGM+ cells after four weeks, such IGM+ populations did not arise in FragB lines (Fig. 2D). DNA sequencing confirmed this absence of SHM and GCV (Fig. 2C), with mutation frequencies being measured below the background of 5×10−5 events/bp observed in AID−/− DT40 cells (23). The IGH locus mutation frequency of 3.47×10−4 events/bp (Fig. 2C) was comparable to that of 4.79–6.22×10−4 events/bp routinely observed in DT40 cells (15), indicating that the lack of SHM/GCV in the IGL locus was caused solely by the manipulation of the IGL locus itself. Thus we concluded that fragment B is sufficient to drive normal levels of IGL transcription, but is unable to support SHM/GCV. This provides definitive evidence for the concept that transcriptional enhancer function and mutational enhancer function are physically separable in the DT40 system.

To determine whether the addition of non-coding sequences could indeed increase mutation levels without altering transcription, we generated fragment C which extended 222 bp beyond the 3′ end of fragment B (Fig. 2A). As predicted, IGL transcript levels in the corresponding FragC DT40 knock-in line (Fig. 2A, 2B, and Fig. S1B) remained unaltered. This indicated that these additional 222 bp of DNA are dispensable for transcription. Importantly, however, AID-mediated sequence diversification was partially restored in these FragC clones. An IGM+ cell population became readily detectable in our sensitive flow cytometry-based IGM reversion assay (Fig. 2D), and reached levels of about 50 % of the parental ΔM clone that was used to generate this genotype by gene-targeting. Although the DNA sequencing approach is less sensitive (as it is a PCR-based assay), the mutation event frequency measured in FragC lines was still clearly above the background of our system. In summary, this demonstrated that a small fragment from the 3′RR contains at least parts of a MEE critical for SHM/GCV targeting, and that addition of this sequence to the transcriptional enhancer now recruits AID activity without altering transcription.

A pair of MEEs in the chicken IGL locus

To determine where within the 3′RR the MEE (i.e. the SHM/GCV targeting element) resides, we generated DT40 lines with smaller deletions of this region (Fig. 3 and Fig. S1C–E). Importantly, even deleting as little as 1.5 kb of the 3′RR in the ΔME6.1S line resulted in a complete disruption of SHM/GCV specifically in the IGL locus (Fig. 3). Thus we concluded that a MEE for SHM/CSR resides in the 1.5 kb between the Enh enhancer and the CR1 retrotransposon. This location is fully consistent with our observations in the FragC line in which a small fragment from this 1.5 kb element clearly enhanced mutation (Fig. 2D).

To determine whether this MEE is also essential for AID-mediated sequence diversification in the context of the intact wildtype IGL locus, we generated the Δ1.5K DT40 line (Fig. 3 and Fig. S2A). Somewhat surprisingly, this genotype was still able to drive SHM/GCV in the IGL locus robustly at 70% of the level observed in wildtype DT40 cells (4.07×10−4 events/bp sequenced, (15)). This strongly suggested that a second targeting element exists upstream of the minimal Enh enhancer element.

The ΔM line lacking the VJ-C intron and the ΔE line lacking the Enh enhancer showed wildtype levels of SHM/GCV, but the deletion of 2.3 kb non-coding DNA between the constant region exon and the enhancer in the ΔME line reduced SHM/GCV to 50% (15). As this is comparable to what we observed in our Δ1.5K line, we inferred that an additional MEE resides in that area. Hereafter we will refer to the MEEs upstream and downstream of the enhancer as 5′MEE and 3′MEE, respectively. As the individual deletion of each element resulted in similarly modest effects on SHM/GCV, we inferred that these elements might be redundant in function. Alternatively, one MEE could drive SHM whereas the other supports GCV. Importantly, however, the ΔME and Δ1.5K line lacking the 5′MEE or 3′MEE, respectively, show evidence for both, SHM and GCV (Fig. 4). Thus we conclude that these two elements a largely redundant.

Figure 4. Mutation event patterns in the ΔME and Δ1.5K DT40 lines.

Figure 4

Mutation events identified in the VJ region of the indicated cell lines after four weeks of continuous culture were classified as templated and non-templated events as indicated. Note that the data for the ΔME line is from (15).

Evolutionary conservation of targeting activities

To determine whether MEEs are a unique feature of chicken B cells, a search for such elements in another species was initiated. As standard sequence comparison algorithms were unable to detect significant similarities between the non-coding regions of the IGL locus of chicken and those of humans, mice, rats, lizards, and teleost fish, we decided to focus on a more closely related species, the zebra finch (Taeniopygia guttata). An 8 kb contig of the zebra finch IGL locus was assembled by a combination of in silico and PCR experiments, and largely matched the current genome assembly (which became available while these studies were ongoing). A sequence comparison using Pipmaker revealed four areas of strong homology between this region and the chicken IGL locus (Fig. 5A). Interestingly, there is a stretch of strong homology upstream of the Enh enhancer in an area where we predicted the location of the 5′MEE (discussed below), but also to the 1.5 kb region of the 3′RR to which we mapped the chicken 3′MEE. Hereafter we focus on the latter homologous sequence element, and refer to it as the zebra finch targeting element (ZFTE).

Figure 5. Zebra finch IGL targeting elements for AID-mediated sequence diversification.

Figure 5

(A) Dot-plot comparison of the zebra finch and chicken IGL locus. A conserved sequence element that overlaps with the chicken 3′RR is marked as zebra finch targeting element (ZFTE). (B) Schematic representation of the IGL locus in the ZFTE DT40 line and the corresponding inactive ΔME6.1 line. The ZFTE element is shown as a filled box. (C) Steady state-transcript levels of IGL normalized to those of GAPDH. The level for the ΔM genotype is arbitrarily set to one, and each bar is the average (±s.d.) of triplicate experiments for two independent cell clones. (D) Mutation event frequencies in the IGL (black bar) and IGH (white bars) genes as determined by DNA sequencing (see Figure 3). The background level of 5×10−5 events/bp is shown as a dotted line. Note that the ΔM data are the same as presented in Fig. 2. (E) Scatter plot of the percentage of IGM+ cells in 24 subclones of DT40 ZFTE cells after three weeks of continuous culture as determined by flow cytometry. Data from twelve subclones of ΔME6.1 reprensenting the corresponding “empty” IGL serve as a control. The average percentage was calculated and shown as a black bar.

To test whether the ZFTE is able to promote GCV and SHM in the context of the chicken locus, knock-in cell lines were generated in which the chicken targeting element was replaced with this fragment (Fig. 5B, and Fig. S2B). The ZFTE was fully able to promote transcription of the chicken IGL gene (Fig. 5C), and evidence for ongoing sequence diversification by SHM/GCV was readily observed by flow cytometry (Fig. 5E). Furthermore, DNA sequencing also showed clear evidence for ongoing SHM/GCV (Fig. 5D) albeit at levels below those observed in the parental ΔM line (15). Interestingly, AID-mediated mutagenesis in the IGH locus of the ZFTE DT40 cells was similarly reduced in two independently generated lines (Fig. 5E), but the reason for this effect remains unknown. Overall, these observations strongly suggests that the MEEs for AID-mediated sequence diversification are evolutionarily conserved.

A common strategy to find conserved binding sites in cis-regulatory elements in rapidly evolving loci is to look at comparisons between closely related species. Thus we isolated and sequenced BAC clones containing the IGL locus of the California condor (Gymnogyps californianus) and identified a region that showed striking homology to both the chicken and the zebra finch locus (a detailed description of the California condor IGL locus will be published elsewhere). Three-way alignments using MULAN revealed two stretches with strong sequence homology, one upstream and one downstream of the Enh enhancer (Fig. 6A). Strikingly, the downstream region partially overlaps with the small 222 bp sequence element that exhibited MEE activity in the FragC line (Fig. 6B), and hence we conclude that this area of homology might represent at least parts of the 3′MEE. In this region the in silico approach to identify conserved transcription factor binding sites using multiTF, predicted binding sites for the transcription factors E2F, FoxO forkhead proteins, and a GATA factor binding site (Fig. 6A). Furthermore, one stretch of nucleotides (AGXTTGTAAACAXGCTGA) stood out as it showed almost perfect conservation across all three species. No match to this sequence was found, however, in any of the available databases scanned. Importantly, no evolutionarily conserved sites for any of the transcription factors that had been previously implicated in targeting, including E2A (2426), NFκB, Mef2, and Oct1/2 (17), was present in this region. Lastly, based on our current delineation of the border between the transcriptional enhancer and the 3′MEE, the GATA site is a strong candidate for conferring MEE function as it only is present in the IGL locus of the FragC lines which showed SHM/GCV, but not in the FragB lines which only shows transcription (Fig. 6B).

Figure 6. Evolutionary conservation of mutation enhancer elements.

Figure 6

(A) The predicted location of the 5′MEE and 3′MEE (red bars) as determined by three way comparison between chicken (Genbank NW_001471461.1, nt 1149180–1151691 and nt 1148230–1148400), zebra finch (Genbank NW_002197395.1, nt 5112960–5107578 and nt 5106902–5107106), and California condor (GenBank JF693631, nt 1–800, and GenBank HQ414233, nt 478–680) using MULAN. Evolutionarily conserved transcription factor binding sites predicted by multiTF are shown in blue binding sites, and a highly conserved sequence motif that might play a role in MEE function is highlighted in orange. A chicken IGL silencer element annotated in GenBank L26587 is shown as a black line. (B) Location of the evolutionarily conserved sequence element with respect to the FragB and FragC clones (see Fig. 2) which show equal transcription but are deficient or active for SHM/GCV, respectively.

With respect to the upstream region of homology it is conceivable that this area also corresponds to a cis-regulatory element, and hence it might include the 5′MEE of the chicken IGL locus (Fig. 6A). It is important to note, however, that a transcriptional silencer has been annotated in this region as well (Fig. 6A). Interestingly, this putative 5′MEE contains a conserved E-box motif (CAGCTG) that could function similar to the E-box motifs in the murine Igκ enhancers (24), but surprisingly, there is no overlap between the predicted transcription factor binding sites in the evolutionarily conserved putative 5′MEE and 3′MEE. In summary, our sequence alignment suggest that the sequences of the IGL MEEs are evolutionarily conserved, and that their function might be mediated by highly conserved modules containing binding sites for known transcription factors and targeting factors of unknown identity.

Discussion

AID is a DNA mutator that poses an enormous threat to genomic integrity in B cells when its expression is turned on upon activation. A widely entertained working model proposes that cis-regulatory sequences might play a key role in restricting these processes to Ig genes (27). Our experimental data presented here now show that MEEs and the transcriptional enhancers in chicken IGL locus are physically separable, i.e. MEEs are critical elements beyond transcriptional enhancers to recruit AID-mediated mutagenesis. Currently, we cannot rule out that MEEs also influence transcription. Overall, our findings require a drastic revision of the long-standing model that Ig enhancer elements (defined and identified by narrowing on sequences elements with strong classical transcription enhancing features) also harbor the cis-regulatory sequences that target SHM, GCV, and CSR to these gene loci (13). The observation of AID-dependent mutation events in non-Ig genes (e.g. in BCL6) suggests that MEEs are also present outside Ig loci, and might be more widely distributed than previously anticipated (6, 8). It is currently unknown, whether such non-Ig MEEs are evolutionarily conserved (i.e. whether e.g. the BCL6 gene of all jawed vertebrates is mutated by AID), but it seems rather unlikely as they might be subject to negative selective pressure. MEEs might also be the defining features of genomic loci with increased instability, like breakpoint cluster regions and recombination hotspots, and are directing translocations that ultimately lead to lymphoma formation.

One of the key questions with respect to targeting of AID-dependent sequence diversification is the identity of minimal MEEs, i.e., the minimal sequence elements that are necessary and sufficient to target SHM, GCV, and CSR to Ig loci. Our analyses revealed the presence of two distinct and largely redundant MEEs in the chicken IGL locus, the 5′MEE and 3′MEE, upstream and downstream of the previously described minimal transcriptional enhancer. These data are consistent with some of the published data obtained in the context of a largely modified IGL locus including a GFP reporter gene (16). Furthermore, downstream of the full transcriptional enhancer defined in this study we identified a small 222 bp fragment that likely represents a central component of the 3′MEE. Ongoing systematic deletion studies in this region of the IGL locus will help in fine-mapping the boundaries and location of this element.

Using an evolutionary approach, we showed that a 1.7 kb fragment from the zebra finch IGL locus is functional as a SHM/GCV targeting element when placed in the chicken cells. Chicken belongs to the Galloanserae, while the zebra finch and calfornia condor are members of the Neoaves, and the last common ancestor of these birds is thought to date back to 105 million years ago (28). Three-way alignments between Ig gene loci sequences of these three bird species focused our attention to smaller highly conserved regions upstream and downstream of the minimal Enh enhancer. The relevance of these regions for IGL biology was strongly supported by the fact that the downstream homology area overlaps with the location of the 3′MEE determined in our functional studies. Importantly, the putative 5’MEE and 3’MEE both harbor short sequence modules with nearly complete evolutionary conservation containing putative binding sites for CDP, CEBPB, E2A, HSF1, E2F, FoxO, and GATA transcription factors, while other modules remain orphans (Fig. 6). It is tempting to speculate that subsets of these (individually or in combination) act as binding sites for trans-acting factors mediating targeting. Interestingly, although the 5′MEE and the 3′MEE are largely redundant, there is no overlap between the predicted transcription factor binding sites for the upstream and downstream MEE. As the corresponding transcription factors are rather broadly used in lymphocytes and not unique to activated B cells, we predict that the as-of-yet uncharacterized conserved sequences may hold the key to understanding the mechanism of SHM/GCV targeting.

Sequence comparisons between the bird MEEs and mammalian Ig loci do not reveal any regions of high similarity, but we predict that MEEs exist not only in mammalian Ig genes but in the Ig loci of all jawed vertebrates. As SHM is conserved throughout the jawed vertebrate lineage, we favor a model in which the targeting factors recruited by these cis-regulatory sequences are evolutionarily conserved as well. Hence the identification of the minimal set of target factor binding sites required for MEE function will facilitate bioinformatics approaches to determine the location of clusters of such sites (and hence the MEEs) in all Ig loci. Such approach would also represent a tool towards identifying MEEs in non-Ig genes like BCL6. In an alternative model, functionally conserved factors with altered DNA sequence specificity would mediate targeting, and comparing the Ig loci of multiple phylogenetically closely related species (as done for birds in this manuscript) would guide to conserved cis-regulatory sequences including MEEs. A comparison of the Igκ locus of mouse (Mus musculus) and rat (Rattus norvegicus) does reveal a large number of conserved non-coding sequence segments, but the lack of genomic sequence from a third rodent currently precludes a more stringent approach towards identifying candidates for MEEs in their Ig loci.

Lastly, the mode of action of MEEs at the molecular level remains elusive. As AID is more widely distributed over the genome than previously thought (6, 8, 29), the most simplistic model, the specific recruitment of the AID enzyme to Ig gene loci, is likely incorrect. Thus Ig locus specific activation of AID, and/or recruitment of error-prone polymerases represent plausible mechanisms of MEE function. This would be consistent with a model that MEEs are also present and active in non-Ig loci even outside the B cell lineage where AID is unlikely to be the initiating factor for genomic instability. The recent description of DNA ZIP codes, small DNA sequences that drive the localization of repressed genes to the nuclear pore in budding yeast (30), raises the possibility that MEEs might act in a similar fashion to localize Ig loci to distinct subnuclear regions in which error-prone DNA repair occurs.

Supplementary Material

1
2
3

Acknowledgements

We thank Drs. David R. Wilson, Ranjan Sen, and Shu Yuan Yang for helpful discussions, suggestions, and comments on this manuscript. We are indebted to Dr. Arthur P. Arnold for providing us genomic DNA from a zebra finch, and Dr. Oliver A. Ryder for the California condor genomic DNA sample.

Abbreviations

3′RR

3′ regulatory region

AID

activation-induced cytidine deaminase

CSR

class switch recombination

Enh

chicken IGL transcription enhancer

GCV

immunoglobulin gene conversion

MEE

mutation enhancer element

SHM

somatic hypermutation

ZFTE

zebra finch targeting element

Footnotes

GenBank submission

The sequences of the condor IGL locus fragments were deposited at GenBank under the accession numbers HQ414233 and JF693631.

The authors declare no competing financial conflicts of interest.

This work was entirely supported by the intramural research program of the NIH, National Institute on Aging.

References

  • 1.Muramatsu M, Kinoshita K, Fagarasan S, Yamada S, Shinkai Y, Honjo T. Class switch recombination and hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell. 2000;102:553–563. doi: 10.1016/s0092-8674(00)00078-7. [DOI] [PubMed] [Google Scholar]
  • 2.Revy P, Muto T, Levy Y, Geissmann F, Plebani A, Sanal O, Catalan N, Forveille M, Dufourcq-Labelouse R, Gennery A, Tezcan I, Ersoy F, Kayserili H, Ugazio AG, Brousse N, Muramatsu M, Notarangelo LD, Kinoshita K, Honjo T, Fischer A, Durandy A. Activation-induced cytidine deaminase (AID) deficiency causes the autosomal recessive form of the Hyper-IgM syndrome (HIGM2) Cell. 2000;102:565–575. doi: 10.1016/s0092-8674(00)00079-9. [DOI] [PubMed] [Google Scholar]
  • 3.Arakawa H, Hauschild J, Buerstedde JM. Requirement of the activation-induced deaminase (AID) gene for immunoglobulin gene conversion. Science. 2002;295:1301–1306. doi: 10.1126/science.1067308. [DOI] [PubMed] [Google Scholar]
  • 4.Di Noia JM, Neuberger MS. Molecular mechanisms of antibody somatic hypermutation. Annu. Rev. Biochem. 2007;76:1–22. doi: 10.1146/annurev.biochem.76.061705.090740. [DOI] [PubMed] [Google Scholar]
  • 5.Longerich S, Basu U, Alt F, Storb U. AID in somatic hypermutation and class switch recombination. Curr. Opin. Immunol. 2006;18:164–174. doi: 10.1016/j.coi.2006.01.008. [DOI] [PubMed] [Google Scholar]
  • 6.Liu M, Duke JL, Richter DJ, Vinuesa CG, Goodnow CC, Kleinstein SH, Schatz DG. Two levels of protection for the B cell genome during somatic hypermutation. Nature. 2008;451:841–845. doi: 10.1038/nature06547. [DOI] [PubMed] [Google Scholar]
  • 7.Wang CL, Harper RA, Wabl M. Genome-wide somatic hypermutation. Proc Natl Acad Sci U S A. 2004;101:7352–7356. doi: 10.1073/pnas.0402009101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Hasham MG, Donghia NM, Coffey E, Maynard J, Snow KJ, Ames J, Wilpan RY, He Y, King BL, Mills KD. Widespread genomic breaks generated by activation-induced cytidine deaminase are prevented by homologous recombination. Nat Immunol. 2010;11:820–826. doi: 10.1038/ni.1909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Robbiani DF, Bothmer A, Callen E, Reina-San-Martin B, Dorsett Y, Difilippantonio S, Bolland DJ, Chen HT, Corcoran AE, Nussenzweig A, Nussenzweig MC. AID is required for the chromosomal breaks in c-myc that lead to c-myc/IgH translocations. Cell. 2008;135:1028–1038. doi: 10.1016/j.cell.2008.09.062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Pasqualucci L, Bhagat G, Jankovic M, Compagno M, Smith P, Muramatsu M, Honjo T, Morse HC, 3rd, Nussenzweig MC, Dalla-Favera R. AID is required for germinal center-derived lymphomagenesis. Nat Genet. 2008;40:108–112. doi: 10.1038/ng.2007.35. [DOI] [PubMed] [Google Scholar]
  • 11.Shen HM, Peters A, Baron B, Zhu X, Storb U. Mutation of BCL-6 gene in normal B cells by the process of somatic hypermutation of Ig genes. Science. 1998;280:1750–1752. doi: 10.1126/science.280.5370.1750. [DOI] [PubMed] [Google Scholar]
  • 12.Pasqualucci L, Migliazza A, Fracchiolla N, William C, Neri A, Baldini L, Chaganti RS, Klein U, Kuppers R, Rajewsky K, Dalla-Favera R. BCL-6 mutations in normal germinal center B cells: evidence of somatic hypermutation acting outside Ig loci. Proc. Natl. Acad. Sci. USA. 1998;95:11816–11821. doi: 10.1073/pnas.95.20.11816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Odegard VH, Schatz DG. Targeting of somatic hypermutation. Nat. Rev. Immunol. 2006;6:573–583. doi: 10.1038/nri1896. [DOI] [PubMed] [Google Scholar]
  • 14.Dunnick WA, Collins JT, Shi J, Westfield G, Fontaine C, Hakimpour P, Papavasiliou FN. Switch recombination and somatic hypermutation are controlled by the heavy chain 3' enhancer region. J Exp Med. 2009;206:2613–2623. doi: 10.1084/jem.20091280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kothapalli N, Norton DD, Fugmann SD. Cutting Edge: A cis-Acting DNA Element Targets AID-Mediated Sequence Diversification to the Chicken Ig Light Chain Gene Locus. J Immunol. 2008;180:2019–2023. doi: 10.4049/jimmunol.180.4.2019. [DOI] [PubMed] [Google Scholar]
  • 16.Blagodatski A, Batrak V, Schmidl S, Schoetz U, Caldwell RB, Arakawa H, Buerstedde JM. A cis-acting diversification activator both necessary and sufficient for AID-mediated hypermutation. PLoS Genet. 2009;5:e1000332. doi: 10.1371/journal.pgen.1000332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Kim Y, Tian M. NF-kappaB family of transcription factor facilitates gene conversion in chicken B cells. Mol Immunol. 2009;46:3283–3291. doi: 10.1016/j.molimm.2009.07.027. [DOI] [PubMed] [Google Scholar]
  • 18.Arakawa H, Lodygin D, Buerstedde JM. Mutant loxP vectors for selectable marker recycle and conditional knock-outs. BMC Biotechnol. 2001;1:7. doi: 10.1186/1472-6750-1-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R, Hardison R, Miller W. PipMaker--a web server for aligning two genomic DNA sequences. Genome Res. 2000;10:577–586. doi: 10.1101/gr.10.4.577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ovcharenko I, Loots GG, Giardine BM, Hou M, Ma J, Hardison RC, Stubbs L, Miller W. Mulan: multiple-sequence local alignment and visualization for studying function and evolution. Genome Res. 2005;15:184–194. doi: 10.1101/gr.3007205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Bulfone-Paus S, Reiners-Schramm L, Lauster R. The chicken immunoglobulin lambda light chain gene is transcriptionally controlled by a modularly organized enhancer and an octamer-dependent silencer. Nucleic Acids Res. 1995;23:1997–2005. doi: 10.1093/nar/23.11.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ichiyanagi K, Okada N. Mobility pathways for vertebrate L1, L2, CR1, and RTE clade retrotransposons. Mol Biol Evol. 2008;25:1148–1157. doi: 10.1093/molbev/msn061. [DOI] [PubMed] [Google Scholar]
  • 23.Gopal AR, Fugmann SD. AID-mediated diversification within the IgL locus of chicken DT40 cells is restricted to the transcribed IgL gene. Mol Immunol. 2008;45:2062–2068. doi: 10.1016/j.molimm.2007.10.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Tanaka A, Shen HM, Ratnam S, Kodgire P, Storb U. Attracting AID to targets of somatic hypermutation. J Exp Med. 2010;207:405–415. doi: 10.1084/jem.20090821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Michael N, Shen HM, Longerich S, Kim N, Longacre A, Storb U. The E box motif CAGGTG enhances somatic hypermutation without enhancing transcription. Immunity. 2003;19:235–242. doi: 10.1016/s1074-7613(03)00204-8. [DOI] [PubMed] [Google Scholar]
  • 26.Yabuki M, Ordinario EC, Cummings WJ, Fujii MM, Maizels N. E2A acts in cis in G1 phase of cell cycle to promote Ig gene diversification. J Immunol. 2009;182:408–415. doi: 10.4049/jimmunol.182.1.408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Yang SY, Schatz DG. Targeting of AID-mediated sequence diversification by cis-acting determinants. Adv Immunol. 2007;94:109–125. doi: 10.1016/S0065-2776(06)94004-8. [DOI] [PubMed] [Google Scholar]
  • 28.van Tuinen M. Birds (Aves) In: Hedges SB, Kumar S, editors. The Time Tree of Life. New York, NY: Oxford University Press; 2009. pp. 409–411. [Google Scholar]
  • 29.Pavri R, Gazumyan A, Jankovic M, Di Virgilio M, Klein I, Ansarah-Sobrinho C, Resch W, Yamane A, San-Martin BR, Barreto V, Nieland TJ, Root DE, Casellas R, Nussenzweig MC. Activation-Induced Cytidine Deaminase Targets DNA at Sites of RNA Polymerase II Stalling by Interaction with Spt5. Cell. 2010;143:122–133. doi: 10.1016/j.cell.2010.09.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ahmed S, Brickner DG, Light WH, Cajigas I, McDonough M, Froyshteter AB, Volpe T, Brickner JH. DNA zip codes control an ancient mechanism for gene targeting to the nuclear periphery. Nat Cell Biol. 2010;12:111–118. doi: 10.1038/ncb2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2
3

RESOURCES