Abstract
Genome-wide mutations and selection within a population are the basis of natural evolution. A similar process occurs during antibody affinity maturation when immunoglobulin genes are hypermutated and only those B cells which express antibodies of improved antigen-binding specificity are expanded. Protein evolution might be simulated in cell culture, if transgene-specific hypermutation can be combined with the selection of cells carrying beneficial mutations. Here, we describe the optimization of a GFP transgene in the B cell line DT40 by hypermutation and iterative fluorescence activated cell sorting. Artificial evolution in DT40 offers unique advantages and may be easily adapted to other transgenes, if the selection for desirable mutations is feasible.
INTRODUCTION
Natural evolution, based on the selection of beneficial mutations within a population of genetic variants, has created the amazing diversity of life on our planet. While natural selection works on whole genomes, the evolution of individual proteins can be tracked by the analysis of intra-species polymorphisms and inter-species divergence. A fascinating example for the evolution of a single protein by mutation and selection is the affinity maturation of antibodies in B cells (1). Developing B cells activate and diversify their immunoglobulin (Ig) genes by recombination, but only cells encoding antibodies with antigen-binding specificity are stimulated to expand after antigen challenge. These B cells hypermutate their Ig genes and cycles of hypermutation and selection continue until antibodies of sufficiently high antigen-binding affinity emerge.
Recombinant DNA technologies are able to recapitulate the evolution of proteins by mutagenesis and selection in vitro. Most approaches combine the expression of diverse gene libraries in phage, bacteria or translation systems with selection for the encoded proteins (2,3). If the structure-function relationship of a protein is sufficiently understood, site-directed mutagenesis of critical amino acids can be used instead or in addition to random mutagenesis (4). The potential of in vitro protein evolution is illustrated by the example of the green fluorescent protein (GFP) (5) which could be changed into yellow, cyan or blue variants by random and site-directed mutagenesis and selection in vitro (6–8).
Affinity maturation of antibodies can be simulated ex vivo in cultures of hypermutating B cell lines by the enrichment of cells expressing antigen-specific antibodies (9,10). B cell lines could also be used for the optimization of non-immunoglobulin proteins, if the hypermutating activity were directed toward transfected transgenes and cells carrying beneficial mutations were selected. Advantages of this approach are the possibilities to generate an enormous amount of genetic diversity and to select for improved protein variants within a living cell culture. It was recently reported that a gene encoding a red fluorescent protein (FP) was transferred into the hypermutating human B cell line RAMOS, and the selection of cells emitting red-shifted fluorescence yielded the most far red-shifted FP protein known to date (11).
Despite this impressive result, hypermutation is difficult to control in RAMOS, because transgenes usually integrate at random chromosomal sites outside the hypermutating Ig loci. In contrast, transgenes can be easily inserted into the Ig loci of the chicken B cell line DT40 (12). DT40 diversifies its rearranged Ig light chain gene by pseudo V (ψV) gene-templated gene conversion (13), but if gene conversion is blocked due to the inactivation of RAD51 paralogues or the deletion of gene conversion donors, hypermutation occurs (14,15). Ig gene diversification by conversion or hypermutation requires expression of the AID gene (15,16). Based on these results, we reasoned that transgenes inserted into the Ig loci of DT40 without nearby gene conversion donors would be diversified by hypermutation in AID positive cells. To test this hypothesis, we inserted the enhanced GFP (eGFP) gene (17) into the rearranged Ig light chain locus and searched for cells displaying increased fluorescence. Only three rounds of fluorescence activated cell sorting (FACS) were sufficient to isolate cells expressing eGFP variants whose fluorescent intensity appears to be superior to the best GFPs currently available for vertebrate cell labeling.
MATERIALS AND METHODS
Cell lines
Cells were cultured in chicken medium (RPMI-1640 or DMEM/F-12 with 10% fetal bovine serum, 1% chicken serum, 2 mM l-glutamine, 0.1 mM β-mercaptoethanol and penicillin/streptmycin) at 41°C with 5% CO2. The AID expressing DT40 clones, AIDR1CL1 and AIDR1CL2, used in the study for hypermutation were derived from the AID knockout cell clone DT40Cre1AID−/− (16) by stable transfection of a floxed AID—IRES (internal ribosome entry site)—gpt (guanine phosphoribosyl transferase) bicistronic cassette. The IRES sequence was derived from Encephalomyocarditis virus. Because the bi-cistronic cassette is driven by the strong β-actin promoter and since AID is the first gene downstream of the promoter, AID protein expression in AIDR1CL1 and CL2 is higher than in wild-type DT40 cells (unpublished data). The cell clones express the Cre recombinase as a MerCreMer fusion protein which is inactive due to its retention in the endoplasmic reticulum in the absence of estrogen derivatives (18). However, since background activity of MerCreMer can lead to undesired excision of the floxed AID-gpt expression cassette during prolonged culture, we selected for cells retaining AID by culturing in media containing 0.5 µg/ml of mycophenolic acid for 3 days following each preparative FACS sort.
An AID negative subclone of AIDR1Cl1 was generated by culturing the cells in chicken medium containing 1 µM 4-hydroxitamoxifen (SIGMA) for 2 days and subsequent subcloning.
Targeted integration of transgenes into the rearranged Ig locus
Cells were transfected by electroporation using the Gene Pulser Xcell (BIO-RAD) at 25 µF and 700 V (16) and stable transfectants were selected for using 1 µg/ml of puromycin. Transfectants having integrated the pHypermut1- or pHypermut2-derived constructs by targeted integration were identified by PCR using primer pairs P1/P2 and P1/P3, respectively. The frequency of targeted integration after transfection of pHypermut constructs was consistently more than 70% (data not shown). Since the constructs can target either the rearranged or the unrearranged Ig locus of DT40, integration into the rearranged Ig light chain locus was verified by PCR amplification of the VJ intervening sequence of the unrearranged locus using primer pairs of P4/P5.
Flow cytometry
To quantify the appearance of GFP negative cells within proliferating cultures, the AIDR1IgLeGFP1, AID−/−IgLeGFP1 and ψVAIDR1IgLeGFP clones were subcloned by limited dilution, and 24 subclones of each were analyzed 2 weeks after subcloning by FACS. The preparative FACS sorts to select cells of increased fluorescence were performed using the MoFlo high-speed cell sorter (Cytomation).
PCR, cloning and sequencing
The screenings for targeted integration and the amplification of the eGFP transgenes were performed with the expand long template PCR system (Roche) under the following conditions: 2 min of initial incubation at 93°C; 35 cycles consisting of 10 s at 93°C, 30 s at 65°C and 5 min (plus an added 20 s per cycle) at 68°C and a final elongation step of 7 min at 68°C. The primer pair for the amplification of the eGFP gene from sorted cells was P6/P7. The PCR products were digested with HindIII and XbaI, cloned into the pUC119 plasmid vector and sequenced using primers P8 and P9. The mutated eGFP genes were then amplified using Pfu Ultra hotstart polymerase (Stratagene) and primer pair P10/P11, digested by AvrII and cloned into the NheI site of pHypermut2. The orientation and the sequence of the mutant eGFPs in pHypermut2 was verified by sequencing using primers P12 and P13.
Site-directed mutagenesis
To combine the codon changes found in different eGFP variants, we designed primers (P14–P26) which included the intended mutations in the center of the primer sequence. In the first step, parts of the eGFP gene were amplified by PCR using at least one mutation-containing primer. In the second step, PCR fragments containing mutations were mixed with other PCR fragments to cover the full eGFP coding sequence. This mixture served as a template for chimeric PCR using the primer pair of P10/P11. The full length mutant eGFP PCR fragments were digested with AvrII and cloned into the NheI site of pHypermut2. The orientation and the sequence of the mutant eGFPs in pHypermut2 were confirmed by sequencing.
Primers
P1 GGGACTAGTAAAATGATGCATAACCTTTTGCACA
P2 CGATTGAAGAACTCATTCCACTCAAATATACCC
P3 CCCACCGACTCTAGAGGATCATAATCAGCC
P4 TACAAAAACCTCCTGCCAGTGCAAGGAGCAGCTGATGGTTTTTACTGTCT
P5 GGGGGATCCAGATCTGTGACCGGTGCAAGTGATAGAAACT
P6 GGGAAGCTTTGGGAAATACTGGTGATAGGTGGAT
P7 GGGTCTAGACCTCTCAGCTTTTTCAGCAGAATAACCTCC
P8 GGTATAAAAGGGCATCGAGGTCCCCGGCAC
P9 AGTTCGAGGGCGACACCCTGGTGAACCGCA
P10 GAACCTAGGGCCACCATGGTGAGCAAGGGCGAGGA
P11 GAACCTAGGACTTGTACAGCTCGTCCATGCCG
P12 CCTAGCTCGATACAATAAACGCCATTTGAC
P13 TGGCTTCGGTCGGAGCCATGGAGATC
P14 AACGGCATCAAGGcGAACTTCAAGATC
P15 GATCTTGAAGTTCgCCTTATGCCGTT
P16 CCCGACCACATGAAGgAGCACGACTTCTTC
P17 GAAGAAGTCGTGCTcCTTCATGTGGTCGGG
P18 GATCACATGGTCCTGgTGGAGTTCGTGACC
P20 GGTCACGAACTCCAcCAGGACCATGTGATC
P21 GCCGACCACTACCAGgAGAACACCCCCATC
P22 GATGGGGGTGTTCTcCTGGTAGTGGTCGGC
P23 CTGAgCACCCAGTCCaCCCTGAgCAAAGAC
P24 GTCTTTGcTCAGGGtGGACTGGGTGcTCAG
P25 ATCCTGGGGCACAAGgTGGAGTACAACT
P26 AGTTGTACTCCAcCTTGTGCCCCAGGAT
Color spectrum
Excitation and emission spectra were analyzed by the luminescence spectrometer LS50B (Perkin Elmer). One million cells were washed once with PBS, resuspended in 2 ml of PBS, and then used for spectrum analysis. The relative ability to excite by lasers of different wavelengths was measured at fixed emission of 540 nm wavelength. The relative emission intensities at different wavelengths were measured after fixed excitation at 460 nm wavelength.
RESULTS
Targeted integration of the eGFP gene into the rearranged Ig light chain locus
The eGFP coding sequence was inserted into the pHypermut1 vector and transfected into the DT40 variant, AIDR1CL1, which is deleted for both alleles of the endogenous AID gene and expresses AID as a floxed cDNA expression cassette (16). Clones which had integrated the construct into the rearranged Ig light chain locus were identified by PCR. Transcription of the eGFP gene in these clones is supposed to be driven by the Ig light chain promoter and terminated by the SV40 polyA signal (Figure 1A). The clones contained two floxed transgenes, the bsr marker gene co-inserted in the rearranged Ig light chain locus and the AID-IRES-gpt gene of the AIDR1 progenitor clone. To remove the bsr marker alone or together with the AID expression cassette, Cre recombinase was induced by tamoxifen before subcloning by limited dilution (16). In this way, AID positive (AIDR1IgLeGFP1) and negative (AID−/−IgLeGFP1) clones could be isolated which had deleted the bsr marker gene from the light chain locus.
Figure 1.
Strategy for artificial evolution of eGFP gene. (A) A physical map of the chicken rearranged Ig light chain locus, the pHypermut1-eGFP targeting construct and the rearranged Ig light chain locus after targeted integration and marker excision is shown. The positions of primers used for the identification of targeted integration events are shown by arrows. (B) FACS profiles of AIDR1IgLeGFP1 and AID−/−IgLeGFP1 clones. The average percentages of events falling into the GFPhigh and GFPlow gates based on the measurement of 24 subclones are shown. (C) Sorting strategy for cells of increased fluorescence activity.
Evidence for hypermutation of the eGFP transgene
The AIDR1IgLeGFP1 and AID−/−IgLeGFP1 clones were subcloned by limited dilution and 24 subclones of each were analyzed by FACS for GFP brightness 14 days after subcloning. Subclones of the AIDR1IgLeGFP1 clone contained cell populations showing lower (3.8%) and higher (0.5%) fluorescence than the dominant GFP positive cell population, whereas subclones of the AID−/−IgLeGFP1 clone consisted only of a homogeneous GFP positive population (Figure 1B). This suggests that the eGFP gene inserted into the rearranged Ig light chain locus is diversified in AIDR1IgLeGFP1 cells giving rise to variants of increased and decreased fluorescent intensity.
Mutation activity of artificial evolution system
To analyze the mutation spectrum of the eGFP transgenes, the coding sequence was amplified by PCR from AIDR1IgLeGFP1 and AID−/−IgLeGFP1 cells 6 weeks after subcloning, cloned into a plasmid vector and sequenced. A total of 13 nt changes were found within the 0.7 kb eGFP coding region of 39 sequences from the AIDR1IgLeGFP1 clone (Table 1). Assuming a DT40 doubling time of 10 h, the mutation rate of the eGFP coding region in the rearranged Ig light chain locus of AIDR1IgLeGFP1 cells is approximately 4.7 × 10−6 mutations per base pair and division. This mutation frequency is approximately three times lower than the mutation rate of the rearranged VJ segment in the pseudogene deleted ψV−AIDR cell line (15) perhaps reflecting the lower incidence of hypermutation hotspots in the eGFP sequence. In both cases, the majority of mutations occurred at C/G base pairs (Table 1). Only 1 nt change which possibly represents a PCR artifact was found in 26 sequences of the AID/IgLeGFP1 clone confirming that the mutation activity of the eGFP transgene is dependent on AID.
Table 1.
Mutation profile
| Cell source | Gene | Mutations | Number of sequences | Mutations/ sequence | Mutations at | Duplication | Deletion | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| C | G | A | T | ||||||||
| AID−/−IgLeGFP1 | 6-week culture | eGFP (723 bp) | 1 | 26 | 0.04 | 0 | 0 | 1 | 0 | 0 | 0 |
| AIDR1IgLeGFP1 | 6-week culture | eGFP (723 bp) | 13 | 39 | 0.33 | 5 | 8 | 0 | 0 | 1 | 0 |
| AIDR1IgLeGFP1 | Sort I | eGFP (723 bp)a | 15 | 95 | 0.16 | 8 | 6 | 1 | 0 | 0 | 0 |
| AIDR1IgLeGFP1 | Sort I | polyA-L (300 bp)a | 88 | 95 | 0.93 | 1 | 87 | 0 | 0 | 0 | 0 |
| AIDR1IgLeGFP1 | Sort III | eGFP (723 bp) | 221 | 85 | 2.60 | 130 | 93 | 6 | 4 | 0 | 0 |
Their sequences were derived from the different region of the same plasmids.
Sorting and sequence analysis of cells displaying increased fluorescence
The FACS analysis suggested that a few cells within the AIDR1IgLeGFP1 culture had accumulated mutations which increased either the abundance or the fluorescent intensity of the eGFP protein. To enrich these cells, 10 million AIDR1IgLeGFP1 cells were sorted by preparative FACS using a gate which included only the 0.05% brightest cells (Figure 1C). The cells collected by the FACS sort were expanded and cycles of preparative cell sorting and expansion were repeated twice. Three consecutive sorts were performed independently for three AIDR1IgLeGFP1 subclones.
To trace the rise of mutations responsible for the increased fluorescence, the coding sequences of the eGFP genes were amplified by PCR from cells after the first and the third sort, cloned into a plasmid vector and sequenced. Unexpectedly, few mutations and no obvious mutation signatures were found in the eGFP coding sequence after the first sort (Table 1) despite the fact that the sorted cells displayed increased fluorescence (Figure 1C). We therefore extended the sequence analysis to the region downstream of the eGFP coding sequence by primer walking. This analysis revealed frequent mutations at the splice donor site of the Ig light chain leader intron (88 mutations/95 seq) (Figure 2A, Table 1). Although the role of these mutations downstream of the eGFP polyA signal remains speculative, it is possible that they enhance fluorescence by increasing eGFP mRNA stability.
Figure 2.
Mutations downstream and within the eGFP transgene. (A) Mutations identified at the exon/intron border of the Ig light chain leader sequence. The number of times a mutation was found is shown by superscript. (B) Mutations within the eGFP coding sequence. Mutations were mapped below the reference eGFP sequence together with the corresponding amino acid codons. The eGFP sequence which we used had one codon (Val) inserted next to the start codon to yield an optimal translation-initiation sequence (Kozak motif). This Val was numbered as codon 1a according to Crameri (26) for the easier comparison to wild-type GFP and previously reported GFP mutants. (C) Pedigrees for the evolution of the eGFP transgenes in culture. The number of times each sequence was found within each subclone is indicated at the right side of the circles. The amino acid changes of each step are shown beside the arrows. Sequences identified in more than one subclone are named v1–v9.
In contrast, most of the sequences after the third sort showed mutations in the eGFP coding sequence (221 mutations/85 seq) (Figure 2B, Table 1). Comparison of the sequences of each sorted subclone allowed us to reconstruct evolutionary trees showing the stepwise accumulation of eGFP codon changes (Figure 2C). Many of the codon changes were not shared by different subclones and their functional significance remains uncertain. However, a number of codon changes occurred independently in different subclones suggesting that they are related to the increased fluorescence of the sorted cells. Mutated eGFP sequences which were found more than once were named GFP variants (GFPv1–9).
Development of a new vector for transgene evolution in DT40
Although we were able to use pHypermut1 for the enhancement of eGFP, this vector is not ideally suited for transgene evolution in DT40, as mutations outside the transgene coding region had apparently been selected for after the first sort. We therefore developed a new targeting vector named pHypermut2 in which the expression of the transgene is unlikely to be influenced by mutations of the rearranged Ig light chain gene, because transgene transcription is driven by the Rous sarcoma virus (RSV) promoter in the opposite direction of the light chain gene (Figure 3A). pHypermut2 allows transgenes to be cloned into multiple cloning sites (NheI, EcoRV and BglII). The vector's puromycin resistance (puroR) cassette, located downstream of the transgene can be excised by Cre recombinase (19), but it is unlikely to interfere with transgene transcription or mutagenesis. A convenient feature of pHypermut2 is the presence of a SpeI site upstream of the RSV promoter which can be used for the insertion of gene conversion donor sequences (20).
Figure 3.
Hypermutation of transgenes using pHypermut2. (A) Plasmid map of pHypermut2 vector. Target genes for artificial evolution can be cloned into the NheI, EcoRV or BglII sites. Potential gene conversion donor sequences can be cloned into the SpeI site. (B) A physical map of the rearranged Ig light chain locus, the pHypermut2-eGFP construct and the rearranged Ig light chain locus after targeted integration. The positions of primers used for the identification of targeted integration events are shown by arrows. (C) FACS profile of the ψV−AIDR1IgLeGFP clone having integrated the pHypermut2-eGFP construct into the rearranged Ig light chain locus after 2 weeks culture. The average percentage of events falling into the GFPlow gate based on the measurement of 24 subclones is shown.
We inserted the eGFP transgene into pHypermut2, and transfected the construct into the ψV−AIDR1 cell line (15) (Figure 3B). To estimate the hypermutation rate within the eGFP transgene, one of the stable transfected clones ψV−AIDR1IgLeGFP, having integrated the construct into the rearranged Ig light chain locus, was subcloned by limited dilution. FACS analysis of 24 subclones after 14 days of culture revealed that on average 10.9% of the cells showed decreased or lost fluorescence (Figure 3C). This result indicates that transgenes inserted by pHypermut2 into the rearranged Ig light chain locus are efficiently diversified by hypermutation.
Confirmation of the variant GFP phenotypes
The phenotype of mutations is best confirmed by re-transfection into the genetically stable AID−/− clone and the analysis of transfectants which have integrated the gene as a single copy into the rearranged locus. The pHypermut2 vector is well suited for this task and all isolated GFP variants, as well as the eGFP and the Emerald (8) control sequences, were cloned into pHypermut2. These constructs were then transfected into AID−/−. More than 50% of all stable transfectants had integrated the constructs into the rearranged light chain locus and these clones were named according to the inserted transgene (AID−/−IgLeGFP, AID−/−IgLEmerald, AID−/−IgLGFPv1, etc.).
The fluorescent intensity of the different transfectants was compared by FACS (Figure 4A). Two independent transfectants of each variant GFP gene were included in the experiment to account for variation among transfectants of the same gene. This analysis revealed that the Emerald transfectant was 1.2 times brighter than the eGFP transfectant, four of the nine GFP variant transfectants (AID−/−IgLGFPv2, AID−/−IgLGFPv5, AID−/-−gLGFPv6 and AID−/−IgLGFPv8) were brighter than the eGFP transfectant. Three variant GFP transfectants (AID−/−IgLGFPv5, AID−/−IgLGFPv6 and AID−/−IgLGFPv8) had even higher fluorescent intensity than the Emerald transfectant (Table 2). Of all variant GFP transfectants, AID−/−IgLGFPv6 had the highest fluorescent intensity (2.5-fold more than the AID−/−IgLeGFP) and AID−/−IgLGFPv5 had the second highest intensity (2.0-fold brighter than the AID−/−IgLeGFP) (Figure 4A, Table 2). Little fluorescent variation was observed among transfectants of the same transgene.
Figure 4.
Analysis of variant GFP transfectants. (A) FACS analysis of control and variant GFP transfectants. The average fluorescence of AID London−/−IgLeGFP is indicated by a green line for easier comparison. (B) Flow chart of site-directed mutagenesis of GFP variants. (C) Relative fluorescence of the transfectants normalized to the fluorescence of AID−/−IgLeGFP. (D) Excitation and emission spectra. (E) Image of single cells by fluorescence microscopy. The image of the same single cells is shown with fluorescence activation (upper row) and without fluorescence activation (lower row).
Table 2.
Amino acid changes and brightness of GFP variants
| GFP variant | Amino acid changes | Number of sequences | Subclone | Relative brightnessa |
|---|---|---|---|---|
| eGFP | Change from wild-type GFP | – | – | 1.0 ± 0.1 |
| 1aG, F64L, S65T, H231L, 239S | ||||
| Change from eGFP | ||||
| GFPv1 | Q80E | 28 | 1, 2, 3 | 0.7 ± 0.0 |
| GFPv2 | Q80E, Q184E, A206T | 19 | 1 | 1.2 ± 0.0 |
| GFPv3 | Q80E, E124D, L141V, Y237stop | 11 | 3 | 0.3 ± 0.0 |
| GFPv4 | Q80E, S202T | 4 | 1, 2 | 0.9 ± 0.1 |
| GFPv5 | Q80E, V163A | 4 | 1 | 2.0 ± 0.0 |
| GFPv6 | Q80E, Y145F, S202T | 3 | 2 | 2.5 ± 0.0 |
| GFPv7 | Q80E, S208T | 2 | 3 | 0.7 ± 0.1 |
| GFPv8 | Q80E, L221V | 2 | 3 | 1.5 ± 0.3 |
| GFPv9 | Q80E, L141V, S202I | 5 | 3 | 1.0 ± 0.1 |
| GFPv10 | Q80E, Y145F, V163A, S202T | – | – | 3.0 ± 0.1 |
| GFPv11 (GFPnovo1) | Y145F, V163A, S202T | – | – | 3.2 ± 0.0 |
| GFPv12 (GFPnovo2) | Y145F, V163A, S202T, L221V | – | – | 3.3 ± 0.0 |
| GFPv13 | L141V, Y145F, V163A, Q184E, S202T, A206T, L221V | – | – | 2.9 ± 0.1 |
| Emerald | Change from wild-type GFP | – | – | 1.2 |
| S65T, S72A, N149K, M153T, I167T |
aSDS based on the analysis of two clones.
To compare the fluorescent intensity of eGFP and GFPv6 in human cells, the pHypermut2 constructs of GFPv6 and eGFP were transiently transfected into the human embryonic kidney cell line HEK293T. By FACS analysis, cells transfected with the GFPv6 construct showed on average more fluorescence than cells transfected with the eGFP construct (data not shown).
Increased fluorescence by the combination of mutations
As certain combinations of codon changes were not found among the GFP variants, we wondered whether the accumulation of mutations in a single sequence may lead to further enhancement of fluorescent intensity (Figure 4B). At first, we combined Y145F and S202T of GFPv6 with V163A of GFPv5 to produce GFPv10. Because GFPv1 harboring the single Q80E mutation showed less fluorescent intensity than eGFP, we subtracted Q80E from GFPv10 to generate GFPv11. We also added L221V of GFPv8 to GFPv11, thereby generating GFPv12. Finally, we added Q184E and A206T of GFPv2 as well as L141V of GFPv9 to GFPv12 to generate GFPv13. These new variants were cloned into pHypermut2 and inserted into the rearranged Ig light chain locus of the clone AID−/−.
Transfectants of all new GFP variants showed higher fluorescent intensity than AID−/−IgLGFPv6 by FACS analysis (Figure 4C, Table 2). AID−/−IgLGFPv10 had 3.0-fold higher fluorescent intensity than AID−/−IgLeGFP, but AID−/−IgLGFPv11 and AID−/−IgLGFPv12 were even brighter, displaying 3.2- and 3.3-fold more fluorescence respectively than AID−/−IgLeGFP. AID−/−IgLGFPv13 did not surpass the fluorescent intensity of the other variant GFP transfectants. The variants GFPv11 and GFPv12 which offer the most potential for the labeling of vertebrate cells were named GFPnovo1 and GFPnovo2, respectively.
Spectral properties of the new GFP variants
The excitation and the emission maximum of eGFP are at 488 and 509 nm wavelength, respectively (7,8). To determine the spectral profiles of the new variant GFP proteins, the control and variant gene transfectants (AID−/−IgLeGFP, AID−/−IgLEmerald, AID−/−IgLGFPv5, AID−/−IgLGFPv6, AID−/−IgLGFPnovo1 and the AID−/−IgLGFPnovo2) were analyzed using a luminescence spectrometer (Figure 4D). This analysis showed that the fluorescent intensity, but not the excitation and emission spectra had been changed in the variant GFP transfectants. The lack of mutants showing altered excitation or emission properties is most likely due to our selection strategy which only enriched cells for increased fluorescent intensity.
We also examined cells of the eGFP, Emerald and variant GFP transfectants by fluorescence microscopy (Figure 4E). Consistent with the FACS and spectrometer analysis, cells of the GFPv5, GFPv6, GFPnovo1 and GFPnovo2 transfectants produced brighter images than cells of the eGFP and Emerald transfectants, but there was no noticeable change of color.
DISCUSSION
We describe here an artificial evolution system in the B cell line DT40 which offers advantages over alternative approaches and could be of generic value for the optimization of many types of proteins in cell culture. The main advantages are the high transgene-specific mutation rates, the easy confirmation of mutant phenotypes and the selection for beneficial mutations within the environment of live vertebrate cells. We have illustrated the utility of the system by generating new GFP variants which display more than 3-fold higher fluorescence activity than the best GFPs currently available for bio-imaging of vertebrate cells.
We have chosen eGFP as a first example for protein evolution in DT40, because the protein has been extensively optimized by random and site-directed mutagenesis and selection of cells displaying brighter fluorescence is straightforward. Only three rounds of FACS sorting during 2 months of culturing were sufficient to isolate mutants of increased fluorescent intensity. Some of the identified mutations are identical to mutations previously reported by others for GFP variants with an altered excitation and/or emission spectrum. For example, the V163A codon substitution is included in the GFP variants Cycle3 (7), T-Sapphire (20), Venus (21–23), ECFP (8) and W1C (8); Y145F is shared by the GFP variants Sapphire (20), p4-3 (8) and EBFP (7,8); L221V is shared by EGFPevo mutants (24). It has been proposed that the V163A substitution facilitates the correct folding and maturation of GFP (6). Other amino acid changes such as S202T and A206T are first described in this study. Although the crystal structure of GFP is resolved (25) it is not easy to explain the increased fluorescence of our variant GFP transfectants without more detailed studies of the encoded mRNAs and proteins. As the configuration of the GFP chromophore has already been optimized, the most likely reason seems to be improved protein maturation or stability. The amino acid substitutions A206T and L221V are localized in a region known to influence GFP dimerization and their effect might be due to the enhancement of dimer formation, but other effects cannot be ruled out. Regardless of the molecular basis, the increased fluorescence conferred by the new GFP variants and in particular GFPnovo1 and GFPnovo2 should be useful in situations where maximal sensitivity is required for bio-imaging.
Compared to alternative artificial evolution techniques, protein optimization in hypermutating B cell lines like DT40 has the advantage that it is performed within the real life environment of a vertebrate cell. This can be important as the properties of many proteins are affected by intra-cellular parameters such as pH and salt concentrations, folding, post-translational modifications and degradation. Although optimization of proteins has been achieved in hypermutating mammalian B cell lines such as Ramos (11) or 18–81 (24), DT40 offers the advantages that transgenes can be inserted as single copies into the Ig light chain locus and AID expression can be controlled. In the future, transgene diversification in DT40 may be accomplished by a combination of hypermutation and gene conversion if homologous conversion donor sequences are inserted upstream of the transgene. It has already been demonstrated that gene conversion can be used for the diversification of transgenes in DT40 (20).
The high efficiency of targeted gene integration in DT40 allows one to standardize the hypermutation of transgenes and the confirmation of mutant phenotypes (Figure 5). We envision that the transgene of interest is first cloned into the pHypermut2 vector and then transfected into a DT40 clone which conditionally expresses AID. Transfectants, which hypermutate the transgene after targeted integration into the rearranged Ig light chain locus, are subjected to iterative cycles of cell selection and expansion. When a cell population of the desired phenotype is isolated, hypermutation is shut off by excision of the AID expression cassette and the cells are subcloned. Transgene sequences from the most promising subclones are amplified by PCR and cloned into the pHypermut2 vector for sequencing. In the end, the mutated transgenes within the pHypermut2 vector are transfected back into AID negative cells to verify their phenotype.
Figure 5.

Scheme of artificial evolution in DT40. The approach only requires a cell line such as AIDR which conditionally expresses AID and the pHypermut2 targeting vector.
In principle, DT40 can be used for the evolution of any coding or non-coding sequence whose transcription is tolerated. However, evolution of transgenes encoding non-fluorescent proteins is only feasible, if rare cells carrying desirable mutations can be enriched from the bulk culture. As FACS sorting is ideal for selection, one might attempt to tie the optimization of non-fluorescent proteins to fluorescent signals. For example, a cell surface receptor expressed on the surface of DT40 might be optimized using a fluorescent coupled ligand. Most likely, the selection strategies will need to be customized for different types of proteins and the outcome of these efforts will determine the success of artificial evolution in DT40.
ACKNOWLEDGEMENTS
We are grateful to Claire Brellinger for excellent technical assistance. This work was supported by the Geninteg grant of the FP6 framework and the DGF program ‘Networks in genome expression and maintenance’. Funding to pay the Open Access publication charges for this article was provided by XXX.
Conflict of interest statement. None declared.
REFERENCES
- 1.Milstein C, Rada C. The maturation of the antibody response. In: Honjo T, Alt FW, editors. Immunoglobulin Genes. 2nd. London: Academic Press; 1995. pp. 57–81. [Google Scholar]
- 2.Matsuura T, Yomo T. In vitro evolution of proteins. J. Biosci. Bioeng. 2006;101:449–456. doi: 10.1263/jbb.101.449. [DOI] [PubMed] [Google Scholar]
- 3.Dufner P, Jermutus L, Minter RR. Harnessing phage and ribosome display for antibody optimisation. Trends Biotechnol. 2006;24:523–529. doi: 10.1016/j.tibtech.2006.09.004. [DOI] [PubMed] [Google Scholar]
- 4.Dwyer MA, Looger LL, Hellinga HW. Computational design of a biologically active enzyme. Science. 2004;304:1967–1971. doi: 10.1126/science.1098432. [DOI] [PubMed] [Google Scholar]
- 5.Shimomura O. Structure of the chromophore of Aequorea green fluorescent protein. FEBS Lett. 1979;104:220–222. [Google Scholar]
- 6.Zacharias DA, Tsien RY. Molecular biology and mutation of green fluorescent protein. Methods Biochem. Anal. 2006;47:83–120. [PubMed] [Google Scholar]
- 7.Patterson GH, Knobel SM, Sharif WD, Kain SR, Piston DW. Use of the green fluorescent protein and its mutants in quantitative fluorescence microscopy. Biophys. J. 1997;73:2782–2790. doi: 10.1016/S0006-3495(97)78307-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Cubitt AB, Woollenweber LA, Heim R. Understanding structure-function relationships in the Aequorea victoria green fluorescent protein. Methods Cell Biol. 1999;58:19–30. doi: 10.1016/s0091-679x(08)61946-9. [DOI] [PubMed] [Google Scholar]
- 9.Cumbers SJ, Williams GT, Davies SL, Grenfell RL, Takeda S, Batista FD, Sale JE, Neuberger MS. Generation and iterative affinity maturation of antibodies in vitro using hypermutating B-cell lines. Nat. Biotechnol. 2002;20:1129–1134. doi: 10.1038/nbt752. [DOI] [PubMed] [Google Scholar]
- 10.Seo H, Masuoka M, Murofushi H, Takeda S, Shibata T, Ohta K. Rapid generation of specific antibodies by enhanced homologous recombination. Nat. Biotechnol. 2005;23:731–735. doi: 10.1038/nbt1092. [DOI] [PubMed] [Google Scholar]
- 11.Wang L, Jackson WC, Steinbach PA, Tsien RY. Evolution of new nonantibody proteins via iterative somatic hypermutation. Proc. Natl Acad. Sci. USA. 2004;101:16745–16749. doi: 10.1073/pnas.0407752101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Buerstedde J-M, Takeda S. Increased ratio of targeted to random integration after transfection of chicken B cell lines. Cell. 1991;67:179–188. doi: 10.1016/0092-8674(91)90581-i. [DOI] [PubMed] [Google Scholar]
- 13.Buerstedde J-M, Reynaud CA, Humphries EH, Olson W, Ewert DL, Weill JC. Light chain gene conversion continues at high rate in an ALV-induced cell line. EMBO J. 1990;9:921–927. doi: 10.1002/j.1460-2075.1990.tb08190.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Sale JE, Calandrini DM, Takata M, Takeda S, Neuberger MS. Ablation of XRCC2/3 transforms immunoglobulin V gene conversion into somatic hypermutation. Nature. 2001;412:921–926. doi: 10.1038/35091100. [DOI] [PubMed] [Google Scholar]
- 15.Arakawa H, Saribasak H, Buerstedde J-M. Activation-induced cytidine deaminase initiates immunoglobulin gene conversion and hypermutation by a common intermediate. PLoS Biol. 2004;2:E179. doi: 10.1371/journal.pbio.0020179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Arakawa H, Hauschild J, Buerstedde J-M. Requirement of the activation-induced deaminase (AID) gene for immunoglobulin gene conversion. Science. 2002;295:1301–1306. doi: 10.1126/science.1067308. [DOI] [PubMed] [Google Scholar]
- 17.Cormack BP, Valdivia RH, Falkow S. FACS-optimized mutants of the green fluorescent protein (GFP) Gene. 1996;173:33–38. doi: 10.1016/0378-1119(95)00685-0. [DOI] [PubMed] [Google Scholar]
- 18.Zhang Y, Riesterer C, Ayrall AM, Sablitzky F, Littlewood TD, Reth M. Inducible site-directed recombination in mouse embryonic stem cells. Nucleic Acids Res. 1996;24:543–548. doi: 10.1093/nar/24.4.543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Arakawa H, Lodygin D, Buerstedde JM. Mutant loxP vectors for selectable marker recycle and conditional knock-outs. BMC Biotechnol. 2001;1:7. doi: 10.1186/1472-6750-1-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kanayama N, Todo K, Takahashi S, Magari M, Ohmori H. Genetic manipulation of an exogenous non-immunoglobulin protein by gene conversion machinery in a chicken B cell line. Nucleic Acids Res. 2006;34:e10. doi: 10.1093/nar/gnj013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zapata-Hommer O, Griesbeck O. Efficiently folding and circularly permuted variants of the Sapphire mutant of GFP. BMC Biotechnol. 2003;3:5. doi: 10.1186/1472-6750-3-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Nagai T, Ibata K, Park ES, Kubota M, Mikoshiba K, Miyawaki A. A variant of yellow fluorescent protein with fast and efficient maturation for cell-biological applications. Nat. Biotechnol. 2002;20:87–90. doi: 10.1038/nbt0102-87. [DOI] [PubMed] [Google Scholar]
- 23.Rekas A, Alattia JR, Nagai T, Miyawaki A, Ikura M. Crystal structure of venus, a yellow fluorescent protein with improved maturation and reduced environmental sensitivity. J. Biol. Chem. 2002;277:50573–50578. doi: 10.1074/jbc.M209524200. [DOI] [PubMed] [Google Scholar]
- 24.Wang CL, Yang DC, Wabl M. Directed molecular evolution by somatic hypermutation. Protein Eng. Des. Sel. 2004;17:659–664. doi: 10.1093/protein/gzh080. [DOI] [PubMed] [Google Scholar]
- 25.Ormo M, Cubitt AB, Kallio K, Gross LA, Tsien RY, Remington SJ. Crystal structure of the Aequorea victoria green fluorescent protein. Science. 1996;273:1392–1395. doi: 10.1126/science.273.5280.1392. [DOI] [PubMed] [Google Scholar]
- 26.Crameri A, Whitehorn EA, Tate E, Stemmer WP. Improved green fluorescent protein by molecular evolution using DNA shuffling. Nat. Biotechnol. 1996;14:315–319. doi: 10.1038/nbt0396-315. [DOI] [PubMed] [Google Scholar]




