Skip to main content
ACS Omega logoLink to ACS Omega
. 2017 Jul 5;2(7):3183–3191. doi: 10.1021/acsomega.7b00508

Spiked Genes: A Method to Introduce Random Point Nucleotide Mutations Evenly throughout an Entire Gene Using a Complete Set of Spiked Oligonucleotides for the Assembly

Edson Cárcamo , Abigail Roldán-Salgado , Joel Osuna , Iván Bello-Sanmartin , Jorge A Yáñez , Gloria Saab-Rincón , Héctor Viadiu , Paul Gaytán †,*
PMCID: PMC6044943  PMID: 30023688

Abstract

graphic file with name ao-2017-005086_0004.jpg

In vitro mutagenesis methods have revolutionized biological research and the biotechnology industry. In this study, we describe a mutagenesis method based on synthesizing a gene using a complete set of forward and reverse spiked oligonucleotides that have been modified to introduce a low ratio of mutant nucleotides at each position. This novel mutagenesis scheme named “Spiked Genes” yields a library of clones with an enhanced mutation distribution due to its unbiased nucleotide incorporation. Using the far-red fluorescent protein emKate as a model, we demonstrated that Spiked Genes yields richer libraries than those obtained via enzymatic methods. We obtained a library without bias toward any nucleotide or base pair and with even mutations, transitions, and transversion frequencies. Compared with enzymatic methods, the proposed synthetic approach for the creation of gene libraries represents an improved strategy for screening protein variants and does not require a starting template.

Introduction

The ability to produce a large number of protein variants that are screened to optimize the desired protein function continues to expand our technological progress. Directed molecular evolution relying on random and semirandom mutagenesis of genes has become a powerful strategy for modifying protein properties.1,2 There are already many proteins whose function has been modified by selecting variants from a created protein library. For example, variant proteins with stability to high temperatures,3 extreme pH,4,5 and organic solvents6,7 have been obtained. Moreover, variant enzymes with improved activities8,9 or specificity changes10,11 are available. Other physical properties, such as oligomerization of a target protein12,13 or color of fluorescent proteins have also been modified.14,15

Although screening and selection are intrinsically related to the particular properties of the target protein, the generation of genetic diversity at the gene level is a more general issue and depends on the mutagenesis method employed.1,2,16 Soon after the advent of the polymerase chain reaction (PCR) technique, it was determined that random point mutations were generated during the in vitro amplification process,17,18 and protocols were developed to take advantage of such mutations, creating the error-prone PCR (epPCR) mutagenesis approach.19,20 Now, it is well known that manganese ions and unbalanced dNTP ratios,21,22 nucleotide analogs,23 and low-fidelity DNA polymerases24,25 promote the random incorporation of erroneous nucleotides.

On the other hand, early studies on oligonucleotide synthesis performed on a solid phase demonstrated that when added in equimolar concentrations, the four DNA monomer phosphoramidites dA, dC, dG, and dT have similar chemical reactivity.26 In 1986, taking advantage of the equal nucleotide reactivity, the 30 bp glucocorticoid response element of the mouse mammary tumor virus was randomized by spiked oligonucleotides, wherein each position was doped during automated DNA synthesis by the low levels of the other three phosphoramidites.27 This work confirmed the robustness of the method to produce all of the expected mutations by sequencing 546 clones and finding 88 different base substitutions out of the 90 possible single mutants. A similar mutagenesis approach was reported the following year by Hill et al.28 and 4 years later by Dale et al.29 Using genes as templates, researchers incorporated spiked oligonucleotides by PCR-based methods to modify specific regions of the encoded proteins.30 More recently, Hidalgo et al.31 and Jin et al.32 reused the spiked oligonucleotides to achieve a focused and directed evolution in certain regions of the regulatory elements and enzymes via in vitro recombination with oligonucleotides bearing the wild-type sequence.

In the recent era of synthetic biology, using only the sequence information without any gene template, libraries of synthetic genes have been prepared by two similar approaches termed “Synthetic Shuffling”33 and “Assembly of Designed Oligonucleotides”.34 In both methods, the gene library is assembled by appropriately prepared oligonucleotides containing saturated degeneracies at the desired nucleotide positions, selected by sequence alignments of homologous genes. The precise annealing required to form full-length genes is achieved by controlled overlapping of the designed oligomers. However, contrary to the spiked oligonucleotides, ordinary primers synthesized with saturated degeneracies at some positions rapidly increase the number of variants at the nucleotide level35 and enrich the libraries with protein variants harboring multiple amino acid mutations, which usually affect either the folding or function of most encoded proteins. The fundamental difference between spiked and ordinary degenerated oligonucleotides relies on the level of doping of the wild-type nucleotides. In spiked oligonucleotides each wild-type nucleotide is “doped” during the chemical synthesis, with a very low ratio of the other three bases, yielding oligonucleotide mutants harboring very few changes per chain and a high ratio containing the wild-type sequence. However, in ordinary commercial degenerated oligonucleotides, the target nucleotides are completely replaced with an equimolar ratio of the desired mixed bases: N: A/G/C/T, V: A/C/G, D: A/G/T, H: A/T/C, B: C/G/T, M: A/C, R: A/G, W: A/T, S: C/G, Y: C/T, or K: G/T, producing oligonucleotide mutants harboring multiple mutations per chain and practically nothing of the wild-type sequence when more than 12 nucleotides are randomized with N.

In this article, for the first time, we report the strategy of using a complete set of spiked oligonucleotides to introduce mutations throughout an entire synthetic gene, an approach that according to a review written by Wong et al. has not been reported yet.36 This method is not limited to a target region, as previously described,3032 or to a reduced number of target nucleotides.33,34 This synthetic strategy, which we have named “Spiked Genes”, was applied to the monomeric enhanced red fluorescent protein emKate.37,38 This relatively small protein composed of 243 amino acids, including a polyhistidine tag on its carboxy end, was selected as a model test not only because its small gene is suitable for chemical assembly but also because emKate harbors the mutation S158A with respect to the original mKate protein,39 a mutation that makes it more fluorescent than its parental protein.38 Moreover, some amino acid replacements on its scaffold produce variants of a different color,37 similar to the protein DsRed that served as a scaffold for producing a palette of fluorescent proteins termed mFruits.14

Results and Discussion

Gene Library Synthesis

The synthetic emKate gene library was assembled, as described in the Methods section, by a one-step strategy using a diluted equimolar pool of 22 internal spiked oligonucleotides and two outermost spiked oligonucleotides as extension primers (Figure 1).37 The optimal concentration of the internal oligonucleotides was 4 nM, 100-fold less concentrated than that of the external primers. The assembly efficiency was sensitive at higher internal oligonucleotide concentrations likely due to mispairing between some of them and the external primers. We did not observe any assembly efficiency difference between the wild-type37 and spiked mutant oligonucleotides.

Figure 1.

Figure 1

Strategy for the assembly of the synthetic gene library. Each continuous arrow represents a spiked oligonucleotide, wherein each position was doped with 0.25% of each of the other three bases. Primer sequences are listed in Table S1. (A) Scheme of the single-step PCR assembly reaction to synthesize the emKate gene library. (B) PCR reactions, using three different starting concentrations of the internal primers (4, 8, and 16 nM) and outermost primers at 400 nM, analyzed by agarose gel electrophoresis and the GeneRuler 100 bp Plus DNA ladder as the molecular marker.

To evaluate the higher efficiency of Spiked Genes for generating random point nucleotide substitutions, we compared our method with the mutagenesis efficiencies, over the emKate gene, of three enzymatic epPCR-based methods: two commercial mutagenesis kits (GeneMorph II and Diversify) and the traditional epPCR method.20 The Diversify PCR Random Mutagenesis Kit is based on the methods of Leung et al.19 and Cadwell and Joyce20 and relies on the use of Titanium Taq polymerase and different concentrations of manganese and dGTPs to achieve different mutation rates. GeneMorph II is a commercial mutagenesis kit containing Mutazyme II: a blend of two error-prone DNA polymerases.25 One of these enzymes is Mutazyme I, which favors mutations at G’s and C’s, whereas the other is a novel Taq DNA polymerase mutant that exhibits increased misinsertion and misextension frequencies at A’s and T’s compared with those of the wild-type Taq.

Analysis of the Libraries

To evaluate the mutagenic efficiency of the Spiked Genes method versus the three error-prone enzymatic methods, we used three parameters to define the quality of DNA libraries:16,40 (1) the mutation distribution, which evaluates the capacity of the method to spread point mutations along the complete gene sequence, (2) the mutation rate, which evaluates the prediction ability of the method to achieve an expected number of point mutations per kb, and (3) the mutational diversity, also known as the mutational spectrum, which evaluates the capacity of the method to produce all type of transition and transversion mutations.41 To measure these parameters while avoiding misrepresented data, we randomly selected several clones (independent of the phenotype displayed) from the antibiotic supplemented plate. A total of 57 867 bases were sequenced for this analysis (Table 1).

Table 1. Comparison of Mutational Spectra and Mutation Rate Generated by the Spiked Genes and Three Different Enzymatic Mutagenesis Approaches.

  classic epPCRa diversify (Clontech) GeneMorph II (Stratagene) Spiked Genes method expected valuesb
Bias Indicators
Ts/Tv 1.81 3.16 1.01 0.61 0.5
AT → GC/GC → AT 4.07 8.70 2.14 0.88 0.908
A → N, T → Nc 80.2 89.7 68.2 46.9 47.58
G → N, C → Nc 19.7 10.3 31.8 53.1 52.38
Types of Mutations
transitionsc
A → G, T → C 46.0 69.2 29.5 17.7 15.86
G → A, C → T 18.4 6.8 20.9 20.4 17.46
transversionsc
A → T, T → A 26.3 17.1 34.3 13.6 15.86
A → C, T → G 7.9 3.4 4.4 15.6 15.86
G → C, C → G 0 1 4.3 11.6 17.46
G → T, C → A 1.4 2.5 6.6 21.1 17.46
insertions and deletions
deletions 1 8 6 19 0
insertions 0 0 1 1 0
Data Analyzed
clones 18 19 23 22  
sequenced bases 13 122 13 395 16 038 15 312  
point mutations 76 204 184 155  
Mutation Frequency (Mutations/kb)
exhibitedd 5.8 (4.5; 7.2) 15.2 (13.2; 17.4) 11.4 (9.8; 13.2) 10.1 (8.6; 11.8)  
expectede 6.6 8 16 15  
a

Using the conditions reported by Cadwell and Joyce.20

b

Expected values calculated from the proportion of each nucleotide in the emKate gene, assuming an equal mutation probability in every position.

c

In percentage.

d

95% confidence intervals are shown between brackets, assuming that the mutation frequency follows a Poisson distribution.22,59

e

Expected mutation frequency of each approach based on a previous report,20 instruction manuals of commercial kits and the theoretical rates of Spiked Genes.

Mutation Distribution

The first point to highlight from the sequencing results was the homogeneous distribution of mutations generated along the entire gene (Figure 2). Using our Spiked Genes method, we did not observe accumulation or reduction of mutations at any location of the gene, even at the boundaries of the oligonucleotides used. Similar evenly distributed results were also obtained for the libraries created by the three enzymatic mutagenesis methods tested. Although, for the enzymatic methods, some bases were more prone to be replaced than others due to the nucleotide specificity of the DNA polymerases, which is dependent on the surrounding sequence.24

Figure 2.

Figure 2

Experimental mutation distribution of the libraries constructed with the four methods examined. Each bar represents a point nucleotide substitution at a certain location of the gene from the initial ATG codon. The bar height represents the number of substitutions found in each position.

Mutation Rate

For our Spiked Genes method, the expected mutation rate was 15 changes/kb due to the 0.75% doping ratio per base per strand and therefore 1.5% per base-pair. The experimental results exhibited an average replacement rate of 10.3 point mutations per kb, which was slightly lower than the expected value, as shown in Table 1. This difference between the theoretical and experimental values reflects a selective pressure during the PCR assembly process to favor pairing of complementary primers with the least ratio of mismatches. The difference between the theoretical and experimental mutation rates would probably decrease using less stringent annealing conditions during the PCR step.

Because of the 1.0% experimental mutation rate, the mutagenesis of the 0.7 kb emKate gene (with our method) resulted in an average of seven nucleotide changes at the DNA level and four or five amino acid changes at the protein level, which is appropriate for performing experiments of directed evolution in most protein targets.40 However, if a lower mutagenic rate is desired for elucidating protein function–structure relationships, the doping ratio may be simply reduced during the oligonucleotide synthesis to achieve one or two amino acid changes per protein variant.

Such fine control in the mutagenic rate of the Spiked Genes method is hardly achievable by the enzymatic mutagenesis approaches. In our experiment, using the epPCR protocols reported in the literature produced, on an average, 5.8 replacements per kb (0.8 replacements less than the expected 6.6 changes/kb). However, the range of mutagenic rates with the epPCR methods is very broad.24 Using similar mutagenic conditions, with different target genes and Taq DNA polymerases from different suppliers, we have observed mutation frequencies as high as 25 changes per kb (results not shown). None of the commercial epPCR mutagenesis kits tested could achieve the predicted value even when the mutagenesis conditions were set to reach the maximal mutation rate. Whereas the 4.2 mutations per kb rate of the GeneMorph II kit was lower than the expected mutation rate and the 7.2 mutations per kb rate of the Diversify kit surpassed the expected value (Table 1).

Mutational Diversity

The mutational spectrum is the most important parameter to assess the quality of mutagenesis libraries. The ideal mutagenesis method should be able to replace any base in the gene with any of the other three bases at the same frequency. Mutational diversity can be examined by different parameters.41,42 One is by analyzing the ratio of transitions (Ts) to transversions (Tv). Transition mutations are purine (G and A) to purine changes and pyrimidine (T and C) to pyrimidine changes, whereas transversions are changes from purine to pyrimidine and from pyrimidine to purine. In this regard, the theoretical transitions to transversions ratio is 0.5 because there are 4 possible transitions (A → G, G → A, C → T, and T → C) and eight possible transversions (A → C, C → A, T → A, A → T, C → G, G → C, T → G, and G → T). The analysis of the libraries constructed for this work showed that our Spiked Genes method, with a 0.61 Ts/Tv ratio, yielded the most diverse library. However, the classical and diversify epPCR libraries produced less diverse libraries, with Ts/Tv ratios of 1.81 and 3.16, respectively. In other words, the epPCR libraries produced 64.4 and 76.0% of transitions, although the expected value is 33.3%.

Another way to evaluate the mutational diversity is by calculating the ratio of AT → GC to GC → AT transition mutations (AT → GC/GC → AT ratio), which would be 1 for a perfectly unbiased mutagenic method targeting a gene with the same proportion of AT and GC pairs. As observed with the Ts/Tv ratio, Spiked Genes yielded the most even library, producing an experimental AT → GC/GC → AT ratio of 0.88 versus the expected 0.91 value calculated for the nucleotide composition of the emKate gene (A 26.13%, T 21.45%, G 27.72%, C 24.66%) (Table 1). Instead, the classical epPCR and diversify methods yielded GC-enriched libraries.

Finally, mutational diversity can also be assessed by measuring the ratio between the frequency of mutating A’s and T’s versus the frequency of mutating G’s and C’s (AT → NN/GC → NN ratio). This ratio should also be 1 for an unbiased mutagenic method that targets a gene with equal concentrations of ATs and GCs. For the Spiked Genes method, we measured the AT → NN value as 46.90%, which was very close to the 47.58% expected value for nucleotide composition of the emKate gene, whereas the GC → NN experimental value was 53.10% as compared with the expected value of 52.38%.21,22,43 For the enzymatic methods, GeneMorph II produced less biased results than classical epPCR or the Diversify kit that generated a higher AT → GC bias.

We dissected our results to analyze the individual mutations found in the coding strand, normalizing each type of mutation with respect to the wild-type nucleotide content of the gene. Figure 3 shows that the Spiked Genes method yielded the broadest mutagenesis spectrum among all methods, whereas the enzymatic methods had a poor representation of the transversions C → G, G → T, G → C, T → G, and C → A, a result that is in concurrence with that reported by Alexander et al.44 for GeneMorph II. At the protein level, the nucleotide bias of enzymatic methods favors neutral amino acid replacements due to the protecting nature found in the degeneracy of the genetic code, leaving unexplored a large fraction of the sequence space. Again, the analysis of individual mutations reflected a clear advantage of the Spiked Genes method over that of the enzymatic-based approaches to produce more complete libraries of protein variants, as reported in the supplementary Table S2, wherein all of the amino acid replacements found in the four libraries were analyzed. Clearly, Spiked Genes created amino acid substitutions unachievable by the other methods.

Figure 3.

Figure 3

Experimental/expected frequency ratios for each of the 12 possible nucleotide mutations. The pointed line indicates the ideal normalized value for every substitution.

Although nonbiased mutagenic libraries are mainly used to perform directed evolution studies, controlled-bias mutagenic libraries are occasionally preferred to favor certain amino acid mutations.36,45,46 Thus, it is important to have an adjustable method for setting specific transition/transversion bias to achieve the maximal coverage of desired amino acid substitutions. For this matter, the Spiked Genes method can be finely tuned to produce the desired mutation frequency and mutational bias by simply changing, during oligonucleotide synthesis, the proportion of the spiked bases. Moreover, if an important functional region is discovered after the first screening of the library, our method allows the mutational enrichment of this specific region by reusing some of the spiked oligonucleotides used in the initial assembly, which may be selectively incorporated by additional rounds of mutagenesis via overlapped PCR.47

Together, all of the presented results confirm that the Spiked Genes method is a superior mutagenesis approach to produce libraries with evenly distributed random point nucleotide substitutions and affords an accurate statistical description of the library composition, contrary to the libraries built by epPCR approaches.40 Unfortunately, our method also showed a higher deletion rate than the enzymatic methods. It is well known that due to the inefficient coupling of the incoming nucleoside phosphoramidites and subsequent inefficient capping of the growing oligonucleotide chains that fail to be blocked synthetic oligonucleotides contain a high rate of single nucleotide deletions.4850 The deletion rate can be reduced during synthesis by increasing the concentration of the nucleoside phosphoramidites to improve coupling yield at every nucleotide addition and using a double capping step to improve blocking.51,52 Deletion mutations can also be reduced by postsynthesis purification of the oligonucleotides by size exclusion chromatography, high-performance liquid chromatography, or polyacrylamide gel electrophoresis, as it has already been done to assemble complete virus genomes.53 These improvements will render higher quality libraries.

Mutant Analysis

After transforming Escherichia coli competent cells with only one-tenth of the ligation reaction, nonbiased random mutagenesis of the emKate gene using the Spiked Genes method produced 181 colored colonies out of 1995 colonies selected on kanamycin. From the analyzed colonies, 149, 16, 4, and 12 were red, green, yellow, and orange variants, respectively. As expected, one clone containing the wild-type emKate intact sequence emerged after sequencing the plasmid from 12 red colonies, whereas the rest of the red clones contained an average of 2 to 3 amino acid replacements and only 2 clones contained 6 and 8 changes. Most of the substitutions corresponded to neutral replacements.

The color variation observed in some of the mutant clones (Figure 4A) was the result of incomplete maturation of the DsRed-type chromophore present in the emKate protein.37 In some variants, the chromophore was trapped as a GFP-like immature intermediate (Figure S1). As a consequence, some variants showed an orange phenotype that corresponded to a mixture of two proteins: one that remained in the GFP-like state, with an absorbance peak of approximately 470–500 nm and another that reached the DsRed-like state, with an absorbance peak of approximately 580–600 nm. The simultaneous emission of green and red lights produced the orange phenotype. Proteins displaying slow and incomplete maturation of their chromophores have been extensively used as fluorescent timers,54 and perhaps some of our variants could also be used as timers.

Figure 4.

Figure 4

emKate mutants generated by the Spiked Genes method. (A) E. coli colonies expressing wild-type GFP, a nonfluorescent GFP mutant containing the multiple mutations L64A/C65M/Y66G/G67V, wild-type emKate, and the representative fluorescent mutants found in the library. (B) Absorbance spectra of the mutants and reference proteins shown in panel A.

The different levels of protonation of the chromophore also influenced the observed phenotypes, as revealed by the absorption properties of the green and yellow variants (Figure 4B), which exhibited different ratios of the GFP-type chromophore in its ionized state at approximately 488 nm and its neutral state at approximately 382 nm. Although the absorbance properties of green and yellow variants were very similar, their phenotype was clearly different, as shown in the streaked colonies in Figure 4A. Because green variants were colored but nonfluorescent, perhaps their chromophores achieved a trans configuration around the double bond formed between the Cα and Cβ of Tyr64 (equivalent to Tyr66 in GFP) during the oxidation step, whereas the yellow variants achieved a cis configuration. It is well known that most nonfluorescent chromoproteins display a GFP-type chromophore in the trans configuration, whereas most fluorescent proteins display a chromophore with cis configuration.55,56

Here, it is important to mention that emKate’s chromophore was formed by post-translational modification of the amino acids Met63, Tyr64, and Gly65, whose equivalent positions in the reference GFP protein are Ser65, Tyr66, and Gly67. In this context, all amino acid changes that gave rise to drastic changes of phenotypes compared with the red color of emKate were located on the vicinity of the chromophore, either on the α-helix holding the chromophore or on amino acid positions located on the β-strands surrounding the chromophore, whose side-chains face the interior of the β-barrel and are near the chromophore. For instance, the yellow phenotype of emKate-yellow1 was generated by the mutation S66G, adjacent to the chromophore moiety, whereas the green and orange phenotypes found in emKate-green1 and emKate-orange1 were generated, respectively, by mutations Q39H and A158V (Figure S2), located in β-strands 2 and 7.

These substitutions probably caused the loss of hydrogen bonds with the chromophore or crowded that region, disturbing the appropriate conformation of the GFP-like intermediate and hampered the second oxidation step, which would produce the red chromophore.57 Surprisingly, a green shift was also found in a variant containing the external mutation A142P, which presumably destabilized the β-barrel, causing loss of the hydrogen bond between the phenolate ion of the chromophore and Ser143. The alteration of the H-bond network results in the destabilization of the cis configuration and probably neutralization of the phenolate.58

Conclusions

In this study, we describe a new method to perform random mutagenesis that we call the Spiked Genes method, and we compared its efficiency against that of enzymatic epPCR mutagenesis methods. Using oligonucleotides with 0.75% mutant nucleotides at each position, we obtained a mutant library that (with respect to other methods) showed: first, a higher control of mutation rate; second, a wider and homogeneous mutation distribution along the entire gene; and third, a more even mutation diversity, with minimal mutation bias. Moreover, our method can be modulated to achieve the desired mutation and transition/transversion ratios. In this era of synthetic biology, the continuous drop in the price of oligonucleotides makes this random mutagenesis strategy a suitable option for the improvement of protein properties. The power of the strategy was manifested in the high phenotypic variations found in the library. We have created a rich library of emKate variant proteins that once characterized could potentially be used as reporters of cellular processes.

Methods

Oligonucleotide Synthesis

All oligonucleotides used for the gene library assembly, forward and reverse (shown in Table S1), were synthesized with a MerMade 192 DNA synthesizer (BioAutomation), using standard DMTr-protected phosphoramidites dABz, dCAc, dGdmf, and dT. However, contrary to conventional synthesis, each nucleotide was doped with 0.25% of each of the other three nucleotides, producing an overall contamination of 0.75% per base. After deprotection and analysis by denaturing polyacrylamide gel electrophoresis, all oligonucleotides were observed as a single band and were used for the subsequent experiments without further purification.

Library of Synthetic Genes Encoding emKate

The library of synthetic genes encoding emKate (accession number EU383029, DNA Data Bank of Japan) was assembled in a single PCR step, as described in a previous report.37 Oligonucleotides in one strand were designed to overlap 30 nt with the oligonucleotides in the complementary strand. To improve gene expression, the least frequent E. coli codons were substituted with more frequently used codons. An overlapping extension was carried out using an equimolar pool of the 22 internal oligonucleotides as templates (mFw2 to mFw12 and mRv1 to mRv11) (Table S1). Three different equimolar concentrations of the internal oligonucleotides were tested (4, 8, and 16 nM), plus 400 nM of the outermost primers (mFw1 and mRv12), which, for cloning purposes, also included NdeI and Xhol restriction sites at the 5′ and 3′ ends of the synthetic genes, respectively. mRv12 also contained the information to introduce a polyhistidine tag −SGGSHHHHHH at the carboxy end of the variant proteins. The final 100 μL of PCR mixture had Vent DNA polymerase (New England BioLabs), buffer, MgSO4, and dNTPs, as recommended by the supplier. The PCR reaction was performed under the following conditions: 1 cycle: 94 °C for 3 min; 25 cycles: 94 °C for 1 min, 58 °C for 1 min, 72 °C for 1 min; 1 cycle: 72 °C for 5 min.

Both the gene library and the pJOQ plasmid were double-digested with the NdeI and XhoI restriction endonucleases for 12 h at 37 °C and purified from agarose gel, using the EZ-10 spin column PCR purification kit (Bio Basic Inc.). Next, 200 ng of the gene library was ligated into 200 ng of the vector in a 20 μL reaction at 16 °C for 20 h. Using electroporation, 2 μL of the ligation reaction was used to transform 100 μL of MC1061 E. coli cells. After recovering in 1 mL of LB media at 37 °C for 1 h, the transformed cells were grown at 37 °C for 20 h on LB-Kanamycin plates. After transformation, plasmids from several independent colonies (fluorescent and nonfluorescent) were isolated for DNA sequencing.

Enzymatic Mutagenesis

The wild-type gene emKate was subjected to three different epPCR-based mutagenesis approaches, two using commercial kits and one established in our laboratory. We followed the instructions to obtain the highest mutation rate recommended for two commercial kits: 16/kb for the GeneMorph II mutagenesis kit from Stratagene and 8/kb for the Diversify PCR Random Mutagenesis Kit from Clontech. As a third protocol, we evaluated the classical epPCR approach at the predicted mutation rate of 6.6/kb, as described by Cadwell and Joyce.20 All libraries generated from the three epPCR methods were cloned, selected, and evaluated, as previously described for the synthetic library.

Acknowledgments

Technical assistance by Eugenio López-Bustos, Santiago Becerra-Ramírez, Ana Yanci Alarcón, Leopoldo Güereca, and Humberto Flores-Soto is highly appreciated. The present research was financially supported by the Core Facility of Oligonucleotide Synthesis and DNA Sequencing of the Institute of Biothecnology, belonging to the National Autonomous University of Mexico.

Supporting Information Available

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acsomega.7b00508.

  • Spiked oligonucleotides synthesized to assemble the emKate gene via the Spiked Genes method. Each wild-type nucleotide was doped with 0.25% of each of the other nucleotides (Table S1); experimental amino acid replacements obtained with the four mutagenesis methods tested over the emKate gene (Table S2); the mechanism proposed for the post-translational maturation of the red chromophore found in emKate protein (Figure S1); alignment of sequences corresponding to the mutants streaked in Figure 4 (Figure S2) (PDF)

Author Contributions

§ E.C. and A.R.-S. contributed equally to this work and both must be considered as first authors.

The authors declare no competing financial interest.

Supplementary Material

ao7b00508_si_001.pdf (660.4KB, pdf)

References

  1. Neylon C. Chemical and Biochemical Strategies for the Randomization of Protein Encoding DNA Sequences: Library Construction Methods for Directed Evolution. Nucleic Acids Res. 2004, 32, 1448–1459. 10.1093/nar/gkh315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Packer M. S.; Liu D. R. Methods for the Directed Evolution of Proteins. Nat. Rev. Genet. 2015, 16, 379–394. 10.1038/nrg3927. [DOI] [PubMed] [Google Scholar]
  3. Arnold F. H. Enzyme Engineering Reaches the Boiling Point. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 2035–2036. 10.1073/pnas.95.5.2035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Torres-Salas P.; Mate D. M.; Ghazi I.; Plou F. J.; Ballesteros A. O.; Alcalde M. Widening the pH Activity Profile of a Fungal Laccase by Directed Evolution. ChemBioChem 2013, 14, 934–937. 10.1002/cbic.201300102. [DOI] [PubMed] [Google Scholar]
  5. Bai W.; Cao Y.; Liu J.; Wang Q.; Jia Z. Improvement of Alkalophilicity of an Alkaline Xylanase Xyn11A-LC from Bacillus sp. SN5 by Random Mutation and Glu135 Saturation Mutagenesis. BMC Biotechnol. 2016, 16, 77. 10.1186/s12896-016-0310-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Moore J. C.; Arnold F. H. Directed Evolution of a Para-Nitrobenzyl Esterase for Aqueous-Organic Solvents. Nat. Biotechnol. 1996, 14, 458–467. 10.1038/nbt0496-458. [DOI] [PubMed] [Google Scholar]
  7. Kumar R.; Singh R.; Kaura J. Characterization and Molecular Modelling of an Engineered Organic Solvent Tolerant, Thermostable Lipase with Enhanced Enzyme Activity. J. Mol. Catal. B: Enzym. 2013, 97, 243–251. 10.1016/j.molcatb.2013.09.001. [DOI] [Google Scholar]
  8. Cheng Q.; Gao H.; Hu N. A Trehalase from Zunongwangia sp.: Characterization and Improving Catalytic Efficiency by Directed Evolution. BMC Biotechnol. 2016, 16, 9. 10.1186/s12896-016-0239-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Axarli I.; Muleta A. W.; Chronopoulou E. G.; Papageorgiou A. C.; Labrou N. E. Directed Evolution of Glutathione Transferases towards a Selective Glutathione-Binding Site and Improved Oxidative Stability. Biochim. Biophys. Acta, Gen. Subj. 2017, 1861, 3416–3428. 10.1016/j.bbagen.2016.09.004. [DOI] [PubMed] [Google Scholar]
  10. Dröge M. J.; Boersma Y. L.; van Pouderoyen G.; Vrenken T. E.; Rüggeberg C. J.; Reetz M. T.; Dijkstra B. W.; Quax W. J. Directed Evolution of Bacillus subtilis lipase A by Use of Enantiomeric Phosphonate Inhibitors: Crystal structures and phage display selection. Chembiochem 2006, 7, 149–157. 10.1002/cbic.200500308. [DOI] [PubMed] [Google Scholar]
  11. Campeotto I.; Bolt A. H.; Harman T. A.; Dennis C.; Trinh C. H.; Phillips S. E. V.; Nelson A.; Pearson A. R.; Berry A. Structural Insights into Substrate Specificity in Variants of N-Acetylneuraminic Acid Lyase Produced by Directed Evolution. J. Mol. Biol. 2010, 404, 56–69. 10.1016/j.jmb.2010.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Campbell R. E.; Tour O.; Palmer A. E.; Steinbach P. A.; Baird G. S.; Zacharias D. A.; Tsien R. Y. A Monomeric Red Fluorescent Protein. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 7877–7882. 10.1073/pnas.082243699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Ai H. W.; Henderson J. N.; Remington S. J.; Campbell R. E. Directed Evolution of a Monomeric, Bright and Photostable Version of Clavularia Cyan Fluorescent Protein: Structural Characterization and Applications in Fluorescence Imaging. Biochem. J. 2006, 400, 531–540. 10.1042/BJ20060874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Shaner N. C.; Campbell R. E.; Steinbach P. A.; Giepmans B. N. G.; Palmer A. E.; Tsien R. Y. Improved Monomeric Red, Orange and Yellow Fluorescent Proteins Derived from Discosoma sp Red Fluorescent Protein. Nat. Biotechnol. 2004, 22, 1567–1572. 10.1038/nbt1037. [DOI] [PubMed] [Google Scholar]
  15. Subach O. M.; Gundorov I. S.; Yoshimura M.; Subach F. V.; Zhang J.; Gruenwald D.; Souslova E. A.; Chudakov D. M.; Verkhusha V. V. Conversion of Red Fluorescent Protein into a Bright Blue Probe. Chem. Biol. 2008, 15, 1116–1124. 10.1016/j.chembiol.2008.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Lutz S.; Patrick W. M. Novel Methods for Directed Evolution of Enzymes: Quality, not Quantity. Curr. Opin. Biotechnol. 2004, 15, 291–297. 10.1016/j.copbio.2004.05.004. [DOI] [PubMed] [Google Scholar]
  17. Keohavong P.; Thilly W. G. Fidelity of DNA Polymerases in DNA Amplification. Proc. Natl. Acad. Sci. U.S.A. 1989, 86, 9253–9257. 10.1073/pnas.86.23.9253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Eckert K. A.; Kunkel T. A. High Fidelity DNA Synthesis by the Thermus Aquaticus DNA Polymerase. Nucleic Acids Res. 1990, 18, 3739–3744. 10.1093/nar/18.13.3739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Leung D. W.; Chen E.; Goeddel D. V. A method for Random Mutagenesis of a Defined DNA Segment Using a Modified Polymerase Chain Reaction. Technique 1989, 1, 11–15. [Google Scholar]
  20. Cadwell R. C.; Joyce G. F. Randomization of Genes by PCR Mutagenesis. PCR Methods Appl. 1992, 2, 28–33. 10.1101/gr.2.1.28. [DOI] [PubMed] [Google Scholar]
  21. Lin-Goerke J. L.; Robbins D. J.; Burczak J. D. PCR-Based Random Mutagenesis Using Manganese and Reduced dNTP Concentration. BioTechniques 1997, 23, 409–412. [DOI] [PubMed] [Google Scholar]
  22. Shafikhani S.; Siegel R. A.; Ferrari E.; Schellenberger V. Generation of Large Libraries of Random Mutants in Bacillus subtilis by PCR-Based Plasmid Multimerization. BioTechniques 1997, 23, 304–310. [DOI] [PubMed] [Google Scholar]
  23. Zaccolo M.; Williams D. M.; Brown D. M.; Gherardi E. An Approach to Random Mutagenesis of DNA Using Mixtures of Triphosphate Derivatives of Nucleoside Analogues. J. Mol. Biol. 1996, 255, 589–603. 10.1006/jmbi.1996.0049. [DOI] [PubMed] [Google Scholar]
  24. Emond S.; Mondon P.; Pizzut-Serin S.; Douchy L.; Crozet F.; Bouayadi K.; Kharrat H.; Potocki-Veronese G.; Monsan P.; Remaud-Simeon M. A Novel Random Mutagenesis Approach Using Human Mutagenic DNA Polymerases to Generate Enzyme Variant Libraries. Protein Eng., Des. Sel. 2008, 21, 267–274. 10.1093/protein/gzn004. [DOI] [PubMed] [Google Scholar]
  25. Instruction Manual. In GeneMorph II Random Mutagenesis Kit; Stratagene: 2009; pp1–16. [Google Scholar]
  26. Zon G.; Gallo K. A.; Samson C. J.; Shao K. L.; Summers M. F.; Byrd R. A. Analytical Studies of ‘Mixed Sequence’ Oligodeoxyribonucleotides Synthesized by Competitive Coupling of Either methyl- or beta-cyanoethyl-N,N-diisopropylamino Phosphoramidite Reagents, Including 2′-Deoxyinosine. Nucleic Acids Res. 1985, 13, 8181–8196. 10.1093/nar/13.22.8181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Hutchison C. A.; Nordeen S. K.; Vogt K.; Edgell M. H. A Complete Library of Point Substitution Mutations in the Glucocorticoid Response Element of Mouse Mammary-Tumor Virus. Proc. Natl. Acad. Sci. U.S.A. 1986, 83, 710–714. 10.1073/pnas.83.3.710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Hill D. E.; Oliphant A. R.; Struhl K. Mutagenesis with Degenerate Oligonucleotides: an Efficient Method for Saturating a Defined DNA Region with Base Pair Substitutions. Methods Enzymol. 1987, 155, 558–568. 10.1016/0076-6879(87)55036-4. [DOI] [PubMed] [Google Scholar]
  29. Dale S. J.; Belfield M.; Richardson T. C. Oligonucleotide-Directed Random Mutagenesis Using a High-Efficiency Procedure. Methods 1991, 3, 145–153. 10.1016/S1046-2023(05)80167-7. [DOI] [Google Scholar]
  30. Lanio T.; Jeltsch A. PCR-Based Random Mutagenesis Method Using Spiked Oligonucleotides to Randomize Selected Parts of a Gene without any Wild-Type Background. BioTechniques 1998, 25, 962–965. [DOI] [PubMed] [Google Scholar]
  31. Hidalgo A.; Schließmann A.; Molina R.; Hermoso J.; Bornscheuer U. T. A One-Pot, Simple Methodology for Cassette Randomisation and Recombination for Focused Directed Evolution. Protein Eng., Des. Sel. 2008, 21, 567–576. 10.1093/protein/gzn034. [DOI] [PubMed] [Google Scholar]
  32. Jin P.; Kang Z.; Zhang J.; Zhang L.; Du G.; Chen J. Combinatorial Evolution of Enzymes and Synthetic Pathways Using One-Step PCR. ACS Synth. Biol. 2016, 5, 259–268. 10.1021/acssynbio.5b00240. [DOI] [PubMed] [Google Scholar]
  33. Ness J. E.; Kim S.; Gottman A.; Pak R.; Krebber A.; Borchert T. V.; Govindarajan S.; Mundorff E. C.; Minshull J. Synthetic Shuffling Expands Functional Protein Diversity by Allowing Amino Acids to Recombine Independently. Nat. Biotechnol. 2002, 20, 1251–1255. 10.1038/nbt754. [DOI] [PubMed] [Google Scholar]
  34. Zha D.; Eipper A.; Reetz M. T. Assembly of Designed Oligonucleotides as an Efficient Method for Gene Recombination: a New Tool in Directed Evolution. Chembiochem 2003, 4, 34–39. 10.1002/cbic.200390011. [DOI] [PubMed] [Google Scholar]
  35. del Rio G.; Osuna J.; Soberon X. Combinatorial Libraries of Proteins: Analysis of Efficiency of Mutagenesis Techniques. BioTechniques 1994, 17, 1132–1139. [PubMed] [Google Scholar]
  36. Wong T. S.; Zhurina D.; Schwaneberg U. The Diversity Challenge in Directed Protein Evolution. Comb. Chem. High Throughput Screening 2006, 9, 271–288. 10.2174/138620706776843192. [DOI] [PubMed] [Google Scholar]
  37. Gaytán P.; Roldan-Salgado A. Elimination of Redundant and Stop Codons during the Chemical Synthesis of Degenerate Oligonucleotides. Combinatorial Testing on the Chromophore Region of the Red Fluorescent Protein mKate. ACS Synth. Biol. 2013, 2, 453–462. 10.1021/sb3001326. [DOI] [PubMed] [Google Scholar]
  38. Pletnev S.; Shcherbo D.; Chudakov D. M.; Pletneva N.; Merzlyak E. M.; Wlodawer A.; Dauter Z.; Pletnev V. A Crystallographic Study of Bright Far-Red Fluorescent Protein mKate Reveals pH-Induced cis-trans Isomerization of the Chromophore. J. Biol. Chem. 2008, 283, 28980–28987. 10.1074/jbc.M800599200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Shcherbo D.; Merzlyak E. M.; Chepurnykh T. V.; Fradkov A. F.; Ermakova G. V.; Solovieva E. A.; Lukyanov K. A.; Bogdanova E. A.; Zaraisky A. G.; Lukyanov S.; Chudakov D. M. Bright Far-Red Fluorescent Protein for Whole-Body Imaging. Nat. Methods 2007, 4, 741–746. 10.1038/nmeth1083. [DOI] [PubMed] [Google Scholar]
  40. Nov Y. Probabilistic Methods in Directed Evolution: Library Size, Mutation Rate, and Diversity. Methods Mol. Biol. 2014, 1179, 261–278. 10.1007/978-1-4939-1053-3_18. [DOI] [PubMed] [Google Scholar]
  41. Verma R.; Wong T. S.; Schwaneberg U.; Roccatano D. The Mutagenesis Assistant Program. Methods Mol. Biol. 2014, 1179, 279–290. 10.1007/978-1-4939-1053-3_19. [DOI] [PubMed] [Google Scholar]
  42. Copp J. N.; Hanson-Manful P.; Ackerley D. F.; Patrick W. M. Error-Prone PCR and Effective Generation of Gene Variant Libraries for Directed Evolution. Methods Mol. Biol. 2014, 1179, 3–22. 10.1007/978-1-4939-1053-3_1. [DOI] [PubMed] [Google Scholar]
  43. Casson L. P.; Manser T. Evaluation of Loss and Change of Specificity Resulting from Random Mutagenesis of an Antibody VH Region. J. Immunol. 1995, 155, 5647–5654. [PubMed] [Google Scholar]
  44. Alexander D. L.; Lilly J.; Hernandez J.; Romsdahl J.; Troll C. J.; Camps M. Random Mutagenesis by Error-Prone Pol Plasmid Replication in Escherichia coli. Methods Mol. Biol. 2014, 1179, 31–44. 10.1007/978-1-4939-1053-3_3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Wong T. S.; Roccatano D.; Zacharias M.; Schwaneberg U. A Statistical Analysis of Random Mutagenesis Methods Used for Directed Protein Evolution. J. Mol. Biol. 2006, 355, 858–871. 10.1016/j.jmb.2005.10.082. [DOI] [PubMed] [Google Scholar]
  46. Wong T. S.; Roccatano D.; Schwaneberg U. Are transversion mutations better? A Mutagenesis Assistant Program analysis on P450 BM-3 heme domain. Biotechnol. J. 2007, 2, 133–142. 10.1002/biot.200600201. [DOI] [PubMed] [Google Scholar]
  47. Ho S. N.; Hunt H. D.; Horton R. M.; Pullen J. K.; Pease L. R. Site-Directed Mutagenesis by Overlap Extension Using the Polymerase Chain-Reaction. Gene 1989, 77, 51–59. 10.1016/0378-1119(89)90358-2. [DOI] [PubMed] [Google Scholar]
  48. Ellington A.; Pollard J. D.. Introduction to the Synthesis and Purification of Oligonucleotides. InCurrent Protocols in Nucleic Acid Chemistry; John Wiley & Sons, Inc., 2001. [DOI] [PubMed] [Google Scholar]
  49. Young L.; Dong Q. Two-Step Total Gene Synthesis Method. Nucleic Acids Res. 2004, 32, e59 10.1093/nar/gnh058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Caruthers M. H.; Barone A. D.; Beaucage S. L.; Dodds D. R.; Fisher E. F.; McBride L. J.; Matteucci M.; Stabinsky Z.; Tang J. Y. Chemical Synthesis of Deoxyoligonucleotides by the Phosphoramidite Method. Methods Enzymol. 1987, 154, 287–313. [DOI] [PubMed] [Google Scholar]
  51. Gaytán P.; Contreras-Zambrano C.; Ortiz-Alvarado M.; Morales-Pablos A.; Yanez J. TrimerDimer: an Oligonucleotide-Based Saturation Mutagenesis Approach that Removes Redundant and Stop Codons. Nucleic Acids Res. 2009, 37, e125 10.1093/nar/gkp602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Yu D.; Tang J. Y.; Iyer R. P.; Agrawal S. Diethoxy N, N-diisopropyl Phosphoramidite as an Improved Capping Reagent in the Synthesis of Oligonucleotides Using Phosphoramidite Chemistry. Tetrahedron Lett. 1994, 35, 8565–8568. 10.1016/S0040-4039(00)78437-1. [DOI] [Google Scholar]
  53. Smith H. O.; Hutchison C. A. 3rd; Pfannkoch C.; Venter J. C. Generating a Synthetic Genome by Whole Genome Assembly: phiX174 Bacteriophage from Synthetic Oligonucleotides. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 15440–15445. 10.1073/pnas.2237126100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Terskikh A.; Fradkov A.; Ermakova G.; Zaraisky A.; Tan P.; Kajava A. V.; Zhao X. N.; Lukyanov S.; Matz M.; Kim S.; Weissman I.; Siebert P. “Fluorescent Timer”: Protein that Changes Color with Time. Science 2000, 290, 1585–1588. 10.1126/science.290.5496.1585. [DOI] [PubMed] [Google Scholar]
  55. Miyawaki A.; Shcherbakova D. M.; Verkhusha V. V. Red Fluorescent Proteins: Chromophore Formation and Cellular Applications. Curr. Opin. Struct. Biol. 2012, 22, 679–688. 10.1016/j.sbi.2012.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Wiedenmann J.; Ivanchenko S.; Oswald F.; Schmitt F.; Röcker C.; Salih A.; Spindler K. D.; Nienhaus G. U. EosFP, a Fluorescent Marker Protein with UV-inducible Green-to-Red Fluorescence Conversion. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 15905–15910. 10.1073/pnas.0403668101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Subach F. V.; Malashkevich V. N.; Zencheck W. D.; Xiao H.; Filonov G. S.; Almo S. C.; Verkhusha V. V. Photoactivation Mechanism of PAmCherry Based on Crystal Structures of the Protein in the Dark and Fluorescent States. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 21097–21102. 10.1073/pnas.0909204106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Wang Q.; Byrnes L. J.; Shui B.; Röhrig U. F.; Singh A.; Chudakov D. M.; Lukyanov S.; Zipfel W. R.; Kotlikoff M. I.; Sondermann H. Molecular Mechanism of a Green-Shifted, pH-Dependent Red Fluorescent Protein mKate Variant. PLoS One 2011, 6, e23513 10.1371/journal.pone.0023513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Brown L. D.; Cai T. T.; DasGupta A. Interval Estimation for a Binomial Proportion. Statistical Science 2001, 16, 101–133. 10.1214/ss/1009213286. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ao7b00508_si_001.pdf (660.4KB, pdf)

Articles from ACS Omega are provided here courtesy of American Chemical Society

RESOURCES