Abstract
Multiplexed gene expression optimization via modulation of gene translation efficiency through ribosome binding site (RBS) engineering is a valuable approach for optimizing artificial properties in bacteria, ranging from genetic circuits to production pathways. Established algorithms design smart RBS-libraries based on a single partially-degenerate sequence that efficiently samples the entire space of translation initiation rates. However, the sequence space that is accessible when integrating the library by CRISPR/Cas9-based genome editing is severely restricted by DNA mismatch repair (MMR) systems. MMR efficiency depends on the type and length of the mismatch and thus effectively removes potential library members from the pool. Rather than working in MMR-deficient strains, which accumulate off-target mutations, or depending on temporary MMR inactivation, which requires additional steps, we eliminate this limitation by developing a pre-selection rule of genome-library-optimized-sequences (GLOS) that enables introducing large functional diversity into MMR-proficient strains with sequences that are no longer subject to MMR-processing. We implement several GLOS-libraries in Escherichia coli and show that GLOS-libraries indeed retain diversity during genome editing and that such libraries can be used in complex genome editing operations such as concomitant deletions. We argue that this approach allows for stable and efficient fine tuning of chromosomal functions with minimal effort.
Introduction
Modifying ribosome binding site (RBS) strength is a popular and efficient approach to tune gene expression levels in prokaryotic systems1,2. Small changes in as few as 6 to 8 bp in a spatially well-defined region result in the up- or down-regulation of translation3. Since the translation initiation rate (TIR) of any RBS can be approximately predicted solely from its nucleotide sequence3, RBS engineering is a useful tool for rationally optimizing a broad range of artificial functions, such as the signalling characteristics of genetic circuits4 or pathway fluxes for production of valuable products5. Still, in many cases the required level of translation is not known nor is the prediction of the RBS strength exact. In addition, confounding factors such as metabolic burden, essentiality of involved genes, side product formation and toxicity may apply6, and predicted expression levels may not be met due to a lack of availability of free ribosomes7. Therefore, a library approach is often needed to optimize the TIR for each of the involved genes. As full randomization of even a short part of the RBS produces a set of sequences that is too large to evaluate for a single gene, let alone for gene combinations, and in addition is heavily biased towards non-functional or weak RBSs8, a number of tools exist that use the mentioned approximate prediction of RBS strengths to design smaller libraries with a high level of functional sequences whose predicted TIRs still span the entire accessible range. They include the “RBS library calculator”5,9, the “MAGE Oligo Design Tool” (MODEST)10, the “Empiric Model and Oligos for Protein Expression Changes” (EMOPEC)11, and “Reduced Libraries” (RedLibs)8.
In many cases, in particular those in which genetic stability of engineered strains is of importance, such as for industrial biotechnology12,13, the relevant functions that need tuning are located on the chromosome, presenting a considerable practical obstacle to efficient engineering14. However, the development of multiplex automated genome engineering (MAGE), by which small changes in the prokaryotic genome can be efficiently introduced using the lambda red-supported exploitation of single stranded DNA oligonucleotides as fake Okazaki fragments2,15, has allowed extending RBS engineering to chromosomal genes, including large scale multiplexing2,5,16,17. However, the mechanism of action of this targeted mutagenesis method inherently leads to effects that depend on the sequence of the mutagenic oligonucleotide: the target cell counters mutagenesis by removing mismatches during replication using its MMR enzyme MutS, and the efficiency of this process depends on the length and nature of the mismatch2,15,18–20. Consequently, the members of libraries of mutagenic oligonucleotides will meet with quite different fates upon transformation of the target cell. The effect can be avoided in MMR-deficient (MMR−) strains, but this results in an increased overall mutation rate (approximately four unwanted point mutations in the genome per MAGE cycle21–24 (see also Supplementary Fig. S1)) and can undermine selection (see below). The MMR system can be temporarily inactivated22–24, thus limiting the potential for undesired mutations, but this adds additional complexity to the process.
Here, we develop an alternative protocol for genome engineering that allows ignoring the MMR system altogether and thus [i] is optimally suited for manipulations in practically relevant MMR+ strains and [ii] eliminates the potential for sequence-based bias from library strategies. Specifically, MutS does not recognize insertions or mismatches that are greater than 5 bp25. Therefore, using oligonucleotides that introduce the desired mutation(s) on an (at least) 6 bp-long mismatch should eliminate sequence bias. Practically speaking, this rule, which we term the GLOS rule (for genome library optimized sequences) can be easily implemented as an additional requirement for available oligonucleotide selection algorithms, in our case for RBS engineering using the RedLibs algorithm8 (Fig. 1).
For practical implementation, we combine our MAGE-based library designs with CRISPR/Cas9-based counter-selection (CRISPR-optimized MAGE or CRMAGE26), which has been shown to deliver excellent allelic replacement (AR) efficiencies (more than 95% and 60% for inserting 1 and 6 bp mismatches, respectively26). The GLOS-strategy fits well to this approach, as by design all oligonucleotides will have the same mismatch length and thus AR efficiency and gRNA/target binding should be similar, avoiding additional sources of biases (Supplementary Table S1).
Here, we demonstrate the scope of the GLOS approach with a particular focus on avoiding the introduction of sequence bias into library approaches. We validate the GLOS method by modulating the RBS of E. coli’s chromosomal lacZ gene, and then illustrate its practical utility by modulating a production pathway, specifically the production pathway for the industrially produced vitamin riboflavin (vitamin B2, Supplementary Fig. S2)27.
Results and Discussion
GLOS allows unbiased and efficient sampling of the functional space by oligonucleotide-directed mutagenesis in an MMR+ strain
While it was shown previously that both the length and nature of mismatches affect the frequency of repair in MMR+ strains2,15,18–20, we wanted to confirm that this has indeed a strong impact on library diversity. We chose translation from the chromosomal lacZ gene as a test case, as the effects on translation are easy to analyse and LacZ is not essential. Therefore, the AR frequency would not be strongly influenced by changes in expression level, as it would be the case for essential proteins (e.g. due to possible growth defects due to low expression levels). Moreover, it was previously shown that lacZ can be modified with high AR efficiency28. This was important to exclude variations in AR efficiency due to the position on the genome10,29.
Next, we analysed the effect of an active MMR system on RBS engineering if GLOS was not applied. We randomized position −10 to −15 relative to the start of E. coli’s lacZ open reading frame (ORF) with fully degenerate bases (“N6-library”) and obtained the predicted TIRs for each RBS sequence from an RBS calculator5,30. We used this dataset as input for the RedLibs algorithm to design an oligonucleotide encoding a smart 18-member lacZ RBS library with a TIR distribution as uniform as possible (“N6-RedLibs”, Fig. 2). This library was introduced into an MMR+ and an MMR− strain (EcNR1 and Ec-, respectively) and successful mutagenesis was selected for using CRMAGE (Fig. 2). Sanger-sequencing of 96 randomly selected clones obtained for the MMR− strain showed that in two independent experiments, AR efficiencies of at least 98% were achieved and 16 and 18 of the 18 library members, respectively, were recovered. In contrast, in the MMR+ strain an average AR efficiency of only 48% was achieved, and only 5 and 9 out of 18 possible sequences could be recovered from the two experiments. Clearly, the MMR system had substantially reduced the diversity of variants that could be sampled with a given number of analyses and substantially increased the number of wild-type clones to further reduce analysis effectiveness.
While an active MMR system clearly decreases library diversity, it might have beneficial effects with respect to other aspects of sequence integrity. Specifically, chemical oligonucleotide synthesis is error-prone and introduces indels on the genome via MAGE close to the target site23. We analysed indel frequency within the genomic sequence equivalent to mutagenic oligonucleotides for the 4 times 96 clones obtained before, and found indels in 16.5% of the MMR− clones versus only 7.5% for the MMR+ clones (Supplementary Fig. S3a). In other words, MMR+ strains have an improved capacity to remove erroneous oligonucleotides, but not sufficient to compensate for the loss in library diversity. In summary, library-based RBS genome editing in regular, MMR+ strains is severely hampered.
For comparison we applied the GLOS rule on the N6-library (see above), which left us with three nucleotides instead of four at each position (Fig. 1). This results in 36 = 729 oligonucleotides with a 6 bp mismatch at the position of the RBS (“GLOS library”). This GLOS library was then further reduced using RedLibs to generate a single oligonucleotide encoding an 18-member library with TIRs again distributed as uniformly as possible. The resulting smart library was integrated into the genome of MMR+ EcNR1 by CRMAGE again in two independent experiments. We observed an improvement of the AR efficiencies to over 98%, with 16 and 18 of the 18 library members incorporated (Fig. 2d). Clearly, the GLOS rule allows maintaining library diversity and analysis effectiveness.
Remaining deviations from ideal sequence distributions are partially due to biases in folding energy and chemical synthesis
When we inspected the recovered sequences in more detail, we observed that while we found most or even all of sequences that were expected to be present in the library, the abundances were not uniformly distributed. We therefore investigated some potential reasons that might affect the distribution (summarized in Supplementary Table S1), specifically under- or overrepresentation of a sequence within the oligonucleotide pool after chemical synthesis and different folding energies (∆G) of the oligonucleotides. To start with the latter, unstructured oligonucleotides (with higher folding energies) can hybridize to the target more easily and therefore show higher AR efficiencies20. Indeed, analysis of the ∆G values of all library members suggested that oligonucleotides with lower folding energy (−7.03 kcal mol−1 and −6.05 kcal mol−1) were not integrated as efficiently (0 to 5% abundance in the library) as oligonucleotides with higher folding energy (−4.33 kcal mol−1; 0 to 40% abundance) (Supplementary Fig. S4a). These results suggest that ∆G has an influence on the final distribution, and that a higher ∆G is indeed beneficial but not a sufficient requirement for a high AR efficiency.
We next examined whether the composition of the initial pool of chemically synthesized DNA oligonucleotides was a source of bias. One potential reason therefore could be again the ∆G of the oligonucleotide since highly structured oligonucleotides are harder to synthesize. We analysed Illumina next-generation sequencing (NGS) data of the synthesized oligonucleotides to detect bias before genome editing. Even though we found up to a 3-fold difference in the oligonucleotide distribution after NGS, there is no correlation with ∆G (Supplementary Fig. S5a). Next we compared the abundances after chemical synthesis with the Sanger-sequencing data of the strains obtained after genome editing to characterize abundances afterwards (Supplementary Fig. S5b). We could not identify a strong correlation between sequence abundances before and after, but did note that the four sequences that we had not found in the four experiments with N6-RedLibs-treated MMR− and GLOS-RedLibs-treated MMR+ strains were underrepresented by 5% to 37% in the initial oligonucleotide pool (in an ideal uniform distribution pool of 18 oligonucleotides, each sequence would constitute 5.5% of the population; an underrepresentation of 50% of a particular oligonucleotide for example would therefore correspond to an overall abundance of 2.75%). Similarly, the sequences that were recovered most frequently (N6-RedLibs library and MMR− strain: TTTGAG; GLOS-RedLibs library and MMR+ strain: CTGGGA) were 79% and 45% overrepresented in the pool (Fig. 2c,e). Taken together, the composition of the oligonucleotide library after synthesis as well as oligonucleotide folding energies probably contributed to some extent to the absence or presence of sequences after genome editing.
In conclusion, by including GLOS into the design of a smart RBS library AR efficiency after CRISPR/Cas9-counterselection can be improved from 48% to 98% in an MMR+ strain, and the library coverage from 39% to 94%. Looking at a sample of roughly 200 clones (for an 18-membered library), the entire functional space can be covered with reasonable certainty. Careful analysis suggests that deviations from an ideal distribution of sequences after genome editing can be linked to some extent to external factors (such as oligonucleotide selection and/or folding energy and synthesis).
Application of GLOS to tuning the riboflavin pathway
To test the efficiency of the GLOS-RedLibs strategy on a pathway rather a single gene, we chose riboflavin production in E. coli (Supplementary Fig. S2a). All genes are endogenous to E. coli, and the product is secreted and easily quantifiable by absorbance or fluorescence measurement in the supernatant31, which makes it a suitable model pathway. To identify bottlenecks in the riboflavin production pathway we first analysed the riboflavin production levels when separately overexpressing ribA, ribB, ribC, ribD, and ribE from low copy plasmids controlled by an inducible promoter (Supplementary Fig. S2b). In case of ribB, the genomic negative feedback loop (riboregulator)32 is not included on the plasmid. Indeed, overexpression of ribA, ribB, or ribC led to 2-, 3- and 2-fold increased levels of secreted riboflavin, respectively. We chose ribA and ribB for further experiments.
Implementation of a GLOS-RedLibs RBS library for ribA
We designed a GLOS-RedLibs RBS library for ribA, targeting positions −11 to −16 relative to the start of the ribA ORF, and integrated it into an MMR+ strain (Fig. 3a). We sequenced 88 clones and observed over 98% AR efficiency with 15 of the 18 library members incorporated (Fig. 3b). As previously observed, we found a higher folding energy of the oligonucleotide to be beneficial for the AR efficiency (Supplementary Fig. S4c). In addition, comparison of sequence abundances in the oligonucleotide pool before and in the E. coli population after genome editing showed that two of the three missing sequences were already underrepresented before editing (40% and 52%, Fig. 3f). However, in contrast to the earlier experiments with lacZ, ribA is an essential gene, and therefore clones with a change in TIR might have growth advantages or disadvantages that could have contributed to a non-ideal library sequence distribution in the analysed population. To explore this further, we individually integrated the three missing sequences into the E. coli genome in front of ribA and determined riboflavin production levels and growth curves. Since all sequences could be incorporated and did not affect growth (Fig. 3e), we exclude differences in growth behaviour as a reason for the fact that they were missing before, and conclude that the imbalance of sequence distribution already before genome editing was the most important source of library bias.
The 88 clones of the library plus the parental clone were examined for riboflavin production (Fig. 3c). The levels of secreted riboflavin varied by a factor of 2.5. When we separately looked at the riboflavin production level of the three strains that were not included in the first library, their levels did not diverge from the overall library range (Fig. 3d). We chose the clone (AACAGA) with the highest riboflavin production (2.5-fold over parent) for further analysis. Notably, although our GLOS-RedLibs library has a reduced potential sequence space compared to an N6-RedLibs library (due to the initial reduction in candidate sequences by applying GLOS), we were able to achieve the expression level observed with plasmid-based overexpression (Supplementary Fig. S2).
Combination of deletion and library insertion
Next, we tested whether the efficient introduction of broad sequence variation by our GLOS method could be implemented in more complex editing processes, specifically if combined with introducing a deletion in the same step. Therefore, we designed an oligonucleotide to delete the FMN riboregulator (220 bp) in front of ribB and to concomitantly encode an GLOS-RedLibs RBS library targeting positions −5 to −11 relative to the start of the ribB ORF, and integrated this library into MMR+ EcNR1 (Fig. 4a). We sequenced 96 of the recovered clones, and observed an AR efficiency of 86% with 14 of 18 possible library members incorporated (Fig. 4b). Similar to lacZ and ribA, the sequences overrepresented in this set of strains had higher folding energies for the oligonucleotide (Supplementary Fig. S4d) and oligonucleotide sequencing showed that all sequences not found in 96 sequenced clones (AGGAAG, AGGAGG, AGAAGC, AGGACG) had already been underrepresented (10% to 25%) in the oligonucleotide pool (Fig. 4d).
Screening for riboflavin production levels among the recovered strains, we found clones producing up to 2.6-fold more riboflavin than the wild type, indicating that the library nearly covered the riboflavin production range (1- to ~3-fold wild type) expected from the initial plasmid-based overexpression experiments. Production levels of the best producer (AGGACA) were verified in triplicate (Fig. 5a). We conclude that complex modifications like library-based RBS modification with concomitant deletion is possible. However, since the AR efficiency drops in this case, more clones need to be screened to cover the library.
Combination of experimentally optimized RBSs leads to higher riboflavin production levels than the combination of RBSs that are predicted to show highest TIRs
Since ribA and ribB are both on the negative strand on replichore 1 and 2, respectively, their RBSs cannot be simultaneously mutated with high efficiency. Therefore, to further improve riboflavin production we combined the RBSs in front of ribA and ribB found in the respective best producer strains. We also constructed a strain that contained in front of ribA and ribB the two RBSs that were predicted to show the highest TIRs after GLOS-RedLibs library construction (GAGGGA for ribA and AGGAAC for ribB). We found that the strain containing the optimized RBSs showed a 9.3-fold increase in riboflavin production over wild type, while the strain containing the RBSs with the highest predicted TIRs showed only a 6.7-fold increase in production (Fig. 5a). Neither mutant exhibited a growth defect when compared to the wild type (Fig. 5b). Even though we realize that this does not constitute a full analysis, we note that combining RBSs that were experimentally optimized results in our case in a better productivity than combining RBSs that were predicted to have the highest TIRs. This confirms the added value of a library-based optimization approach compared to merely combining modifications that produce optimal results when looked at in isolation.
The engineered MMR+ strain has a strongly improved mutational profile
Finally, in order to confirm our original argument that the frequency of undesired mutations is much lower after MAGE-based genome editing in an MMR+ than in an MMR− strain, we re-constructed the different RBSs in front of ribA and ribB and the deletion of the riboregulator in front of ribB in an Ec− strain, in which we had previously inactivated the mutS gene in two additional MAGE cycles. This way, both strains had undergone the same number of CRMAGE cycles (amounting to eight transformations per strain). The MMR− strain had undergone in addition two transformations to inactivate mutS, which is a necessary prerequisite to inactivate the MMR system. Therefore, analysing the genomic sequence of the two strains for off-target mutations should accurately reflect the impact of engineering in an MMR− background. We performed Illumina-based genome sequencing on both strains and compared the sequences to the parental EcNR1 strain. Indeed, the MMR+ strain exhibited only 4 off-target mutations, while the MMR− strain exhibited 57 off-target mutations (Supplementary Fig. S1 and Supplementary Table S5).
Conclusions
The GLOS rule presented here enables fast and efficient construction of smart RBS libraries in MMR+ strains while reducing the amount of off-target effects and retaining a higher tendency to repair oligonucleotide-induced indels. Moreover it was previously shown that an active MMR system leads to higher killing efficiency for the CRISPR/Cas9 counter selection33. We show that complex modifications such as RBS library construction with simultaneous sequence deletion can be performed in a single step with high AR efficiencies, demonstrating the potential of this method.
In terms of drawbacks, it should be noted that due to the requirement of a 6 bp mismatch it might be that the covered TIR range is not maximal, especially if the targeted Shine-Dalgarno sequence is already similar to the optimal consensus sequence and therefore predicted to be very efficient. In such a case, the members of a GLOS-RBS library might suffer disproportionally much from a 6 bp mismatch. However this can be counteracted by slightly shifting the target region for randomization towards the start codon. Moreover, we believe that the maximal TIRs are unlikely to be required for most applications. Very high TIR values often do not lead to higher activities for various reasons, including metabolic burden or translation being no longer rate-limiting3. Furthermore, we argue that this limitation is a reasonable trade-off given that our method drastically reduces off-target mutations and thus might become valuable to many industrial production strains.
It should be mentioned that the recent combination of homologous recombination and CRISPR/Cas9 counterselection represents a different approach to apply library strategies to chromosomal targets34,35, which circumvents the issue of library bias due to an active MMR system altogether. However, it remains for now unclear which alternative biases are introduced by replacing oligonucleotide-directed mutagenesis with homologous recombination. In the present method, the different influences have been carefully characterized and it is clear that efficient library strategies can be easily implemented.
Finally, we demonstrated the functionality of this method on E. coli but it could be applied to any bacteria which have a similar MMR system. We therefore conclude that GLOS is a helpful tool for many kinds of genomic RBS optimization applications.
Methods
Chemicals and media
Restriction enzymes, Taq polymerase, Q5 high-fidelity DNA polymerase and T4 DNA ligase were obtained from New England Biolabs (BioConcept AG, Allschwil, Switzerland) and used according to the manufacturer’s instructions. Chemicals were purchased in the highest purity available from Sigma-Aldrich (Buchs, Switzerland) or BD Bioscience (Allschwil, Switzerland). Low salt Difco Luria broth base, Miller (LB Miller) was used for all CRMAGE experiments, where required supplemented with chloramphenicol at 20 µg mL−1, kanamycin at 50 µg mL−1, streptomycin 50 µg mL−1, or carbenicillin at 50 µg mL−1 for antibiotic selection. Sucrose was added to a final concentration of 5 g L−1 for selection of loss of plasmid pSEVA431_SacB. Medium M9 GCA contained 1× M9 salts36, 10 mg L−1 thiamine, 2 mg L−1 biotin, 0.5 g L−1 glucose and 1 g L−1 casamino acids. Desalted oligonucleotides were purchased from Sigma-Aldrich (Haverhill, UK) or Microsynth (Balgach, Switzerland). MAGE oligonucleotides contained four phosphorothioated bases at the 5′-end.
Strains and plasmids
Strains and plasmids used in this study are shown in Supplementary Table S2. The gRNAs for the CRISPR Cas9 system were designed and checked for off-target effects with the web tool COD (Cas9 Online Designer, cas9.wicp.net;37). They were inserted into the plasmid pCRISPR according to the Addgene protocol38. Shortly, pCRISPR_X plasmids were constructed by digestion of pCRISPR with BsaI followed by insertion of a double stranded linker assembled from oligonucleotide pairs recruited from primers 11–14. Plasmid pSEVA431_SacB was constructed by restriction digest-based cloning. The sacB gene including a constitutive promoter was amplified from plasmid pKO3 and HindIII and EcoRI restriction sites were included by PCR (Q5 polymerase, primer 23 + 24, see Supplementary Table S3). The PCR was done as follows: step 1: 30 s at 95 °C; step 2: 10 s at 95 °C; step 3: 30 s at 50 °C; step 4: 90 s at 72 °C; repeat steps 2 to 4 25 times; step 5: 2 min at 72 °C and storage at 8 °C. The PCR fragment was purified and digested with EcoRI and HindIII and ligated into equally digested pSEVA431. For the plasmids containing the riboflavin genes, the genome of E. coli MG1655 was used as a template for the genes encoding the biosynthetic pathway enzymes RibA, RibB, RibC, RibD, and RibE. Each of the rib genes was cloned into a separate operon as a transcriptional fusion with the gene for the red fluorescent protein mKate239 downstream of the rib gene. The operons had a common structure where the rib genes were fused to a 5′ region containing a cloning site (underlined), and the rib RBS sequence (bold) (TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATAAGCTT). Between the two genes, an intergenic region (TAATAAGCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATAAGCTT) containing the mkate2 RBS sequence (bold) was integrated. After the mkate2 gene, a 3’ region (TAGTAAGCTAGCTCGAATTC) containing a cloning site (underlined) was inserted. To express the operons, we constructed vector pSEVA261_Ptet, which allows induction of gene expression via the tet-system in response to anhydrotetracycline (aTc). It was constructed by isolation of the Ptet promoter together with tetR from plasmid pAB92 by restriction digest with SpeI and EcoRI and ligation into plasmid pSEVA261, linearized with the same enzymes. Integration of the operons as ribX-mKate2 fragments into pSEVA261_Ptet produced plasmids pSEVA261_ribX, where X = A, B, C, D, or E.
In silico RBS library preparation
For all involved genes, the nucleotides of the RBS were identified and fully diversified (N6) for the N6-libraries. For the GLOS-libraries a partially degenerated sequence was identified so that each sequence in the library is a 6-bp mismatch to the wild-type sequence. Effectively, this means at each position we allowed all bases except for the original one – e.g., if the original sequence was an A the randomization would include a T, C and G at this positon. Appling this rule at 6 consecutive positions ensured that a 6 bp mismatch was created for each library member. The fully degenerate sequence, and the partially degenerate sequence that was designed according to GLOS rules (see above), were submitted to the RBS Calculator version 1.0 in the “Predict: RBS Library” mode5 (salislab.net/software) in order to calculate TIR values. For each gene, this data set was then used as input for the RedLibs algorithm version 1.1 as described in the documentation provided with the algorithm’s script (Supplementary file S1; github.com/dgerngross/RedLibs). The RedLibs algorithm was set to generate reduced libraries of 18 sequences whose predicted TIRs distribute as uniformly as possible between the minimal and maximal value of the input library (Supplementary Table S4). The resulting reduced degenerate DNA sequences (one for each gene) were used to design MAGE oligonucleotides (oligonucleotide 1–8, Supplementary Table S3).
Chromosomal integration: CRMAGE
E. coli (strains EcNR1 or Ec−, each transformed with plasmid pCas9) were grown in 3 mL of LB Miller supplemented with chloramphenicol at 32 °C until an OD600 of approximately 0.6 was reached. Cells were heat-shocked for 15 min at 42 °C to induce production of the Red Beta protein. The induced cells were made electrocompetent by washing 3 times with 1 mL of ice cold water. Then, each oligonucleotide was added to the cells to a final concentration in the cuvette of 2 µM and the cells were electroporated (1 mm gap cuvettes; Cell Projects, Harrietsham, United Kingdom; 1.8 kV and 4–6 ms pulse). Cells were recovered by addition of 3 mL of fresh LB Miller medium supplemented with chloramphenicol for one further CRISPR/Cas9 assisted MAGE cycle. For the second cycle cells were electroporated with 2 µM oligonucleotide and 50 to 100 ng of the pCRISPR plasmid. Cells were incubated for recovery at 32 °C, after 1 h kanamycin was added for selection of pCRISPR, and the incubation was continued overnight. On the next day, cells were plated on selective medium plates with kanamycin and chloramphenicol. Clones were analysed by colony PCR (Multiplex PCR kit; Quiagen, Hombrechtikon, Switzerland). AR efficiency is defined by the ratio of number of mutants and number of total clones. PCR primers (15–20) are listed in Supplementary Table S3. The PCR program was as follows: step 1: 15 min at 95 °C; step 2: 30 s at 95 °C; step 3: 30 s at 50 °C; step 4: 60 s at 72 °C; repeat steps 2 to 4 30 times; step 5: 10 min at 72 °C and storage at 8 °C. To perform the next round of CRMAGE, the strain was cured from pCRISPR by transformation with pSEVA431_SacB that can be selected for by streptomycin. Due to incompatibility of the plasmid origins of replication, pCRISPR was lost during streptomycin selection. Afterward, the strain was grown without antibiotic, and cells that had lost pSEVA431_SacB were selected on medium containing sucrose. Loss of pSEVA431_SacB was confirmed by loss of the antibiotic resistance.
Sanger sequencing
A PCR on the genomic regions of interest was performed with Taq DNA polymerase (primer 17–22 see Supplementary Table S3). The PCR program had the following steps: step 1: 5 min at 95 °C; step 2: 30 s at 95 °C; step 3: 30 s at 50 °C; step 4: 30 s at 68 °C; repeat steps 2 to 4 28 times; step 5: 10 min at 68 °C and storage at 8 °C. PCR products were sequenced by GATC (Konstanz, Germany) or Microsynth (Balgach, Switzerland).
Next generation sequencing (NGS)
To analyse the off-target effects caused by the MMR− strain, we sequenced the genomes of EcNR1, Ec−, Ec+ribAB and Ec−ribAB on an Illumina MiSeq platform (Illumina RTA Version: 1.18.54, Sequencer: GFB MiSeq, Run type: PE-250). Whole genome data were analysed with the software breseq version 0.26.040. In order to investigate the composition of the oligonucleotide libraries, ssDNA was filled up by a primer extension reaction with one primer (primer 25–27 see Supplementary Table S3). Sequencing adapters were PCR free ligated to the dsDNA fragments by the Hyper Prep Kit (KAPA Biosystems, Wilmington, US) and sequenced by Illumina MiSeq as described above. Oligonucleotide sequencing data were analysed by an in-house developed software for NGS data analysis (S. Schmitt, manuscript in preparation).
Riboflavin quantification
Strains were incubated in 96-deep well plates (System Duetz, EnzyScreen, Haarlem, Netherlands) for 24 h at 30 °C in M9 GCA medium. Riboflavin was quantified in the cell-free supernatant using an Infinite M1000 PRO microplate reader (TECAN, Maennedorf, Switzerland) measuring fluorescence (excitation wavelength 440 nm and emission wavelength 535 nm). Additionally, the OD600 of the culture was measured.
Growth curves
Growth curves were analysed in 1 mL LB medium at 30 °C and shaking at 800 rpm using a BioLector (m2p-labs GmbH, Baesweiler, Germany) device in a 48-well FlowerPlate (m2p-labs GmbH, Baesweiler, Germany).
Data availability statement
All data generated or analysed during this study are included in this published article (and its Supplementary Information files), the algorithm’s script used in this published article is provided on github.com/dgerngross/RedLibs.
Electronic supplementary material
Acknowledgements
We thank George Church for providing strain EcNR1 (Addgene strain #26930), Luciano Marraffini for pCas9 (Addgene plasmid #42876) and pCRISPR (Addgene plasmid #42875), Victor de Lorenzo for pSEVA431, Andreas Bosshart for pAB92 and Diana Ostos Rangel for the pSEVA261_ribX plasmids. We thank Lukas A. Widmer for critical reviewing of the RedLibs v1.1.0 script and this manuscript. We thank Irene Wüthrich, Tsvetan Kardashliev and Markus Jeschek for critical reading of the manuscript and scientific discussions as well as Christian Beisel and the Genomics Facility Basel (D-BSSE, ETH Zurich) for NGS and Susana Posada Cespedes for her help with the NGS analysis. This work was supported by the Swiss National Science Foundation (project PROTSWITCH #310030_143645) and the NCCR Molecular Systems Engineering.
Author Contributions
S.O.E., D.G. and T.R. conceived the study. S.O.E. and T.R. designed and S.O.E. performed the experiments. S.O.E., T.R. and S.P. critically assessed the data, and S.P. oversaw the study. S.S. established the plasmid based riboflavin production. S.O.E. and S.S. analysed the N.G.S. data. D.G. developed the updated RedLibs algorithm. S.OE., T.R. and S.P. wrote the manuscript. All authors read and approved the final manuscript.
Competing Interests
The authors declare that they have no competing interests.
Footnotes
Electronic supplementary material
Supplementary information accompanies this paper at 10.1038/s41598-017-12395-3.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Pfleger BF, Pitera DJ, Smolke CD, Keasling JD. Combinatorial engineering of intergenic regions in operons tunes expression of multiple genes. Nat. Biotechnol. 2006;24:1027–1032. doi: 10.1038/nbt1226. [DOI] [PubMed] [Google Scholar]
- 2.Wang HH, et al. Programming cells by multiplex genome engineering and accelerated evolution. Nature. 2009;460:894–898. doi: 10.1038/nature08187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Salis HM, Mirsky EA, Voigt CA. Automated design of synthetic ribosome binding sites to control protein expression. Nat. Biotechnol. 2009;27:946–950. doi: 10.1038/nbt.1568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Fernandez-Rodriguez J, Moser F, Song M, Voigt CA. Engineering RGB color vision into Escherichia coli. Nat. Chem. Biol. 2017;13:706–708. doi: 10.1038/nchembio.2390. [DOI] [PubMed] [Google Scholar]
- 5.Farasat I, et al. Efficient search, mapping, and optimization of multi-protein genetic systems in diverse bacteria. Mol. Syst. Biol. 2014;10:731–731. doi: 10.15252/msb.20134955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wu G, et al. Metabolic burden: Cornerstones in synthetic biology and metabolic engineering applications. Trends Biotechnol. 2016;34:652–664. doi: 10.1016/j.tibtech.2016.02.010. [DOI] [PubMed] [Google Scholar]
- 7.Ceroni F, Algar R, Stan G-B, Ellis T. Quantifying cellular capacity identifies gene expression designs with reduced burden. Nat. Methods. 2015;12:1–8. doi: 10.1038/nmeth.3339. [DOI] [PubMed] [Google Scholar]
- 8.Jeschek M, Gerngross D, Panke S. Rationally reduced libraries for combinatorial pathway optimization minimizing experimental effort. Nat. Commun. 2016;7:1–10. doi: 10.1038/ncomms11163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Salis HM. The ribosome binding site calculator. Methods Enzymol. 2011;498:19–42. doi: 10.1016/B978-0-12-385120-8.00002-4. [DOI] [PubMed] [Google Scholar]
- 10.Bonde MT, et al. Direct mutagenesis of thousands of genomic targets using microarray-derived oligonucleotides. ACS Synth. Biol. 2015;4:17–22. doi: 10.1021/sb5001565. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bonde MT, et al. Predictable tuning of protein expression in bacteria. Nat. Methods. 2016;13:233–236. doi: 10.1038/nmeth.3727. [DOI] [PubMed] [Google Scholar]
- 12.Waites, M. J., Morgan, N. L., Rockey, J. S. & Higton, G. Industrial microbiology: An introduction. International Journal of Food Microbiology77, (2002).
- 13.Silva F, Queiroz JA, Domingues FC. Evaluating metabolic stress and plasmid stability in plasmid DNA production by Escherichia coli. Biotechnol. Adv. 2012;30:691–708. doi: 10.1016/j.biotechadv.2011.12.005. [DOI] [PubMed] [Google Scholar]
- 14.Oesterle, S. et al. Toward Genome-Based Metabolic Engineering in Bacteria, in Submitted, vol. 101, Elsevier Ltd, doi:10.1016/bs.aambs.2017.07.001 (2017).
- 15.Sawitzke JA, et al. Recombineering: In vivo genetic engineering in E. coli, S. enterica, and beyond. Methods Enzymol. 2007;421:171–199. doi: 10.1016/S0076-6879(06)21015-2. [DOI] [PubMed] [Google Scholar]
- 16.Wei T, Cheng B-Y, Liu J-Z. Genome engineering Escherichia coli for L-DOPA overproduction from glucose. Sci. Rep. 2016;6:30080. doi: 10.1038/srep30080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ng CY, Farasat I, Maranas CD, Salis HM. Rational design of a synthetic Entner-Doudoroff pathway for improved and controllable NADPH regeneration. Metab. Eng. 2015;29:86–96. doi: 10.1016/j.ymben.2015.03.001. [DOI] [PubMed] [Google Scholar]
- 18.Wang HH, Xu G, Vonner AJ, Church G. Modified bases enable high-efficiency oligonucleotide-mediated allelic replacement via mismatch repair evasion. Nucleic Acids Res. 2011;39:7336–7347. doi: 10.1093/nar/gkr183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Costantino N, Court DL. Enhanced levels of lamdba red-mediated recombinants in mismatch repair mutants. Proc. Natl. Acad. Sci. 2003;100:15748–15753. doi: 10.1073/pnas.2434959100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gallagher RR, Li Z, Lewis AO, Isaacs FJ. Rapid editing and evolution of bacterial genomes using libraries of synthetic DNA. Nat. Protoc. 2014;9:2301–2316. doi: 10.1038/nprot.2014.082. [DOI] [PubMed] [Google Scholar]
- 21.Isaacs FJ, et al. Precise manipulation of chromosomes in vivo enables genome-wide codon replacement. Science. 2011;333:348–353. doi: 10.1126/science.1205822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Nyerges Á, et al. Conditional DNA repair mutants enable highly precise genome engineering. Nucleic Acids Res. 2014;42:e62. doi: 10.1093/nar/gku105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lennen RM, et al. Transient overexpression of DNA adenine methylase enables efficient and mobile genome engineering with reduced off-target effects. Nucleic Acids Res. 2016;44:e36–e36. doi: 10.1093/nar/gkv1090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Nyerges Á, et al. A highly precise and portable genome engineering method allows comparison of mutational effects across bacterial species. Proc. Natl. Acad. Sci. 2016;113:2502–2507. doi: 10.1073/pnas.1520040113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sawitzke JA, et al. Probing cellular processes with oligo-mediated recombination and using the knowledge gained to optimize recombineering. J. Mol. Biol. 2011;407:45–59. doi: 10.1016/j.jmb.2011.01.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Ronda C, Pedersen LE, Sommer MOA, Nielsen AT. CRMAGE: CRISPR optimized MAGE recombineering. Sci. Rep. 2016;6:19452. doi: 10.1038/srep19452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hohmann, H.-P. & Stahmann, K.-P. In Comprehensive Natural Products II 115–139 (Elsevier, 2010). 10.1016/B978-008045382-8.00667-5
- 28.Carr PA, et al. Enhanced multiplex genome engineering through co-operative oligonucleotide co-selection. Nucleic Acids Res. 2012;40:e132–e132. doi: 10.1093/nar/gks455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Bassalo MC, et al. Rapid and efficient one-step metabolic pathway integration in E. coli. ACS Synth. Biol. 2016;5:561–568. doi: 10.1021/acssynbio.5b00187. [DOI] [PubMed] [Google Scholar]
- 30.Espah Borujeni A, Channarasappa AS, Salis HM. Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream standby sites. Nucleic Acids Res. 2014;42:2646–2659. doi: 10.1093/nar/gkt1139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Koziol J. Studies on flavins in organic solvents-I*. spectral characteristics of riboflavin, riboflavin tetrabutyrate and lumichrome. Photochem. Photobiol. 1966;5:41–54. doi: 10.1111/j.1751-1097.1966.tb05759.x. [DOI] [Google Scholar]
- 32.Pedrolli D, et al. The ribB FMN riboswitch from Escherichia coli operates at the transcriptional and translational level and regulates riboflavin biosynthesis. FEBS J. 2015;282:3230–3242. doi: 10.1111/febs.13226. [DOI] [PubMed] [Google Scholar]
- 33.Li Y, et al. Metabolic engineering of Escherichia coli using CRISPR–Cas9 meditated genome editing. Metab. Eng. 2015;31:13–21. doi: 10.1016/j.ymben.2015.06.006. [DOI] [PubMed] [Google Scholar]
- 34.Garst AD, et al. Genome-wide mapping of mutations at single-nucleotide resolution for protein, metabolic and genome engineering. Nat. Biotechnol. 2016;35:1–14. doi: 10.1038/nbt.3718. [DOI] [PubMed] [Google Scholar]
- 35.Liang L, et al. CRISPR EnAbled Trackable genome Engineering for isopropanol production in Escherichia coli. Metab. Eng. 2017;41:1–10. doi: 10.1016/j.ymben.2017.02.009. [DOI] [PubMed] [Google Scholar]
- 36.Sambrook, J. F. & Russell, D. W. Molecular Cloning: A Laboratory Manual 3rd edition (Cold Spring Harbor Laboratory, 2001).
- 37.Guo D, et al. Online high-throughput mutagenesis designer using scoring matrix of sequence-specific endonucleases. J. Integr. Bioinform. 2015;12:1–24. doi: 10.1515/jib-2015-283. [DOI] [PubMed] [Google Scholar]
- 38.Jiang W, Bikard D, Cox D, Zhang F, Marraffini LA. RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat. Biotechnol. 2013;31:233–239. doi: 10.1038/nbt.2508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Shcherbo D, et al. Far-red fluorescent tags for protein imaging in living tissues. Biochem. J. 2009;418:567–574. doi: 10.1042/BJ20081949. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Deatherage DE, Barrick JE. Identification of mutations in laboratory-evolved microbes from next-generation sequencing data using breseq. Methods Protoc. Methods Mol. Biol. 2014;1151:165–188. doi: 10.1007/978-1-4939-0554-6_12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Biochemical Nomenclature and Related Documents. (Press, Portland, 1992).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data generated or analysed during this study are included in this published article (and its Supplementary Information files), the algorithm’s script used in this published article is provided on github.com/dgerngross/RedLibs.