Abstract

Synthetic biology uses genetically encoded devices and circuits to implement novel complex functions in living cells and organisms. A hallmark of these genetic circuits is the interaction among their individual parts, according to predefined rules, to process cellular information and produce a circuit output or response. As the number of individual components in a genetic circuit increases, so does the number of interactions needed to achieve the correct behavior, and hence, a greater need to fine-tune the levels of expression of each component. Transcriptional promoters play a key regulatory role in genetic circuits, as they influence the levels of RNA and proteins produced. In multicellular organisms, such as plants, they can also determine developmental, spatial, and tissue-specific patterns of gene expression. The 35S promoter from the Cauliflower Mosaic Virus (CaMV 35S) is widely used in plant biotechnology to direct high levels of gene expression in a variety of plant species. We produced a library of 21 variants of the CaMV 35S promoter by introducing all single nucleotide substitutions to the promoter’s TATA box sequence. We then characterized the activity of all variants in homozygous transgenic plants and showed that some of these variants have lower activity than the wild type in plants. These promoter variants could be used to fine-tune the behavior of synthetic genetic circuits in plants.
Keywords: plant promoter, CaMV 35S promoter, TATA box, genetic circuits, plant synthetic biology, control of gene expression
Introduction
Synthetic biology aims to apply well-established engineering concepts and approaches to genetically re-program biological systems with novel protein functions, expression patterns, responses, metabolic pathways, and other capabilities for applications in industries such as agriculture, manufacturing, and medicine.1 These novel, usually complex behaviors are enabled by genetic circuits assembled from various genetic parts that interact with each other according to predefined instructions to process cellular information. As the complexity of a synthetic genetic circuit increases, so does the probability of failure in its intended function, as small deviations in the behavior of its individual parts are compounded in the overall system. Therefore, efforts have been made to quantitatively define or characterize standard biological parts, which may be assembled according to simple rules and perform in predictable ways. The intention of these efforts has been to eliminate the need for researchers to build and test genetic parts individually and on an as-needed basis and, instead, provide a toolbox of genetic parts complete with data sheets that can be assembled into composite parts to create complex yet predictable genetic circuits.2
Significant progress has been made in the development of genetic parts and circuits for use in prokaryotic and yeast cells;3 however, similar advances in plant synthetic biology are lagging.4 Biological and physical features of plants, such as slow growth rates, long life cycles, and multicellularity, have imposed constraints and complicated the development of these tools.5 In addition, the site of integration of transgenes into the plant genome remains mostly random, resulting in often highly variable measurements of genetic part activity among independent transgenic plants and impacting the predictability of function.6 To overcome these challenges, it has been suggested that improvements to the methodology of building complex plant multigenic traits first need to be addressed through the implementation of logical and systematic engineering strategies tailored to plants.7 An important facet of this goal is the thorough quantitative definition of genetic parts that are used to build larger, predictable genetic circuits, which produce a predetermined amount of the desired product with no off-target or toxic effects on the plant cells. The availability of well-characterized parts allows establishment of the first stage of higher-order abstraction in line with the engineering approach synthetic biology aims for.5
All introduced genetic parts and circuits must function somewhat independently from the host cell regulation, a concept known as orthogonality.8 While computational software can aid in predicting cellular interactions and genetic regulation in multigene circuits, the process still requires much trial and error.9 Transcriptional promoters play an important role in the regulation of genetic circuits, as they control transcription initiation rates and therefore, influence the levels of gene expression. They determine whether the expression of a gene is constitutive, induced, or context-specific, and as such they are the fine-tuning knobs of many genetic circuits. Promoters typically contain many cis-regulatory sequences that affect their activity, and by their nature, they must interact with the host cell’s transcriptional machinery. Therefore, unintended interactions between the host’s transcription factors and an introduced synthetic promoter must be tested to ensure orthogonality.
In an attempt to standardize measurements of promoter activity in plants, we recently proposed that the 35S promoter from the Cauliflower Mosaic Virus (CaMV) be established as the standard reference promoter in plant synthetic biology, to streamline mathematical modeling of genetic circuits.10 This concept has been applied to prokaryotic systems, where the bacterial promoter part BBa_J23101 serves as the internal reference standard promoter to which all other promoter activities are measured against.11 The use of this fully characterized genetic part has facilitated data collection that is universally understood. The viral CaMV 35S promoter is one of the most widely used heterologous promoters in plant biotechnology as it directs high transcription rates and constitutive expression across most plant tissues, with a few exceptions.12,13 The CaMV 35S promoter contains several regulatory domains, including a TATA box (sequence 5′-TATATAA-3′), located in the core promoter region and centered 27 bp upstream of the transcription start site. The TATA box is an important 7-base-pair cis-regulatory sequence of promoters, as it is recognized by the TATA-binding protein (TBP), an essential component of the eukaryotic TFIID general transcription factor complex.14 Mutagenesis studies in plants, HeLa cells, and yeast have shown that many single nucleotide mutations within the TATA box of promoters are tolerated and may even result in higher transcription rates, specifically, those that agree with the consensus TATA box sequence (5′-TATAWAW-3′, where W = A or T). The third position change of T to A is the exception to this rule—a change from T to A in this position has been shown to reduce transcription rates significantly.15,16 Overall, TATA box sequences with double substitutions have been shown to result in no detectable transcription, and single mutations between positions 2 through 5 cause a reduction in transcription rates, presumably because TBP cannot bind to the altered sequences or it does not recognize the sequence at all.17,18
With the goal of constructing plant-functional promoters with varying levels of transcriptional activity, here we generated a library of 21 promoter variants by introducing all possible single-nucleotide mutations in the TATA box of the CaMV 35S promoter. We then measured their transcriptional activities in homozygous transgenic Arabidopsis thaliana plants using firefly luciferase as a quantitative reporter and compared them to the promoter containing the wild-type TATA box sequence (Figure 1). We show that certain single mutations within the 7-bp TATA box sequence alter the CaMV 35S promoter activity significantly (e.g., T at position 5), whereas nucleotides at other positions (e.g., A at position 7) may be modified without significant loss of activity. Importantly, some of these variants result in low, but still measurable promoter activities in plants, an important property for fine-tuning the expression levels of individual genes in synthetic multigene circuits.
Figure 1.

Plasmid components. Each plasmid contained the Firefly luciferase gene driven by a CaMV 35S promoter (CaMV 35S) containing a single-nucleotide change in the TATA box, and a TNos terminator sequence (Fluc TU). The wild-type 7-bp TATA box sequence is indicated in blue font in the box above the CaMV 35S promoter, and single nucleotide changes to this sequence are indicated in red font, with V representing an A, C, or G nucleotide, and B representing a T, C, or G nucleotide. In addition, a Renilla luciferase gene driven by the Nopaline Synthase promoter (NOS) and a TNos terminator was kept constant in all plasmids (Rluc TU). Relative positions to the transcription start site (+1) are indicated above the CaMV 35S promoter.
Results and Discussion
Transient Gene Expression Assays in Protoplasts
The effect of TATA box single-nucleotide changes on the activity of the CaMV 35S promoter was initially determined by transient gene expression in Arabidopsis protoplasts and quantified with the Dual-Luciferase Reporter Assay System (Promega), which allows measurement of both firefly and Renilla luciferase activities in the same sample. Firefly luciferase expression was driven by TATA box variant CaMV 35S promoters, whereas the Nos promoter was used to control the expression of Renilla luciferase and remained constant in all plasmids (Figure 1). Renilla luciferase expression was used as an internal normalization factor to account for differences in plasmid copy numbers introduced into protoplasts. Normalized firefly luciferase expressed from the CaMV 35S promoter containing the wild-type TATA box sequence (i.e., 5′-TATATAA-3′) was used as a basis of comparison with the activity of the variant promoters. Transient expression in protoplasts provided preliminary insight into changes in gene expression driven by the variant CaMV 35S promoters and tested whether any of the single-nucleotide variants eliminated promoter activity entirely. We did not test two simultaneous nucleotide substitutions as these have been previously shown to abolish promoter activity.18,19
While transient protoplast assays can display large variability, Figure 2 shows that the A4T variant resulted in the highest promoter activity, with an average 1.2-fold higher promoter activity relative to the wild-type TATA box sequence. The T5A variant had the second highest activity, 1.1-fold higher, although neither change was considered statistically significant. As expected, many more single-nucleotide changes resulted in significantly lower promoter activity compared to the wild-type sequence. Notably, significant reductions in the CaMV 35S promoter activity were observed for the A4C, A4G, and T5C variants, with reductions of 7.7-, 5.6-, and 5.3-fold, respectively, relative to the wild-type sequence. No C or G substitutions resulted in higher promoter activity than the wild-type TATA box sequence, and there were no substitutions that eliminated promoter activity completely, though A4C resulted in only 13% of promoter activity relative to the wild-type TATA box sequence.
Figure 2.
Transient gene expression in protoplasts. Firefly luciferase activity was normalized to Renilla luciferase measurements, and the average normalized firefly luciferase activity of the CaMV 35S promoter containing the wild-type TATA box was set to 1.0 (labeled as WT). Each bar represents the average of three independent experiments (depicted as dots) and each bar color indicates groups of nucleotide changes to the same position. Each variant was tested in triplicate in three independent experiments. *P-value < 0.05, **P-value < 0.01. Error bars indicate standard error of the mean.
Stable Gene Expression in Transgenic Plants
Following initial transient expression assays in protoplasts, the same plasmids containing the variant TATA box in the CaMV 35S promoter were then individually transformed into A. thaliana plants using the Agrobacterium-mediated floral dip method. Initial screenings of firefly luciferase expression using a NightOwl (Berthold) imaging system were conducted on several independent transgenic T1 plants from each variant, followed by the selection of transgenic lines containing a single T-DNA insertion based on segregation of the kanamycin resistance trait in subsequent generations. At least five independent, homozygous T3 transgenic lines for each promoter variant and the wild type were used for further analyses. Although the T-DNA insertion sites were not determined for these lines, we believe that by examining at least five independent lines, we can observe the general trends of these variants. Figure 3 shows the average luciferase activity for each of the CaMV 35S TATA box variants and wild-type sequence in protein extracts of 14-day-old seedlings (source data is available in Table S2).
Figure 3.
Average luciferase expression from CaMV 35S promoter TATA box variants in homozygous transgenic plants. Firefly luciferase activity was normalized to Renilla luciferase measurements. The average normalized firefly luciferase activity of the CaMV 35S promoter containing the wild-type TATA box sequence was set to 1.0 (labeled as WT). Each boxplot color indicates groups of nucleotide changes to the same position. Each boxplot represents the average luciferase activity of five independent transgenic lines (8 replicates per line). *P-value < 0.05, **P-value < 0.01.
Transient gene expression in leaf protoplasts has been extensively used for the characterization of gene function in plants, and although protoplast batch-to-batch variability has been reported, overall expression trends are maintained.20 Here, the results of luciferase activity varied substantially between transient assays and stable transgenic plants. For example, the T1G variant showed slightly higher, albeit not significant, promoter activity than the wild-type sequence in stably transformed plants, whereas its activity was approximately 2-fold lower than the wild type in transient assays. Similarly, the A4T variant had the highest promoter activity in transient assays but one of the lowest activities in transgenic plants. One possible reason for these discrepancies is the presence of different cell types in the two assays. Protoplasts were prepared only from leaves and the measured promoter activities reflect function in this tissue, whereas whole seedlings were used for promoter activity measurements in transgenic plants, which include additional tissues such as roots, and hypocotyl.
Eight variants (T1G, A4C, A4G, T5A, A6G, A7C, A7G, and A7T) had promoter activity similar to or greater than (average > 0.80) the wild-type TATA box sequence in transgenic plants. The T5A variant resulted in the highest promoter activity, an average 2.89-fold higher than the wild-type sequence (P-value = 0.16); however, it also had the largest variation in measured activity. The T1G and A6G variants had slightly higher, although not significant, promoter activity compared with the wild-type TATA box sequence, with 1.46- and 1.29-fold higher promoter activity, respectively. The T1A variant showed the lowest promoter activity of all variants with 3.83-fold lower promoter activity relative to the wild-type TATA box sequence (P-value = 0.009). The A4T and T5C variants were shown to have the second- and third-lowest promoter activity, with 3.32- and 2.83-fold lower activity, respectively (P-value = 0.0012 and 0.0022, respectively). The results show that changes in the second and third positions in the TATA box always reduce promoter activity regardless of nucleotide substitution, and changes in the seventh position do not reduce promoter activity significantly; all C substitutions lower promoter activity, except in the seventh position where the change resulted in similar promoter activity compared to the wild-type TATA box sequence (1.0029 fold).
Previous research on single nucleotide changes to the TATA box of several different promoters has shown that some altered sequences can still initiate transcription efficiently and variations in the sequence are naturally found in the promoter regions of plant genes. For example, TATA box variant A4T, which results in the sequence 5′-TATTTAA-3′, is naturally found in the rice phenylalanine ammonia-lyase promoter and can efficiently initiate transcription in whole-cell extracts in rice.19 Here, variant A4T resulted in the highest promoter activity in transient transformations but had only 30% promoter activity of the wild-type TATA box in stable transgenic plants.
It has been suggested that A-to-T and T-to-A transversions in the TATA box sequence would result in transcription rates and gene expression similar to the consensus TATA box sequence because the two nucleotides have similar stereochemical properties in the minor groove.21 However, Mukumoto et al.18 observed that the T3A variant in the TC7 promoter had very low transcriptional efficiency with only 15% transcriptional activation relative to the consensus TATA box sequence. Additionally, through co-crystallization of the TATA-binding protein (TBP) with naturally occurring TATA elements, it has been determined that Arabidopsis TBP cannot co-crystalize with the T3A variant or a variant with all A substitutions except for the first position (i.e., 5′-TAAAAAAA-3′).22 In the present study, the T3A variant resulted in 61% promoter activity relative to the wild-type TATA box sequence in transgenic plants (Figure 3).
TBP has been shown to have preference for binding flexible sequences (i.e., AT-rich) that are characterized by weaker stacking energies, allowing them to be bent.21 While TBP can bind to the sequences, it has been shown that transcription factors TFIIA and TFIIB cannot bind to the altered sequences, inhibiting the formation of the pre-initiation complex (PIC) of RNA polymerase II, which in turn, inhibits transcription.23,24 TBP phenylalanine side chains interact with the first-, second-, and seventh-position nucleotides of the TATA box, which disrupts base-stacking interactions in the minor groove of DNA; this interaction causes the DNA to bend sharply.24 These side-chain interactions could explain why most nucleotide substitutions in the first and second positions of the CaMV 35S TATA box resulted in lower promoter activity than the wild-type TATA box sequence in both transient and stable transformations, with T1G being the only exception. Our results also suggest that the identity of the nucleotide in the seventh position may be more flexible, as changes in this position did not result in lower promoter activity. This flexibility is also reflected in the consensus TATA box sequence of TATAWAW, where W = A or T.
Previous co-crystallization experiments22 also showed that some C and G substitutions in the TATA box are not tolerated, namely, A2C, A2G, A4G, and T5C substitutions, as they were not able to initiate transcription or co-crystalize with TBP. Here, these substitutions resulted in measurable promoter activity in transgenic plants, with only A2G and T5C having significantly lower activity than the wild-type sequence (Figure 3).
The variability observed among independent transgenic lines containing the same variant promoter could be explained, at least in part, by the different sites of integration of the T-DNA during Agrobacterium transformation. Position effects, caused by host trans-acting mechanisms or T-DNA neighboring heterochromatic DNA, can cause gene silencing and are well documented in plants.25 This high variability is evident by observing a plot of the measured firefly luciferase activity (without normalization) in independent transgenic plants containing the same variant promoter (Figure S1). Nonetheless, we observed consistent changes in promoter activity in some variants relative to the wild-type sequence, and their relative transcription rates can serve as a starting point for fine-tuning gene expression in more complex genetic circuits in plants.
Characterization of CaMV 35S Promoter TATA Box Variants
With the goal of establishing CaMV 35S promoter TATA box variants that can be used to fine-tune gene expression in plants, we selected two of the variants, T5A and T5C, which showed higher and lower activity than the wild-type sequence, respectively, to carry out further analyses on their activity compared to the wild-type TATA box sequence. Two independent, homozygous T3 plant lines from each sequence variant and the wild type were grown on agar plates for luciferase and transcriptional activity measurements at 7, 14, and 28 days after germination by luminescence imaging using the NightOwl imaging system, dual-luciferase assays, and quantitative RT-PCR (qRT-PCR).
Figure 4 shows that, in general, transcriptional activity from the variant promoters and luciferase measurements showed good correlation, i.e., plants with lower transcript levels had lower luciferase activity. Both T5C transgenic lines consistently displayed significantly lower luciferase transcript and protein activity levels compared to transgenic lines containing the wild-type TATA box sequence. This behavior was observed in all plants at all ages tested. Plants containing the CaMV 35S promoter with the T5A mutation displayed similar or higher luciferase activity than the wild-type sequence at 7 and 14 days; however, by 28 days, T5A plants had significantly reduced luciferase activity. Despite the similarities in luciferase activity between transgenic plants with the T5A and wild-type TATA box sequences at earlier plant ages, the transcriptional activity of the wild-type promoter was consistently higher than T5A, as assessed by qRT-PCR.
Figure 4.
Expression analysis of transgenic plants containing the T5A and T5C variants and wild-type sequence of the CaMV 35S promoter TATA box at different plant ages. (A) Bar graphs showing quantitative RT-PCR analysis of firefly luciferase expression in homozygous T3 transgenic lines containing the wild-type (gray; WT), T5C (green), and T5A (red) TATA box sequences. Numbers in parentheses indicate the transgenic line number. (B) Boxplots showing firefly luciferase activity of the same transgenic lines containing the T5A, T5C, and wild-type TATA box sequences. Firefly luciferase activity was normalized by the Renilla luciferase activity. Top panels represent 7-day-old plants (7d), middle panels represent 14-day-old plants (14d), and bottom panels represent 28-day-old plants (28d). Y-axis range in graphs depicting luciferase activity was adjusted for better data visualization. Each bar represents the mean of three replicates for qRT-PCR (depicted as dots) and each boxplot represents six replicates for luciferase activity measurements. Error bars in (A) represent the standard error of the mean. Different letters above bars indicate significant differences at P-value <0.05.
Despite a more or less constant accumulation of the firefly luciferase transcript at the three plant ages tested (Figure 4A), all six transgenic lines displayed the highest luciferase activity at 14 days, and the activity decreased in 28-day-old plants (Figure 4B). The decrease in firefly luciferase activity from 28-day-old plants could also be observed by whole-plant luminescence imaging of all six transgenic lines (Figure S2). The higher luciferase activity measured at 14 days compared to 7 and 28 days was unexpected and might be explained by either a decreased protein turnover rate or an increase in protein synthesis at 14 days. Although the turnover rate of endogenous proteins has been shown to change with leaf age in Arabidopsis,26 we do not know how this would affect an introduced protein such as firefly luciferase. In addition, we do not believe this increase in luciferase activity at 14 days is related to the plants’ vegetative to reproductive transition, as Arabidopsis plants grown under long-day conditions typically flower around 3 weeks after germination (21 days), and our plants had no visible bolts at 14 days. Finally, changes in the overall metabolism of the plants related to aging and the availability of cofactors (e.g., ATP) should not have affected our luciferase measurements, as these were performed in vitro using whole protein extracts and the Dual-luciferase Reporter Assay System (Promega).
It is important to note that the consensus eukaryotic TATA box sequence, which includes plant TATA box sequences, has been determined to be 5′-TATAWAW-3′, where W = A or T. The variable positions in this consensus sequence correspond to the T5A (TATAAAA) and A7T (TATATAT) variants tested here, which individually, showed the highest luciferase activity in stable transgenic plants. Interestingly, Arabidopsis TBP has been shown to be able to recognize the consensus eukaryotic TATA sequence effectively, and it was found to be able to replace human TBP in HeLa-derived in vitro transcription systems.18 Examining the T5A and A7T mutations simultaneously to determine if the double mutation results in promoter activity significantly higher than the native CaMV 35S promoter sequence could be performed to provide more information on TATA box variants.
The Cauliflower Mosaic Virus (CaMV) has evolved to achieve high levels of transcriptional activity in plant cells as part of its infection process. The CaMV 35S promoter TATA box, with sequence 5′-TATATAA-3′ is recognized efficiently by the plant’s TATA-binding protein (TBP). Therefore, it is not surprising that single nucleotide changes in the CaMV 35S promoter generally resulted in lower promoter activity in both transient expression assays in plant protoplasts and in stable transgenic plants. None of the single nucleotide changes we tested here completely abolished the promoter activity, although a few of them resulted in less than half of the promoter activity achieved with the wild-type TATA box sequence. One of the TATA box variants, T5A, which was selected for showing higher activity than the wild-type sequence, albeit with high variability (Figure 3), did not result in higher transcription or luciferase activities in this focused study. At 14 days, the luciferase activity of the two selected T5A lines was similar to the WT. Among the possible explanations for this difference is that the results shown in Figure 3 are an average of five independent transgenic lines (8 replicates from each line), whereas Figure 4 depicts the average activity for each individual plant line. There was high variability among the five T5A transgenic lines originally tested, and this likely contributed to the higher average observed.
A library of promoters with varying activities in plants is useful to fine-tune the expression levels of a gene(s) of interest to achieve desired phenotypes. In the context of synthetic biology and metabolic engineering, these CaMV 35S variants can be used to drive the expression of genes that result in the production of metabolites, proteins, and chemicals, without negative effects to the host organism. Moreover, synthetic promoters with low transcriptional activity may be needed for fine-tuning genetic circuits of higher complexity; for example, a repressor protein, expressed constitutively at low levels in the cell, may help lower the background output activity in positive feedback circuits and prevent spurious activation of the feedback mechanism. Similar mutations in promoter TATA box have been used to tune and control gene expression noise (or variability) in yeast promoters.27,28 The authors demonstrated that mutations that reduced the activity of the promoter controlling expression of a transcription factor allowed the noise in the expression of its target gene to be tuned. It will be interesting to see if a CaMV 35S TATA box variant with lower transcriptional activity, such as T5C, also has similar noise-tuning properties in plant cells. The TATA box sequence variants shown here for the CaMV 35S promoter might also be tested for similar changes in promoter strength in other promoters commonly used for transgene expression in plants, such as the Nopaline Synthase (NOS) and the Figwort Mosaic Virus 34S (FMV) promoters,29 and even be implemented in synthetic plant promoters, as the TATA box has been shown to be a modular core component of synthetic promoters, along with transcription factor binding sites.30,31
Materials and Methods
Plant and Growth Conditions
A. thaliana ecotype Col-0 was used for leaf protoplast isolation and transient expression assays, as well as for stable plant transformation using Agrobacterium tumefaciens. Seeds were surface-sterilized with 10% bleach, rinsed, and germinated in 1/2 strength Murashige and Skoog (MS) media containing 1% sucrose and 0.8% plant agar, and then kept at 4 °C for 2 days before transferring them to light racks. Fourteen-day-old seedlings were transferred to 4-inch pots containing Pro-Mix BX substrate (Premier Tech) and grown in a Conviron ATC26 growth chamber at 22 °C, under short-day conditions (10 h light/14 h dark). After floral dip transformations, plants were grown under LED grow lights in racks at 22 °C, under long-day conditions (16 h light/8 h dark), until seed harvesting.
Plasmid Construction
A TATA box-containing fragment of the CaMV 35S promoter, flanked by EcoRV and SpeI restriction sites, was PCR amplified from the GB0030 plasmid in the GoldenBraid (GB) 2.0 kit (Addgene, Watertown, MA). Oligonucleotide primers were used in PCR reactions to introduce single nucleotide changes into every position of the TATA box (5′-TATATAA-3′) creating two different-sized fragments (105 and 155 bp), which were fused by overlapping extension PCR32 using Phusion High-Fidelity DNA Polymerase (Thermo Scientific). EcoRV and SpeI were then used to replace the original TATA box in the CaMV 35S promoter in GB0030 with each TATA box variant, and the single nucleotide mutations of each variant were confirmed by sequencing (GeneWiz).
Plasmids containing the CaMV 35S promoter with each of the single-nucleotide variant TATA boxes (Level 0) were used in one-pot GoldenBraid 2.0 multipartite reactions,33 according to GoldenBraid protocols generated online (https://gbcloning.upv.es/do/multipartite/), to assemble transcriptional units (TU = promoter + ORF + terminator). Level 1 firefly luciferase transcriptional units (Fluc TU) were assembled in the pDGB3_α1 destination plasmid (21 total), each containing a CaMV 35S promoter with a variant TATA box, firefly luciferase coding sequence (GB0096), and the Nos (Nopaline Synthase) terminator (GB0037). A single Level 1 Renilla luciferase (Rluc) TU was similarly assembled in the pDGB3_α2 destination plasmid and is composed of three genetic parts: the Nopaline Synthase (Nos) promoter (GB0072), Renilla luciferase coding sequence (GB0056), and Nos terminator. Successful multipartite assemblies for Fluc and Rluc TUs were confirmed by colony PCR and restriction digests, after which these Level 1 plasmids were used for bipartite assemblies, to combine the constant Rluc TU individually with all Fluc TUs containing the TATA box variant CaMV 35S promoters, into the pDGB1_Ω2 destination plasmid. Successful combination of both TUs was again confirmed by colony PCR and restriction digests, followed by a similar bipartite reaction with a plasmid containing the NptII TU (GB1181; kanamycin selection marker in plants) into the pDGB_α1 plasmid to generate the final binary plasmids (21 total) containing the Fluc (with TATA box variants), Rluc, and NptII TUs. All 21 confirmed plasmids were used in transient assays and stable plant transformations. Plasmid names to indicate the TATA box modifications include the original nucleotide, its position in the 7-nucleotide TATA box sequence, followed by the new nucleotide variant (e.g., T1A indicates a CaMV 35S promoter variant where the T nucleotide at position 1 of the TATA box was mutated to an A nucleotide) (Table S1).
Transient Expression Assays
Transient expression assays of all 21 plasmids described above were conducted in Arabidopsis mesophyll protoplasts, essentially as previously described.34 Eight-week-old plants grown under short days were used for protoplast isolation, and each reaction contained approximately 30,000 protoplasts that were transfected with 15 micrograms of plasmid DNA in a 15 mL conical tube using 20% PEG (poly(ethylene glycol) 8000). After overnight incubation of protoplasts in the dark with gentle agitation (40 rpm), luciferase activity measurements were conducted.
Luciferase Activity Measurements
Luminescence from firefly and Renilla luciferases was measured using the Dual-Luciferase Reporter Assay System (Promega). For transfected protoplasts, cells were lysed in Passive Lysis Buffer (PLB) for 30 min prior to luminescence measurements. For stably transformed plants, 2-week-old homozygous T3 plants grown on MS media agar plates were flash-frozen in liquid nitrogen, then ground into a fine powder, resuspended in PLB, and incubated with gentle agitation for 30 min at room temperature. After centrifugation, 20 μL of sample were added to a black 96-well plate, followed by automated sequential dispensing of 50 μL of each substrate (LARII for Fluc; Stop & Glo for Rluc) and measured using BioTek’s SynergyMX plate reader. Firefly luciferase expression in transgenic plants was also imaged with a CCD camera in a Berthold NightOWL LB 983 instrument after plants were sprayed with a luciferin substrate (200 μM; GoldBioTechnology) and dark-adapted for 10 min in a dark chamber. Acquisition parameters were as follows: exposure time of 0.1 s, illumination intensity of 10%, and cosmic suppression and flatfield correction were enabled. Luminescence readings were taken at 10, 15, and 20 min after dark adaptation. Luminescence parameters were as follows: high gain, slow readout, 4 × 4 binning, and exposure times of 15 s after 10 min, 5 s after 15 min, and 1 s after 20 min.
Stable Plant Transformations
Plasmids containing the promoter variants were electroporated into A. tumefaciens GV3101 cells, and Arabidopsis plants were stably transformed using the Agrobacterium-mediated floral dip method.35,36 Transgenic plants were selected on MS agar media containing 60 mg/L kanamycin, and segregation of the kanamycin resistance trait was used to select homozygous plant lines for each of the TATA box variants.
Quantitative RT-PCR
Whole plants were harvested at 7, 14, and 28 days post-germination, total RNA was extracted using the Plant RNeasy Mini Kit (Qiagen) with on-column DNAse I treatment, followed by cDNA synthesis using One-Taq RT-PCR kit (New England BioLabs). Quantitative PCR reactions were carried out in a QuantStudio 3 (Applied Biosystems) using Luna Universal qPCR Master Mix (New England Biolabs) according to the manufacturer’s instructions. Firefly luciferase relative expression was calculated as described in Hellemans et al.37 using the geometric mean of three endogenous controls, TUBULIN β-9 CHAIN (AT4G20890), UBIQUITIN-CONJUGATING ENZYME 21 (AT5G25760), and elongation factor 1-α (AT5G60390), with primer efficiency within the 90–110% range.
Acknowledgments
The authors thank Trina Cupit for help growing all of the plants. This work was supported by start-up funds to M.S.A. by the BioDiscovery Institute and Department of Biological Sciences at the University of North Texas.
Glossary
Abbreviations Used
- CaMV
cauliflower mosaic virus
- NOS
nopaline synthase
- WT
wild type
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acssynbio.2c00457.
Plasmids used and corresponding mutations, source data for normalized firefly luciferase measurements, graph of non-normalized luciferase data, and images of luciferase expression in transgenic plants (PDF)
Author Contributions
M.S.A. designed the research; S.C.A. and S.S.F. performed experiments; S.C.A., S.S.F., and M.S.A. analyzed the data; S.C.A., S.S.F., and M.S.A. wrote the manuscript. All authors contributed to the article and approved the final manuscript.
The authors declare no competing financial interest.
Supplementary Material
References
- Endy D. Foundations for engineering biology. Nature 2005, 438, 449–453. 10.1038/nature04342. [DOI] [PubMed] [Google Scholar]
- Canton B.; Labno A.; Endy D. Refinement and standardization of synthetic biological parts and devices. Nat. Biotechnol. 2008, 26, 787–793. 10.1038/nbt1413. [DOI] [PubMed] [Google Scholar]
- McCarty N. S.; Ledesma-Amaro R. Synthetic Biology Tools to Engineer Microbial Communities for Biotechnology. Trends Biotechnol. 2019, 37, 181–197. 10.1016/j.tibtech.2018.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu W.; Stewart C. N. Plant synthetic biology. Trends Plant Sci. 2015, 20, 309–317. 10.1016/j.tplants.2015.02.004. [DOI] [PubMed] [Google Scholar]
- Boehm C. R.; Pollak B.; Purswani N.; Patron N.; Haseloff J. Synthetic botany. Cold Spring Harb. Perspect. Biol. 2017, 9, a023887 10.1101/cshperspect.a023887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gelvin S. B. Integration of Agrobacterium T-DNA into the Plant Genome. Annu. Rev. Genet. 2017, 51, 195–217. 10.1146/annurev-genet-120215-035320. [DOI] [PubMed] [Google Scholar]
- Roell M. S.; Zurbriggen M. D. The impact of synthetic biology for future agriculture and nutrition. Curr. Opin. Biotechnol. 2020, 61, 102–109. 10.1016/j.copbio.2019.10.004. [DOI] [PubMed] [Google Scholar]
- Costello A.; Badran A. H. Synthetic Biological Circuits within an Orthogonal Central Dogma. Trends Biotechnol. 2021, 39, 59–71. 10.1016/j.tibtech.2020.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brophy J. A. N.; Voigt C. A. Principles of genetic circuit design. Nat. Methods 2014, 11, 508–520. 10.1038/nmeth.2926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Amack S. C.; Antunes M. S. CaMV35S promoter – A plant biology and biotechnology workhorse in the era of synthetic biology. Curr. Plant Biol. 2020, 24, 100179 10.1016/j.cpb.2020.100179. [DOI] [Google Scholar]
- Kelly J. R.; Rubin A. J.; Davis J. H.; Ajo-Franklin C. M.; Cumbers J.; Czar M. J.; de Mora K.; Glieberman A. L.; Monie D. D.; Endy D. Measuring the activity of BioBrick promoters using an in vivo reference standard. J. Biol. Eng. 2009, 3, 4. 10.1186/1754-1611-3-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benfey P. N.; Chua N. H. The cauliflower mosaic virus 35S promoter: Combinatorial regulation of transcription in plants. Science 1990, 250, 959–966. 10.1126/science.250.4983.959. [DOI] [PubMed] [Google Scholar]
- Auriac M. C.; Timmers A. C. J. Nodulation studies in the model legume Medicago truncatula: Advantages of using the constitutive EF1α promoter and limitations in detecting fluorescent reporter proteins in nodule tissues. Mol. Plant-Microbe Interact. 2007, 20, 1040–1047. 10.1094/MPMI-20-9-1040. [DOI] [PubMed] [Google Scholar]
- Nikolov D. B.; Hu S. H.; Lin J.; Gasch A.; Hoffmann A.; Horikoshi M.; Chua N. H.; Roeder R. G.; Burley S. K. Crystal structure of TFIID TATA-box binding protein. Nature 1992, 360, 40–46. 10.1038/360040a0. [DOI] [PubMed] [Google Scholar]
- Yamaguchi Y.; Itoh Y.; Takeda Y.; Yamazaki K. I. TATA sequence requirements for the initiation of transcription for an RNA polymerase II in vitro transcription system from Nicotiana tabacum. Plant Mol. Biol. 1998, 38, 1247–1252. 10.1023/A:1006056128129. [DOI] [PubMed] [Google Scholar]
- Kiran K.; Ansari S. A.; Srivastava R.; Lodhi N.; Chatuverdi C. P.; Sawant S. V.; Tuli R. The TATA-box sequence in the basal promoter contributes to determining light-dependent gene expression in plants. Plant Physiol. 2006, 142, 364–376. 10.1104/pp.106.084319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hirose S.; Takeuchi K.; Suzuki Y. In vitro characterization of the fibroin gene promoter by the use of single-base substitution mutants. Proc. Natl. Acad. Sci. U.S.A. 1982, 79, 7258–7262. 10.1073/pnas.79.23.7258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mukumoto F.; Hirose S.; Imaseki H.; Yamazaki K. I. DNA sequence requirement of a TATA element-binding protein from Arabidopsis for transcription in vitro. Plant Mol. Biol. 1993, 23, 995–1003. 10.1007/BF00021814. [DOI] [PubMed] [Google Scholar]
- Zhu Qun.; Dabi T.; Lamb C. TATA box and initiator functions in the accurate transcription of a plant minimal promoter in vitro. Plant Cell 1995, 7, 1681–1689. 10.1105/tpc.7.10.1681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schaumberg K. A.; Antunes M. S.; Kassaw T. K.; Xu W.; Zalewski C. S.; Medford J. I.; Prasad A. Quantitative characterization of genetic parts and circuits for plant synthetic biology. Nat. Methods 2016, 13, 94–100. 10.1038/nmeth.3659. [DOI] [PubMed] [Google Scholar]
- Pastor N.; Pardo L.; Weinstein H. Does TATA matter? A structural exploration of the selectivity determinants in its complexes with TATA box-binding protein. Biophys. J. 1997, 73, 640–652. 10.1016/S0006-3495(97)78099-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Patikoglou G. A.; Kim J. L.; Sun L.; Yang S. H.; Kodadek T.; Burley S. K. TATA element recognition by the TATA box-binding protein has been conserved throughout evolution. Genes Dev. 1999, 13, 3217–3230. 10.1101/gad.13.24.3217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernués J.; Carrera P.; Azorín F. TBP binds the transcriptionally inactive TA5 sequence but the resulting complex is not efficiently recognised by TFIIB and TFIIA. Nucleic Acids Res. 1996, 24, 2950–2958. 10.1093/nar/24.15.2950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hieb A. R.; Halsey W. A.; Betterton M. D.; Perkins T. T.; Kugel J. F.; Goodrich J. A. TFIIA Changes the Conformation of the DNA in TBP/TATA Complexes and Increases their Kinetic Stability. J. Mol. Biol. 2007, 372, 619–632. 10.1016/j.jmb.2007.06.061. [DOI] [PubMed] [Google Scholar]
- de Buck S.; Windels P.; de Loose M.; Depicker A. Single-copy T-DNAs integrated at different positions in the Arabidopsis genome display uniform and comparable β-glucuronidase accumulation levels. Cell. Mol. Life Sci. 2004, 61, 2632–2645. 10.1007/s00018-004-4284-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- LI L.; Nelson C. J.; Trosch J.; Castleden I.; Huang S.; Millar H. Protein degradation rate in Arabidopsis thaliana leaf growth and development. Plant Cell 2017, 29, 207–228. 10.1105/tpc.16.00768. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blake W. J.; Balázsi G.; Kohanski M. A.; Isaacs F. J.; Murphy K. F.; Kuang Y.; Cantor C. R.; Walt D. R.; Collins J. J. Phenotypic Consequences of Promoter-Mediated Transcriptional Noise. Mol. Cell 2006, 24, 853–865. 10.1016/j.molcel.2006.11.003. [DOI] [PubMed] [Google Scholar]
- Murphy K. F.; Adams R. M.; Wang X.; Balázsi G.; Collins J. J. Tuning and controlling gene expression noise in synthetic gene networks. Nucleic Acids Res. 2010, 38, 2712–2726. 10.1093/nar/gkq091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sanger M.; Daubert S.; Goodman R. M. Characteristics of a strong promoter from figwort mosaic virus: comparison with the analogous 35S promoter from cauliflower mosaic virus and the regulated mannopine synthase promoter. Plant Mol. Biol. 1990, 14, 433–443. 10.1007/BF00028779. [DOI] [PubMed] [Google Scholar]
- Jores T.; Tonnies J.; Wrightsma T.; Buckler E. S.; Cuperus J. T.; Fields S.; Queitsch C. Synthetic promoter designs enabled by a comprehensive analysis of plant core promoters. Nat. Plants 2021, 7, 842–855. 10.1038/s41477-021-00932-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mogno I.; Vallania F.; Mitra R. D.; Cohen B. A. TATA is a modular component of synthetic promoters. Genome Res. 2010, 20, 1391–1397. 10.1101/gr.106732.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heckman K. L.; Pease L. R. Gene splicing and mutagenesis by PCR-driven overlap extension. Nat. Protoc. 2007, 2, 924–932. 10.1038/nprot.2007.132. [DOI] [PubMed] [Google Scholar]
- Sarrion-Perdigones A.; Vazquez-Vilar M.; Palaci J.; Castelijns B.; Forment J.; Ziarsolo P.; Blanca J.; Granell A.; Orzaez D. Goldenbraid 2.0: A comprehensive DNA assembly framework for plant synthetic biology. Plant Physiol. 2013, 162, 1618–1631. 10.1104/pp.113.217661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoo S. D.; Cho Y. H.; Sheen J. Arabidopsis mesophyll protoplasts: A versatile cell system for transient gene expression analysis. Nat. Protoc. 2007, 2, 1565–1572. 10.1038/nprot.2007.199. [DOI] [PubMed] [Google Scholar]
- Clough S. J.; Bent A. F. Floral dip: A simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana. Plant J. 1998, 16, 735–743. 10.1046/j.1365-313x.1998.00343.x. [DOI] [PubMed] [Google Scholar]
- Zhang X.; Henriques R.; Lin S. S.; Niu Q. W.; Chua N. H. Agrobacterium-mediated transformation of Arabidopsis thaliana using the floral dip method. Nat. Protoc. 2006, 1, 641–646. 10.1038/nprot.2006.97. [DOI] [PubMed] [Google Scholar]
- Hellemans J.; Mortier G.; de Paepe A.; Speleman F.; Vandesompele J. qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biol. 2007, 8, R19. 10.1186/gb-2007-8-2-r19. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



