A group of ERF transcription factors involved in plant defense has similar but divergent DNA-binding specificities, and amino acid residues in the DNA-binding domain are critical for such divergence.
Abstract
Transcription factors (TFs) recognize target DNA sequences with distinct DNA-binding domains (DBDs). The DBD of Arabidopsis (Arabidopsis thaliana) ETHYLENE RESPONSE FACTOR1 (AtERF1) uses three consecutive β-strands to recognize a GCC-containing sequence, but tobacco (Nicotiana tabacum) ERF189 and periwinkle (Catharanthus roseus) Octadecanoid-derivative Responsive Catharanthus AP2-domain protein3 (ORCA3) of the same TF subgroup appear to target similar but divergent DNA sequences. Here, we examined how DNA-binding specificities of these TFs have diverged in each plant lineage to regulate distinct defense metabolisms. Extensive mutational analyses of these DBDs suggest that two modes of protein-DNA interactions independently contribute to binding specificity and affinity. Substitution of a conserved arginine to lysine in the first β-strand of ERF189 relaxes its interaction with the second GC pair of the GCC DNA sequence. By contrast, an increased number of basic amino acids in the first two β-strands of ORCA3 allows this TF to recognize more than one GCC-related target, presumably via increased electrostatic interactions with the negatively charged phosphate backbone of DNA. Divergent DNA-binding specificities of the ERFs may have arisen through mutational changes of these amino acid residues.
Biological processes ranging from development to metabolism rely on reprogramming of the transcriptome, which is governed largely by transcription factors (TFs). TFs control gene expression at the level of transcription by recognizing specific DNA sequences, or cis-elements, in promoters of target genes through DNA-binding domains (DBDs) with various structural motifs (Yamasaki et al., 2012). Understanding specific binding of TFs to the DNA sequences and the underlying biophysical phenomena is necessary to interpret the genetic information in the genome directing gene regulation (Segal and Widom, 2009). Recently, systematic analyses of the DNA binding of a large number of TFs using protein-binding microarrays (Badis et al., 2009) and SELEX (Jolma et al., 2013) have shown promise in this direction.
TF genes account for over 5% of the total genes in the plant genome (Riechmann and Ratcliffe, 2000) and are usually present as large superfamilies, each member of which serves a specific regulatory role. Plant-specific TFs of the APETALA2/Ethylene Response Factor (AP2/ERF) superfamily are defined by the presence of a conserved AP2/ERF DBD of about 60 amino acid residues (Ohme-Takagi and Shinshi, 1995; Nakano et al., 2006). The AP2/ERF superfamily can be further divided into several subfamilies, including the most prevalent ERF subfamily. Many members of the ERF subfamily act as monomers that recognize cis-elements with a GCC box (5′-AGCCGCC-3′; Ohme-Takagi and Shinshi, 1995; Hao et al., 1998; Fujimoto et al., 2000). An NMR study of the solution structure of the DBD of AtERF1 in complex with a GCC box revealed a unique mode of DNA-protein interaction (Allen et al., 1998). The DBD consists of three-stranded antiparallel β-sheet and one α-helix parallel to the sheet (Fig. 1A; Supplemental Fig. S1). Amino acid residues, mainly Arg and Trp, that directly contact base moieties of DNA are found in the β-sheet (Fig. 1B) and are important, along with other residues in the DBD, for DNA-binding properties (Allen et al., 1998; Sakuma et al., 2002; Hao et al., 2002; Liu et al., 2006; Wang et al., 2009).
Figure 1.
The DBD of group IXa ERF proteins. A, A three-dimensional structure of the DBD of AtERF1 bound with GCC box (1GCC.pdb). Each segment of secondary structure is a different color. B, Multiple sequence alignment of the DBDs. The sequences were aligned with ClustalW (Thompson et al., 1994). Residues identical or similar at least in four sequences are shaded in black or gray, respectively, and dashes indicate gaps introduced to maximize the alignment. Asterisks below the alignment denote residues that directly contact base moieties of DNA based on a structural study (Allen et al., 1998), and secondary structures are indicated as follows: arrows denote β-strands and the bar shows α-helix. The residues (numbered from the N-terminal end of the domain) that were examined in following mutational analyses are marked with arrowheads. AtERF1 (At4g17500) and AtERF13 (At2g44840) are from Arabidopsis (At), ORCA3 (EU072424) from periwinkle (Cr), Sl1g90340 (Solyc01g090340) from tomato (Sl), and the others are from tobacco (Nt). Only for the Arabidopsis and tomato genes, the abbreviations for organisms are included in gene names as prefixes. Sequences of tobacco ERFs can be found in the Database of Tobacco Transcription Factors (Rushton et al., 2008) under the names used here. [See online article for color version of this figure.]
The IXa group of the ERF subfamily (Nakano et al., 2006) includes a handful of TFs mainly involved in plant defense, such as AtERF1 and AtERF13 from Arabidopsis (Arabidopsis thaliana; Fujimoto et al., 2000; Lee et al., 2010), Octadecanoid-derivative Responsive Catharanthus AP2-domain protein3 (ORCA3) from periwinkle (Catharanthus roseus; van der Fits and Memelink, 2000), ERF189, ERF115, ERF179, and ERF163 from tobacco (Nicotiana tabacum; Shoji et al., 2010), and Sl1g90340 from tomato (Solanum lycopersicum; Fig. 1B). AtERF1 is a founding member of the AP2/ERF superfamily and recognizes a GCC box found in a broad range of defense genes, though its involvement in plant defense has yet to be confirmed (Fujimoto et al., 2000; Oñate-Sánchez and Singh, 2002; Gutterson and Reuber, 2004; McGrath et al., 2005). ORCA3 is a regulator of jasmonate-inducible indole alkaloid biosynthesis in periwinkle (van der Fits and Memelink, 2000), and multiple ERF genes in tobacco are clustered at the nicotine-controlling NICOTINE2 locus, as master regulators of nicotine alkaloid production, which is also inducible by jasmonate (Shoji et al., 2010).
Three GC-rich boxes in the promoters of target genes are known to be bound by group IXa ERFs. A P box (5′-CCGCCCTCCA-3′) in the promoter of the tobacco putrescine N-methyltransferase2 (PMT2) gene is recognized by ERF189, ERF115, ERF179, ERF163, ORCA3, and AtERF13, but not by AtERF1 (Shoji et al., 2010; Shoji and Hashimoto, 2012a). A CS1 box (5′-TAGACCGCCT-3′) is part of a jasmonate- and elicitor-responsive element targeted by ORCA3 in the promoter of the strictosidine synthase (STR) gene (van der Fits and Memelink, 2001). Finally, AtERF1 recognizes a GCC box (5′-AGCCGCC-3′; Fujimoto et al., 2000; Gutterson and Reuber, 2004).
TFs with DBDs related in amino acid sequence often bind similar but distinct DNA sequences. Little is known about how such divergence of DNA-binding specificities arose and how it is translated into functional distinction of the TFs. In this study, we examined the DNA-binding specificities of group IXa ERFs, demonstrating their differences in binding to multiple GC-rich sequences. Mutational analysis of the ERFs revealed amino acid residues important for the differential DNA binding, and these were found to be involved in interactions with bases and phosphate backbones of DNA. The divergent DNA-binding specificities of group IXa ERFs appear to have arisen through mutational changes of these residues.
RESULTS
Group IXa ERF Genes in Various Flowering Plants
To clarify the distribution of group IXa ERF genes among flowering plants, all genes of the group in the genomes of rice (Oryza sativa), maize (Zea mays), Arabidopsis, Brassica rapa, poplar (Populus trichocarpa), and tomato were retrieved from public databases, and the relationship of these and other members from tobacco, periwinkle, and Artemisia annua was examined by aligning the sequences of the DBD (Supplemental Fig. S1) and generating a phylogenetic tree (Fig. 2). This analysis divided group IXa into two clades, with clade 1 represented by AtERF1. Clade 2 was further divided into four subclades, clade 2-1, including ERF189 and ERF115, clade 2-2, including ERF179, clade 2-2b, including Sl1g90340, and clade 2-3, including ERF163, ORCA3, and AtERF13. The alignment for the mentioned members is shown in Figure 1B. Note that clade 2-3 is defined more loosely than other subclades. Although three rice ERF genes had been classified into group IXa previously (Nakano et al., 2006), reexamination here revealed that only two of them, Os2g43790 and Os4g46220, are in group IXa, while the other one, Os1g54890, actually belongs to group IXc.
Figure 2.
Phylogenetic tree of group IXa ERFs. Amino acid sequences of the DBD were aligned with ClustalW (Thompson et al., 1994; Supplemental Fig. S1), and based on the alignment, a tree was generated using MEGA4 software (Tamura et al., 2007) with the neighbor-joining algorithm. Bootstrap values are indicated at branch nodes, and the scale bar indicates the number of amino acid substitution per site. According to species, gene names are denoted with prefixes. Aa, A. annua; At, Arabidopsis, Br, B. rapa; Cr, periwinkle; Nt, tobacco; Os, rice; Pt, poplar; Zm, maize. Group VIII AtERF4, group IXb AtERF5, and group IXc AtORA59 are included as outgroup members. [See online article for color version of this figure.]
The positions of group IXa ERF genes on chromosomes in rice, maize, Arabidopsis, B. rapa, poplar, and tomato are illustrated in Supplemental Figure S2. All clade 1 ERFs, except for Br022115 in B. rapa, are located in close proximity of group IXb members on the chromosomes, whereas clade 2 ERFs are far from them, indicating a clear distinction between the two clades in terms of gene evolution. We also found tandem clustering of multiple clade 2 ERF genes in poplar and tomato, which had been speculated for tobacco ERFs at the NICOTINE2 locus (Shoji et al., 2010). Although the majority of clade 2 ERFs are in the clusters, a few genes of clade 2, e.g. AtERF13 in Arabidopsis, Br037630 and Br004873 in B. rapa, and Sl5g50790 in tomato, are present as singletons.
Differential Binding of Group IXa ERFs to Three GC-Rich Boxes
To compare the in vitro binding of group IXa ERFs to the P, CS1, and GCC boxes, electrophoresis mobility shift assays (EMSAs) were carried out using select group IXa ERFs (ERF189, ERF115, ERF179, Sl1g90340, ERF163, ORCA3, AtERF13, and AtERF1) and three probes, P, CS1, and GCC, each containing one of the boxes. The purities of the recombinant ERFs are shown in Supplemental Figure S3. As the GCC probe, a 10-bp sequence with a GCC box (5′-AGAGCCGCCA-3′, the box is underlined) from the tobacco β-1,3-glucanase (GLN2) promoter was used (Table I). The GCC probe was bound by all ERFs except clade 2-1 members ERF189 and ERF115, whereas the CS1 probe was bound only by clade 2-3 members ERF163, ORCA3, and AtERF13 (Fig. 3A). Consistent with previous reports, the P probe was recognized by all but clade 1 AtERF1 (Fig. 3A). These results, summarized in Figure 3B, suggest similar but distinct DNA-binding specificities of the group IXa ERFs.
Table I. Sequences bound by group IXa ERFs in vitro.
Prefixes to gene names indicate organisms (tobacco [Nt] and periwinkle [Cr]). Positions of 5′ ends of the indicated sequences relative to the first ATG and orientations (forward [F] and reverse [R]) of boxes in the context of promoters are indicated to the left of the sequences. TESS scores are indicated as follows: a, based on data from ERF189 at the P box; b, ORCA3 at the P box; c, ORCA3 at the CS1; d, ORCA3 at the GCC box, and e, AtERF1 at the GCC box. Sequences of 7 bp that give rise to the scores in d and e are in the rightmost column. Nd, Not detected.
| Name | Gene | Sequence (5′ to 3′) |
TESS Scores |
Sequence for Scores d and e | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| a | b | c | d | e | |||||||
| P type | |||||||||||
| P | NtPMT2 | −133 F | CCGCCCTCCA | 7.45 | 11.56 | nd | 4.39 | 3.42 | (CGCCCTC) | ||
| Q1 | NtQPT2 | −131 R | TAGCACTCCA | 5.58 | 10.96 | nd | nd | nd | |||
| Q2 | NtQPT2 | −160 F | AAGCACTCCA | 7.18 | 10.69 | nd | nd | nd | |||
| Q3 | NtQPT2 | −238 R | TAGCACTCCA | 5.58 | 10.96 | nd | nd | nd | |||
| O3 | NtODC1 | −969 R | TAGCCAGCCT | 7.45 | 8.92 | nd | 3.02 | nd | (GCCAGCC) | ||
| O5 | NtODC2 | −919 R | TAGCCAGCCT | 7.45 | 8.92 | nd | 3.02 | nd | (GCCAGCC) | ||
| M1 | NtMATE1 | −206 R | TAGCACTCCA | 7.75 | 7.44 | nd | 2.99 | 2.89 | (AGCACCC) | ||
| M2 | NtMATE1 | −704 R | GAGCACACCT | 9.04 | 9.27 | nd | nd | nd | |||
| CC3 | CrCPR | −346 R | ACGCCTACCA | 3.58 | 8.79 | nd | nd | nd | |||
| Consensus | AAA | ||||||||||
| N/GC/NNCC/ | |||||||||||
| CCC | |||||||||||
| CS1 type | |||||||||||
| CS1 | CrSTR | −146 F | TAGACCGCCT | 6.20 | nd | 9.11 | 5.85 | 7.29 | (GACCGCC) | ||
| CC2 | CrCPR | −342 R | AAGAACGCCT | 7.36 | nd | 12.36 | 2.54 | 4.73 | (GAACGCC) | ||
| CC1 | CrCPR | −224 F | ACGCCGGCGA | nd | 7.28 | nd | 6.16 | 4.97 | (CGCCGGC) | ||
| GCC | |||||||||||
| GCC | NtGLN2 | −1179 F | AGAGCCGCCA | nd | nd | nd | 10.29 | 10.50 | (AGCCGCC) | ||
Figure 3.
In vitro binding of group IXa ERFs to P, CS1, and GCC boxes. A, EMSA was performed to examine the binding between ERFs and oligonucleotide probes containing a P box (5′-CCGCCCTCCA-3′) from the tobacco PMT2 promoter, a CS1 box (5′-TAGACCGCCT-3′) from the periwinkle STR promoter, or a GCC box (5′-AGCCGCC-3′) from the tobacco GLN2 promoter. Sequences and other information about the boxes are in Table I. Clade numbers are indicated in parentheses. The relative levels of binding of the P probe by different ERFs, except AtERF1, which did not bind the probe, compared in the same blots are shown in Supplemental Figure S9C. B, Summary of the results shown in A. Nucleotides in P and CS1 boxes that are different from those in the corresponding positions of the GCC box are underlined.
There are other non-GCC boxes that are recognized by AP2/ERF TFs: Dehydration-Responsive Element (DRE; 5′-[A/G)]CCGAC-3′; Liu et al., 1998), C-Repeat/DRE/HVCBF2 (CBF2; 5′-GTCGAC-3′; Xue, 2003; Yu et al., 2012), RAV1/AAT (RAA; 5′-CAACA-3′; Kagaya et al., 1999; Yu et al., 2012), and Coupled Element1 (CE1; 5′-CCACC-3′; Shen and Ho, 1995; Lee et al., 2010). No clear binding to probes containing DRE, CBF2, RAA, or CE1 boxes was detected for ERFs examined by EMSA (Supplemental Fig. S4).
In Vitro Binding of Mutant Versions of ERF189, ORCA3, and AtERF1
Each of the three probes was recognized by a unique subset of group IXa ERFs (Fig. 3). This led us to ask what structural differences among the ERF subsets account for the probe discrimination. To address this, we focused on the DNA-contacting N-terminal half of the DBD (Fig. 1B), where the three β-strands form an interface to bind the DNA (Allen et al., 1998). Only six residues in that region differ among the subsets of group IXa ERFs (Fig. 1B).
To clarify the amino acid residues that are critical for DNA binding, EMSA was performed with mutant versions of ORCA3 (mORCA3), in which each of the six residues was substituted with that of other clade ERFs (Fig. 4). ORCA3 bound all the three probes as shown (Fig. 3A), as did mORCA3/Arg-1-to-His (R1H) and mORCA3/Lys-3-to-Arg (K3R; Supplemental Fig. S5A), indicating no detectable influence of the R1H and K3R substitutions. By contrast, the Lys-3-to-Ile (K3I) substitution caused nearly complete loss of binding to CS1 (Fig. 3A). The Arg-6-to-Lys (R6K) and Arg-7-to-Glu (R7Q) substitutions disrupted binding of all probes except P and GCC, respectively (Fig. 4A). Both Lys-12-to-Thr (K12T) and Ala-14-to-Ser (A14S) substitutions hampered binding to all three probes (Fig. 4A). Even double substitution of these close-together residues (K12T/A14S), which mimicked the sequences of clade 2-1 and 2-2b ERFs in that region (Fig. 1B), did not lead to binding (Supplemental Fig. S5A).
Figure 4.
In vitro binding of mutant versions of ORCA3 and ERF189 to P, CS1, and GCC boxes. A, EMSA was performed to examine binding of ORCA3, ERF189, and their mutant versions to P, CS1, and GCC probes. To indicate the positions of substitutions, amino acid residues are numbered from the N terminus of the AP2/ERF DBD (Fig. 1B). The relative abundance of the complexes on each blot was determined by comparing the intensities of the retarded bands and is indicated above the images. The values for P probe (or GCC probe for mORCA3/R7Q) were set to 100. B, Relative levels of binding of different ERFs to P probe for ORCA3, ERF189, and their mutant versions, compared in the same blots.
Because three substitutions K3I, R6K, and R7Q in ORCA3, wherein the original residues were changed to the corresponding residues of ERF179, ERF189, and AtERF1, respectively, altered binding of ORCA3 to the probes, we examined how the reverse substitutions, Ile-3-to-Arg (I3K) in ERF179, Lys-6-to-Arg (K6R) in ERF189, and Glu-7-to-Arg (Q7R) in AtERF1, affected the binding patterns of the original ERFs (Fig. 4A; Supplemental Fig. S5A). Two substitutions, I3K in ERF179 and Q7R in AtERF1, did not affect the binding patterns (compare Fig. 3A with Supplemental Fig. S5A). However, in contrast to the wild-type ERF189 binding of only the P box, the R6K substitution in ERF189 enabled strong binding to all three probes (Fig. 4A).
Overall, these results suggest that the amino acid residues at positions 3, 6, 7, 12, and 14 are important for ERF binding to the probes. In particular, the changes of binding patterns that were caused by substitutions K3I, R6K, and R7Q in ORCA3 and K6R in ERF189 were mostly consistent with the distinct DNA-binding specificities of the corresponding ERFs. On the other hand, two reverse substitutions (I3K in ERF179 and Q7R in AtERF1) had no effects on the binding patterns.
In Vivo Binding of ERF189, ORCA3, and AtERF1 and Their Mutant Derivatives
We then asked whether and how well the in vitro binding results (Figs. 3 and 4) reflected in vivo DNA-protein interaction. First, we performed transient transactivation assays using tobacco Bright Yellow-2 (BY-2) cultured cells, into which combinations of reporter genes under the control of a P, CS1, or GCC box and ERF effector genes were delivered by particle bombardment. To examine P box-dependent expression, the PMT2 promoter, including the box, was tested for activation by ERF189, ORCA3, AtERF1, and their mutant derivatives (Fig. 5). As previously reported (Shoji et al., 2010; Shoji and Hashimoto, 2012a), ERF189 and ORCA3 activated the reporter about 8- and 5-fold, respectively, whereas AtERF1 did not. The R7Q substitution in ORCA3 markedly diminished the induction to 1.2-fold, whereas other substitutions, K3I and R6K in ORCA3 and K6R in ERF189, did not significantly alter the induction, in accordance with the in vitro binding results (Fig. 4A; Supplemental Fig. S5A). No induction was observed of a mutant reporter in which the P box was disrupted (Fig. 5), indicating the requirement of the box for the responses. To monitor the expression dependent on CS1 and GCC boxes, promoters containing four copies of each box were used to construct reporter vectors CS1x4-35Smini-GUS and GCCx4-35Smini-GUS. Unexpectedly, even ORCA3 and AtERF1 failed to activate CS1x4-35Smini-GUS and GCCx4-35Smini-GUS, respectively (Supplemental Fig. S6). Activation of a similarly designed reporter construct with a GCC box tetramer by AtERF1 was reported (Fujimoto et al., 2000), and we could not find any particular reasons for the inconsistency other than different experimental conditions.
Figure 5.
Transient transactivation of the PMT2 promoter with ERF effectors and their mutant derivatives in tobacco BY-2 cells. Cultured tobacco BY-2 cells were bombarded with a combination of a GUS-expressing reporter plasmid, a LUC-expressing reference plasmid, and either an ERF-expressing effector or an empty plasmid (EV). The reporter GUS gene was driven either by the wild-type PMT2 promoter (PMT2pro236-GUS) or by a PMT2 promoter in which the ERF-binding P box was mutated (PMT2pro236m4-GUS). GUS activity in the cell extracts is shown relative to the LUC activity. The values for the empty plasmid are set to 1. Error bars indicate the sd for three independent biological replicates. sds among the effectors were determined at P < 0.05 by one-way ANOVA, followed by the Tukey-Kramer test, and are indicated by different letters.
Next, we carried out transactivation assays in yeast (Saccharomyces cerevisiae). A modified firefly luciferase (mLUC; see “Materials and Methods”) was placed under the control of promoters containing four copies of the P box in Px4-mini-mLUC, the CS1 box in CS1x4-mini-mLUC, or the GCC box in GCCx4-mini-mLUC. Yeast harboring a pair of reporter and ERF effector plasmids was grown, and LUC activity of the culture was determined by measuring luminescence emitted after addition of luciferin (Fig. 6). CS1x4-mini-mLUC was significantly activated by ORCA3 and mERF189/K6R but not by the others, and the activity induced by ORCA3 was about 5 times that induced by mERF189/K6R. GCCx4-mini-mLUC was activated by AtERF1, ORCA3, mERF189/K6R, and mORCA3/R7Q to varying extents. The lack of significant activation of reporters mutated in the CS1 or GCC box demonstrated the dependence of the activation on the boxes. High LUC activity was detected for Px4-mini-mLUC, even with the empty effector vector (Fig. 6B), and thus we did not perform the assay with this reporter.
Figure 6.
Transactivation assay in yeast. Yeast Y187 strain was transformed with a combination of a mLUC-expressing reporter and either an ERF-expressing effector or an empty plasmid (EV). A, The reporter mLUC gene was driven by either four copies of the CS1 or the GCC box placed upstream of a minimal promoter of the HIS3 locus (CS1x4-mini-mLUC, GCCx4-mini-mLUC) or their mutant derivatives (mCS1x4-mini-mLUC, mGCCx4-mini-mLUC). B, The reporter mLUC gene was driven by four copies of the P box placed upstream of a minimal promoter of the HIS3 locus (Px4-mini-mLUC). LUC activities of yeast cultures from three independent colonies were measured, and error bars indicate the sd. Only the values above those of negative controls, which were measured using buffer without luciferin, are considered significant. The value for CS1x4-mini-LUC activated by ORCA3 was set to 100. sds among the effectors were determined at P < 0.05 by one-way ANOVA, followed by the Tukey-Kramer test, and are indicated by different letters. ND, Not detected.
In Vitro Binding Specificities of ERF189, ORCA3, and AtERF1 Determined with a Series of Single Nucleotide-Substituted Probes
To determine in vitro binding specificities of ERF189, ORCA3, and AtERF1, EMSA was performed with series of wild-type and mutant probes that contained every possible nucleotide substitution (A, T, G, or C) in the boxes (Supplemental Figs. S7 and S8), as in Shoji and Hashimoto (2011b), except that the sequences flanking the box were randomized in this study (see “Materials and Methods”). Binding specificities of TFs are often expressed as position weight matrices (PWMs), which are based on the assumption that interactions between each base of DNA and a TF are independent (Benos et al., 2002). Based on PWMs reflecting the binding profiles obtained by EMSA, the binding specificities of ERF189, ORCA3, and AtERF1 were represented as sequence logos (Fig. 7).
Figure 7.
Sequence logos representing in vitro binding specificities of ERF189, ORCA3, and AtERF1. EMSA was used to examine binding between ERF and a set of wild-type and single nucleotide-substituted mutant probes (Supplemental Figs. S7 and S8). The sequences of the wild-type probes are indicated below the logos, and sequences of the boxes are underlined and shown in black letters. The binding profiles obtained by EMSA were expressed as PWMs. Multiple alignments of 100 hypothetical sequences reflecting the EMSA-derived PWMs were used to generate sequence logos by WebLogo (Crooks et al., 2004). Lowercase letters (a–e) in parentheses indicate TESS scores listed in Table I, which were calculated based on the PWMs. The binding specificity of ERF189 to the P box-containing sequence was obtained in a previous study (Shoji and Hashimoto, 2011b). [See online article for color version of this figure.]
The obtained binding specificity of ORCA3 to the P box, 5′-(C/A)GC(C/A)NNCC-3′, was quite similar to that of ERF189 to the same box (5′-[A/C]GC[C/A][C/A]NCC-3′; Shoji and Hashimoto, 2011b; Fig. 7). The binding specificity of ORCA3 to the CS1 box was 5′-GNACGCC-3′, which was clearly distinct from those of ORCA3 to the P and GCC boxes (Fig. 7). ERF189, ORCA3, and AtERF1 were also examined by EMSA with a set of probes based on the GCC box (Supplemental Fig. S8). ERF189 did not bind the GCC box, and only weak binding of ERF189 to a small number of mutant probes was observed. ORCA3 and AtERF1 robustly bound the probes, allowing us to determine their specificities. They showed nearly identical specificities (5′-GCCGCC-3′), which were less flexible than those at other sites and clearly different from those of ORCA3 for P and CS1 boxes (Fig. 7).
In Vitro Binding Sequences of Group IXa ERFs in Promoters of Alkaloid Biosynthesis Genes
The in vitro binding specificities of ERF189, ORCA3, and AtERF1 (Shoji and Hashimoto, 2011b; Fig. 7) allowed us to predict potential binding sites in possible target promoters. The promoter sequences (up to 1 kb from the first ATG) of alkaloid biosynthesis and transport genes from tobacco and periwinkle, which are presumed to be regulated by relevant group IXa ERFs, were searched computationally using Transcriptional Element Search Software (TESS) software, which gave scores to candidate sequences based on the EMSA-derived PWMs. All sequences with scores higher than 7.0 in any searches are listed in Table I. For the GCC box, a core 7-bp sequence was used for the searches. Tobacco A622, encoding a PIP-family reductase (Kajikawa et al., 2009), periwinkle Trp decarboxylase (TDC), and periwinkle desacetoxyvindoline 4-hydroxylase (D4H) are also regulated by group IXa ERFs, but no binding sites were predicted in their promoters (up to approximately 1.4 kb of A622, 1.0 kb of TDC, and 0.6 kb of D4H were examined).
In tobacco, all eight sequences (P, Q1, Q2, Q3, O3, O5, M1, and M2) that have been reported as ERF189-binding sequences (Shoji and Hashimoto, 2011b, 2012a), but none apart from those, were identified in promoters of PMT2, quinolinate phosphoribosyl transferase2 (QPT2), Orn decarboxylase (ODC), and multidrug and toxic compound extrusion-type transporter1 (MATE1; Shoji et al., 2009; Table I). In periwinkle, four sequences, CS1 in the STR promoter and CC1 to CC3 in the cytochrome P450 reductase (CPR) promoter, were predicted (Table I).
To validate the predictions, binding between probes containing each predicted sequence and group IXa ERFs was examined by EMSA (Fig. 8; Supplemental Fig. S9). Clade 2-1 ERF189 and ERF115 were excluded from this assay, because binding between ERF189 and the tested sequences has previously been examined (Shoji and Hashimoto, 2012a). The eight tobacco probes were bound by all but AtERF1, though the binding strengths varied. AtERF13 displayed relatively weak binding to most of the probes (Supplemental Fig. S9, A and C). Three of the probes from periwinkle, CS1, CC1, and CC2, were bound by clade 2-3 members ERF163, ORCA3, and AtERF13, while the fourth, CC3, was bound by all but AtERF1, similar to the results for the tobacco probes (Fig. 8; Supplemental Fig. S9B).
Figure 8.
In vitro binding of group IXa ERFs to predicted binding sequences in the promoters of alkaloid biosynthesis genes. EMSA was performed to examine binding between ERFs and oligonucleotide probes containing each binding sequence predicted in promoters of alkaloid biosynthesis genes in tobacco (P, Q1/Q3, Q2, O3/O5, M1, and M2) and periwinkle (CS1, CC1, CC2, and CC3; Supplemental Fig. S9). Sequences and other information about the predicted sequences are in Table I. Relative abundance of the complexes on each blot was determined by comparing the intensities of the retarded bands (Supplemental Fig. S9). As a reference, probe P (or the GCC probe for AtERF1) was included in the blots. The values for the P probe (or GCC probe for AtERF1) were set to 100. Asterisks are above the bars for AtERF1. Binding of ERF115 to the P box-type probes was not examined. Relative levels of binding to the GCC box are not shown. Clade numbers are indicated in parentheses. Nd, Not determined.
Based on the similarity of the binding patterns to P and CS1, the binding sequences found in the alkaloid gene promoters could be classified into two types, P type and CS1 type (Fig. 8; Table I). The P type included the eight tobacco sequences and CC3, which were scored highly by TESS with specificities obtained at the P box and had 5′-(A/C)GC(A/C)NNCCA-3′ as a consensus (Table I). The consensus matched especially well with the specificity of ORCA3 obtained for the P box (Fig. 7). The CS1 type included three sequences, CS1, CC2, and CC3, none of which matched the P-type consensus (Table I). Although CS1 and CC2 are similar and have 5′-(A/T)AGA(A/C)CGCCT-3′ in common, the CC1 sequence is quite divergent from them, and a consensus of the three could not be defined. The CS1-type sequences were identified by the prediction program as putative binding sites, not only based on the specificity for the CS1 box, but also with the specificities for P and GCC boxes (Table I), reflecting intermediate nature of the CS1 type between the P type and the GCC box. On the contrary, for the GCC box, which is not present in the searched alkaloid gene promoter regions, high binding scores were predicted only with the less flexible specificities obtained from the bona fide GCC box (Fig. 7; Table I).
DISCUSSION
Divergent DNA-Binding Specificities of Group IXa ERFs
We demonstrated that group IXa ERFs differentially bind to multiple GC-rich sequences, which can be grouped into three types, namely P type, CS1 type, and the GCC box, indicating divergent DNA-binding specificities of these TFs (Figs. 3, 5, 6, and 8; Table I). In vitro DNA-binding specificities of ERF189, ORCA3, and AtERF1 were determined using series of single nucleotide-substituted probes at the defined binding boxes (Fig. 7). Based on the resulting specificities, sequences of P and CS1 types could be identified in promoters of alkaloid biosynthesis genes and added to the repertoire of known GC-rich sequences bound by these TFs (Fig. 8; Table I).
Every binding sequence that was predicted computationally (TESS scores > 7.0) was demonstrated to be bound by more than one of the examined ERFs, demonstrating the reliability of the predictions. Furthermore, TESS software gave scores of less than 5.0 to sequences of probes including non-GCC boxes (DRE, CBF2, RAV, and CE1), which were shown not to be bound by the ERFs (Supplemental Table S1). The reliability of the scoring was improved over previous attempts (Shoji and Hashimoto, 2011b, 2012a), in part because of the addition of four specificities (Fig. 7) and the adoption of a stricter cutoff value. In vivo relevance of the binding sequences as cis-elements has been demonstrated for the P box in PMT, the Q1, Q2, and Q3 boxes in QPT2, and the CS1 box in STR (van der Fits and Memelink, 2001; Shoji et al., 2010; Shoji and Hashimoto, 2011b), but experimental proof remains necessary for the other sequences.
The importance of probe sequence may explain the disparities between our results and those obtained in previous studies, which pointed to specific nucleotides (the second G, fifth G, and seventh C [Hao et al., 1998] or a CG step in the center [Yang et al., 2009] of the GCC box [5′-AGCCGCC-3′]) as being critical or indispensable for recognition. Although such tendencies could be observed in our results, there were small but significant discrepancies among these findings and ours. In addition, no binding sequences were predicted in A622, TDC, and D4H promoter regions. Although there is no experimental evidence of direct binding of ERFs to A622 and D4H promoter regions, in vitro binding of ORCA3 to the proximal TDC promoter has been clearly demonstrated (van der Fits and Memelink, 2000). It is possible that the promoter regions that we searched were not long enough to include the binding sequences or that sequences with low scores were omitted by our cutoff. As exemplified by the fact that the ORCA3 specificities were variable among the three boxes (Fig. 7), the results also imply the existence of multiple distinct sequence motifs recognized by a single TF (Badis et al., 2009). As such, we cannot rule out the existence of still more sequences that these TFs can bind. In addition, our methodology using probes substituted at one nucleotide position and PWMs ignores position interdependence (Badis et al., 2009; Jolma et al., 2013) and therefore could fail to detect some subsets of possible binding sequences.
Structural Basis of Divergent DNA-Binding Specificities
We found that the similar but different DNA-binding specificities of group IXa ERFs are closely related to the amino acid sequences of the DBD (Figs. 1, 3, 5, 6, and 8). Through mutational analyses, a few amino acid residues critical for such differences were defined: at least six substitutions at five positions (K3I, R6K, R7Q, K12T, and A14S in ORCA3 and K6R in ERF189) influenced DNA-binding specificities (Figs. 4–6). The only structure available for this superfamily is that of the DBD of AtERF1, a member of group IXa, bound to a GCC box (Allen et al., 1998). This prompted us to model the structures of representative group IXa ERFs ORCA3 and ERF189 with DNA and compare their structures to elucidate the structural effects of the critical residues on DNA binding. Using the structure of the AtERF1 complex (1GCC.pdb) as a template, homology models of ORCA3 and ERF189 complexes were generated.
In the AtERF1 and ORCA3 complexes, the guanidyl group of Arg-6 (R6) contacts the O6 atom of the guanine base of a CG pair at the third position in the GCC box through a hydrogen bond. The side chain of R6 also interacts with the cytosine base moiety of a GC pair at the fourth position in the box through hydrophobic interaction (Fig. 9, A and B; Allen et al., 1998). This Arg residue in the first β-strand is conserved in all AP2/ERF TFs except clade 2-1 ERFs (Fig. 1A; Supplemental Fig. S1), suggesting the requirement of R6 for recognition by most ERFs, including AtERF1. However, the findings that clade 2-3 ERFs also can bind to P-type sequences without the GC pair (Figs. 3 and 8; Table I) and that the specificity of ORCA3 for the P box is not stringent at the fourth position (Fig. 7), indicate that the GC pair is dispensable for recognition by ORCA3 and related ERFs. As discussed below, stronger affinity of this ERF subset may allow such relaxed recognition. In contrast to R6, Lys-6 (K6) in clade 2-1 ERF189 may favor the adenine of a TA pair at the fourth position, which appears in the P box and four other P-type sequences (Table I), because the ε-amino group of K6 can interact with the N7 atom of adenine through hydrogen bonding (Fig. 9B). Such pairings between Lys and adenine are prevalent in reported structures (Mao et al., 2003). ERF189 also recognizes P-type sequences without the TA pair (Table I), and its specificity for the P box did not show a firm preference for a TA pair at the position (Fig. 7), indicating that the pairing is favored but not exclusive. Apart from R6 and K6, there are no apparent changes in residues directly contacting the bases of DNA among the ERF structures, not even in the region responsible for recognition of nearly one-half of the binding sequences, where P, CS1, and GCC boxes also show distinctions between each other to some extent (Figs. 3B and 7).
Figure 9.
Structure of AtERF1 and model structures of ORCA3 and ERF189 in complex with the GCC box (GCC) or its substituted derivative (TCC). The model structures were built by the SWISS-MODEL server, using the structure of AtERF1 bound with GCC box (1GCC.pdb) as a template. In the ERF189 complex, a GC pair in the GCC box-containing DNA molecule (5′-TAGCCGCCAGC-3′; the pair is underlined) was replaced with a TA pair using the 3D-DART server. Base moieties of the pairs are labeled. Amino acid residues of interest are shown in sticks and labeled. Nitrogen and oxygen atoms are indicated in blue and red, respectively. To show possible access to the nearby phosphate backbone, side chains of R7 and Q7 were moved arbitrarily and colored in yellow in A. Hydrogen bonding between the basic side chain of K6 and the adenine base moiety is shown as a red dotted line, and hydrophobic interaction between the side chain of R6 and the base moiety of cytosine is shown as a blue dotted curve in B.
In addition to sequence-specific recognition dependent on interaction with base moieties of DNA, which is exemplified by cases mentioned above, TFs also interact with the backbone of DNA in a non-sequence-specific manner (von Hippel and Berg, 1989). Basic amino acids at positions 3, 7, and 12 (K3, R7, and K12 in ORCA3, R3 and K12 in AtERF1, and R7 in ERF189) in the first two β-strands may enhance the ERFs’ general affinities for DNA by electrostatically interacting with its negatively charged phosphate backbone (Fig. 9A). Three basic residues, K3, R7, and K12, may enable ORCA3 to bind strongly to P, CS1, and GCC boxes, and therefore single substitutions to nonbasic residues at any of these positions (K3I, R7Q, and K12T) could cause loss of affinity. The K12T substitution caused nearly complete loss of binding to all of the probes, while K3I and R7Q substitutions disrupted binding to certain but not all probes (CS1 for K3I and CS1 and P for R7Q). The differential loss of binding may be explained by different affinities of ORCA3 for the individual probes, which remains to be examined with quantitative analyses.
Evolution of Group IXa ERFs
Mutations of TFs and TF-targeted cis-elements alter expression profiles of targeted genes, resulting in rearrangement of regulatory networks (Dowell, 2010). Because such regulatory mutations are difficult to define due to our limited understanding of DNA-TF interactions, their significance tends to be underestimated. After arising from an original copy through gene duplication, novel TF genes usually undergo functional diversification by changing expression patterns or functional properties based on protein structure, such as DNA-binding specificity and transactivation activity. In general, DNA-binding specificities of TFs evolve very slowly (Amoutzias et al., 2007). Dimer orientation and spacing preferences are sometime divergent among related TFs that bind to DNA as dimers (Jolma et al., 2013). Because ERFs bind to DNA as monomers (Hao et al., 1998), the divergence that we found here relates to the primary binding specificities of the TFs as monomers and seems rare in that sense (Maerkl and Quake, 2009; Baker et al., 2011).
Group IXa ERF genes from various flowering plants are grouped into five clades. Signature residues for the clades critical to DNA-binding specificities along with other properties are summarized in Figure 10. The residues are conserved in nearly all examined sequences, except for some members mostly from clade 1 (ERF168, Pt14s04630, Br024954, Br040159, Sl8g78180, ERF34, ERF66, ERF123, ERF125, and ERF10/ERF108/ERF146; DBD sequences of the latter three are identical). Based on distribution of the genes in the species, we infer dicot-specific existence of clade 2 ERFs and that clade 2-3 ERFs exist widely in various lineages of dicot plants, including woody poplar, whereas clade 2 members outside of clade 2-3 are found only in the family Solanaceae. In particular, clade 2-1 ERFs bearing a unique K6 are specific to the genus Nicotiana represented by tobacco and not found in other Solanaceae species, such as tomato. This Nicotiana spp.-specific distribution was also confirmed by extensive database searching at the SOL Genomics Network (http://solgenomics.net/).
Figure 10.
Properties of group IXa ERFs. Characteristics of each clade are summarized. The amino acid residues at four positions (3, 6, 7, and 12; numbered from the N terminus of the DBD) that are important for DNA-binding specificity, and thus clade distinction, are shown. Basic residues at three positions that interact with the phosphate backbone of DNA are shown with asterisks, while the K6 unique to clade 2-1 is shown in gray. In clade 2-3 ERFs, position 3 is occupied by Arg or Lys, but here only Arg is shown. Confirmed presence of ERF genes of the indicated types in five plant species is indicated by bars. [See online article for color version of this figure.]
Divergent DNA-binding specificities of group IXa ERFs may mirror their functional divergence in biological contexts. Group IXa ERFs play regulatory roles in various forms of jasmonate-dependent chemical defense (De Geyter et al., 2012; Shoji and Hashimoto, 2012b). Tobacco ERF189 and periwinkle ORCA3 are under the control of the basic Helix-Loop-Helix family TF MYC2 in jasmonate signaling (Shoji and Hashimoto, 2011a; Zhang et al., 2011), regulating distinct alkaloid pathways that have evolved independently in each plant lineage (i.e. tobacco and periwinkle belong to the families Solanaceae and Apocynaceae, respectively). Despite their conserved action in the same signaling cascade, ERF189 and ORCA3 have recruited different sets of downstream target genes constituting individual metabolic pathways. Mutational changes are more likely to occur in cis-elements than in TFs, in part because of the short and degenerate nature of cis-elements and the limited effects of the mutations, which influence only the genes controlled by those cis-elements (Wray, 2007). Gain and loss of cis-elements allow dynamic rearrangement of the connections among TFs and their target genes and thus enable recruitment of the genes into regulons under the control of certain TFs (Shoji and Hashimoto, 2011b). A dominant role of clade 2-1 ERFs, including ERF189, as nicotine regulators (Shoji et al., 2010) may explain the presence of only P-type sequences, exclusive targets of this clade of ERFs, in the promoters of tobacco alkaloid genes. On the other hand, elimination of the GCC box from the promoters may keep alkaloid regulation free from influence by a large number of GCC box-recognizing ERFs in tobacco and periwinkle.
Functions of group IXa ERFs are not restricted to alkaloid regulation, of course. The GCC box-dependent regulation by clade 1 and some clade 2 ERFs, some of which are inducible by jasmonate as well, is responsible for the expression of a wide range of defense genes, such as genes for Pathogenesis-Related proteins (Gutterson and Reuber, 2004). In addition, clade 1 AaERF1 from A. annua regulates jasmonate-inducible production of artemisinin, a sesquiterpene lactone used for malaria treatment (Yu et al., 2012). Apart from jasmonate responses, clade 2-3 AtERF13 is assumed to be involved in abiotic stress tolerance, probably controlling multiple downstream genes, because its overexpression conferred hypersensitivity to abscisic acid (Lee et al., 2010).
The different but partially overlapping binding patterns of group IXa ERFs to three types of sequences, P type, CS1 type, and the GCC box (Fig. 10), suggest a gradual transition of binding specificity during evolution, where TFs with intermediate or broader specificities appeared in between TFs with distinct narrower specificities. Such a partial overlap of binding specificities is observed for other TFs, and gradual evolution of their specificities is also postulated (Slattery et al., 2011). According to the presumed distribution of group IXa ERF genes among flowering plants (Fig. 10), we can infer gene appearance during evolution in the following order: clade 1, clade 2-3, clade 2-2, and clade 2-2b or clade 2-1. In this scenario, clade 2-3 ERFs that can bind to all three types of sequence first arose from clade 1 ERFs that bind only to the GCC box by acquiring broader DNA-binding specificities around the time of differentiation between monocots and dicots. Next, partial loss of such broad binding activity led to the generation of clades 2-2, 2-2b, and 2-1 in certain dicot lineages (e.g. Solanaceae and Nicotiana spp.).
Clustering of multiple clade 2 ERFs in a certain chromosomal region was speculated to occur in tobacco (Shoji et al., 2010) and was confirmed in poplar and tomato genomes (Supplemental Fig. S2). There are relatively large numbers of clade 2 genes in some species (e.g. five in poplar, six in tomato, and at least 13 in tobacco), in which most of the ERFs are clustered in tandem, implying relatively recent generation of the genes through repeated gene duplications. In tomato and tobacco, because the ERF clusters include members from different clades (e.g. clades 2-3, 2-2, 2-2b, and 2-1), DNA-binding specificities are divergent even among the clustered genes, and that fact implies their rapid diversification, which could occur by subtle mutational changes in a relatively small number of the signature residues (Fig. 10). Why did clade 2 ERF genes extensively duplicate in certain lineages? An increase in gene number, or gene dosage, leads to higher accumulation of the gene products, or proteins (Kondrashov et al., 2002). In line with this notion, the increased number of tobacco ERF genes may match the requirement for substantial nicotine production in this species. Gene duplications also allow the functional diversification of the genes (Innan and Kondrashov, 2010). The clustered ERF genes of tobacco may have overlapping but distinct roles in nicotine regulation, possibly reflecting divergent DNA-binding specificities, because patterns of jasmonate-dependent induction and effects of overexpression on nicotine production are different among the genes (Shoji et al., 2010). Little is known regarding the biological significance of the lineage-specific expansion and diversification of clade 2, which is especially apparent in the family Solanaceae (i.e. tomato and tobacco). Elucidation of clade 2 ERFs’ functions other than in nicotine regulation, which occurs only in the genus Nicotiana, is awaited.
MATERIALS AND METHODS
EMSA
The pET32-based expression vectors for ERF189, ERF115, ERF179, ERF163, ORCA3, AtERF13, and AtERF1 to express recombinant proteins fused to a thioredoxin, an S-tag, and a His-tag at their N-terminal ends have been described (Shoji et al., 2010; Shoji and Hashimoto, 2012a). Because we failed to express the fusion of full-length Sl1g90340 as a soluble protein using the same vector arrangement, an expression vector was generated by cloning a portion of Sl1g90340 (corresponding to 40–219 amino acid residues) into the BamHI and EcoRI of pET32b. To generate the mutant versions, PCR-based site-directed mutagenesis (Hemsley et al., 1989) using a high-fidelity Prime Star Max DNA polymerase (Takara) was performed, with the relevant expression vectors as templates. Sequences of the primers used for the mutagenesis are listed in Supplemental Table S2. Recombinant protein was expressed in Escherichia coli BL21 Star (DE3; Novagen), affinity purified, quantified, and stained with Coomassie Brilliant Blue R250 after separation on a 12% (w/v) SDS-PAGE gel (Shoji et al., 2010). The purities of the recombinant proteins are shown in Supplemental Figure S3.
Sense oligonucleotides containing the 10 base sequences shown in Table I and their mutant versions, as described in the text, were flanked by 5′-NNNNNNNN-3′ and 5′-NNNNCCTCGG-3′, where N represents any nucleotide. Sequences including non-GCC boxes shown in Supplemental Table S1 were likewise placed in the center of the oligonucleotides. An antisense oligonucleotide (5′-ACACCGAGG-3′) was biotin-labeled at the 5′ end and annealed to the sense oligonucleotides to generate double-stranded probes (Shoji et al., 2010). The DNA-protein binding assay, gel separation, and detection of DNA-protein complexes have been described (Shoji et al., 2010). The biotin-labeled DNA probes (20 femtomoles) and purified recombinant proteins (2 μg) were used for each binding reaction.
Computational Prediction of ERF Binding Sequences
TESS (http://www.cbil.uppenn.edu./tess) was used to search for and score putative ERF binding sequences in the query promoters by weight matrix scoring, adapting EMSA-derived PWMs (Shoji and Hashimoto, 2011b; Supplemental Figs. S7 and S8). Minimum log-likelihood ratio and maximum log-likelihood deficit were set to 2.0 and 8.0, respectively, and the expert parameters were used in the default setting. The promoter sequences from this article can be found in the GenBank/EMBL data libraries under accession numbers AB004323 (PMT2), AJ748263 (QPT2), AB031066 (ODC1), AF233849 (ODC2), AB071165 (A622), AB286963 (MATE1), X53600 (GLN2), Y09417 (CPR), Y10182 (STR), X67662 (TDC), AF008597 (D4H), and L19119 (Hordeum vulgare HVA22)
Transient Transactivation Assay in Tobacco BY-2 cells
The reporter plasmids for PMT2pro236-GUS with a PMT2 promoter fragment (–236 to –1; numbered from the first ATG) and its mutant derivative PMT2pro236m4-GUS have been described (Shoji et al., 2010). To generate CS1x4-35Smini-GUS and GCCx4-35Smini-GUS, sense and antisense oligonucleotides that contained four copies of the CS1 or GCC box (Supplemental Table S3) were annealed to generate double-stranded oligonucleotides with cohesive tails for HindIII and SpeI at each end, and the resultant oligonucleotides were inserted into corresponding restriction sites upstream of a minimal Cauliflower mosaic virus (CaMV) 35S promoter (–46 to –1) generated by PCR mutagenesis in pBI221. The effector plasmids for 35S-ERF189, 35S-ORCA3, and 35S-AtERF1, all of which contain the CaMV 35S promoter, have been described (Shoji et al., 2010; Shoji and Hashimoto, 2012a). To introduce the mutations, PCR-based mutagenesis was performed with the appropriate primers (Supplemental Table S2) and the relevant effector vectors as templates. The pBI221-LUC vector harboring LUC under the control of the CaMV 35S promoter was cotransformed as an internal standard. Particle bombardment and subsequent measurement of GUS and LUC activities in extracts of the bombarded tobacco (Nicotiana tabacum) BY-2 cells have been described (Shoji et al., 2010).
Transactivation Assay in Yeast
The vectors pHIS2.1 and pGAD-Rec2-53 included in the Matchmaker One-Hybrid Screening Kit (Clontech) were manipulated. To replace HIS3 in pHIS2.1, the coding sequence of LUC lacking a 9-bp sequence corresponding to the C-terminal three amino acid residues, or mLUC, was amplified from pBI221-LUC by PCR with primers including the appropriate restriction sites. The amplification product was inserted into ApaI and KpnI sites generated by PCR-based mutagenesis between the minimal promoter and the 3′-untranslated region of HIS3 locus in pHIS2.1, generating mini-mLUC. The removal of the C-terminal residues of LUC is intended to allow mLUC protein to be cytoplasmic to increase its access to the luciferin substrate applied exogenously (Leskinen et al., 2003). Sense and antisense oligonucleotides containing four copies of the P, CS1, or GCC box or their mutant derivatives (Supplemental Table S3) were annealed to generate double-stranded oligonucleotides with cohesive tails for EcoRI and SpeI at each end. The resultant oligonucleotides were placed at the corresponding sites upstream of mini-mLUC to generate the Px4-mini-mLUC, CS1x4-mini-mLUC, and GCCx4-mini-mLUC reporters. To generate the effector plasmids, the p53 coding sequence in pGAD-Rec2-53 was removed and replaced with BamHI and SpeI sites by PCR mutagenesis, into which full-length sequences of ERF189, ORCA3, AtERF1, and their mutant versions amplified from the relevant bacterial expression vectors by PCR with primers attaching a BglII site compatible with BamHI and a SpeI site were inserted. Sequence information for primers used for vector construction other than those already listed in Supplemental Tables S2 and S3 are available upon request.
After the reporter and effector plasmids were cotransformed into yeast (Saccharomyces cerevisiae) Y187 strain (Clontech) following a lithium acetate protocol, the transformed colonies were grown on agar plates containing yeast synthetic dropout medium lacking Trp and Leu. Yeast cells were grown in the same liquid medium for 48 h. After the optical density at 600 nm of the culture was adjusted to 0.5 in a total volume of 1.5 mL, the cells were further grown until the optical density reached 1.0. Of this culture, 100-μL aliquots were placed into a 96-well plate and 100 μL 1 mm d-luciferin in 0.1 m sodium citrate buffer (pH 3.0) was added. After briefly shaking the plate, luminescence from the solutions was measured for 10 s using a LAS-4000 (Fujifilm).
Supplemental Data
The following materials are available in the online version of this article.
Supplemental Figure S1. Multiple sequence alignment of the DBD of group IXa ERF proteins.
Supplemental Figure S2. Positions of group IXa ERF genes on chromosomes in rice, maize, Arabidopsis, B. rapa, popular, and tomato.
Supplemental Figure S3. Purity of recombinant ERF proteins.
Supplemental Figure S4. In vitro binding of group IXa ERFs to DRE, CBF2, RAV, and CE1 boxes.
Supplemental Figure S5. In vitro binding of mutant versions of ORCA3, ERF179, and AtERF1 to P, CS1, and GCC boxes.
Supplemental Figure S6. Transient transactivation of CS1x4-35Smini-GUS by ORCA3 and GCCx4-35Smini-GUS by AtERF1 in tobacco BY-2 cells.
Supplemental Figure S7. In vitro binding profiles of ORCA3 at P and CS1 boxes.
Supplemental Figure S8. In vitro binding profiles of ORCA3 and AtERF1 at GCC box.
Supplemental Figure S9. In vitro binding of group IXa ERFs to predicted binding sequences in promoters of alkaloid biosynthesis genes.
Supplemental Table S1. Sequences of non-GCC boxes.
Supplemental Table S2. Oligonucleotides used for PCR-based site-directed mutagenesis.
Supplemental Table S3. Oligonucleotides including four copies of P, CS1, and GCC boxes.
Acknowledgments
We thank Kazuyuki Hiratsuka (Yokohama National University) for providing the pBI221-LUC plasmid and advising on the LUC assay in yeast and Masaru Ohme-Takagi (Saitama University) for consulting on vectors containing four copies of the GCC box.
Glossary
- TF
transcription factor
- DBD
DNA-binding domain
- EMSA
electrophoresis mobility shift assay
- BY-2
Bright Yellow-2
- PWM
position weight matrix
- TESS
Transcriptional Element Search Software
- CaMV
Cauliflower mosaic virus
References
- Allen MD, Yamasaki K, Ohme-Takagi M, Tateno M, Suzuki M. (1998) A novel mode of DNA recognition by a β-sheet revealed by the solution structure of the GCC-box binding domain in complex with DNA. EMBO J 18: 5484–5496 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Amoutzias GD, Veron AS, Weiner J, III, Robinson-Rechavi M, Bornberg-Bauer E, Oliver SG, Robertson DL. (2007) One billion years of bZIP transcription factor evolution: conservation and change in dimerization and DNA-binding site specificity. Mol Biol Evol 24: 827–835 [DOI] [PubMed] [Google Scholar]
- Badis G, Berger MF, Philippakis AA, Talukder S, Gehrke AR, Jaeger SA, Chan ET, Metzler G, Vedenko A, Chen X, et al. (2009) Diversity and complexity in DNA recognition by transcription factors. Science 324: 1720–1723 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baker CR, Tuch BB, Johnson AD. (2011) Extensive DNA-binding specificity divergence of a conserved transcription regulator. Proc Natl Acad Sci USA 108: 7493–7498 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benos PV, Bulyk ML, Stormo GD. (2002) Additivity in protein-DNA interactions: how good an approximation is it? Nucleic Acids Res 30: 4442–4451 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crooks GE, Hon G, Chandonia JM, Brenner SE. (2004) WebLogo: a sequence logo generator. Genome Res 14: 1188–1190 [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Geyter N, Gholami A, Goormachtig S, Goossens A. (2012) Transcriptional machineries in jasmonate-elicited plant secondary metabolism. Trends Plant Sci 17: 349–359 [DOI] [PubMed] [Google Scholar]
- Dowell RD. (2010) Transcription factor binding variation in the evolution of gene regulation. Trends Genet 26: 468–475 [DOI] [PubMed] [Google Scholar]
- Fujimoto SY, Ohta M, Usui A, Shinshi H, Ohme-Takagi M. (2000) Arabidopsis ethylene-responsive element binding factors act as transcriptional activators or repressors of GCC box-mediated gene expression. Plant Cell 12: 393–404 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gutterson N, Reuber TL. (2004) Regulation of disease resistance pathways by AP2/ERF transcription factors. Curr Opin Plant Biol 7: 465–471 [DOI] [PubMed] [Google Scholar]
- Hao D, Ohme-Takagi M, Sarai A. (1998) Unique mode of GCC box recognition by the DNA-binding domain of ethylene-responsive element-binding factor (ERF domain) in plant. J Biol Chem 273: 26857–26861 [DOI] [PubMed] [Google Scholar]
- Hao D, Yamasaki K, Sarai A, Ohme-Takagi M. (2002) Determinants in the sequence specific binding of two plant transcription factors, CBF1 and NtERF2, to the DRE and GCC motifs. Biochemistry 41: 4202–4208 [DOI] [PubMed] [Google Scholar]
- Hemsley A, Arnheim N, Toney MD, Cortopassi G, Galas DJ. (1989) A simple method for site-directed mutagenesis using the polymerase chain reaction. Nucleic Acids Res 17: 6545–6551 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Innan H, Kondrashov F. (2010) The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet 11: 97–108 [DOI] [PubMed] [Google Scholar]
- Jolma A, Yan J, Whitington T, Toivonen J, Nitta KR, Rastas P, Morgunova E, Enge M, Taipale M, Wei G, et al. (2013) DNA-binding specificities of human transcription factors. Cell 152: 327–339 [DOI] [PubMed] [Google Scholar]
- Kagaya Y, Ohmiya K, Hattori T. (1999) RAV1, a novel DNA-binding protein, binds to bipartite recognition sequence through two distinct DNA-binding domains uniquely found in higher plants. Nucleic Acids Res 27: 470–478 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kajikawa M, Hirai N, Hashimoto T. (2009) A PIP-family protein is required for biosynthesis of tobacco alkaloids. Plant Mol Biol 69: 287–298 [DOI] [PubMed] [Google Scholar]
- Kondrashov FA, Rogozin IB, Wolf YI, Koonin EV. (2002) Selection in the evolution of gene duplications. Genome Biol 3: research0008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee SJ, Park JH, Lee MH, Yu JH, Kim SY. (2010) Isolation and functional characterization of CE1 binding proteins. BMC Plant Biol 10: 277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leskinen P, Virta M, Karp M. (2003) One-step measurement of firefly luciferase activity in yeast. Yeast 20: 1109–1113 [DOI] [PubMed] [Google Scholar]
- Liu Q, Kasuga M, Sakuma Y, Abe H, Miura S, Yamaguchi-Shinozaki K, Shinozaki K. (1998) Two transcription factors, DREB1 and DREB2, with an EREBP/AP2 DNA binding domain separate two cellular signal transduction pathways in drought- and low-temperature-responsive gene expression, respectively, in Arabidopsis. Plant Cell 10: 1391–1406 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y, Zhao TJ, Liu JM, Liu WQ, Liu Q, Yan YB, Zhou HM. (2006) The conserved Ala37 in the ERF/AP2 domain is essential for binding with the DRE element and the GCC box. FEBS Lett 580: 1303–1308 [DOI] [PubMed] [Google Scholar]
- Maerkl SJ, Quake SR. (2009) Experimental determination of the evolvability of a transcription factor. Proc Natl Acad Sci USA 106: 18650–18655 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mao L, Wang Y, Liu Y, Hu X. (2003) Multiple intermolecular interaction modes of positively charged residues with adenine in ATP-binding proteins. J Am Chem Soc 125: 14216–14217 [DOI] [PubMed] [Google Scholar]
- McGrath KC, Dombrecht B, Manners JM, Schenk PM, Edgar CI, Maclean DJ, Scheible WR, Udvardi MK, Kazan K. (2005) Repressor- and activator-type ethylene response factors functioning in jasmonate signaling and disease resistance identified via a genome-wide screen of Arabidopsis transcription factor gene expression. Plant Physiol 139: 949–959 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nakano T, Suzuki K, Fujimura T, Shinshi H. (2006) Genome-wide analysis of the ERF gene family in Arabidopsis and rice. Plant Physiol 140: 411–432 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ohme-Takagi M, Shinshi H. (1995) Ethylene-inducible DNA binding proteins that interact with an ethylene-responsive element. Plant Cell 7: 173–182 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oñate-Sánchez L, Singh KB. (2002) Identification of Arabidopsis ethylene-responsive element binding factors with distinct induction kinetics after pathogen infection. Plant Physiol 128: 1313–1322 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riechmann JL, Ratcliffe OJ. (2000) A genomic perspective on plant transcription factors. Curr Opin Plant Biol 3: 423–434 [DOI] [PubMed] [Google Scholar]
- Rushton PJ, Bokowiec MT, Laudeman TW, Brannock JF, Chen X, Timko MP. (2008) TOBFAC: the database of tobacco transcription factors. BMC Bioinformatics 9: 53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sakuma Y, Liu Q, Dubouzet JG, Abe H, Shinozaki K, Yamaguchi-Shinozaki K. (2002) DNA-binding specificity of the ERF/AP2 domain of Arabidopsis DREBs, transcription factors involved in dehydration- and cold-inducible gene expression. Biochem Biophys Res Commun 290: 998–1009 [DOI] [PubMed] [Google Scholar]
- Segal E, Widom J. (2009) From DNA sequence to transcriptional behaviour: a quantitative approach. Nat Rev Genet 10: 443–456 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shen Q, Ho TH. (1995) Functional dissection of an abscisic acid (ABA)-inducible gene reveals two independent ABA-responsive complexes each containing a G-box and a novel cis-acting element. Plant Cell 7: 295–307 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shoji T, Hashimoto T. (2011a) Tobacco MYC2 regulates jasmonate-inducible nicotine biosynthesis genes directly and by way of the NIC2-locus ERF genes. Plant Cell Physiol 52: 1117–1130 [DOI] [PubMed] [Google Scholar]
- Shoji T, Hashimoto T. (2011b) Recruitment of a duplicated primary metabolism gene into the nicotine biosynthesis regulon in tobacco. Plant J 67: 949–959 [DOI] [PubMed] [Google Scholar]
- Shoji T, Hashimoto T. (2012a) DNA-binding and transcriptional activation properties of tobacco NIC2-locus ERF189 and related transcription factors. Plant Biotechnol 29: 35–42 [Google Scholar]
- Shoji T, Hashimoto T. (2012b) Jasmonate-responsive transcription factors; new tools for metabolic engineering and gene discovery. In Chandra S, Lata H, Varma A, eds, Biotechnology for Medicinal Plants: Micropropagation and Improvement. Springer, Heidelberg, Germany, pp 345–357 [Google Scholar]
- Shoji T, Inai K, Yazaki Y, Sato Y, Takase H, Shitan N, Yazaki K, Goto Y, Toyooka K, Matsuoka K, et al. (2009) Multidrug and toxic compound extrusion-type transporters implicated in vacuolar sequestration of nicotine in tobacco roots. Plant Physiol 149: 708–718 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shoji T, Kajikawa M, Hashimoto T. (2010) Clustered transcription factor genes regulate nicotine biosynthesis in tobacco. Plant Cell 22: 3390–3409 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slattery M, Riley T, Liu P, Abe N, Gomez-Alcala P, Dror I, Zhou T, Rohs R, Honig B, Bussemaker HJ, et al. (2011) Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins. Cell 147: 1270–1282 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tamura K, Dudley J, Nei M, Kumar S. (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 24: 1596–1599 [DOI] [PubMed] [Google Scholar]
- Thompson JD, Higgins DG, Gibson TJ. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22: 4673–4680 [DOI] [PMC free article] [PubMed] [Google Scholar]
- van der Fits L, Memelink J. (2000) ORCA3, a jasmonate-responsive transcriptional regulator of plant primary and secondary metabolism. Science 289: 295–297 [DOI] [PubMed] [Google Scholar]
- van der Fits L, Memelink J. (2001) The jasmonate-inducible AP2/ERF-domain transcription factor ORCA3 activates gene expression via interaction with a jasmonate-responsive promoter element. Plant J 25: 43–53 [DOI] [PubMed] [Google Scholar]
- von Hippel PH, Berg OG. (1989) Facilitated target location in biological systems. J Biol Chem 264: 675–678 [PubMed] [Google Scholar]
- Wang S, Yang S, Yin Y, Xi J, Li S, Hao D. (2009) Molecular dynamics simulations reveal the disparity in specific recognition of GCC-box by AtERFs transcription factors super family in Arabidopsis. J Mol Recognit 22: 474–479 [DOI] [PubMed] [Google Scholar]
- Wray GA. (2007) The evolutionary significance of cis-regulatory mutations. Nat Rev Genet 8: 206–216 [DOI] [PubMed] [Google Scholar]
- Xue GP. (2003) The DNA-binding activity of an AP2 transcriptional activator HvCBF2 involved in barley is modulated by temperature. Plant J 33: 373–383 [DOI] [PubMed] [Google Scholar]
- Yamasaki K, Kigawa T, Seki M, Shinozaki K, Yokoyama S. (2012) DNA-binding domains of plant-specific transcription factors: structure, function, and evolution. Trends Plant Sci 12: S1360–S1385 [DOI] [PubMed] [Google Scholar]
- Yang S, Wang S, Liu X, Yu Y, Yue L, Wang X, Hao D. (2009) Four divergent Arabidopsis ethylene-responsive element-binding factor domains bind to a target DNA motif with a universal CG step core recognition and different flanking bases preference. FEBS J 276: 7177–7186 [DOI] [PubMed] [Google Scholar]
- Yu ZX, Li JX, Yang CQ, Hu WL, Wang LJ, Chen XY. (2012) The jasmonate-responsive AP2/ERF transcription factors AaERF1 and AaERF2 positively regulate artemisinin biosynthesis in Artemisia annua L. Mol Plant 5: 353–365 [DOI] [PubMed] [Google Scholar]
- Zhang H, Hedhili S, Montiel G, Zhang Y, Chatel G, Pré M, Gantet P, Memelink J. (2011) The basic helix-loop-helix transcription factor CrMYC2 controls the jasmonate-responsive expression of the ORCA genes that regulate alkaloid biosynthesis in Catharanthus roseus. Plant J 67: 61–71 [DOI] [PubMed] [Google Scholar]










