Skip to main content
The Scientific World Journal logoLink to The Scientific World Journal
. 2012 May 3;2012:983528. doi: 10.1100/2012/983528

Evidence for Directed Evolution of Larger Size Motif in Arabidopsis thaliana Genome

Rajesh Mehrotra 1, 2,2,*, Amit Yadav 1, Purva Bhalothia 1, Ratna Karan 2, Sandhya Mehrotra 1
PMCID: PMC3354754  PMID: 22645502

Abstract

Transcription control of gene expression depends on a variety of interactions mediated by the core promoter region, sequence specific DNA-binding proteins, and their cognate promoter elements. The prominent group of cis acting elements in plants contains an ACGT core. The cis element with this core has been shown to be involved in abscisic acid, salicylic acid, and light response. In this study, genome-wide comparison of the frequency of occurrence of two ACGT elements without any spacers as well as those separated by spacers of different length was carried out. In the first step, the frequency of occurrence of the cis element sequences across the whole genome was determined by using BLAST tool. In another approach the spacer sequence was randomized before making the query. As expected, the sequence ACGTACGT had maximum occurrence in Arabidopsis thaliana genome. As we increased the spacer length, one nucleotide at a time, the probability of its occurrence in genome decreased. This trend continued until an unexpectedly sharp rise in frequency of (ACGT)N25(ACGT). The observation of higher probability of bigger size motif suggests its directed evolution in Arabidopsis thaliana genome.

1. Introduction

Gene expression in eukaryotic organisms has been a topic of great interest. Careful regulation and recruitment of transcription factors (TFs) to cis regulatory elements in promoter regions lead to generation of specificity and diversity [1] in genetic regulation. Promoters are arrays of cis regulatory elements present upstream of a gene arranged with other specific cis elements. At present 469 cis elements have been reported in the plant cis regulatory element (PLACE) database. The prominent group of cis acting elements in plants contains an ACGT core. Several cis elements with this core have been shown to be responding to abscisic acid [24], salicylic acid [5], and light signals [6]. It has been reported by Foster et al. [7] that bZIP class of transcription factors binds to this core motif. In an elegant study Krawczyk et al. [8] showed deletion of two base pairs between activator sequence-1 (as1) palindromes does not affect binding of activator sequence binding factor (ASF-1) and TGA factors (which binds to TGACG sequence), whereas insertion decreases factor binding in vitro. In their study the distance between palindromic centers was 12 base pairs. Mehrotra et al. [9, 10] have shown that this motif functions even when they are placed out of the native context. R. Mehrotra and S. Mehrotra [11] have shown that promoter activation by ACGT in response to salicylic and abscisic acids is differentially regulated by the spacing between these motifs. It contributes synergistically to gene expression by stabilising the transcription complex formed on minimal promoter [10]. The present study is an extension of aforementioned work. In this study, genome-wide comparison of the frequency of occurrence of two ACGT elements without any spacers and also separated by spacers of different lengths was done. Based on the data obtained we report that there is a directed evolution of bigger size of motif in the Arabidopsis thaliana genome.

2. Materials and Methods

The objective was to find out the frequency of the recurring sequences and then use these recurring sequences with a random minimal promoter to predict transcription factors likely to interact with them.

The genomic sequence database of Arabidopsis thaliana at http://www.arabidopsis.org/ (The Arabidopsis Information Resource, TAIR) was analyzed using software BLASTn (available at NCBI website). All sequences were run in BLASTn against whole Arabidopsis thaliana genome to find their frequency of occurrence. Accession numbers of Arabidopsis thaliana chromosomes are as follows: chromosome 1: NC_003070.9, chromosome 2: NC_003071.7, chromosome 3: NC_003074.8, chromosome 4: NC_003075.7, and chromosome 5: NC_003076.8.

Randomization of the sequence was carried out using SHUFFLE program [12]. Different sequences obtained are listed in Table 1. In the next step we found the transcription factors binding to these cis elements separated by different length of nucleotides. A 139 bp long minimal promoter Pmec [13] was used in this study. The minimal promoter sequence as shown below was suffixed to the sequences shown in Table 1;

Table 1.

Frequency of occurrence of the various promoter sequences in which spacer sequence length between two ACGT palindromes is gradually increased from 5 to 25 nucleotides.

Cis element Chromosome 1 Chromosome 2 Chromosome 3 Chromosome 4 Chromosome 5 Total
(ACGT) 2 ACGTACGT 469 312 367 327 410 1885
(ACGT) 8 ACGTACGTACGTACGTACGTACGTACGTACGT 70 31 12 28 59 200
(ACGT)N5(ACGT) ACGTGGCTAACGT 16 11 13 13 19 72
(ACGT)N10(ACGT) ACGTGGCTATGGCGACGT 8 5 10 4 12 39
(ACGT)N25(ACGT) ACGTGGCTATGGCGGAGCAAGATTCACTCACGT 15 12 13 9 13 62
(ACGT)RN5(ACGT) ACGT–GCTAG–ACGT 7 5 5 2 4 23
(ACGT)RN10(ACGT) ACGT–TGGGGCCGAT–ACGT 2 2 4 3 3 14
(ACGT)RN25(ACGT) ACGTAGACACGTTGGGGGAACTTACTGCCACGT 3 1 7 5 5 21
(ACGT)RN25(ACGT) ACGT-ATATGAGATCGGCGCTTCACGGAGC-ACGT 4 14 6 4 4 32
(ACGT)N5(ACGT) randomized GGAATCCTTGGCA 41 24 30 19 23 137
(ACGT)N10(ACGT) randomized GCGGGCTATCGGTAGCAT 2 5 2 0 1 10
(ACGT)N25(ACGT) randomized TAAGGCTTAGCCACGCTTAGGGTGTGAGCACAC 6 6 3 0 3 18
(TGCA)N25(TGCA) TGCAGGCTATGGCGGAGCAAGATTCACTCTGCA 13 12 9 12 9 55

N5, N10, N25 denote sequence length between two ACGT palindromes. RN5, RN10, RN25—signify only spacer sequence being randomized. (ACGT) N_(ACGT) randomized—signify complete sequence being randomized.

TCACTATATATAGGAAGTTCATTTCATTTGGAATGGACACGTGTTGTCATTTCTCAACAATTACCAACAACAACAAACAACAAACAACATTATACAATTACTATTTACAATTACATCTAGATAAACAATGGCTTCCTCC.

These extended sequences were used in JASPAR core database [14] to scan for transcription factors and then these TFs were crosschecked with results obtained from CONSITE [15].

3. Results and Discussion

3.1. Promoters with Greater Length between ACGT Motifs Are More Frequent

It has been reported that ACGT cis elements function even when they are placed out of native sequence context [9, 10]. When the distance of separation between two ACGT elements are 5 base pairs, and 10 base pairs, they are induced in response to salicylic acid (SA) and abscisic acid (ABA), respectively. Interestingly, SA mimics biotic stress response and ABA mimics abiotic stress response in plants and thus is of great interest to plant biologists. Paixão and Azevedo [16] showed that multiplicity of cis element evolved through transitional forms showing redundant cis regulation. In this study, when the frequency of occurrence of two ACGT elements without any spacers and also separated by the spacer of different lengths was observed, we found that the total frequency of occurrence of two ACGT element in tandem is 1885 (Table 1), while the e value was same for all alignments obtained on a particular chromosome. When two ACGT elements were separated by spacer of 5, 10, and 25 nucleotides their frequency of occurrence was 72, 39, and 62, respectively. An unexpectedly high frequency of occurrence was observed when two ACGT elements were separated by 25 base pairs. According to the rule of probability the frequency of two ACGT elements separated by 25 base pairs should be less than when they are separated by 10 base pairs or lesser. Hobo et al. [17] have earlier reported that in ABA responsive promoters the distance between ACGT elements is 30 base pairs. To address this discrepancy in the data obtained, we randomized the spacer sequence keeping the ACGT motif unchanged. The logic of this randomization was to identify how important is the distance between the binding sites for transcription factors. After randomization of the spacer there was a drop in the frequency of occurrence to 23, 14, and 21 from 72, 39, and 62 for (ACGT)N5(ACGT), (ACGT)N10(ACGT), and (ACGT)N25(ACGT), respectively. This means that along with the distance between binding motifs there has been a positive selection for the sequence of the spacer in transcriptional regulation. In the next step we completely randomized the sequence and we observed that there is a drop in frequency of occurrence of two ACGT elements when separated by 10 and 25 base pairs while there was an unexpected increase in the frequency when ACGT elements were separated by five base pairs. This happened because randomization generated a motif that has been positively selected in evolution.

3.2. A and G Are the Preferred Bases

We increased the spacer length one residue at a time and looked for the frequency of each resultant sequence in the database. As shown in Table 2, there has been preference for A and G in the spacer region between two ACGT sequences.

Table 2.

Frequency of occurrence of nitrogenous bases when spacer sequence length between two ACGT palindromes is gradually increased from 5 to 25 nucleotides.

A C G T Seq. used Gap Count
(ACGT)N5(ACGT) ACGTGGCT_ACGT 72 42 33 34 72 5 690
(ACGT)N6(ACGT) ACGTGGCTA_ACGT 98 65 45 44 44 6 611
(ACGT)N7(ACGT) ACGTGGCTAT_ACGT 92 91 77 80 77 7 824
(ACGT)N8(ACGT) ACGTGGCTATG_ACGT 97 30 64 55 64 8 852
(ACGT)N9(ACGT) ACGTGGCTATGG_ACGT 39 32 22 32 32 9 602
(ACGT)N10 (ACGT) ACGTGGCTATGGC_ACGT 34 36 39 66 39 10 600
(ACGT)N11(ACGT) ACGTGGCTATGGCG_ACGT 36 23 38 29 38 11 681
(ACGT)N12(ACGT) ACGTGGCTATGGCGG_ACGT 56 54 65 45 56 12 638
(ACGT)N13(ACGT) ACGTGGCTATGGCGGA_ACGT 78 50 77 59 77 13 652
(ACGT)N14(ACGT) ACGTGGCTATGGCGGAG_ACGT 86 53 96 52 53 14 841
(ACGT)N15(ACGT) ACGTGGCTATGGCGGAGC_ACGT 56 67 44 66 56 15 709
(ACGT)N16(ACGT) ACGTGGCTATGGCGGAGCA_ACGT 60 34 52 34 60 16 843
(ACGT)N17(ACGT) ACGTGGCTATGGCGGAGCAA_ACGT 39 41 42 39 42 17 830
(ACGT)N18(ACGT) ACGTGGCTATGGCGGAGCAAG_ACGT 49 47 58 48 49 18 719
(ACGT)N19(ACGT) ACGTGGCTATGGCGGAGCAAGA_ACGT 50 38 49 44 44 19 695
(ACGT)N20(ACGT) ACGTGGCTATGGCGGAGCAAGAT_ACGT 34 30 44 37 37 20 821
(ACGT)N21(ACGT) ACGTGGCTATGGCGGAGCAAGATT_ACGT 36 40 42 43 40 21 717
(ACGT)N22(ACGT) ACGTGGCTATGGCGGAGCAAGATTC_ACGT 53 42 42 46 53 22 726
(ACGT)N23(ACGT) ACGTGGCTATGGCGGAGCAAGATTCA_ACGT 91 55 60 61 55 23 771
(ACGT)N24(ACGT) ACGTGGCTATGGCGGAGCAAGATTCAC_ACGT 77 64 57 53 53 24 1171
(ACGT)N25(ACGT) ACGTGGCTATGGCGGAGCAAGATTCACT_ACGT 76 62 58 69 62 25 708

3.3. Increasing Spacing between Motifs Increases Transcription Factor Binding Sites

Potential transcription factor binding sites for all experimental sequences when predicted using JASPAR CORE software and subsequently crosschecked with CONSITE revealed the minimal promoter sequence to be possessing 35 potential TF binding sites (Table 3, MPS). Interestingly the sequence ACGT as such has no site for binding of transcription factors but when minimal promoter is suffixed to it, an extra site for squamosa is generated and the total transcription factor binding site increases from 35 to 36 in minimal promoter alone (Table 3, (ACGT)(MPS)). When two ACGT elements in tandem are placed over minimal promoter sequence no extra site for binding of transcription factor is generated (Table 3, (ACGT)2(MPS)). However, when ACGT elements are separated by five base pairs (Table 3, (ACGT)N5(ACGT)(MPS)), four additional transcriptional binding sites are generated while ATHB-5 binding site which existed in the earlier cases is lost. The new sites generated are for transcription factors bzip9-10, EmBP-1, myb.Ph3, and TGA1a. Placement of two ACGT elements separated by 10 base pairs, however, resulted in loss of one myb.Ph3 site and the total transcriptional binding site decreased to 38 (Table 3, (ACGT)N10(ACGT)(MPS)). In case when ACGT elements are separated by 25 base pairs followed by minimal promoter an additional site for ARR10 and dof3 was generated (Table 3, (ACGT)N25(ACGT)(MPS)).

Table 3.

Alterations in transcription factor binding sites when spacer sequence length between two ACGT palindromes is gradually increased from 5 to 25 nucleotides.

Minimal promoter sequence (MPS) (ACGT) (ACGT)(MPS) (ACGT)2(MPS) (ACGT)N5(ACGT)(MPS) (ACGT)N10(ACGT)(MPS) (ACGT)N25(ACGT)(MPS)
Model name Frequency

ARR10 0 0 0 0 0 0 1
AGL3 2 0 2 2 2 2 2
ATHB-5 1 0 1 2 1 1 1
bZIP910 0 0 0 0 1 1 1
Dof3 1 0 1 1 1 1 2
EmBP-1 2 0 2 1 2 2 2
Gamyb 5 0 5 5 5 5 5
HAT5 2 0 2 2 2 2 2
HMG-1 6 0 6 6 6 6 6
HMG-I/Y 6 0 6 6 6 6 6
id1 5 0 5 5 5 5 5
myb.Ph3 1 0 1 1 2 1 1
PEND 1 0 1 1 1 1 1
squamosa 2 0 3 3 3 3 3
TGA1A 1 0 1 1 2 2 2
35 0 36 36 39 38 40

Based on the data obtained in this study, we report here that there has been directed evolution of bigger size of the motif in the Arabidopsis thaliana genome.

4. Conclusions

The central question in promoter evolution is to know how does cis regulatory element multiplicity evolved. The promoter regions of many genes contains multiple binding sites for the same transcription factor. Multiplicity may have evolved through transitional forms showing redundant cis regulation. In this paper, we focused on multiplicity of ACGT cis element and the distances between them which occurs in natural promoters. We found that ACGT element separated by 25 base pairs is more frequent than those by 10 base pairs which is against the law of probability. It signifies that under some evolutionary forces this interval was favoured since this distance may cause changes in the level of gene expression or in its robustness against variation in transcription factor concentration. Selection for different levels of expression of certain genes in certain environment could, over time, generates a positive association between cis element multiplicity and expression level.

Acknowledgments

The authors are grateful to the Department of Science and Technology, New Delhi, India for Grant-in-Aid and financial support to carry out this work bearing the file no. SR/FT/LS-126/2008. The authors are grateful to the administration of Birla Institute of Technology and Sciences, Pilani, Rajasthan for providing logistic support. They are thankful to Professor C. Gatz for critically reading the paper.

References

  • 1.Wray GA, Hahn MW, Abouheif E, et al. The evolution of transcriptional regulation in eukaryotes. Molecular Biology and Evolution. 2003;20(9):1377–1419. doi: 10.1093/molbev/msg140. [DOI] [PubMed] [Google Scholar]
  • 2.Guiltinan MJ, Marcotte WR, Quatrano RS. A plant leucine zipper protein that recognizes an abscisic acid response element. Science. 1990;250(4978):267–271. doi: 10.1126/science.2145628. [DOI] [PubMed] [Google Scholar]
  • 3.Shen Q, Ho TH. Functional dissection of an abscisic acid (ABA)-inducible gene reveals two independent ABA-responsive complexes each containing a G-box and a novel cis-acting element. Plant Cell. 1995;7(3):295–307. doi: 10.1105/tpc.7.3.295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Busk PK, Pagès M. Regulation of abscisic acid-induced transcription. Plant Molecular Biology. 1998;37(3):425–435. doi: 10.1023/a:1006058700720. [DOI] [PubMed] [Google Scholar]
  • 5.Jupin I, Chua NH. Activation of the CaMV as-1 cis-element by salicylic acid: differential DNA-binding of a factor related to TGA1a. EMBO Journal. 1996;15(20):5679–5689. [PMC free article] [PubMed] [Google Scholar]
  • 6.Donald RGK, Cashmore AR. Mutation of either G box or I box sequences profoundly affects expression from the Arabidopsis rbcS-1A promoter. EMBO Journal. 1990;9(6):1717–1726. doi: 10.1002/j.1460-2075.1990.tb08295.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Foster R, Izawa T, Chua NH. Plant basic leucine zipper proteins gather at ACGT elements. FASEB Journal. 1994;8:192–200. doi: 10.1096/fasebj.8.2.8119490. [DOI] [PubMed] [Google Scholar]
  • 8.Krawczyk S, Thurow C, Niggeweg R, Gatz C. Analysis of the spacing between the two palindromes of activation sequence-1 with respect to binding to different TGA factors and transcriptional activation potential. Nucleic Acids Research. 2002;30(3):775–781. doi: 10.1093/nar/30.3.775. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mehrotra R, Kiran K, Chaturvedi CP, et al. Effect of copy number and spacing of the ACGT and GT cis elements on transient expression of minimal promoter in plants. Journal of Genetics. 2005;84(2):183–187. doi: 10.1007/BF02715844. [DOI] [PubMed] [Google Scholar]
  • 10.Sawant SV, Kiran K, Mehrotra R, et al. A variety of synergistic and antagonistic interactions mediated by cis-acting DNA motifs regulate gene expression in plant cells and modulate stability of the transcription complex formed on a basal promoter. Journal of Experimental Botany. 2005;56(419):2345–2353. doi: 10.1093/jxb/eri227. [DOI] [PubMed] [Google Scholar]
  • 11.Mehrotra R, Mehrotra S. Promoter activation by ACGT in response to salicylic and abscisic acids is differentially regulated by the spacing between two copies of the motif. Journal of Plant Physiology. 2010;167(14):1214–1218. doi: 10.1016/j.jplph.2010.04.005. [DOI] [PubMed] [Google Scholar]
  • 12.Doelz L. BioCompanion. Basel, Switzerland: Dr. Ing. U. Doelz; 1990. (Biocomputing Essentials Series). [Google Scholar]
  • 13.Sawant S, Singh PK, Madanala R, Tuli R. Designing of an artificial expression cassette for the high-level expression of transgenes in plants. Theoretical and Applied Genetics. 2001;102(4):635–644. [Google Scholar]
  • 14.Bryne JC, Valen E, Tang MHE, et al. JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Research. 2008;36(1):D102–D106. doi: 10.1093/nar/gkm955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Sandelin A, Wasserman WW, Lenhard B. ConSite: web-based prediction of regulatory elements using cross-species comparison. Nucleic Acids Research. 2004;32:W249–W252. doi: 10.1093/nar/gkh372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Paixão T, Azevedo RBR. Redundancy and the evolution of cis-regulatory element multiplicity. PLoS Computational Biology. 2010;6(7) doi: 10.1371/journal.pcbi.1000848. Article ID e1000848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hobo T, Asada M, Kowyama Y, Hattori T. ACGT-containing abscisic acid response element (ABRE) and coupling element 3 (CE3) are functionally equivalent. Plant Journal. 1999;19(6):679–689. doi: 10.1046/j.1365-313x.1999.00565.x. [DOI] [PubMed] [Google Scholar]

Articles from The Scientific World Journal are provided here courtesy of Wiley

RESOURCES