Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2013 Nov 3;42(Database issue):D1188–D1192. doi: 10.1093/nar/gkt1027

ppdb: plant promoter database version 3.0

Ayaka Hieno 1, Hushna Ara Naznin 1, Mitsuro Hyakumachi 1,2, Tetsuya Sakurai 3, Mututomo Tokizawa 1, Hiroyuki Koyama 1,2, Naoki Sato 4, Tomoaki Nishiyama 5, Mitsuyasu Hasebe 6,7, Andreas D Zimmer 8, Daniel Lang 8, Ralf Reski 8, Stefan A Rensing 9, Junichi Obokata 10, Yoshiharu Y Yamamoto 1,2,3,*
PMCID: PMC3965062  PMID: 24194597

Abstract

ppdb (http://ppdb.agr.gifu-u.ac.jp) is a plant promoter database that provides information on transcription start sites (TSSs), core promoter structure (TATA boxes, Initiators, Y Patches, GA and CA elements) and regulatory element groups (REGs) as putative and comprehensive transcriptional regulatory elements. Since the last report in this journal, the database has been updated in three areas to version 3.0. First, new genomes have been included in the database, and now ppdb provides information on Arabidopsis thaliana, rice, Physcomitrella patens and poplar. Second, new TSS tag data (34 million) from A. thaliana, determined by a high throughput sequencer, has been added to give a ∼200-fold increase in TSS data compared with version 1.0. This results in a much higher coverage of ∼27 000 A. thaliana genes and finer positioning of promoters even for genes with low expression levels. Third, microarray data-based predictions have been appended as REG annotations which inform their putative physiological roles.

INTRODUCTION

Gene regulation is a central part of morphogenesis and environmental adaptation of higher plants, and it is controlled by the promoter of each gene. Therefore, understanding of promoter structure is crucial to understand these fundamental processes of plants.

There are three aspects to promoter structure: (i) the position, direction and strength of the transcription start sites (TSSs) that indicate actual promoter position; (ii) the type and position of the core promoter elements such as TATA boxes and Initiators (Inrs) that are thought to be the major determinants of the direction and position of promoters and (iii) the type and position of transcriptional regulatory elements that are involved in gene regulation.

In our last report (1), we introduced the plant promoter database (ppdb), which provided promoter information about TSS clusters, core promoter elements [TATA boxes, Inrs, Y Patches, GA and CA elements (2,3)] and regulatory element groups [REGs, putative position-sensitive transcriptional regulatory elements that are extracted by local distribution of short sequences (LDSS) analysis (2)] as putative and comprehensive sets of transcriptional regulatory elements. The database of the original version 1.0 contained information of two plant species, Arabidopsis thaliana and rice.

MAJOR EXTENSIONS FROM VERSION 1.0

The major amendment in version 3.0 is the addition of the Physcomitrella patens and poplar genomes to the database. The sources used for the information of the four genomes, including A. thaliana and rice, are shown in Table 1. The promoter elements of the moss genome have been extracted by the LDSS method (2). During extraction, we noticed that considerable numbers of moss genes are driven by a similar type of promoter that is located within long terminal repeats. These promoters affect the extraction process due to tight sequence conservation that is not related to promoter function and for this reason they were excluded from the LDSS analysis. A. thaliana promoter elements have been applied to the poplar genome because the Brassicaceae and Malpighiales are phylogenetically close.

Table 1.

Source of ppdb version 3.0

Specification Source Size
A. thaliana
    Genome sequence and gene annotation TAIR9 http://www.arabidopsis.org/, (4)
 TSS information Selected RAFL cDNA http://rarge.gsc.riken.jp/, (5) 62 108 (clonesa)
Cap signature CT-MPSS tags (3) 158 237 (tagsb)
Oligo-Cap Illumina data Tokizawa M, Yamanaka H, Koyama H, Sakurai T, Kurotani A, Shinozaki K, Suzuki Y, Sugano S, Obokata J, Yamamoto YY (unpublished data) 34 206 936 (tagsb)
 Promoter elements A. thaliana LDSS-positive octamers (2,3) 659 (octamersc)
Annotation for LDSS elements: PLACE http://www.dna.affrc.go.jp/PLACE/, (6) 21 (only matched motifsd)
Annotation for LDSS elements: stress and hormonal responses (7) 53 (only matched motifsc)
Rice (Oryza sativa)
    Genome sequence and gene annotation RGSP build 4.0 http://rapdb.lab.nig.ac.jp/, (8)
    TSS information Carefully selected fl-cDNA (from KOME) http://cdna01.dna.affrc.go.jp/cDNA/, (9) 17 286 (clonesa)
 Promoter elements Rice LDSS-positive octamers (2,10) 660 (octamersc)
Annotation for LDSS elements: PLACE http://www.dna.affrc.go.jp/PLACE/, (6) 4 (only matched motifsd)
Moss (P. patens)
    Genome sequence and gene annotation JGI version 1.1, COSMOSS V1.6 http://www.cosmos.org, (11,12)
    TSS information 5′ CAGE (13) 1 122 382 (tagsb)
    Promoter elements P. patens LDSS-positive octamers This work 198 (octamersc)
Poplar (Populus trichocarpa)
    Genome sequence and gene annotation Phytozome6 http://www.phytozome.net/poplar, (14)
 TSS information FL-cDNA info from GenBank (15) 15 256 (clonesa, BP921855–937111)
36 103 (clonesa, DB874873–910976)
    Promoter elements A. thaliana LDSS-positive octamers (2,3) 659 (octamersc)
Orthologue gene
    Orthologue group Gclust (16) 336 689 (familiese)

aclone number, btag number, cnumber of octamer sequences, dnumber of motifs and enumber of orthologue families.

A new function called ‘Homologue Gene Search’ has been added to facilitate the comparison of promoter structures of orthologous genes within a species or between different species. Orthologue groups have been determined by Gclust, a system that classifies orthologues according to the presence or absence of protein motifs (16).

New A. thaliana TSS data of 34 million tags, which corresponds to a ∼200-fold increase in the previous data, have been added (Figure 1). REG annotations have also been appended and show functional predictions based on microarray data of responses to plant hormones (AUX: auxin, BR: brassinosteroid, CK: cytokinin, ABA: abscisic acid, ET: ethylene, JA: jasmonic acid, SA: salicylic acid), responses to a hormone-like chemical (H2O2) and some environmental stress-related responses (drought, DREB1A overexpression) (7). Functional annotation of 53 of 308 REGs is now available in version 3.0 (Figure 2).

Figure 1.

Figure 1.

Indication of individual promoters. An Arabidopsis gene, AT1G02780.1, is shown. The information is composed of five panels: ‘Summary of Gene’, ‘Overview’, ‘Focused view’ and also ‘Promoter Summary’ (not shown) and ‘Other Reliable Promoter Summary’ (not shown). The top TSS (TSS Peak) is shown in the second column of the ‘Focused view’ as white letters on a red background. New TSS tag data (34 million) are shown at the bottom of ‘Focused view’, highlighted in a red rectangle with rounded corners.

Figure 2.

Figure 2.

REG information. REG information of the AT5G52310.1 (RD29A) promoter is shown. REG annotations, added in version 3.0, are highlighted in a red rectangle with rounded corners.

BROWSING PROMOTER STRUCTURE

The major function of ppdb is to give an indication of a possible promoter structure for each gene in a genome based on the established lists of LDSS-positive elements. The information can be directly called by the gene ID (e.g. AT1G67090 or Os01g0791600), or selected from a list of ‘Keyword Search’ or ‘Homologue Gene Search’. Pages for individual genes show the following information: (i) DNA sequence, (ii) TSS distribution (direction and strength at a 1-bp resolution), (iii) core promoter structure and (iv) REG data.

At the sequence window, promoter elements including REGs and core elements are highlighted in a position-dependent manner as the default setting. Care should be taken that promoters without any TSS information do not show any elements as default. For an indication of the promoter elements of these genes, the ‘Reliable’ button should be clicked which changes the state to ‘All’ (Figure 1, red arrow). This button is a toggle switch between ‘Reliable’ and ‘All’. ‘Reliable’ is a default setting where only elements at appropriate positions relative to the peak TSS are detected. The setting ‘All’ removes the positional restriction as an indication of promoter elements, allowing global detection. The sensitive area in the ‘Reliable’ mode for each element group is described on the front page of the database.

The ‘TSS tag distribution’ columns in the ‘Focused view’ provide the expressional strength of each TSS. The expression is the sum of six TSS tag libraries that are prepared from leaves, roots, inflorescences, etiolated seedlings and shoots from low light-grown and high light-grown seedlings.

The ‘Core promoter information’ table shows the presence or absence of core promoter elements (TATA boxes, Inrs, Y Patches, GA and CA elements).

The ‘REG information’ table shows a REG list together with the corresponding PPDB motifs (2,3) and PLACE motifs (6). REG sequences, as well as PPDB and PLACE motifs, are linked to other pages containing biological information. New REG annotations for A. thaliana obtained from predicted cis-regulatory elements based on microarray data (7) have been included (Figure 2).

Selection of the ‘All’ button (Figure 1) adds another category, ‘Not Reliable Promoter Summary’ below ‘Other Reliable Promoter Summary’. This category can be used when searching for regulatory elements (REGs) from wider regions or when there is no TSS information on the promoter of interest.

ADDITIONAL PAGES

A whole list of REGs for each of the genomes can be viewed by selecting a cell in the table of ‘Index of Genes’ at the top of the page. The lists present the relationships between REG ID, sequence, PPDB motifs, PLACE motifs and also functional annotations. Selection of a specific REG entry leads to ‘Summary of the REG’ and ‘Entry Sequences’ that show the whole gene lists containing the corresponding REG, together with gene annotations.

FUNDING

Grant-in-Aid for Scientific Research on Priority Areas ‘Comparative Genomics’ (in part) (to Y.Y.Y. and J.O.); Scientific Research on Priority Areas ‘Perceptive Plants’ (in part) (to Y.Y.Y.); Grant-in-Aid for Publication of Scientific Research Results ‘Databases’ (in part) (to J.O. and Y.Y.Y.) from the Ministry of Education, Culture, Sports, Science and Technology of Japan; Advanced Low Carbon Technology Research and Development Program from Japan Science and Technology Agency (JST ALCA) (in part) (to Y.Y.Y.). Funding for open access charge: Ministry of Education, Culture, Sports, Science and Technology, Japan.

Conflict of interest statement. None declared.

ACKNOWLEDGEMENTS

We would like to acknowledge Mr Takeaki Taniguchi of Mitsubisi Research Institute for excellent technical assistance. We also appreciate Dr Dong Nguyen Tien and Mr Hirokazu Murakami for maintenance of the database.

REFERENCES

  • 1.Yamamoto YY, Obokata J. ppdb, a plant promoter database. Nucleic Acids Res. 2008;36:D977–D981. doi: 10.1093/nar/gkm785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Yamamoto YY, Ichida H, Matsui M, Obokata J, Sakurai T, Satou M, Seki M, Shinozaki K, Abe T. Identification of plant promoter constituents by analysis of local distribution of short sequences. BMC Genomics. 2007;8:67. doi: 10.1186/1471-2164-8-67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Yamamoto YY, Yoshitsugu T, Sakurai T, Seki M, Shinozaki K, Obokata J. Heterogeneity of Arabidopsis core promoters revealed by high density TSS analysis. Plant J. 2009;60:350–362. doi: 10.1111/j.1365-313X.2009.03958.x. [DOI] [PubMed] [Google Scholar]
  • 4.Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, Garcia-Hernandez M, et al. The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 2012;40:D1202–D1210. doi: 10.1093/nar/gkr1090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Seki M, Narusaka M, Kamiya A, Ishida J, Satou M, Sakurai T, Nakajima M, Enju A, Akiyama K, Oono Y, et al. Functional annotation of a full-length Arabidopsis cDNA collection. Science. 2002;296:141–145. doi: 10.1126/science.1071006. [DOI] [PubMed] [Google Scholar]
  • 6.Higo K, Ugawa Y, Iwamoto M, Korenaga T. Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucleic Acids Res. 1999;27:297–300. doi: 10.1093/nar/27.1.297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Yamamoto YY, Yoshioka Y, Hyakumachi M, Maruyama K, Yamaguchi-Shinozaki K, Tokizawa M, Koyama H. Prediction of transcriptional regulatory elements for plant hormone responses based on microarray data. BMC Plant Biol. 2011;11:39. doi: 10.1186/1471-2229-11-39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.International_Rice_Genome_Sequencing_Project. The map-based sequence of the rice genome. Nature. 2005;436:793–800. doi: 10.1038/nature03895. [DOI] [PubMed] [Google Scholar]
  • 9.Kikuchi S, Satoh K, Nagata T, Kawagashira N, Doi K, Kishimoto N, Yazaki J, Ishikawa M, Yamada H, Ooka H, et al. Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice. Science. 2003;301:376–379. doi: 10.1126/science.1081288. [DOI] [PubMed] [Google Scholar]
  • 10.Yamamoto YY, Ichida H, Abe T, Suzuki Y, Sugano S, Obokata J. Differentiation of core promoter architecture between plants and mammals revealed by LDSS analysis. Nucleic Acids Res. 2007;35:6219–6226. doi: 10.1093/nar/gkm685. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Rensing SA, Lang D, Zimmer AD, Terry A, Salamov A, Shapiro H, Nishiyama T, Perroud PF, Lindquist EA, Kamisugi Y, et al. The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants. Science. 2008;319:64–69. doi: 10.1126/science.1150646. [DOI] [PubMed] [Google Scholar]
  • 12.Zimmer AD, Lang D, Buchta K, Rombauts S, Nishiyama T, Hasebe M, Van de Peer Y, Rensing SA, Reski R. Reannotation and extended community resources for the genome of the non-seed plant Physcomitrella patens provide insights into the evolution of plant gene structures and functions. BMC Genomics. 2013;14:498. doi: 10.1186/1471-2164-14-498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Nishiyama T, Miyawaki K, Ohshima M, Thompson K, Nagashima A, Hasebe M, Kurata T. Digital gene expression profiling by 5'-end sequencing of cDNAs during reprogramming in the moss Physcomitrella patens. PLoS ONE. 2012;7:e36471. doi: 10.1371/journal.pone.0036471. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray) Science. 2006;313:1596–1604. doi: 10.1126/science.1128691. [DOI] [PubMed] [Google Scholar]
  • 15.Nanjo T, Sakurai T, Totoki Y, Toyoda A, Nishiguchi M, Kado T, Igasaki T, Futamura N, Seki M, Sakaki Y, et al. Functional annotation of 19,841 Populus nigra full-length enriched cDNA clones. BMC Genomics. 2007;8:448. doi: 10.1186/1471-2164-8-448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Sato N. Gclust: trans-kingdom classification of proteins using automatic individual threshold setting. Bioinformatics. 2009;25:599–605. doi: 10.1093/bioinformatics/btp047. [DOI] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES