TABLE 2.
Human and mouse conserved promoter motifs
| Human Transcript | Evolutionarily Conserved Region | Motifs1 | Position | Sequence |
|---|---|---|---|---|
| ECR 1 | PPARA | −13233 | TGATCT | |
| [−13399−−13206] | SP1 | −13118 | CCACAGCCCC | |
| ERE | −11388 | TT-GGTCA-GGC-TGGTC-TT | ||
| ECR 2 | MYB | −10839 | GGCCAGTTC | |
| [−10919−−10748] | ERE | −10266 | TT-AGTCA-CGC-TGGTC-TC | |
| Pol II binding site | −9038 | (bp 17,435,552−17,436,153) | ||
| ECR 3 | CpG island | −8774 | CpG Repeat | |
| [−9246−−9038] | Exon 1 | −9151 | ||
| Transcript A: NM_148172 | TSS A | −9205 | TTCCGGGGG (bp 17,435,719) | |
| ERE | −7647 | GG-GGTCA-TGA-TCACC-TC | ||
| Consensus | ERE | −7401 | CG-GGTCA-GGG-TGACC-CT | |
| ERE | −6707 | GA-GACCA-GCC-TGACC-AA | ||
| ERE | −5879 | CG-GGGTA-TCT-TGACT-GC | ||
| FOXA1 | −3309 | AAGTTGTTTCCATT | ||
| ECR 4 | MYB | −3264 | AAACTGCCA | |
| [−3324−−2994] | SOX6 | −3190 | TGCATTGTTATCA | |
| GATA | −3186 | TGTTATCATT | ||
| SP1 | −862 | GTGGCGTGAT | ||
| AP1 | −386 | CTGACTCCT | ||
| AP1 | −206 | CCCGAGTCAGC | ||
| Transcript B: NM_007169 | TSS+1 | +1 | TTGTCCATG (bp 17,426,514) | |
| TSS B | +40 | GACCACAA (bp 17,426,470) | ||
| ECR 5 | Exon 2 | +50 | ||
| [−35−+1073] | C/EBP | +175 | AAATTACCA | |
| AP1 | +631 | CATTAGTCATT | ||
| ERE | +725 | TG-AGACA-GGC-TGACC-TG | ||
| ERE | +741 | GA-GGCCA-TTG-GGACC-TG | ||
| Transcript C: NM_148173 | ECR 6 | TSS C | +5010 | TGTGGGCGA (bp 17,421,504) |
| [+5192−+5579] | GATA | +5284 | TTTTATCTTC | |
| ERE | +5476 | TG-GGCTA-CGT-GGACC-CC |
| Mouse | Evolutionarily Conserved Region | Motifs1 | Position | Sequence |
|---|---|---|---|---|
| ECR 1 | Ppar-α | −12980 | AGATCA | |
| [−12997−−12808] | Sp1 | −12884 | CCCACC | |
| ECR 2 | Ppar-α | −11788 | AGGTCA | |
| [−11874−−11709 | TSS A | −9602 | CATCAGATA (bp 59,659,083) | |
| ECR 3 | CpG island | −9622−−9169 | CpG repeats | |
| [−9592−−9382] | ERE | −7919 | TA-GGTCA-GGA-TGACC-TT | |
| Consensus | ||||
| Transcript A | ||||
| ECR 5 | TSS B (+1) | +1 | CCCAGTGTG (bp 59,649,481) | |
| (−190−+884) | ||||
| Transcript B: NM_008819 | TSS B | +110 | TTCCTTCTG (bp 59,649,379) | |
| Ap1 | +429 | CGTTAGTCACT | ||
| ERE | +790 | TG-AGGCA-GGC-TGACC-AG | ||
| ERE | +813 | CA-GGGCA-CGG-GGACC-TG | ||
| ERE | +5150 | TT-AGTCA-TGT-TGGCT-GC | ||
| ECR 6 | ||||
| (+4848−+5243) | Gata | +5441 | CTTTTATCTTC | |
| Transcript C | TSS C | +5147 | GCTGATCTC (bp 59,644,334) | |
| ERE | +5192 | TG-GGTTA-CAT-GGACC-CC | ||
| ERE | +5780 | GG-TGTCA-AGG-TGACC-TA | ||
| ERE | +5860 | AA-GACCA-CTG-TGACC-TC | ||
| ERE | +7961 | CA-GGTGG-GCC-TGACC-CT |
For both human and mouse PEMT genes motif chromosomal position is displayed relative to transcription start site B (+1) based on the literature defined major TSS for each (36, 37); +/− indicates upstream or downstream orientation, respectively. For the human +1 site, this corresponds to bp 17,426,514 of ch17. For mouse this corresponds to bp 59,649,481 on ch11 (UCSC May 2004 Release of the Mus musculus genome, mm5), which corresponds to bp 59,853,136 in the current release of the M. musculus genome (mm8). The mouse and human PEMT gene promoters are highly conserved in six distinct evolutionarily conserved regions (ECR 1-ECR 6). ECR 4 in mouse Pemt has been translocated to a different chromosome (data not shown). There are conserved estrogen response elements and transcription factor binding sites within ECR 5 and 6 in mouse and human proximal promoter regions. The human and murine PEMT gene contains additional EREs, including a consensus ERE in a distal promoter/enhancer region approximately 7.5kb from TSS B. The human promoter A region contains a CpG island and an experimentally validated 600 bp RNA polymerase II binding site (Pol 9419, P<0.00001) (Figure 3, Table 2) (38). ECR browser, rVista TFBS search engine, Dragon ERE Finder, and TRANSFAC were used to identify ECRs, EREs, and TFBS, respectively. Underline indicates nucleotides that differ from the estrogen response element consensus sequence. Italicized letters indicate transcription start sites for each proximal promoter region denoted A, B, C.
Motif Abbrevations for gene names are from LocusLink (www.ncbi.nlm.nih.gov/LocustLink/list.cgi; last accessed 01/02/07).