Abstract
We report the genome-wide identification of estrogen receptor α (ERα)-binding regions in mouse liver using a combination of chromatin immunoprecipitation and tiled microarrays that cover all nonrepetitive sequences in the mouse genome. This analysis identified 5568 ERα-binding regions. In agreement with what has previously been reported for human cell lines, many ERα-binding regions are located far away from transcription start sites; approximately 40% of ERα-binding regions are located within 10 kb of annotated transcription start sites. Almost 50% of ERα-binding regions overlap genes. The majority of ERα-binding regions lie in regions that are evolutionarily conserved between human and mouse. Motif-finding algorithms identified the estrogen response element, and variants thereof, together with binding sites for activator protein 1, basic-helix-loop-helix proteins, ETS proteins, and Forkhead proteins as the most common motifs present in identified ERα-binding regions. To correlate ERα binding to the promoter of specific genes, with changes in expression levels of the corresponding mRNAs, expression levels of selected mRNAs were assayed in livers 2, 4, and 6 h after treatment with ERα-selective agonist propyl pyrazole triol. Five of these eight selected genes, Shp, Stat3, Pdgds, Pck1, and Pdk4, all responded to propyl pyrazole triol after 4 h treatment. These results extend our previous studies using gene expression profiling to characterize estrogen signaling in mouse liver, by characterizing the first step in this signaling cascade, the binding of ERα to DNA in intact chromatin.
ESTROGEN HAS TRADITIONALLY been connected with female reproduction. The importance of this hormone for nonreproductive processes, such as metabolic disease, was established later. Most of the known effects of estrogen are mediated by two estrogen receptors (ERs), ERα (1) and ERβ (2), that activate the expression of specific sets of genes. After binding ligand, the activated ERs are able to interact with cis-regulatory elements of target genes by direct binding to estrogen-response elements (EREs) or indirectly through interaction with another DNA-bound transcription factor, such as activator protein 1 (AP1) or transcription factor Sp1 (Sp1) complexes, thus facilitating the assembly of basal transcription factors into a stable preinitiation complex followed by increased transcription rates for target mRNAs (3, 4, 5, 6).
A role of estrogen in glucose homeostasis has been demonstrated in rodents and humans (7). However, the signaling cascade, from estrogen binding to ER to improved glucose tolerance, still remains largely elusive.
We have demonstrated that estrogen, via ERα, the major ER expressed in the liver, regulates glucose homeostasis mainly by modulating hepatic insulin sensitivity (8). Further studies in diabetic ob/ob mice led us to propose that estrogens via ERα modulate lipid metabolism in the liver, thereby exerting antidiabetic effects (9). However, the initial step in estrogen signaling in the liver, the recruitment of ERs to DNA in intact chromatin, has not yet been studied.
Recently, chromatin immunoprecipitation (ChIP) has been explored in combination with genomic DNA microarrays (chip) (ChIP-on-chip) to pursue whole-genome identification of ER-binding sites and binding sites for other transcription factors in intact chromatin of cultured cell lines (10, 11, 12).
In this study, we applied the ChIP-on-chip approach to identify DNA regions that bind ERα in intact chromatin from mouse liver using Affymetrix whole-genome tiling arrays. To our knowledge, this is the first report that describes the whole-genome identification of ERα-binding sites in intact chromatin from tissue samples.
RESULTS
Genome-Wide Identification and Characterization of ERα-Binding Sites in Mouse Liver Chromatin
We performed ChIP using mouse liver tissues isolated from female mice treated with estradiol for 2 h. Livers from three mice were assayed individually. ERα ChIP-enriched samples and input samples were hybridized to mouse whole-genome tiling arrays. This analysis revealed a total of 5568 ERα ChIP-enriched regions in mouse liver chromatin. The complete list of chromosome coordinates for all enriched regions is available as supplemental Table 1 published as supplemental data on The Endocrine Society’s Journals Online web site at http://mend.endojournals.org.
Identified ERα regions were mapped to RefSeq genes. The Venn diagram in Fig. 1A shows that nearly 50% (2643/5568) of ERα-binding regions overlap RefSeq genes, including 8% (439/5568) that overlap the actual exons. This shows that a large proportion [∼40% (2204/5568)] of ERα-binding regions in liver chromatin is located within introns. Figure 2 shows a large genomic region with zoom-ins of particular ERα-binding regions located within and outside of genes.
We then investigated the distance from ERα-binding regions to the nearest 5′- and 3′-ends, in any direction, of RefSeq genes. The reason for also investigating 3′-ends was to determine whether intronic ERα-binding regions, which, according to Fig. 1, constitute a significant fraction of the total ERα-binding regions, are enriched in the 5′- or 3′-regions of genes.
A cumulative plot showing the fraction of ERα-binding regions found within a certain genomic distance from the 5′- and 3′-end of genes is shown in Fig. 1B. This analysis reveals that: 1) There is a substantial bias toward binding sites being localized in the neighborhood of the 5′-end, the transcription start site (TSS), compared with the 3′-end, of genes; 2) The majority of ERα-binding regions are located far away from both 5′- and 3′-ends of genes. Only about 40% of ERα-binding regions are within 10 kb of a RefSeq TSS.
As shown in Fig. 1A and discussed above, a large proportion of ERα-binding regions were located within genes. Figure 1C shows the location of ERα-binding regions within genes when gene lengths are normalized to 1 (so that the 5′-end = 0 and the 3′-end = 1). The prominent peak close to 0 suggests that there is a clear preference for ERα-binding regions close to the TSS. Apart from this, the density of ERα-binding regions is uniform along the gene boundary.
Validation of Identified ERα-Binding Regions
We performed independent ChIP assays on the same individual mouse livers as those analyzed by the whole-genome tiling arrays. These samples were analyzed on the mouse promoter arrays. The promoter arrays represent focused versions of the whole-genome tiling arrays that target the promoter regions of more than 25,000 genes. ERα ChIP-enriched regions (n = 948) were identified using the promoter arrays. The complete list of chromosome coordinates for the enriched regions identified by the promoter arrays is available as supplemental Table 2. Importantly, 85% of the sites identified by the promoter arrays (811) were also identified by the whole-genome tiling arrays. Thus, the reproducibility for independently performed ChIP assays analyzed using two types of arrays is good and the analysis using the promoter arrays provides validation for the whole-genome identification of ERα-binding sites.
To further validate the identified ERα-binding sites, the ERα ChIP-enriched regions adjacent to the Shp, Stat3, Acox1, Gpam, Pck1, Pdk4, Pparα, and Ptgds genes, respectively, were confirmed by ordinary ChIP followed by detection of ERα ChIP regions with real-time PCR (Fig. 3). If there were more than two ChIP-enriched regions in a gene, the one closest and upstream of the TSS was used for confirmation.
In this assay, all the selected regions showed significant P values (<0.05) for ERα recruitment in the estrogen-treated samples. At least a 5-fold enrichment with ERα antibodies vs. normal IgG was observed for all tested regions. We also studied the effect of estrogen on DNA binding by ERα in this assay. Significant estradiol-mediated enrichment of ERα binding was observed only for the binding regions in the Shp, Gpam, Pdk4, and Ptgds promoters. There were tendencies for increased recruitment of ERα upon estradiol treatment for all of the tested binding sites (Fig. 3).
Evolutionary Conservation of ERα-Binding Regions
We assessed the evolutionary conservation of ERα-binding regions between human and mouse as well as by multiple alignments between vertebrates.
As ChIP-on-chip has a resolution (most ERα-binding regions derived from the experiment are ∼500 bp) that is much lower than the size of an ER-binding site (e.g. the consensus sequence for the classical ERE is 15 bp), we cannot assess the conservation of the physical binding sites but only the larger regions encompassing the sites. Nevertheless, the mouse-to-human comparison shows that a large proportion of ERα-binding regions are under evolutionary constraints (Fig. 4A). The evolutionary constraints do not prove the existence of the corresponding ERα-binding sites in both genomes. However, because transcription factor-binding sites often are conserved between species, it is a likely explanation (13, 14). The distribution of the mean conservation of ERα-binding regions is bimodal with a large peak consisting of nonalignable regions and a larger peak of evolutionary constrained sites with 60–80% sequence conservation. As expected, the second peak does not occur when we use randomly genomic regions in the same analysis (Fig. 4B). There is no major difference in the conservation level of ERα-binding regions identified using the promoter arrays and the whole-genome tiling arrays, respectively, suggesting that any evolutionary constraints are not higher around annotated promoters (data not shown).
When extending the evolutionary sequence conservation to multiple species, it is evident that ERα-binding regions are still preferentially conserved (Fig. 4C), although the difference is smaller between the random expectation and observed results as compared with the results obtained for the mouse-to-human only comparison.
Computational Identification of Common Motifs in Enriched ERα-Binding Sequences
Regions identified by ChIP-on-chip are several magnitudes larger than the span of DNA bound by a particular protein. Thus, the exact location of the actual ERα-binding site(s) within regions identified by ChIP-on-chip is unknown. Potential locations can be identified by computational methods (13).
Twenty-four motifs were identified as significantly enriched sites in the identified ERα binding region (supplemental Table 3). Some position weight matrixes (PWMs) for different motifs closely resemble each other. For example, the JASPAR’s Pparγ motif differs from the ERE motif mainly by having strong base preferences outside the conserved regions of the ERE (Fig. 5A). By combining the motifs that have similar profiles, all the significantly enriched motifs identified could be subgrouped into six classes. They are ERE sites, ERE half-sites, Forkhead sites, AP1 sites, bHLH sites, and ETS sites (Fig. 5B). Interestingly, only 4% of ERα-interacting regions include only ERE or ERE half-site motifs. Approximately 80% of ERα interacting regions include AP1, Forkhead motifs, and/or other enriched binding sites together with ERE-like motifs (data not shown).
In the past, many ER-binding sites have been identified in the proximity of TSS of estrogen target genes on a gene-by-gene basis by classical molecular biology approaches (4, 6, 15). The ability of such sites to mediate transcriptional regulation has been well established. Later genome-wide approaches have revealed that a large fraction of ERα-binding sites are located far away from TSSs. However, because the function of these distal binding sites has not been characterized in detail, it is challenging to link them to target gene(s). In some cases the distal sites overlap with several genes and may therefore regulate more than one gene (16). The following analysis of ERα-binding regions in relation to ER target genes is therefore restricted to the 948 ERα ChIP-enriched regions that were identified by the mouse promoter arrays. Based on the Ensemble gene build 30 (http://www.ensembl.org) for the NCBIv33 mouse, 710 ERα-binding regions could be mapped to promoters of 645 known genes (supplemental Table 4).
ERα-Binding Sites Are Enriched in the Promoters of Genes Involved in Energy Metabolism
We used an overrepresentation analysis procedure to investigate whether ERα was binding to the promoters of groups of functionally related genes. The pathways most significantly enriched [false discovery rate (FDR) < 0.01] for ERα target genes are displayed in Table 1. This analysis revealed that 19 gene ontology (GO) categories, such as organic acid metabolism (GO 0006082), carboxylic acid metabolism (GO 0019752), lipid biosynthesis (GO 0008610), fatty acid metabolism (GO 0006631), and amino acid metabolism (GO 0006520), were significantly enriched for genes that had ERα recruited to their promoter after 2 h of estradiol treatment.
Table 1.
GO Category | P Value | FDR | Total Genes | ERα Target Genes |
---|---|---|---|---|
Organic acid metabolism (GO 0006082) | 1.48 × 10−11 | 0.0000 | 312 | 42 |
Carboxylic acid metabolism (GO 0019752) | 1.48 × 10−11 | 0.0000 | 312 | 42 |
Cellular lipid metabolism (GO 0044255) | 3.40 × 10−11 | 0.0000 | 320 | 42 |
Lipid metabolism (GO 0006629) | 6.76 × 10−11 | 0.0000 | 381 | 46 |
Generation of precursor metabolites and energy (GO 0006091) | 2.76 × 10−7 | 0.0000 | 385 | 39 |
Coenzyme metabolism (GO 0006732) | 1.21 × 10−6 | 0.0000 | 126 | 19 |
Cofactor metabolism (GO 0051186) | 2.39 × 10−6 | 0.0000 | 144 | 20 |
Aromatic compound metabolism (GO 0006725) | 2.39 × 10−6 | 0.0000 | 75 | 14 |
Electron transport (GO 0006118) | 5.18 × 10−6 | 0.0000 | 262 | 28 |
Regulation of body fluids (GO 0050878) | 6.79 × 10−6 | 0.0000 | 51 | 11 |
Steroid metabolism (GO 0008202) | 2.79 × 10−5 | 0.0014 | 104 | 15 |
Amino acid metabolism (GO 0006520) | 3.02 × 10−5 | 0.0013 | 143 | 18 |
Amine metabolism (GO 0009308) | 3.88 × 10−5 | 0.0017 | 216 | 23 |
Lipid biosynthesis (GO 0008610) | 6.28 × 10−5 | 0.0022 | 151 | 18 |
Fatty acid metabolism (GO 0006631) | 6.76 × 10−5 | 0.0021 | 112 | 15 |
Nitrogen compound metabolism (GO 0006807) | 7.87 × 10−5 | 0.0023 | 226 | 23 |
Amino acid and derivative metabolism (GO 0006519) | 1.08 × 10−4 | 0.0030 | 186 | 20 |
Response to wounding (GO 0009611) | 1.56 × 10−4 | 0.0037 | 191 | 20 |
Response to external stimulus (GO 0009605) | 6.71 × 10−4 | 0.0100 | 278 | 24 |
Significantly enriched GO categories were identified by High Throughput GO Miner. The cutoff FDR for significant enrichment is 0.01. FDR is the one-sided Fisher exact test P value corrected for multiple comparisons.
Due to the hierarchical nature of GO categories, the same gene could be assigned with multiple GO terms. The Venn diagram demonstrates that 19 GO terms, which were enriched for ERα-binding sites in their promoters, fell into two clusters relating to energy metabolism and wound response. Seventeen selected GO terms fell into the cluster relating to energy metabolism (Fig. 6).
Upon closer examination of ERα target genes involved in those 19 identified GO categories, we identified genes involved in lipid metabolism (Ptgds, Acox1, Acox2, Cpt1a), glucose metabolism (Pck1, Mapk14), and genes reported to be involved in both regulation of lipid and glucose metabolism (Pparα and Pgc-1).
ERα Binding to Promoters Is Associated with Elevated Expression Levels of Associated Genes
To investigate whether ERα binding to the promoters of genes identified by ChIP-on-chip analysis correlated with ligand-dependent regulation of the corresponding genes, expression levels of Shp, Stat3, Acox1, Gpam, Pck1, Pdk4, Pparα, and Ptgds were determined 2, 4, and 6 h after treatment of mice with the ERα agonist, propyl pyrazole triol (PPT), or vehicle, respectively. These genes were selected for this assay because binding of ERα to the promoters of these genes had been confirmed by ChIP followed by real-time PCR (Fig. 3).
We observed that PPT treatment induced Shp, Stat3, Pck1, and Ptgds expression as early as after 2 h of treatment. Pdk4 was induced after 2 h of PPT treatment, but this induction did not reach statistical significance. All five genes displayed a statistically significant induction after 4 and 6 h of PPT treatment (Fig. 7). However, PPT did not induce the expression of Acox1, Gpam, or Pparα at tested time points (data not shown).
There was no correlation between the location of the identified binding regions relative to the transcription start sites or the identity and number of computationally identified transcription factor-binding sites and hormone responsiveness (data not shown). However, such correlations are restrained by the limited number of regions investigated, the low resolution of the ChIP-on-chip assay, which does not allow determination of the exact ERα binding sites, and the lack of experimental verification of computationally derived TF-binding sites. Future studies will aim at characterizing the derived binding regions in promoter analysis.
Correlation of Global Gene Expression with Global ERα DNA Binding
We have previously described changes in gene expression in ob/ob mouse liver after long-term estradiol treatment (9) as an approach to delineate the molecular mechanism responsible for the antidiabetic effects of estrogen. Among 209 significantly changed genes identified after long-term estradiol treatment in ob/ob mouse livers, 22 were identified in the present study as harboring ERα-binding sites in their promoter regions (supplemental Table 5). This represents a significant enrichment of ERα DNA-binding sites in the promoters of regulated genes compared with the frequency of ERα DNA-binding regions in the complete set of promoters of genes investigated in this study (P value = 3.34 × 10−6 calculated by using binom.test in R, www.r-project.org).
DISCUSSION
We have performed a genome-wide analysis to identify ERα primary binding regions in mouse liver by assaying the association of ERα with nonrepetitive sequences across the whole-mouse genome. Recently, genome-wide analysis has been used to identify loci recognized by ERα in a genome-wide manner in cultured cells (10, 12, 17). To our knowledge, this is the first study that approaches a genome-wide identification of ERα-binding sites in tissue samples. In this study, we identified a total of 5568 ERα-enriched regions. The identification of ERα-enriched regions in the promoters of some known direct ER target genes, such as Shp, Stat3, and Ptgds, and our own validation using Affymetrix promoter arrays and ChIP analysis followed by real-time PCR for selected binding sites, validates the ChIP-on-chip approach for identifying ERα direct target genes (9, 18, 19).
When compared with vehicle-treated samples, the magnitude of the effect of estradiol on ERα binding to DNA in chromatin varied for different assayed regions. This is consistent with what has previously been reported (10, 17). The reason for the varied effect of estrogen on ERα binding to DNA in chromatin remains unknown. However, several examples of ERα interacting with DNA indirectly, via other DNA-bound proteins, have been described. These interactions might not be subject to hormonal regulation. This has been demonstrated for ERα-Sp1 complexes (20).
The recent whole-genome scan of ERα-binding sites in a cell culture model revealed approximately 3700 ERα-binding sites. Comparison of the sets of genes with identified ERα-binding sites located in regions from upstream 6000 bp to 2000 bp downstream of TSS reported by Carroll et al. (12) (data for binding sites at P value 1 × 10−3 were used in this analysis and extracted from the Browns’ Lab web page) with those identified in this study revealed only 45 genes with ERα-binding sites identified in this region in both studies. Whether this reflects tissue and/or species differences remains to be determined. However, we note that the majority of ERα-binding regions in our study are much more conserved between human and mouse than expected by chance.
Carroll et al. reported the identification of ERα DNA-binding regions that were located far away from TSS in the investigated human cell line, with only 4%, or 150 binding sites, mapped to regions 1000 bp upstream of TSS. Our results confirm this, but additionally our data suggest that a significant proportion of sites are located in the neighborhood of TSS, because about 40% of identified ERα-binding regions were located in the 10 K region around TSS. However, it is important to remember that the RefSeq genes used as a reference in this analysis only cover a small part of all transcript variants and start sites. Recent studies have identified a large number of novel promoters, many of which are located within genes (21). For instance, Carninci et al. (22) showed that more than 70% of genes have more than one promoter. Thus, it is possible that many of the ERα-binding regions assumed to be far away from any start sites, in fact, are close to hitherto unannotated alternative promoters.
We also show that almost half of the ERα-binding regions are located within genes. These ERα-binding regions within genes have a clear preference for regions close to the TSS (Fig. 1C), but apart from that, they are uniformly distributed over the whole gene span. This finding suggests the possibility of two distributions of identified ER-binding regions with clear positional distance constraints close to the TSSs and those without such constraints. We speculate that sites with different constraints might act by different molecular mechanisms.
The ERα-binding regions identified in this study display only limited preferential conservation if conservation over multiple (and more distal) species is assessed. As a reference, very highly conserved regulatory elements such as microRNA sites often have mean PhastCons scores of 0.8 or higher, whereas the mean score of ER regions is just around 0.25. One interpretation is that ER regions are generally conserved between relatively evolutionary close mammals such as human-mouse, but not between more distally related species such as fish; more detailed analyses are needed to prove this point.
It has been shown that in addition to ERE, a number of motifs, including Sp1 and AP1, are important regulatory motifs for ER-mediated transcription. Recently the Forkhead-binding motif has been identified as being of potential importance in ER-mediated transcription based on ChIP-on-chip analysis (10, 17). We identified six motifs, the ERE site, the ERE half-site, the AP1 site, bHLH sites, ETS sites, and the Forkhead-binding motif as enriched DNA sequences in ERα-binding regions. Interestingly, most of the ERα-binding sites (>80%) contained an ERE or ERE half-site in combination with other identified enriched motifs. This suggests that AP1, Forkhead proteins, the bHLH family of transcription factors, and proteins of the ETS family are important for estrogen signaling in mouse liver as recently suggested from studies in breast cancer cell lines (4, 10, 17).
Sp1 sites and GC-rich sites have been shown to mediate transactivation of a large number of estrogen-responsive genes in breast cancer (6). Our analysis did not detect Sp1 as an enriched motif in ERα-binding regions, and only 0.02% of identified ERα-binding regions overlapped with CpG islands from the University of California Santa Cruz (UCSC) annotation track. However, The Sp1 motif has low information content and thus needs a strong overrepresentation to be considered significant in this type of analysis. The analysis does not rule out that Sp1 sites are active at some ER-binding regions, but overall indicate that Sp1 and GC-rich sites are not key determinants of estrogen signaling in the liver.
We have previously demonstrated that genes involved in lipid metabolism were regulated by estrogen in livers from diabetic ob/ob mice after long-term estradiol treatment, and we proposed that estrogen exerts its antidiabetic effects by modulating lipid synthesis in the liver (9). We now use a similar approach to demonstrate that genes that recruited ERα to their promoters belong to the same GO categories involved in lipid metabolism, such as lipid biosynthesis (GO 0008610) and lipid metabolism (GO 0006629). This provides further mechanistic details on regulation of lipid metabolism, by estrogen signaling, in mouse liver.
To investigate whether ERα binding to the promoter identified by ChIP-on-chip analysis correlated with ligand-dependent regulation of the corresponding genes, the expression of eight genes important for lipid and glucose metabolism, Shp, Stat3, Acox1, Gpam, Pck1, Pdk4, Pparα, and Ptgds, was determined 2 h, 4 h, and 6 h after treatment of mice with the ER agonists PPT and vehicle, respectively. Of these genes, Shp, Stat3, Pdgds, Pck1, and Pdk4 all responded to PPT. The rapid induction of Shp, Pdgds, and Stat3 is consistent with previous published reports showing induction of these genes after 2 h estradiol or PPT treatment, respectively, in mouse liver (9, 18). The reason why Acox1, Gpam, and Pparα were not regulated by PPT is not clear, but one possibility is that these genes display a slower time course for regulation. A delay between ER binding and transcription of target genes has been reported (10). This could due to a requirement for subsequent modification of the receptor complex or for the recruitment of other factors necessary for ER action.
One of the identified ERα target genes, Ptgds, is ultimately responsible for the synthesis of naturally occurring peroxisome proliferator activator receptor γ ligands. Knockout studies have suggested that Ptgds plays an important role in regulating insulin sensitivity and glucose homeostasis (23). We have previously identified Ptgds as an ERβ-specific target gene in mouse heart (19). Interestingly, in this study, the identified ERα-binding site overlaps the region which contains the previously defined ERβ-selective response element. The fact that Ptgds mRNA was induced at an early time point by the ERα-selective agonist PPT provides strong evidence that Ptgds is an ERα primary target gene in mouse liver. Thus, Ptgds might provide a good model to study receptor selectivity via a distinct DNA response element.
Another one of the identified ERα target genes, Pck1, codes for the rate-limiting enzyme in gluconeogenesis and is therefore critically involved in controlling hepatic glucose production. Importantly, the expression of Pck1 was significantly induced by PPT after 2 h treatment.
Correlation of our previously reported global gene expression profiling data for the effects of long-term estrogen administration on liver from diabetic ob/ob mice with ERα DNA binding reported in this study revealed that about 10% of the genes identified in the ob/ob study had ERα DNA-binding sites in their promoters. We speculate that these genes represent the primary estrogen target genes in that study.
In summary, the identification of primary target genes for ERα in mouse liver should facilitate the dissection of estrogen signaling networks and eventually our understanding of estrogen action in important physiological processes such as lipid and glucose metabolism and associated clinical conditions including various aspects of the metabolic syndrome.
MATERIALS AND METHODS
Animals
Ovariectomized female C57BL/6 mice were purchased from Taconic (Lille Skensved, Denmark). All animals were fed ad libitum and had free access to water. Mice were ovariectomized at 10 wk of age. After recovery for 2 wk, mice were randomly assigned to receive treatment with ER agonists or control vehicle and killed 2, 4, or 6 h after injection.
Mice were injected sc with 100 μg/kg 17β-estradiol (Sigma Chemical Co., St. Louis, MO) or 5 mg/kg of the ERα-selective ligand PPT (Tocris Cookson, Ellisville, MO), respectively (n = 5 per treatment group). All ligands were dissolved in 90% sesame oil/10% ethanol.
Livers were collected and stored at −80 C. The local ethical committee had approved all animal experiments
ChIP-on-chip
Livers obtained from the 2-h estradiol treatment group were used to perform ChIP as previously described (9). After reverse cross-link, the purified ChIP-enriched fragments were amplified and labeled according to the standard Affymetrix protocol (http://www.affymetrix.com/products/arrays/specific/mouse_promoter.affx). Labeled products (8 μg) were hybridized to the Affymetrix mouse tiling 2.0R array sets and the Affymetrix mouse promoter 1.0R arrays, respectively (Affymetrix, Santa Clara, CA). These tiled arrays are designed to contain 25-bp probes located at 35 nucleotide resolution. The mouse tiling 2.0R array set includes approximately 45 million oligonucleotide probes to cover all the nonrepetitive sequences in mouse genome. The promoter array is a focused version of the whole-genome tiling array, which targets the promoter regions of more than 25,000 genes, with each promoter region covering approximately 6 kb upstream through 2.5 kb downstream of the TSS. ChIP-enriched fragments were derived from livers of three individual mice.
Affymetrix Data Analysis
The scanned output files were analyzed with Tiling Analysis Software version1.1 (Affymetrix). Probes were mapped to mouse chromosomes according to NCBIv33 (mm5) genome assembly. The samples (three anti-ERα+ChIP and three genomic input samples) were normalized in groups by quartile normalization followed by a scaling normalization to a target intensity of 100. Two-sample analysis with bandwidth 250 was applied to sample groups to estimate the ChIP enrichment at each probe position for anti-ERα+ChIP vs. genomic input. Because the tiling arrays have one 25-mer probe in every 35 bp of no-repeat regions, the coverage of every probe was extended by 5 bp on both ends. A ChIP-enriched region was defined as a run of probes with at least 2-fold enrichment compared with genomic input and that lies in close genomic proximity (Max gap = 45) to each other. If two neighboring enriched regions were less than 210 bp apart (Minrun = 210), the enriched regions were merged to form one enriched region. The regions that were enriched in anti-ERα+ChIP vs. genomic input were considered as ChIP-enriched region for ERα.
Overlap with Known Genes and Exons
We used the RefSeq track from the UCSC browser (24) to assess transcript overlap. For a identified ERα-binding region to be considered overlapping a gene, it had to overlap any RefSeq span (including introns) by at least 1 bp, regardless of strand. Exon overlap was defined analogously, using the exon definitions in the same track.
Distance to Closest 5′- and 3′-Gene Ends
For any ER region, we identified the closest RefSeq 5′- and 3′-end. Distances were calculated using the start or end of ERα-binding regions (which ever was closest).
For Fig. 1, B and C, in cases where the ERα-binding regions overlapped a gene span, we assessed the distance from the midpoint of the ER region to 5′-end of the gene and calculated the fraction of the gene span covered by this distance: S = D/Gl where D is the distance relative to the 5′-end and Gl is the length of the gene. Note that as ERα-binding regions partially overlapping the gene are also included in the analysis, it is possible to get higher S values than 1, or lower than 0, corresponding to situations where the regions overlaps the gene but the midpoint is outside the gene boundary. For the few cases where several Refseq gene spans overlapped the region, one of these gene spans was selected randomly.
Identification of Binding Sites
The genomic DNA corresponding to every ChIP-enriched region identified by whole tiling array was retrieved from the UCSC genome browser. Clover (25) was used to screen the sequences against a precompiled library of motifs to find the statistically overrepresented motifs.
Position weight matrixes (PWMs) (n = 123) were obtained from the JASPAR database (26). The P value indicates the probability that the observed overrepresentation of the motif is achieved by random selection and was using mouse chromosome 19 as background sequence with 1000 randomization.
Mouse-Human Evolutionary Conservation Analysis
We used whole-genome alignments (the mouse-human NET alignment track from the UCSC browser) to assess sequence similarity between the identified mouse ERα-binding regions and the corresponding human sequences. Importantly, these alignments are chained local BLASTZ alignments, which mean that regions with poor similarity will not be aligned.
There are many ways of interpreting similarity from such alignments. We used the same approach as described in Ref.27 . For a given ERα-enriched region of length l in mouse, we counted the number s of mouse nucleotides that were identical in humans in the alignment. We calculate a conservation index c for each such region: c = s/l.
Unaligned mouse nucleotides were counted as nonidentical. To determine whether the similarity is higher than expected by chance, we randomly selected equally seized regions from the mouse genome and repeated the same analysis.
Multiple Alignment Analysis
To assess evolutionary constraints over larger evolutionary distances, we used the PhastCons track from the UCSC browser (28). Briefly, this is a Hidden Markov Model evaluation of whole-genome alignments between several vertebrate genomes. Conservation values range from 1 to 0, where 1 indicates extreme conservation.
We extracted PhastCons scores for every nucleotide in each ER-enriched region and then evaluated the mean PhastCons score when all ER sites were aligned based on their midpoint. Because some ER sites are longer than others, we padded sites as necessary so that each site had the same length. Nucleotides with no PhastCons value (due to unalignable regions, etc.) were treated as missing values in computing means.
We picked an equal number of random regions of the same size to determine whether the conservation of ER regions is higher than expected by chance, using the same selection methods as in the mouse-human analysis described above.
Functional Classification of Target Genes
The overrepresentation analysis approach was used to test sets of related genes that recruit ERα into their promoter regions. First, ERα binding regions identified by the promoter array were mapped to promoters (upstream 6000 bp to downstream 2000 bp of TSS) of 645 known genes Based on the Ensemble gene build 30 for the NCBIv33 mouse. High-Throughput GoMiner (29) was employed to find enrichment of target genes involved in a particular function using all the a priori defined gene ontology (GO) categories (www.geneontology.org).
Enrichment of target genes, involved in a priori defined GO category, was determined by a one-sided Fisher’s exact test. Significantly changed GO categories with between 50 and 385 genes were reported to exclude very small and general pathways, respectively. The P value from one-sided Fisher’s exact test was reported, as well as the FDR that showed the estimated FDR after multiple-comparison correction based on resampling technique. The relationships of GO terms that had been identified as significantly overrepresented among the target genes were visualized by VennMaster (30) based on Venn diagrams.
ChIP Followed by Real-Time PCR
Forty cycles of three-step real-time PCR assays were carried out on precipitated DNA for the selected ChIP-enriched regions. The annealing temperature was optimized for each primer pair (60 C, with the exception of 56 C for enriched regions mapping to Shp and Pparα).
The sequences of the primers are as follows: Shp: forward, CATGGAAATGGGCATCAATA; reverse, CGTGGCCTTGCTATCACTTT. Stat3: forward, CCTACACTGACAGCCCAACA; reverse, CCACATCCCTCGGTTGTATC. Aocx1: forward, CAAATTGGAGC GAAAGGGTA; reverse, CTTTAAATCCAGGCCCAGTG. Gpam: forward, GATGTTGGCTGAAACCCTGT; reverse, GTATCGTTTGAAGCCGGGTA. Pck1: forward, CAACAGGCAGGGTCAAAGTT; reverse, GCACGGTTTGGAACTGACTT. Pdk4: forward, TGATGGTGGCTGTTGTTGTT; reverse, CGTGATTCCCAGACAAACCT. Pparα: forward, TCCATGCTTTCTGCATTGAG; reverse, TGGATGTCACCTGCAAATGT. Ptgds: forward, CTCCTCTGGTGGCTCAGAAC; reverse, GACACTGGCCCGAGTAACAT.
RNA Purification and Quantification
Total RNA were purified using the TRIzol reagent (Invitrogen, Carlsbad, CA) followed by RNeasy Mini kits (QIAGEN, Valencia, CA). Total RNA (2 μg) from each individual animal was reverse transcribed into cDNA using Superscript II (Invitrogen) with random hexamer primers. Primers were designed with the Primer Express 3.0 software (Applied Biosystems, Foster City, CA), primer pairs reside in separate exons and have melting temperature values of 58–60 C.
Real-time PCR assays were conducted using the Applied Biosystems 7500 fast real-time PCR system with SYBR green RT-PCR reagent kit (Applied Biosystems). All real-time PCR reactions were performed in duplicate. Analysis of melting curves demonstrated amplification of one specific gene product for each primer pair. The real-time PCR data were analyzed by an assumption-free analysis method based on the absolute fluorescence as described by Ramakers et al. (31).
The sequences of the primers are as follows: Shp: forward, AAGGGCACGATCCTCTTCAA; reverse, CTGTTGCAGGTGTGCGATGT. Stat3: forward, GCCCCGTACCTGAAGACCA; reverse, GACATCGGCAGGTCAATGG. Ptgds: forward, CACAGAGGAGGACATTGTTTTCC; reverse, ACTGACTTCTCTCACCTGCGTTT. Acox1: forward, TGCAGCTCAGAGTCTGTCCAA; reverse, GGCTCGCTTCTCTTGATTTCA. Pdk4: forward, GAGCTGGTATATCCAGAGCCTGAT; reverse, CGAACTTTGACCAGCGTGTCT. Gpam: forward, CCTTCAAGACCGAATGATGTTG; reverse, GGTTTGCAATCAGCCTTCGT. Pck1: forward, GGCCACAGCTGCTGCAG; reverse, GGTCGCATGGCAAAGGG. Pparα: forward, GATTCAGAAGAAGAACCGGAACA; reverse, TGCTTTTTCAGATCTTGGCATTC. Hprt: forward, GCAGTACAGCCCCAAAATGG; reverse, AACAAAGTCTGGCCTGTATCCAA.
Statistics
Error bars were expressed as mean ± sem. A P value < 0.05was considered to be significant.
Acknowledgments
We thank Boris Lenhard for advice on genome sequence analysis and ChIP-enriched region mapping. We also thank the Bioinformatics and Expression Analysis core facility, Karolinska Institutet, for assisting with the Affymetrix assays.
NURSA Molecule Pages:
Ligands: 17β-Estradiol;
Nuclear Receptors: ERα.
Footnotes
This study was supported by grants from European Union Network of Excellence CASCADE, the Swedish Cancer Fund, Novo Nordisk Foundation, and KaroBio AB. A.S. is supported by a grant from the NovoNordisk Foundation to the Bioinformatics Centre.
Disclosure statement: H.G., S.F., A.S., and K.D.-W. have nothing to declare. J.-Å.G. is consultant and shareholder of KaroBio AB.
First Published Online September 27, 2007
Abbreviations: AP1, Activator protein 1; bHLH, basic-helix-loop-helix; ChIP, chromatin immunoprecipitation; ER, estrogen receptor; ERE, estrogen response element; FDR, false discovery rate; GO, gene ontology; PPT, propyl pyrazole triol; PWM, position weight matrix; TSS, transcription start site.
References
- 1.Green S, Walter P, Greene G, Krust A, Goffin C, Jensen E, Scrace G, Waterfield M, Chambon P 1986. Cloning of the human oestrogen receptor cDNA. J Steroid Biochem 24:77–83 [DOI] [PubMed] [Google Scholar]
- 2.Kuiper GG, Enmark E, Pelto-Huikko M, Nilsson S, Gustafsson JA 1996. Cloning of a novel receptor expressed in rat prostate and ovary. Proc Natl Acad Sci USA 93:5925–5930 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Nelson LR, Bulun SE 2001. Estrogen production and action. J Am Acad Dermatol 45:S116–S124 [DOI] [PubMed]
- 4.Kushner PJ, Agard DA, Greene GL, Scanlan TS, Shiau AK, Uht RM, Webb P 2000. Estrogen receptor pathways to AP-1. J Steroid Biochem Mol Biol 74:311–317 [DOI] [PubMed] [Google Scholar]
- 5.Scholz A, Truss M, Beato M 1998. Hormone-induced recruitment of Sp1 mediates estrogen activation of the rabbit uteroglobin gene in endometrial epithelium. J Biol Chem 273:4360–4366 [DOI] [PubMed] [Google Scholar]
- 6.Safe S 2001. Transcriptional activation of genes by 17β-estradiol through estrogen receptor-Sp1 interactions. Vitam Horm 62:231–252 [DOI] [PubMed] [Google Scholar]
- 7.Louet JF, LeMay C, Mauvais-Jarvis F 2004. Antidiabetic actions of estrogen: insight from human and genetic mouse models. Curr Atheroscler Rep 6:180–185 [DOI] [PubMed] [Google Scholar]
- 8.Bryzgalova G, Gao H, Ahren B, Zierath JR, Galuska D, Steiler TL, Dahlman-Wright K, Nilsson S, Gustafsson JA, Efendic S, Khan A 2006. Evidence that oestrogen receptor-α plays an important role in the regulation of glucose homeostasis in mice: insulin sensitivity in the liver. Diabetologia 49:588–597 [DOI] [PubMed] [Google Scholar]
- 9.Gao H, Bryzgalova G, Hedman E, Khan A, Efendic S, Gustafsson JA, Dahlman-Wright K 2006. Long-term administration of estradiol decreases expression of hepatic lipogenic genes and improves insulin sensitivity in ob/ob mice: a possible mechanism is through direct regulation of signal transducer and activator of transcription 3. Mol Endocrinol 20:1287–1299 [DOI] [PubMed] [Google Scholar]
- 10.Carroll JS, Liu XS, Brodsky AS, Li W, Meyer CA, Szary AJ, Eeckhoute J, Shao W, Hestermann EV, Geistlinger TR, Fox EA, Silver PA, Brown M 2005. Chromosome-wide mapping of estrogen receptor binding reveals long-range regulation requiring the forkhead protein FoxA1. Cell 122:33–43 [DOI] [PubMed] [Google Scholar]
- 11.Jen KY, Cheung VG 2005. Identification of novel p53 target genes in ionizing radiation response. Cancer Res 65:7666–7673 [DOI] [PubMed] [Google Scholar]
- 12.Carroll JS, Meyer CA, Song J, Li W, Geistlinger TR, Eeckhoute J, Brodsky AS, Keeton EK, Fertuck KC, Hall GF, Wang Q, Bekiranov S, Sementchenko V, Fox EA, Silver PA, Gingeras TR, Liu XS, Brown M 2006. Genome-wide analysis of estrogen receptor binding sites. Nat Genet 38:1289–1297 [DOI] [PubMed] [Google Scholar]
- 13.Wasserman WW, Sandelin A 2004. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 5:276–287 [DOI] [PubMed] [Google Scholar]
- 14.Lenhard B, Sandelin A, Mendoza L, Engstrom P, Jareborg N, Wasserman WW 2003. Identification of conserved regulatory elements by comparative genome analysis. J Biol 2:13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Klinge CM 2001. Estrogen receptor interaction with estrogen response elements. Nucleic Acids Res 29:2905–2919 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kikuta H, Laplante M, Navratilova P, Komisarczuk AZ, Engstrom PG, Fredman D, Akalin A, Caccamo M, Sealy I, Howe K, Ghislain J, Pezeron G, Mourrain P, Ellingsen S, Oates AC, Thisse C, Thisse B, Foucher I, Adolf B, Geling A, Lenhard B, Becker TS 2007. Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates. Genome Res 17:545–555 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Laganiere J, Deblois G, Lefebvre C, Bataille AR, Robert F, Giguere V 2005. From the cover: location analysis of estrogen receptor α target promoters reveals that FOXA1 defines a domain of the estrogen response. Proc Natl Acad Sci USA 102:11651–11656 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lai K, Harnish DC, Evans MJ 2003. Estrogen receptor α regulates expression of the orphan receptor small heterodimer partner. J Biol Chem 278:36418–36429 [DOI] [PubMed] [Google Scholar]
- 19.Otsuki M, Gao H, Dahlman-Wright K, Ohlsson C, Eguchi N, Urade Y, Gustafsson JA 2003. Specific regulation of lipocalin-type prostaglandin D synthase in mouse heart by estrogen receptor β. Mol Endocrinol 17:1844–1855 [DOI] [PubMed] [Google Scholar]
- 20.Khan S, Barhoumi R, Burghardt R, Liu S, Kim K, Safe S 2006. Molecular mechanism of inhibitory aryl hydrocarbon receptor-estrogen receptor/Sp1 cross talk in breast cancer cells. Mol Endocrinol 20:2199–2214 [DOI] [PubMed] [Google Scholar]
- 21.Sandelin A, Carninci P, Lenhard B, Ponjavic J, Hayashizaki Y, Hume DA 2007. Mammalian RNA polymerase II core promoters: insights from genome-wide studies. Nat Rev Genet 8:424–436 [DOI] [PubMed] [Google Scholar]
- 22.Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Poniavic J, Semple CA, Taylor MS, Engstrom PG, Frith MC, Forrest AR, Alkema WB, Tan SL, Plessy C, Kodzius R, Ravasi T, Kasukawa T, Fukuda S, Danamori-Katayama M, Kitazume Y, Kawaii H, Kai C, Nakamura M, Konno H, Nakano K, et al 2006. Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet 38:626–635 [DOI] [PubMed] [Google Scholar]
- 23.Ragolia L, Palaia T, Hall CE, Maesaka JK, Eguchi N, Urade Y 2005. Accelerated glucose intolerance, nephropathy, and atherosclerosis in prostaglandin D2 synthase knock-out mice. J Biol Chem 280:29946–29955 [DOI] [PubMed] [Google Scholar]
- 24.Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D 2002. The human genome browser at UCSC. Genome Res 12:996–1006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Frith MC, Fu Y, Yu L, Chen JF, Hansen U, Weng Z 2004. Detection of functional DNA motifs via statistical over-representation. Nucleic Acids Res 32:1372–1381 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sandelin A, Alkema W, Engstrom P, Wasserman WW, Lenhard B 2004. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 32:D91–D94 [DOI] [PMC free article] [PubMed]
- 27.Frith MC, Ponjavic J, Fredman D, Kai C, Kawai J, Carninci P, Hayashizaki Y, Sandelin A 2006. Evolutionary turnover of mammalian transcription start sites. Genome Res [Erratum (2006) 16:947] 16:713–722 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.King DC, Taylor J, Elnitski L, Chiaromonte F, Miller W, Hardison RC 2005. Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences. Genome Res 15:1051–1060 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zeeberg BR, Qin H, Narasimhan S, Sunshine M, Cao H, Kane DW, Reimers M, Stephens RM, Bryant D, Burt SK, Elnekave E, Hari DM, Wynn TA, Cunningham-Rundles C, Stewart DM, Nelson D, Weinstein JN 2005. High-Throughput GoMiner, an ‘industrial-strength’ integrative gene ontology tool for interpretation of multiple-microarray experiments, with application to studies of common variable immune deficiency (CVID). BMC Bioinformatics 6:168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kestler HA, Muller A, Gress TM, Buchholz M 2005. Generalized Venn diagrams: a new method of visualizing complex genetic set relations. Bioinformatics 21:1592–1595 [DOI] [PubMed] [Google Scholar]
- 31.Ramakers C, Ruijter JM, Deprez RH, Moorman AF 2003. Assumption-free analysis of quantitative real-time polymerase chain reaction (PCR) data. Neurosci Lett 339:62–66 [DOI] [PubMed] [Google Scholar]
- 32.Schneider TD, Stephens RM 1990. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18:6097–6100 [DOI] [PMC free article] [PubMed] [Google Scholar]