SUMMARY
The spatiotemporal regulation of gene expression is central for cell-lineage specification during embryonic development and is achieved through the combinatorial action of transcription factors/co-factors and the epigenetic states at cis-regulatory elements. Previously, we reported that Mll2 (KMT2B)/COMPASS is responsible for the implementation of H3K4me3 at promoters of bivalent genes. Here, we show that Mll2/COMPASS can also implements H3K4me3 at some of the non-TSS regulatory elements, a subset of which share epigenetic signatures of active enhancers. Our mechanistic studies reveal that the association of Mll2’s CXXC domain with CpG-rich regions plays an instrumental role for chromatin targeting and subsequent implementation of H3K4me3. Although Mll2/COMPASS is required for H3K4me3 implementation on thousands of sites, it appears to be essential for the expression of a subset of genes, including those functioning in the control of transcriptional programs during embryonic development, indicating that not all H3K4 trimethylations implemented by MLL2/COMPASS are functionally equivalent.
Graphical abstract
Blurb
Hu et al. analyzed the contribution of MLL2’s methyltransferase and CXXC domain in the trimethylation of H3K4 in mouse ES cells and find that while it trimethylates H3K4 at both bivalent gene promoters and non-TSS elements, it regulates transcription at a limited number of genes including those required for PGC specification.
INTRODUCTION
Histone H3K4me3 is an evolutionarily conserved chromatin mark from yeast to mammals and is associated with diverse chromatin-based processes, such as chromatin remodeling, transcriptional initiation, histone acetylation, and DNA recombination (Li et al., 2006; Matthews et al., 2007; Vermeulen et al., 2010). In budding yeast, H3K4 methylation is deposited by Set1/COMPASS (complex of proteins associated with Set1) (Krogan et al., 2002; Miller et al., 2001; Roguev et al., 2001; Schneider et al., 2005; Shilatifard, 2012). Drosophila has three Set1-related H3K4 methyltransferases, dSet1, Trithorax (Trx), and Trithorax-related (Trr), each of which forms a COMPASS-like complex (Mohan et al., 2011; Shilatifard, 2012). Mammals have two close relatives for each of the three H3K4 methyltransferases identified in Drosophila, resulting in a total of six COMPASS-like complexes: Set1a and Set1b; Mll1 and Mll2; and Mll3 and Mll4 (Piunti and Shilatifard, 2016; Shilatifard, 2012). Functional diversity within the COMPASS family in metazoans includes deposition of distinct methylation states of H3K4 and targeting distinct genomic regions. The Set1 proteins within COMPASS mediate the bulk of promoter H3K4me3 through crosstalk with the H2B monoubiquitination machinery (Mohan et al., 2011; Wu et al., 2008). Mll1 is required for H3K4me3 at the promoters of a subset of genes in mouse embryonic fibroblasts (MEFs) (Wang et al., 2009), while Mll2 is responsible for implementation of H3K4me3 at bivalently marked gene promoters in mouse embryonic stem cells (mESC) (Hu et al., 2013b). Recently, the Drosophila ortholog of Mll1 and Mll2 was shown to implement H3K4me2 at Polycomb response elements (Rickels et al., 2016). In contrast, the enhancer mark H3K4me1 is largely implemented by Trr in Drosophila and Mll3 and Mll4 in mammals (Herz et al., 2014; Herz et al., 2012; Hu et al., 2013a; Morgan and Shilatifard, 2015). From yeast to humans, a direct functional role for H3K4 methylation in transcription remains unclear. Set1/COMPASS is the only H3K4 methyltansferase in yeast and its deletion affects all three states of H3K4 methylation (Schneider et al., 2005; Shilatifard, 2012). Nevertheless, there is no widespread transcriptional alteration in the absence of Set1 in budding yeast (Miller et al., 2001). Likewise, in mammalian cells, the loss of H3K4me3 at promoters has minimal effects on steady-state and regulated transcriptional induction in mESC (Clouaire et al., 2014; Clouaire et al., 2012; Hu et al., 2013b). Therefore, the role of H3K4 methylation in regulating transcription and embryonic development remains elusive. In this study, we uncover an essential role for the catalytic activity of Mll2/COMPASS in H3K4 methylation in the regulation of a limited number of genes, including at enhancers and promoters of genes encoding regulators of PGC specification.
RESULTS
Mll2/COMPASS occupies both promoters and non-TSS regulatory elements in mESC
Histone H3K4me3 accumulates at promoter-proximal regions of active genes but can also be found with H3K27me3 at the lowly transcribed “bivalent” genes in ES cells (Azuara et al., 2006; Bernstein et al., 2006; Santos-Rosa et al., 2002). We previously demonstrated that the H3K4me3 at bivalent promoters in ES cells is implemented by the Mll2 branch of the COMPASS family (Hu et al., 2013b). To gain a broader understanding of the role of Mll2/COMPASS in transcriptional regulation during development, we generated antibodies recognizing two different epitopes in the C-terminal portion of Mll2 (ab CT1 and more C-terminal ab CT2) (Figure S1A). We first confirmed the specificity of the two antibodies in the detection of endogenous Mll2 protein by immunoblotting whole cell extracts from mESC in which Mll2 was depleted by RNAi (Figure S1B). We further validated the two antibodies with immunoprecipitation and found that components of Mll2/COMPASS were co-immunoprecipitated with Mll2 (Figure S1C–S1D).
We identified Mll2 targets by ChIP-seq with each antibody (Figure 1A and S1E). A total of 19,822 binding regions (peaks) were identified with ab CT2 Mll2, among which 70%, 14%, and 16% of peaks are localized to promoters, gene bodies and intergenic regions, respectively (Figure 1A). The high percentage of Mll2 occupancy at promoters was consistent with its activity at bivalent genes in mESC (Hu et al., 2013b). Mll2 peaks localized within some gene bodies or at intergenic regions (non-TSS) demonstrated a lower occupancy than sites of Mll2 occupancy overlapping transcription start sites (TSS) (Figure 1B). Similar results were observed when performing ChIP-seq with ab CT1 (Figure S1E–S1F). Inspection of non-TSS Mll2 peaks near H3f3a and Foxj2 loci reveals that they are co-occupied with the active enhancer marks p300, H3K4me1 and H3K27ac (Figure 1C). Non-TSS Mll2 peaks can be associated with p300, H3K4me1, H3K27ac, and H3K27me3 (Figure 1D and S1G). More of the non-TSS Mll2-associated regions are enriched for the active enhancer marks of p300, H3K4me1, and H3K27ac, than for H3K27me3, a mark of poised enhancers (Rada-Iglesias et al., 2011) (Figure 1D–E and S1G–H).
Figure 1. Mll2/COMPASS catalyzes H3K4me3 at non-TSS Mll2 binding sites.
(A) Pie chart of genome-wide Mll2 distribution in mESC determined by ChIP-seq with ab CT2.
(B) Mll2 occupancy at TSS and non-TSS regions
(C) Genome browser tracks of Mll2, p300, H3K4me1 and H3K27ac at putative enhancers.
(D–E) Binary enrichment profiles (D) and binding percentages (E) for Mll2, p300, H3K4me1, H3K27ac, H3K27me3, and H3K4me3 ± 5kb regions centered at 6,418 high-confidence non-TSS Mll2 peaks. Enrichment determined by p < 1e-8 and FDR < 0.05.
(F) Percentage of 6,418 high-confidence non-TSS Mll2 bound regions that are enriched for indicated histone marks. Group I regions have enrichment for p300, H3K4me1 and H3K27ac, while Group II regions exhibit occupancy of p300, H3K4me1, H3K27ac and H3K4me3.
(G) Gene expression analysis of RNA-seq data in mESC for all genes, nearest genes of all 6,418 high-confidence non-TSS Mll2 peaks, or Mll2-associated Group I and Group II regions. Boxes display the 25–75% ranked genes with the median indicated as an intersection. P-value determined by Wilcoxon rank-sum test.
(H) ChIP-seq profiles for H3K4me3 and Mll2 at putative enhancer regions from (C) in mESC infected with lentiviral shRNAs targeting Mll2 (shMll2) or control (shGFP).
(I) ChIP-qPCR validation of changes at putative enhancers indicated in (H). ChIP signals were calculated as percentages of input and then normalized to that obtained from shGFP cells and presented in the log2 scale. Error bars represent two technical replicates in a representative experiment of at least three biological replicates.
(J) Heat map of Mll2 and H3K4me3 ChIP-seq occupancy ± 5kb at the 6,418 non-TSS Mll2 peaks in Mll2 and control knockdown cells (Left). H3K4me3 log2 fold change after Mll2 depletion is shown 5kb flanking 6,418 non-TSS Mll2 peaks (Middle). H3K27ac occupancy for the same regions is shown (Right). The heat map is rank-ordered from 6,418 sites with highest to lowest occupancy of Mll2 in mESC.
(K) Boxplot representation of ChIP-seq occupancy for Mll2 and H3K4me3 at 6,418 non-TSS Mll2 targets. P-values determined by Wilcox on rank sum test. See also Figure S1.
Approximately 39% of non-TSS Mll2-associated elements (2,465 out of 6,418) share marks with active enhancers (Figure 1F), half of which are further marked with H3K4me3 (Figure 1F). Similar percentages of co-occurrence of H3K4me3 with marks of active enhancers were observed with Mll2 ab CT1 (Figure S1I) or with H3K4me3 ChIP-seq datasets generated in different mESC lines and by distinct laboratories (Clouaire et al., 2014; Denissov et al., 2014; Mikkelsen et al., 2007) (Figure S1J). Gene expression analysis revealed that genes closest to non-TSS Mll2 sites had an overall higher expression level than all genes, consistent with them being enhancers (Figure 1G). Genes nearest non-TSS Mll2 sites with H3K27ac had further increased expression, consistent with these being active enhancers (Figure 1G). However, the cooccurrence of H3K4me3 with these putative active enhancers was not significantly predictive of higher expression than H3K27ac alone (Figure 1G, group II versus group I), suggesting that for a majority of these putative enhancers, Mll2-dependent H3K4me3 functions redundantly with H3K27ac and not associated with driving increased expression in the ES cell state.
Mll2 catalyzes H3K4 methylation at non-TSS sites
As Mll2 functions as a methyltransferase for H3K4 and associates with non-TSS Mll2-associated sites enriched for H3K4me3, we reasoned that Mll2 may be the enzyme directly responsible for implementation of H3K4me3 at the non-TSS Mll2-associated sites. To explore this hypothesis, we employed H3K4me3 ChIP-seq in mESC infected with lentiviral shRNA against GFP and Mll2, respectively. Efficient knockdown of Mll2 has previously been described (Hu et al., 2013b) and confirmed at the protein level (Figure S1B). Mll2 depletion resulted in nearly complete loss of its occupancy around H3f3a and Foxj, indicating the specificity of this antibody for ChIP-seq (Figure 1H). Histone H3K4me3 occupancy at enhancers of H3f3a and Foxj2 was similarly reduced after Mll2 depletion (Figure 1H). In contrast, H3K4me3 levels at the promoters of H3f3a and Foxj2 were unaffected (Figure 1H-1I).
To further investigate the role of Mll2 in H3K4me3 deposition at non-TSS Mll2-associated sites, global analysis of Mll2 and H3K4me3 in shGFP and shMll2-treated cells was performed. Mll2 depletion resulted in broad reduction of H3K4me3 levels and Mll2 occupancy at non-TSS Mll2-associated sites targets (Figure 1J-1K). To rule out the possibility that H3K4me3 at non-TSS sites was a crosslinking artifact due to enhancer-promoter interactions, ChIP-seq with native MNase-digested chromatin was performed (Figure S1K).
Cxxc1/Cfp1, a subunit found in Set1A and Set1B/COMPASS (Shilatifard 2012), is required for targeting Set1A and Set1B to promoters of active genes and establishing H3K4me3 in mESC (Clouaire et al., 2012). However, at non-TSS Mll2-associated sites, H3K4me3 occupancy remained unchanged in the absence of Cxxc1 (Figure S1L), further indicating the specific requirement for Mll2/COMPASS in H3K4me3 deposition at these regions. Direct regulation of H3K4me3 was further validated by determining Mll2 and H3K4me3 occupancies at non-TSS Mll2 bound regions identified by the CT1 Mll2 antibody (Figure S1M). Together, our data established a requirement of Mll2/COMPASS for H3K4me3 deposition at a subset of non-TSS sites in mESC.
Ectopically expressed Mll2 depends on its CXXC domain for recruitment to chromatin
Menin associates with the Mll1 and Mll2 COMPASS-like complexes and is required for targeting MLL1 fusion proteins to genomic loci associated with leukemic transformation (Yokoyama et al., 2005). We performed ChIP-seq for Menin in mESC and observed that Menin co-occurs with Mll2 at putative enhancers close to H3f3a and Foxj2 (Figure S2A). Despite co-occurring at many loci (Figure S2B), depletion of Menin by shRNA showed no major role Mll2 recruitment in mESC (Figure S2C–D), suggesting that Menin-dependent recruitment to chromatin could be cellular context dependent.
To identify the molecular mechanisms underlying Mll2 targeting to chromatin, we used CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats) technology (Cong et al., 2013), to generate Mll2 knockout (MLL2KO) mESC, which could be further used in rescue experiments (Figure 2A). Western blot analysis of a representative mESC clone demonstrated a complete loss of Mll2 protein in Mll2KO mESC (Figure 2B). Consistent with the shRNA-mediated Mll2 knockdown studies, Mll2 knockout had no major effect on bulk levels on H3K4 methylation in mESC (Figure 2B). We first tested the requirement for the CXXC domain of Mll2 for its targeting to chromatin, since the homologous domain in Mll1 can bind unmethylated CpG-containing DNA and plays an essential role in the recruitment of MLL1 fusion proteins to HOXA9 in leukemia cells (Ayton et al., 2004; Cierpicki et al., 2010). Although most promoters in mammals are characterized by the presence of CpG islands (Guenther et al., 2007), about half of mammalian CpG islands are found in intragenic and intergenic regions (Illingworth et al., 2010).
Figure 2. Rescue strategy for dissecting Mll2 activities in mESC.
(A) Top panel, graphical representation of the Mll2 locus and two CRISPR/Cas9 target locations. Target sites are indicated in green and PAM sequences in red. Bottom panel, wild-type and mutant sequenced clones. Size of deletion indicated in base pairs (bp).
(B) Western analysis of Mll2 protein and histone H3K4 methylation levels in wild-type (WT) and a representative Mll2 knockout (Mll2KO) mESC clone. Triangles indicate increasing loaded extract.
(C) Schematic of the human MLL2 domain structure and derivatives used in rescue experiments. Amino acids 958 to 1000 were deleted in the MLL2 (ΔCXXC) mutant rescue fragment. The catalytic mutant (Y2604A) is indicated with red star in the SET domain.
(D) Western analysis of Mll2 protein levels in parental and Mll2KO mESC rescued with empty vector, N-terminally Halo-tagged wild-type, ΔCXXC, or catalytically deficient MLL2. All rescue cell lines express human MLL2 near endogenous mouse Mll2 levels.
(E) MLL2 (ΔCXXC) and MLL2 (Y2604A) properly associate with in an MLL2/C0MPASS-like complex. Mouse V6.5 cells electroporated with indicated plasmids were immunoprecipitated with Halo antibody and eluates were analyzed by immunoblottig using antibodies against Menin, Rbbp5, Ash21, Wdr5 and Halo tag.
(F) UCSC genome browser tracks of H3K4me3 occupancy near H3f3a in parental and Mll2KO mESC with empty vector, or wild-type, ΔCXXC, or catalytically deficient MLL2.
(G) Heat map of H3K4me3 occupancy at 6,418 non-TSS Mll2 sites in wild-type, KO or rescue cells from (F). H3K4me3 ChIP-seq signal is represented as in Figure 1 J. CpG enrichment is shown in the right-most panel.
(H) Boxplot analysis of H3K4me3 occupancy from (G). Boxes represent 25th and 75th percentiles. P-values determined by Wilcoxon rank sum test.
(I) Heat map analysis of Mll2 occupancy at 6,418 non-TSS Mll2 sites in wild-type, KO or rescue cells from (F). Mll2 occupancy represented as in (1J). CpG enrichment is from (G).
(J) Boxplot analysis of Mll2 occupancy from (I). P-values were calculated using the Wilcoxon rank sum test. See also Figure S2.
We reintroduced wild-type, CXXC domain-deficient (ΔCXXC) or the catalytically mutant (Y2604A) versions of Halo-tagged human MLL2 (Figure 2C) into Mll2KO mESC. Different rescue constructs were expressed to similar levels as endogenous Mll2 (Figure 2D) and all of the proteins could assemble into COMPASS-like complexes (Figure 2E and S2E). While the wild-type MLL2 construct was able to fully rescue H3K4me3 defect at non-TSS Mll2-associated sites, the CXXC and Y2604A mutants could not (Figure 2F-2H and S2F). Although the Y2604A and wild-type Halo human MLL2 proteins were properly recruited to Mll2-associated regions, the recruitment of the CXXC mutant was impaired (Figures 2I-2J and S2G). In line with a role for the CXXC domain in recruiting Mll2 to chromatin the non-TSS Mll2 sites overlap CpG-rich regions (Figure 2G and 2I).
CRISPR-mediated genome editing of endogenous Mll2’s CXXC and catalytic domains
We used CRISPR-mediated homologous recombination to mutate six of the eight conserved cysteine residues to alanine to abolish CXXC function of the endogenous Mll2, and tyrosine 2602 (corresponding to tyrosine 2604 in human MLL2) to alanine to generate catalytically deficient Mll2, respectively (Figures 3 A-3B and S3A-3F). Western blotting in the targeted mESC clones demonstrated that the amino acid substitutions have no observable impact on Mll2 protein levels (Figure 3C). ChIP-seq for H3K4me3 in parental, Mll2 knockout, and two independent clones each for the CXXC domain and catalytic mutants, demonstrated a requirement for Mll2’s CXXC domain and catalytic activity for non-TSS H3K4me3 (Figures 3D–E and S3G–S3H). ChIP-seq for Mll2 demonstrated that the catalytic mutant, but not the CXXC mutants were properly recruited to non-TSS Mll2-associated sites and bivalent promoters (Figures 3F-3G and S3I). Taken together, these findings establish an essential role for the CXXC domain and the methyltransferase activity of Mll2 in ES cells for chromatin targeting and H3K4me3 deposition at non-TSS Mll2-associated sites.
Figure 3. CRISPR editing demonstrates requirements for Mll2’s CXXC and catalytic domains in mESC.
(A) Upper panel: Strategy for disrupting the catalytic activity of Mll2 in mESC. A homology repair template changes the catalytic tyrosine 2602 to alanine. The CRISPR-cas9 cleavage site is indicated with a red arrowhead. Primers (P1 to P6) used for genotyping are shown as bars with primer pairs color-coded. Lower panel: Agarose gel electrophoresis of PCR genotyping products in clones before (N) and after (F) flippase (Flp)-mediated eviction of the Neo cassette. Note that PCR products are unsuccessfully amplified with primers P1 and P2 in two targeted ES cell clones due to the large Neo cassette but can be amplified after FLP-mediated excision of Neo, with the PCR product migrating slower than the wild-type product due to a remaining FRT site.
(B) Upper panel: Strategy for mutating the CXXC domain of Mll2 in mESC. CRISPR-cas9 cleavage site primer pairs indicated as in (A). Lower panel: Agarose gel analysis of PCR genotyping as in (A).
(C) Western analysis of Mll2 protein levels in targeted ES cell clones before and after FLP-mediated eviction of Neo. Note that Mll2 protein is absent in two mESC clones before Neo excision (N1-N2) but mutant Mll2 is present at wild-type levels in four clones(F1-F4) after FLP-mediated excision of Neo.
(D) UCSC genome browser track example of H3K4me3 occupancy near the H3f3a locus in parental, Mll2 knockout, CXXC and catalytic mutant Mll2 mESC.
(E) Heat map representation of H3K4me3 occupancy at Mll2 non-TSS sites in parental, and the indicated mutant Mll2 mESC.
(F) UCSC genome browser track example of Mll2 occupancy near the H3f3a locus in parental and the indicated mutant Mll2 mESC.
(G) Heat map analysis of Mll2 occupancy at Mll2 non-TSS sites in parental, and the indicated mutant Mll2 mESC. ChIP-seq signal represented as in Figure 1J. See also Figure S3.
Mll2 is required for expression of regulators of primordial germ cell specification during differentiation
Mll2 is essential for embryonic development but is not required to maintain the pluripotent state (Glaser et al., 2006; Lubitz et al., 2007). To explore Mll2’s role during early development, we conducted RNA-seq analysis in parental and Mll2KO mESC and during their differentiation (Figure S4A). In agreement with a dispensable role for Mll2 in self-renewal, RNA-seq analysis in Mll2KO mESC demonstrated no significant changes in the expression of pluripotency genes, such as Oct4 and Klf4 (Figure S4B). However, gene expression analysis in Mll2KO mESC identified 286 genes that were downregulated and 327 genes that were upregulated (Figure 4A). Functional annotation of down-regulated genes revealed enrichment for different classes of genes including those involved in germ cell development and gamete generation (Castrillon et al., 2000; Fujiwara et al., 1994; Noce et al., 2001; Ohinata et al., 2005; West et al., 2009; Yamaji et al., 2008) (Figure 4B, Table S1). In contrast, genes up-regulated in Mll2KO cells were enriched in categories such as brain and pancreas development, and metabolic process (Figure 4B). Consistent with our RNA-seq analysis suggesting a role for Mll2 in gamete generation, Mll2 conditional germline knockouts exhibit sterility in both males and females (Andreu-Vieyra et al., 2010; Glaser et al., 2006). Prdm1, Prdm14, Lin28b and Ddx4 encode regulators for the specification of primordial germ cells (PGC) from pluripotent mESC state (Ohinata et al., 2005; West et al., 2009; Yamaji et al., 2008) that were among the downregulated genes in the Mll2KO cells (Figure 4C, Table S2).
Figure 4. Mll2 is required for full expression of PGC genes during differentiation.
(A) MA plot of gene expression changes between parental and Mll2KO mESC as determined by RNA-seq. The x axis shows the log2 normalized counts per million (CPM) of averaged expression in the two conditions (A) and the y axis shows the log2 fold change in the two conditions (M). Significantly upregulated and downregulated are highlighted in red and green, respectively (FDR<0.001).
(B) Gene Ontology (GO) functional analysis for differentially expressed genes as determined by Metascape (Tripathi et al., 2015). Each node represents a functional term and its size is proportional to the number of genes falling into that term, with the color representing the cluster identity. Terms with a similarity score > 0.3 are linked by an edge. Log p-values are shown below the GO networks.
(C) UCSC genome browser RNA-seq tracks for regulators of PGC specification in parental and Mll2 knockout mESC.
(D) Representative phase contrast images of embryoid bodies (EBs) at indicated days derived from parental (Mll2 WT) and Mll2KO mESC.
(E) Western analysis of Mll2 levels during mESc cell differentiation into embryoid bodies. Tubulin serves as a loading control.
(F) Scatter plots showing differential gene expression of EBs derived from parental and Mll2KO mESC at different day points. Differentially expressed genes determined as in (A).
(G) UCSC genome browser RNA-seq tracks at Prdm1 and Prdm14 loci in mESC and indicated days of EB formation. See also Figure S4.
To examine potential Mll2 functions during differentiation specifically in PGC specification, we generated embryoid bodies (EBs) from wild-type and Mll2KO mESC and performed RNA-seq (Figure 4D). Mll2 transcripts were initially up-regulated and then downregulated during the course of EB formation and this was also reflected at the protein level as shown by immunoblotting of whole cellular extracts of EBs (Figures 4D, E and S4C–D). Compared to ES cells, many more genes were misregulated later in differentiation in EB stages (Figure 4F). However, we observed that the expression of some of the regulators of PGC specification are reduced from day 0 to day 9 of Mll2KO EB formation, demonstrating Mlll2’s role for transcriptional activation of germ cell determinants during development (Figure 4G, Table S2). We note that the dependency on Mll2 is most obvious during the differentiation studies as the dependency on Mll2 in the ES cell state can vary with growth conditions.
Although loss of Mll2 does not affect self-renewal of mESC grown under serum + LIF condition, Mll2KO mES cells exhibit faster mRNA depletion for Nanog and Klf4 during differentiation (Figure S4E). Supporting the accelerated differentiation of the Mll2KO mESC, we observed that many markers of endoderm, mesoderm and ectoderm have significantly higher expression at different days of Mll2KO EBs than in wild-type EBs (Figure S4F–H, Table S2). Taken together, these studies suggest a role for Mll2 in PGC induction and the proper orchestration of differentiation for PGC and other lineages.
Both the CXXC domain and the H3K4 methyltransferase activity of Mll2 are required for PGC induction from mESC
To better understand the requirement of Mll2 for expression of genes involved in PGC specification during differentiation, we compared the gene expression profile in parental, Mll2 knockout, CXXC mutant and catalytically deficient Mll2 mESC (Figure S5A). Hierarchical clustering of differentially expressed genes across these cell lines reveal that expression changes in CXXC and catalytically deficient mutant Mll2 cells are similar to Mll2KO cells (Figure 5A). The expression of PGC determinants such as Prdm14 and Prdm1 were decreased in mESC lacking either the DNA binding or the methyltransferase activity of endogenous Mll2 (Figures 5B and S5B).
Figure 5. Mll2’s CXXC domain and catalytic activity function in PGC specification.
(A) Heat map of gene expression in mESC after Mll2 knockout, mutation of the catalytic SET domain, or functional disruption of Mll2’s CXXC domain. Differentially expressed genes are separated into two groups based on their up- or down- regulation in Mll2 knockout cells.
(B) UCSC genome browser tracks of total RNA-seq in wild-type and indicated Mll2 mutant conditions. Top four tracks are from Figure 4C.
(C) Representative phase contrast images of 6 day EBs derived from parental, Mll2KO, CXXC and catalytically deficient mESC.
(D) UCSC genome browser RNA-seq tracks at Prdm14 in mESC and 6 day EB.
(E) Schematic of in vitro generation of primordial germ cell-like cells (PGCLC) (upper panel). Expression of representative PGC, endoderm and ectoderm genes in PGCLC aggregates was determined by quantitative reverse transcription PCR and normalization to Gapdh. Expression is relative to expression in PGCLC aggregates derived from parental mESC. Bars represent the standard deviation for two technical replicates, representative of three biological experiments.
(F) FACS analysis of PGCLC derived as in (E) from a representative experiment is shown (left panel) and the average results from two independent quantifications are plotted (right panel). P-values are calculated using the t-test. See also Figure S5.
To further determine the role of Mll2 in PGC specification during mESC differentiation, we differentiated parental, Mll2 knockout, CXXC and catalytically deficient mutant Mll2 mESC into EBs (Figure 5C). Similar to Mll2KO EBs, EBs derived from CXXC and catalytically deficient mutant Mll2 mESC were heterogeneous in morphology and display dark regions in the center of their EBs, although the extent was less severe as compared to Mll2KO EBs (Figure 5C). We analyzed RNA-seq data from EBs differentiated from parental, Mll2KO, CXXC and catalytically deficient Mll2 mutant mESC and observed reduced expression of Prdm1 and Prdm14 in day 6 EBs derived from Mll2_mCXXC, Mll2_Y2602A and Mlll2 KO EBs, indicating an essential role for the CXXC domain and the full enzymatic activity for Mll2-dependent PGC formation (Figures 5D and S5C–S5D).
As EBs represent spontaneous differentiation of ES cells and the number of PGC in EBs was potentially low, we decided to test the function and respective mechanisms (CXXC domain and full methyltransferase activity) of Mll2 in PGC induction in a more controlled model of PGC differentiation system. mESC of parental, Mll2KO, Mll2_mCXXC and Mll2_Y2602A grown under N2B27+2i+LIF conditions were first induced to differentiate into Epiblast-like cells (EpiLC) followed by addition of multiple cytokines to dictate PGCLC specification (Figure 5E upper panel). As seen in EBs, expression of PGC determinants Prdm1, Prdm14 and Dppa3 was considerably decreased while markers for endoderm and ectoderm were upregulated in PGCLC aggregates derived from Mll2KO, Mll2_mCXXC and Mll2_Y2602A mESC, further supporting the essential role for Mll2 and its CXXC domain and catalytic activity in PGC induction (Figure 5E bottom panel). As expected, the number of PGCLCs derived from Mll2KO, Mll2_mCXXC and Mll2_Y2602A cells was decreased 4-fold as demonstrated by flow cytometric analysis of PGC markers (Hayashi and Saitou, 2013) (Figure 5F). Collectively, our findings support a role for Mll2, its ability to bind DNA and its methyltransferase activity in PGC specification during ES cell differentiation.
Regulation of Enhancer activity by Mll2/COMPASS during PGC specification
Among the 6,418 non-TSS Mll2-associated sites, 58% of them were enriched for enhancer marks (Figure 1F). We observed that genes closest to non-TSS Mll2 associated sites had higher expression than all genes, however, loss of Mll2 had no general effect on the expression of the nearest genes in ES cells (Figure S6A). Of the 4,684 genes nearest non-TSS Mll2 associated sites, only 43 genes were significantly downregulated by Mll2 loss (Figure S6B), including genes encoding PGC regulators. The low number of downregulated genes nearest non-TSS sites is consistent with the observation from Figure 1G that the presence of H3K4me3 at putative active enhancers conferred no general advantage on gene expression when compared to other active enhancers. Together, these findings indicate that loss of H3K4me3 from non-TSS sites does not have a general instructive role in transcriptional activation, but instead depends on other, yet identified chromatin contexts. However, we can’t rule out that the minimal effect of loss of Mll2 and H3K4me3 on nearby gene expression was the result of a low level of H3K4me3 implemented by other enzymes being sufficient for gene activation, as we rarely observed a complete lack of H3K4me3 at non-TSS Mll2 associated sites upon Mll2 deletion.
We identified putative enhancer regions in mESC that required Mll2 for their H3K4me3 near PGC determinants such as Prdm14 and Prdm1 (Figure 6A). To determine whether they function as true enhancers for regulation of Prdm1 and Prdm14 transcription, we deleted these genomic elements by CRISPR/Cas9 (Figures 6B-6C and S6C). As expected, deletion of these elements led to decreased expression of Prdm1 and Prdm14, indicating that they are bona fide enhancers for Prdm1 and Prdm14 (Figure 6D and S6D). In addition to H3K4me3 loss in Mll2KO cells, enhancers of Prdm1 and Prdm14 exhibited reduced levels of p300 and H3K27ac, two marks associated with active enhancers, while the general enhancer mark, H3K4me1, was largely unaffected (Figure S6E). However, global analysis of H3K4me1 at 6,418 non-TSS Mll2 associated sites demonstrated a very weak, but significant increase after Mll2 loss (Figure S6F and S6G). This could reflect some degree of competition between Mll2/COMPASS and other COMPASS family members such as the monomethylases MLL3 and MLL4 (Hu et al., 2013a).
Figure 6. Mll2 and putative enhancers of PGC determinants.
(A) ChIP-seq genome browser tracks showing Mll2-dependent H3K4me3 at two putative enhancers (En1 and En2) each near Prdm1 and Prdm14 in mESC. A non-bound element (NBE) at each locus is also indicated.
(B) Schematic of CRISPR/Cas9 design for candidate enhancer deletion. Black arrows indicate the location of genomic regions targeted by CRISPR guide RNAs. P1, P2, P3 and P4 represent primers used for genotyping. For each construct, the protospacer sequence and the Cas9-specific proximal-adjacent motif (PAM) are underlined in green and red, respectively.
(C) Genotyping of mESC transfected with candidate enhancer-flanking CRISPR constructs identified five clones with deletion of Prdm 14 putative enhancers and three clones with deletion of the distal putative Prdm 1 enhancer.
(D) RNA-seq genome browser tracks of Prdml and Prdml4 in parental and three representative clones of the Prdml orPrdml4 enhancer-deleted mES cells.
(E) Physical interactions of Mll2-regulated putative enhancers with promoters of Prdml and Prdml4 as revealed by circularized chromosome conformation capture with high-throughput sequencing (4C) assay. The median and 20th and 80th percentiles of a sliding 10 kb window determine the main trend line. Color scale represents enrichment relative to the maximum median value at a resolution of 12 kb.
(F) Luciferase reporter assay of the role of putative enhancers of Prdm 14 (left) and Prdml (right) in transcriptional activation in the presence of wild-type, ΔCXXC, or catalytically deficient mutant human MLL2. Enhancers (En), non-binding element (NBE) or promoter (Pr) regions indicated in (A), were cloned into a luciferase reporter vector and co-transfected into HEK293 cells with the indicated MLL2 construct. After 48h, cells were harvested, and luciferase activity was determined. Relative luciferase induction is plotted as fold change compared to cells transfected with empty control vectors.
(G) Parental, Mll2KO or Prdm14 enhancer-deleted mESC were differentiated into PGCLCs in vitro. After 6 days, PGCLCs were quantified by FACS. Percentage of PGCLCs induced from parental, Mll2KO and Prdm14 enhancer-deleted mESC in a representative experiment and the average results from two independent quantifications are presented. P-values are calculated using the t-test. See also Figure S6.
One mechanism underlying enhancer action is the physical association with targeted gene promoters (Dekker et al., 2002; Ong and Corces, 2011). To explore whether there are physical contacts between enhancers and promoters of PGC determinants, we performed circularized chromosome capture (4C) coupled with high-throughput sequencing in mESC using genomic regions near the promoters of Prdm1 and Prdm14 as viewpoints. Significant 4C–seq signal from both viewpoints was observed at enhancer regions with Mll2-dependent H3K4me3, suggesting that Mll2-regulated enhancers indeed loop into close proximity with promoters of PGC determinant genes (Figure 6E).
To determine whether Mll2, its CXXC domain or its catalytic activity play a direct role in the regulation of cis-regulatory elements of Prdm1 and Prdm14, their enhancers and promoters were individually cloned into luciferase reporter constructs and co-transfected with Halo-tagged human MLL2 or its mutant derivatives in HEK293 cells. We observed that the activity of enhancers and promoters of Prdm1 and Prdm14 can be stimulated by expression of wild type MLL2. This MLL2 activity was abolished by either deletion of its CXXC domain or mutation of the SET domain, indicating that the stimulation of enhancer and promoter activity by MLL2 is dependent on the DNA binding activity of its CXXC domain and the catalytic activity of its SET domain (Figure 6F).
The requirement of Mll2’s methyltransferase activity and DNA binding activity for Prdm1 and Prdm14 enhancer function prompted us to test if establishment of enhancer-promoter interaction requires Mll2 or its methyltransferase activity. We performed 4C with the Prdm14 promoter as viewpoint in wild-type, KO, and CXXC and catalytic Mll2 mutant mESC. We found that Mll2 deletion, inhibition of its chromatin recruitment or mutation of its catalytic domain, has no observable impact on the interaction between the enhancers and the promoter of Prdm14 (Figure S6H). However, further studies are needed to better understand the role of H3K4 methylation in the regulation of enhancer and promoter communication at other loci. To investigate a role of Mll2 and Mll2-dependent enhancers in PGC specification, we differentiated parental, Mll2KO, Prdm1 and Prdm14 enhancer-deleted mESC into PGCLC in vitro followed by flow cytometric analysis of PGCLC markers. As seen with Mll2KO mESC, Prdm14 enhancer deletion also leads to inefficient specification of PGCLC (Figure 6G). However, we did not observe a defect in PGCLC differentiation with the deletion of the distal Prdm1 enhancer (En1, data not shown), suggesting that the second enhancer (En2) residing within the gene body of Prdm1 may compensate for the loss of function of En1 during differentiation. In summary, our findings suggest molecular mechanisms whereby the recruitment of Mll2/COMPASS to CpG islands at enhancers for the establishment of H3K4me3 and the activation of factors essential for PGC specification (Figure 7).
Figure 7. A working model for Mll2/COMPASS during development. (1st panel).
Mll2/COMPASS can be recruited to both enhancers and promoters through CXXC-dependent binding of CpG-rich regions and implements H3K4me3 for the expression of master regulators of PGC specification. (2nd panel) During differentiation, Mll2−/− cells are not capable of transcriptional activation of PGC factors, resulting in defective PGC specification. (3rd panel) Mll2mCXXC/mCXXC is incompetent to occupy enhancers and promoters of PGC regulators, leading to compromised PGC specification. (4th panel) Catalytically deficient Mll2Y2602A/Y2602A is recruited to enhancers and promoters of PGC regulators, but fails to implement H3K4 methylation and leads to compromised PGC specification.
DISCUSSION
Here, we have demonstrated that Mll2/COMPASS occupies some promoters and non-TSS regulatory elements and implements H3K4me3 at these loci. We found that Mll2 is required for the expression of several genes including the master regulators required for primordial germ cell specification during ES cell differentiation. The DNA-binding activity of the CXXC domain and the catalytic activity are indispensable for Mll2’s function in the regulation of PGC specification during differentiation, suggesting that the recruitment of the catalytic activity of Mll2/COMPASS to these loci is central for its specific transcriptional regulatory properties. Given that Mll2/COMPASS can methylate thousands of promoters and non-TSS regulatory elements, but only a subset of genes require Mll2/COMPASS and its catalytic activity in H3K4 trimethylation for the regulation of gene expression, our findings suggest that not all H3K4 methylated sites are functionally equal.
In line with an active regulatory role for H3K4me3 in PGC gene expression, deletion of Kdm5b, a H3K4me3 demethylase, leads to a global increase of H3K4me3 and a failure to silence the expression of PGC genes (Schmitz et al., 2011). Together, these findings point to an essential role of H3K4me3 in PGC establishment during embryonic differentiation. Since we observe that Mll2 is required for H3K4me3 at both promoters and enhancers of PGC genes, determining the degree to which H3K4me3 changes at promoters or enhancers of these genes is an important area of future investigation.
Nanog is a core pluripotency factor required for both the pluripotent mESC and the unipotent PGCs. Ectopic expression of Nanog in EpiLCs is sufficient to induce PGCLCs formation without addition of exogenous cytokines (Murakami et al., 2016). Mechanistic studies revealed that Nanog associates with the enhancers of Prdm1 and Prdm14 and is required for their activation during PGC induction (Murakami et al., 2016). As we noticed an accelerated reduction in Nanog expression in Mll2KO mESC during differentiation (Figure S4E), lowered expression of Nanog may partially account for the failure to fully activate Prdm1 and Prdm14 in the absence of Mll2.
CpG islands are found at promoters of bivalent genes targeted by Mll2 in mESC (Bernstein et al., 2006; Hu et al., 2013b) and artificial integration of synthetic CpG islands into the mESC genome can create novel bivalent domains that are independent of Cxxc1-containing Set1A/Set1B/COMPASS-like complexes (Wachter et al., 2014). We observed that non-TSS regulatory elements targeted by Mll2 in mESC also show enrichment for CpG dinucleotides (Figures 2G and 2I). Mll2 harbors a CXXC domain, which can bind specifically to nonmethylated CpG (Voo et al., 2000) and this domain is essential for targeting Mll2 to these regions.
Since both Mll1 and the Cxxc1 subunit of Set1A- and Set1B/COMPASS have CXXC domains, it is unclear how Mll2 is preferentially recruited for H3K4me3 at non-TSS Mll2 associated sites in ES cells. More than half of non-TSS Mll2 associated sites are enriched for the enhancer mark of H3K4me1. We previously demonstrated that Mll3/Mll4/COMPASS-like complexes are recruited to enhancers for H3K4 monomethylation (Herz et al., 2012; Hu et al., 2013a). Determining how enhancers specifically choose Mll2 for their H3K4 trimethylation and Mll3/Mll4 for their H3K4 monomethylation will be important for understanding functional differences in different cis-regulatory elements and how they function to regulate pluripotency and development.
The molecular basis for H3K4me3-dependent transcriptional activation of genes during differentiation is currently unknown. Histone H3K4me3 has been previously reported to participate in many biological processes including nucleosome remodeling, transcription initiation, VDJ rearrangement, pre-mRNA splicing and DNA damage response through binding to distinct effector proteins (Li et al., 2006; Matthews et al., 2007; Shi et al., 2006; Sims et al., 2005; Sims et al., 2007; Vermeulen et al., 2010; Vermeulen et al., 2007; Wysocka et al., 2006). Various repressor proteins have been found to bind unmodified H3K4 including BHC80 (component of CoREST), DNMT3L (in complex with DNA de novo methyltransferase DNMT3A and DNMT3B) and UHRF1 (in complex with the DNA maintenance methyltransferase DNMT1) (Lan et al., 2007; Nady et al., 2011; Ooi et al., 2007). H3K4me3 interferes with the binding of these repressors and thus may be involved in inhibition of DNA methylation. In addition, H3K4me3 was recently shown to keep DNMT3A in an autoinhibitory state and removal of histone methylation leads to enzymatic activation of DNMT3A (Guo et al., 2015). Therefore, H3K4me3 may recruit or repel a specific set of downstream effectors from enhancers near PGC determinant genes for transcriptional activation during differentiation.
Our results demonstrate that Mll2/COMPASS is required for the steady-state expression of several different genes during differentiation including genes involved in PGC specification during development. Removal of H3K4me3 from bivalent chromatin by Mll2 depletion in mESC skews cells toward differentiation and leads to prominent defects during development (Glaser et al., 2006; Lubitz et al., 2007). As seen in the pluripotent ES cells, PGCs also maintain a bivalent domain at promoters of key regulators of all three germ layers (Lesch and Page, 2014). We propose that Mll2 is not only required for the establishment of PGCs during differentiation through H3K4 methylation of key PGC genes but it may also be essential for maintaining the bivalent chromatin state for normal PGC function.
STAR*METHODS
CONTACT FOR REAGENT AND RESOURCE SHARING
Further information and requests may be directed to, and will be fulfilled by the lead contact corresponding author Ali Shilatifard (ASH@Northwestern.edu).
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Cell Lines
HEK293 cells and mouse embryonic fibroblasts (MEF) were maintained in DMEM plus 10% FBS. Mouse V6.5 embryonic stem cells were cultured in 0.1% gelatin-coated tissue culture flasks with irradiated MEF feeder cells. Cells were grown in DMEM supplemented with 15% FBS (HyClone), 2 mM L-glutamine, 0.1 mM nonessential amino acids (Stemcell Technologies) and 1,000 U/ml recombinant LIF (Millipore, ESG1107).
Expression Plasmids and short hairpin RNAs
Human full-length MLL2 cDNA was subcloned into pFN205K EFla plasmid with an amino-terminal Halo tag between Sgf I and Pme I sites. Halo-tagged MLL2 (ΔCXXC) and MLL2 (Y2604A) plasmids were generated by deleting amino acids from 958 to 1000 of the CXXC domain or replacing tyrosine 2604 within the SET domain, respectively (Dillon et al., 2005). Plasmids and shRNA sequences are provided in the Key Resources Table.
KEY RESOURCES TABLE
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
Rabbit polyclonal anti-Menin | Bethyl Laboratories | Cat# A300–105A; RRID: AB_2143306 |
Rabbit polyclonal anti- RbBP5 | Bethyl Laboratories | Cat# A300–109A; RRID: AB_210551 |
Rabbit polyclonal anti- ASH2L | Bethyl Laboratories | Cat# A300–489A; RRID: AB_451024 |
Rabbit polyclonal anti-WDR5 | Bethyl Laboratories | Cat# A302–429A; RRID: AB_1944302 |
Rabbit monoclonal anti-WDR82 | Cell Signaling Technology |
Cat# 99715; RRID: AB_1944302 |
Mouse monoclonal anti-beta Tubulin | DSHB | Cat# E7; RRID: AB_2315513 |
Mouse monoclonal anti-HaloTag | Promega | Cat# G9211 |
Rabbit polyclonal anti-p300 | Santa Cruz | Cat# sc-585 |
Rabbit polyclonal anti-H3K27ac | Abcam | Cat# ab4729 |
Rabbit polyclonal anti-MLL2_ab CT1 | This study | N/A |
Rabbit polyclonal anti-MLL2_ab CT2 | This study | N/A |
Rabbit polyclonal anti-H3K4me1 | (Hu et al., 2013a) | N/A |
Rabbit polyclonal anti-H3K4me2 | (Hu et al., 2013a) | N/A |
Rabbit polyclonal anti-H3K4me3 | (Hu et al., 2013a) | N/A |
PE-conjugated anti-mouse/rat CD61 Antibody | BioLegend | Cat# 104308 |
Anti-Human/Mouse SSEA-1 eFluor® 660 | Affymetrix eBioscience | Cat# 508813-42 |
Biological Samples | ||
Chemicals, Peptides, and Recombinant Proteins | ||
N-2 Supplement (100X) | ThermoFisher | Cat# 17502-048 |
B-27® Supplement (50X), minus vitamin A | ThermoFisher | Cat# 12587010 |
ESGRO® Leukemia Inhibitory Factor (LIF) | Millipore | Cat# ESG1107 |
MEK inhibitor, PD0325901 | Stemgent | Cat# 04–0006 |
GSK-3β inhibitor, CHIR99021 | Stemgent | Cat# 04–0004 |
Human Plasma Fibronectin Purified Protein | Millipore | Cat# FC010 |
KnockOut™ Serum Replacement | ThermoFisher | Cat# 10828028 |
bFGF | ThermoFisher | Cat# 13256-029 |
Activin A | Peprotech | Cat# 120-14 |
Nunc® dishes 96-well low cell-binding plate | Sigma | Cat# Z721093 |
Glasgow’s MEM (GMEM) | ThermoFisher | Cat# 11710-035 |
MEM Non-Essential Amino Acids Solution (NEAA) | ThermoFisher | Cat#11140-050 |
L-Glutamine (200 mM) | ThermoFisher | Cat# 25030081 |
Sodium Pyruvate (100 mM) | ThermoFisher | Cat# 1360070 |
Penicillin-Streptomycin (10,000 U/mL) | ThermoFisher | Cat# 15140122 |
Gibco™ 2-Mercaptoethanol | ThermoFisher | Cat# 21985023 |
BMP-4 | R&D Systems | Cat# 314-BP-010 |
BMP8a | R&D Systems | Cat# 1073-BP-010 |
SCF | R&D Systems | Cat# 455-MC-010 |
EGF | R&D Systems | Cat# 2028-EG-200 |
Protein A/G PLUS-Agarose | Santa Cruz | Cat# sc-2003 |
DNase I (RNase-free) | New England BioLabs | Cat# M0303S |
Micrococcal nuclease (MNase) | Worthington | Cat# 9013-53-0 |
Critical Commercial Assays | ||
Dual-Luciferase® Reporter Assay System | Promega | Cat#E1910 |
PerfeCTa® SYBR® Green SuperMix Reaction Mixes, Quanta Biosciences |
VVR | Cat# 01414-144 |
TruSeq® Stranded Total RNA LT - (with Ribo-ZeroTM Human/Mouse/ Rat) - Set A |
Illumina | RS-123-2201 |
TruSeq® Stranded Total RNA LT - (with Ribo-ZeroTM TM Human/ Mouse/Rat) - Set B |
Illumina | RS-123-2202 |
High-Throughput Library Preparation Kit Standard PCR Amp Module - 96 rxn |
KAPA Biosystems | KK8234 |
Deposited Data | ||
Raw and analyzed data | This study | GEO: GSE78708 |
Mouse reference genome NCBI37/mm9 | Genome Reference Consortium | https://www.ncbi.nlm.nih.gov/grc/mouse |
Experimental Models: Cell Lines | ||
V6.5 mouse embryonic stem cells | Novus Biologicals | Cat#NBP1–41162 |
HEK-293 | ATCC | Cat#CRL-1573 |
Puromycin-Resistant Mouse Embryonic Fibroblasts (MEF) | Stem Cell Technologies | Cat# #00325 |
Experimental Models: Organisms/Strains | ||
Recombinant DNA | ||
pFENHK | Promega | N/A |
pFENHK Halo MLL2 | Promega | N/A |
pFENHK _Halo_MLL2(ΔCXXC) | This study | |
pFENHK _Halo_MLL2(Y2604A) | This study | |
pLKO.1 | Addgene | Cat# 10878 |
PX459 | Addgene | Cat# 48139 |
Sequence-Based Reagents | ||
shMenin: ACATATTGCTGCCCGAATTTG3 | This study | |
shMll2: GGAGAACTCTGATTGAGAAAG | (Hu et al., 2013b) | |
4C sequencing Prdm14 locus Forward: AATGATACGGCGACCACCGAGATCTACACTCTTTCC CTACACGACGCTCTTCCGATCTNNNNNNCCAAGACT AAGAACAAGCTT |
This study | |
4C sequencing Prdm14 locus Reverse: CAAGCAGAAGACGGCATACGAGATCCGGAGTTCCAA AACTAGCA |
This study | |
4C sequencing Prdml locus Forward: AATGATACGGCGACCACCGAGATCTACACTCTTTCC CTACACGACGCTCTTCCGATCTNNNNNNTGTACAAG TACTCCAAGCTT |
This study | |
4C sequencing Prdml locus Reverse: CAAGCAGAAGACGGCATACGAGATTCAAGGCAGAA ATGTGTTGTG |
This study | |
Mll2KO-crgRNA-Left: 5’GGAGGGGCAGCCCGTGCGCA3’ |
This study | |
Mll2KO-crgRNA-Right: 5’CTTGCGGCCCCGACCCCGA3’ |
This study | |
Mll2-mCXXC-crgRNA: 5’ TCGGGGTTGCTTGCGTGTGC 3’ |
This study | |
Mll2-Y2602A–crgRNA: 5’ TGTAAACGCAACATCGATGC3’ |
This study | |
Prdml -Δenhancer-Left: 5’ GGGCTTTTGACCCATAGTGT3’ |
This study | |
Prdml -Δenhancer-Right: 5’ TGTATCTGATAGAAGTCCGC3’ |
This study | |
Prdml 4-Δenhancer-Left: 5’ CCACGAGGGGGGATCGTCTC3’ |
This study | |
Prdml 4-Δenhancer-Right: 5’ TAAAATTCAGCCGGACGTGA 3’ |
This study | |
Software and Algorithms | ||
MACSvl.4.2 | (Zhang et al., 2008) | http://liulab.dfci.harvard.edu/MACS/ |
SICERvl.l | (Zang et al.,2009) | http://home.gwu.edu/∼wpeng/Software.htm |
GAT | (Heger et al, 2013) | http://code.google.com/p/genomic-association-tester |
TopHatv2.0.13 | (Trapnell et al, 2009) | https://ccb.jhu.edu/software/tophat/index.shtml |
edgeR 3.0.8 | (Robinson et al, 2010) | http://bioconductor.statistik.tu-dortmund.de/packages/2.11/bioc/html/edgeR.html |
Metascape | (Tripathi et al., 2015) | http://metascape.org/gp/index.html#/main/stepl |
Bowtie aligner vO. 12.9 | (Langmead et al, 2009) | http://bowtie-bio.sourceforge.net/index.shtml |
4cseqpipe | (van de Werken et al, 2012b) | http://compgenomics.weizmann.ac.il/tanay/?page_id=367 |
Other | ||
Stowers Original Data Repository | http://www.stowers.org/research/publications/libpb-1090 | |
See summary of gene ontology(GO) terms for the differentially expressed genes in Mll2KO mES cells in Table S1 |
This study | |
See summary of the gene list of differential expression determined by RNA-seq in Table S2 |
This study | |
See summary of the high-throughput sequencing reads in Table S3 |
This study |
shRNAs (listed in Key Resources Table) were cloned into the pLKO. 1 vector (Addgene). Mouse V6.5 ES cells were infected with lentiviruses expressing shRNAs for 24 h before selection with 2 µg/ml puromycin for 2 d. For ChIP-seq and RNA-seq experiments, cells were grown for one passage without feeder cells for 30 min before harvesting.
METHOD DETAILS
Antibody Generation
Histone H3K4mel, H3K4me2, H3K4me3 were generated in our lab and have been used and described in our previous studies (Hu et al., 2013a; Hu et al., 2013b). For anti-Mll2, carboxyl-terminal fragments of human MLL2, CT1 is our rabbit #9175 and generated to the MLL2 peptide spanning amino acids 2264–2413. And CT2 antibodies is our rabbit #9172 and #9173 (two animals for this MLL2 peptide) and generated to the MLL2 peptide spanning amino acids 2071–2220. The peptides were expressed as His-tag fusion proteins in pET-16b vector and purified on NTA-agarose according to QIAGEN’s protocol.
Lentivirus-based Knockdowns, Rescue Experiments and Generation of Embryoid Bodies
Lentivirus-mediated knockdown has been described previously (Hu et al., 2013b). The hanging drop method was used to form embryoid bodies (EBs) by culturing 1000 mouse ES cells in 25 µl differentiation medium (without LIF) on the lid of 150 mm culture dish for the indicated number of days. For generation of human MLL2-rescue mouse ES cells, constructs were linearized with Mlu I restriction enzyme and electroporated into wild-type or Mll2 KO mouse ES cells with the Gene Pulser Xcell™ system (Bio-Rad) with a setting of 250V, 500 µF capatitance. 24 hours post electroporation, cells were selected with 400 µg/ml G418 for 3 days followed by growing for one passage for 30 min to remove feeder cells before harvesting.
CRISPR-mediated Knockouts and Knockins
sgRNAs were designed using the CRISPRtool (http://crispr.mit.edu) and cloned into PX459 vector (Addgene, 48139). Targeting vector and linearized donor vector were co-electroporated into V6.5 mouse embryonic stem cells and 24 hours later followed by 2 days of puromycin selection. For generation of CXXC and catalytically deficient mutants of endogenous Mll2, cells were continued to be selected in the presence of neomycin drug for a week. Targeted single-ES cell colonies were screened by PCR with primers (see Figure 4A and 4B) and analyzed by agarose-gel electrophoresis. PCR products were sequenced to further validate the genomic editing.
PGCLC Induction
PGCLCs induction was conducted following a previously described protocol (Hayashi et al., 2011; Hayashi and Saitou, 2013). Briefly, mouse ES cells cultured in N2B27 medium (N-2: ThermoFisher Scientific, 17502-048; B27: ThermoFisher, 12587-010) plus LIF and 2i (1mM MEK inhibitor, Stemgent, PD0325901; 3mM GSK-3β inhibitor, Stemgent, CHIR99021) were trypsinized and 1×105 cells were seeded into one well of 12-well plate pre-coated with 5ug/ml plasma fibronectin (Millipore, FC010). After two days of differentiation in N2B27 medium supplemented with 1% knockout serum replacement (KSR, ThermoFisher, 10828-028), 20 ng/ml activin A (Peprotech, cat,no 120-14) and 12 ng/ml bFGF (ThermoFisher, 13256-029), the resulting EpiLCs were trypsinized and 2 ×104 cells were plated into one well of 96-well low cell-binding plate (Sigma, Z721093) for 4 days of differentiation into PGCLCs in serum-free GMEM (ThermoFisher, 11710-035) medium supplemented with 15% KSR, 1× NEAA (ThermoFisher,11140-050), 1× L-glutamine (ThermoFisher, 25030081), 1× sodium pyruvate (ThermoFisher ,1360070), 1× penicillin-streptomycin (ThermoFisher, 15140122), 55 nM 2-Mercaptoethanol (ThermoFisher, 21985023), 1,000 U of LIF, 500 ng/ml BMP4 (R&D, 314-BP-010), 500 ng/ml BMP8a (R&D, 1073-BP-010), 100 ng/ml SCF (R&D, 455-MC-010) and 50 ng/ml EGF (R&D, 2028-EG-200). At the end of day 4 of differentiation, PGCLCs aggregates were either harvested for RNA extraction or dissociated for flow cytometric analysis.
FACS Analysis
For quantification of PGCLCs, differentiated aggregates were trypsinized at 37 °C for 8 min and quenched with MEF medium. The dissociated cells were washed two times in FACS buffer (PBS with 0.1% BSA) and stained with PE-conjugated anti-mouse/rat CD61 (BioLegend, 104308, 1:200 dilutions) and anti-human/mouse SSEA1 eFluor®660 (eBioscience, 50–8813-42, 1:200 dilutions) antibodies for 1 hour on ice followed by three washes with FACS buffer. Stained cells were fixed in 4% paraformaldehyde for 30 min on ice and washed with FACS buffer for three times before analysis was performed with a FACSCantoll cytometer (BD Bioscience).
Immunoprecipitation
Immunoprecipitations were performed as previously described (Hu et al., 2013c). Briefly, V6.5 mouse ES cells were lysed in RIPA (20 mM Tris-Hcl [pH 7.4], 150 mM NaCl, 1% NP40, 1% sodium deoxycholate, 0.1% SDS, 1 mM dithiothreitol [DTT]) and protease inhibitors (Sigma). Cleared lysates were incubated with the indicated antibodies and protein A/G PLUS agarose (Santa Cruz, sc-2003). After washing beads, immunoprecipitates were eluted with loading buffer for SDS-PAGE.
Dual Luciferase Assay
For dual luciferase assays, promoter and putative enhancers of Prdm1 and Prdm14 were cloned into pGL4.12[luc2CP] and the pGL4.25[luc2CP/minP] reporter vectors, respectively. The reporter plasmids were transfected with Lipofectamine 2000 into HEK293 cells grown in 24-well plates, together with pRL-SV40 internal control plasmid, and either empty, wild-type, CXXC or catalytically deficient versions of human Halo-tagged MLL2 for 48 hours before harvesting for passive lysis (Promega, E1910), and quantitation of luciferase activity on a GloMax-multi fluorometer (Promega). The total amount of transfected plasmids per well of transfection was adjusted to be equal to 1ug/well for all samples.
4C Sequencing
4C sequencing was performed as previously described (van de Werken et al., 2012a). Briefly, ∼ 107 cells were fixed with 2% formaldehyde at room temperature and digested with HindIII (NEB) overnight at 37 °C. The digested chromatin was diluted in 1× T4 DNA ligase buffer (Roche) with addition of T4 DNA ligase (Roche) and incubated overnight at 16 °C. Ligated chromatin was reverse-crosslinked to generate the 3C template, which was further digested with NlaIII (NEB). Samples were then diluted and ligated with T4 DNA ligase (Roche) to generate the 4C template. 4C–PCR was performed with the primers in the Key Resources Table (NNNNNN indicates location of index sequence) and next generation sequencing was performed on Nextseq 500 (Illumina).
Size Exclusion Chromatography
HEK293 cells were transfected with the indicated plasmids by Lipofectamine 2000 (ThermoFisher, 11668027). Three days after transfection, nuclear extracts were prepared and concentrated with Millipore Ultrafree centrifugal filters (10 kDa nominal molecular mass limit) before applying onto a Superose 6 3.2/300 column (GE Healthcare) that had been equilibrated with buffer (25 mM Tris, pH 7.4, 150 mM NaCl, 10% glycerol, and 1mM EDTA). Fractions were separately collected and analyzed by western blotting with respective antibodies.
Quantitative RT- and ChIP- PCR
Total RNA was isolated from cells or EBs using Trizol, treated with DNAase I (NEB, M0303S), and reversed transcribed using Superscript III and random primers (Invitrogen). Resulting cDNA was analyzed by quantitative PCR using Green SuperMix for iQ™ (VVR, 01414-144) on a MyIQ Systems (Bio-Rad). For other quantitative ChIP-PCR, isolated DNA was analyzed using SYBR green on a CFX connect™ Real-Time PCR detection system (Bio-Rad).
ChIP-seq
ChIP assays were performed as previously described (Lee et al., 2006). Briefly, 2–5 × 107 cells were cross-linked with 1% paraformaldehyde at room temperature for 10 min and quenched by glycine. Fixed chromatin was sheared to generate chromatin fragments of 200- 600 bp with a Misonix 3000 sonicator and immunoprecipitated with the indicated antibodies. For native ChIP-seq, nuclei were isolated from 1× 107 cells and digested with 10U MNase (Worthington, 9013-53-0) for 15min followed by full-speed centrifugation at 4°C for 30 min. Supernatants containing mononucleosomes were used for ChIP with antibodies against H3K4me3. All ChIP-seq libraries were prepared with the KAPA HTP library preparation kit (KAPA Biosystems, KK8234) and sequenced on either the Illumina HiSeq 2500 or NextSeq 500.
QUANTIFICATION AND STATISTICAL ANALYSIS
Methods and software for calculating p values are described in the relevant sections below and in the figure legends. Briefly, error bars for qPCR represent two technical replicates in a representative experiment of at least three biological replicates. FDR values for enrichment in ChIP-seq experiments were determined with MACS (Zhang et al., 2008). P-values for estimating the significance of peak overlap in ChIP-seq data were determined with GAT (Heger et al., 2013). P-values associated with box plots were calculated with the Wilcoxon rank-sum test. P-values for Gene Ontology were determined with the hypergeometric test using the software Metascape (Tripathi et al., 2015). P-values for Venn Diagrams were calculated using the hypergeometric test in R. Adjusted p values for differential expression were determined with the Benjamini-Hochberg method in EdgeR 3.0.8 (Robinson et al., 2010).
ChIP-Seq Analysis
Sequencing reads were aligned to the mouse genome (UCSC mm9) using the Bowtie aligner v0.12.9 allowing uniquely mapping reads only and allowing up to two mismatches (see Table S3 for alignment statistics) (Langmead et al., 2009). Reads were extended to 150 bases toward the interior of the sequenced fragment and normalized to total reads aligned (reads per million; RPM). External sequencing data was acquired from GEO and ArrayExpress as raw reads and aligned in the same way as internally sequenced samples.
Peak Calling
Peak detection was performed with MACS v1.4.2 (Zhang et al., 2008). Associated control samples were used to determine statistical enrichment at a p < 1e-8 and FDR < 0.05. The enrichment of external data (not including H3K4me3) was determined at p < 1e-5 and FDR < 0.05. Enrichment for H3K27me3 was determined with the broad domain peak detector SICER v1.1 (Zang et al., 2009) at the FDR < 1e-8, window size of 200, and gap size of 600.Non-promoter peaks are the peaks that reside at least 1kb away from a transcription start site (TSS), with the remaining peaks being the TSS peaks. Gene annotations and TSS information were from Ensembl 67. Peaks were assigned to the features of its nearest gene. Co-bound peaks were called if the peak regions of two samples overlap. The significance of peak overlap was tested by GAT (Heger et al., 2013).
Replicate Information
Mll2 ChIP-seq with two independent antibodies were analyzed separately to gain comprehensive information on its genome-wide distribution and to increase our confidence in the specificity of two antibodies. The overlapping high-confidence peak regions of two biological replicates of H3K4me3 were used for further analysis. The overlapping peaks were merged into one region as the union of all. One replicate is generated from this study, the other is from GEO (GSE48172).
ChIP-seq Heat Maps and Enrichment Profiles
For ChIP-seq heat maps, regions of interest were depicted at the center of each peak or nearest TSS and sorted by occupancy. Regions spanning 5 kb on either side of the indicated feature were binned into 25 bp windows. Coverage is normalized into reads per million (RPM). Occupancies were calculated as the mean coverage under each peak region. P-values for boxplots were determined with the Wilcoxon rank sum test. For ChIP-Seq enrichment profiles, regions of interest were shown for each factor as a binary value of enriched/not enriched. The regions were centered at the center of each peak and sorted by the shortest distance of peak center to an annotated TSS. Regions shown are oriented 5’ to 3’ corresponding to the orientation of the nearest gene. Regions spanning 50 kb on either side of feature indicated were binned into 200 bp windows. Each line represents a peak and its surrounding region.
RNA-Seq Analysis
Sequencing reads were aligned to the mouse genome UCSC mm9 (see Table S1 for alignment statistics) and to gene annotations from Ensembl 67 using TopHat v2.0.13 (Trapnell et al., 2009), using option –g 1 –x 1. R package edgeR 3.0.8 was used to perform differential expression analysis at FDR < 1e-3 (Robinson et al., 2010). To estimate sample dispersion better, all the RNAseq samples are combined to fit one generalized linear model. Genes that expressed (>= 1 RPM) in less than 2 samples are filtered out. External sequencing data was downloaded from GEO48172 as raw reads and aligned in the same way as internally sequenced samples. Gene Ontology (GO) analysis of differentially expressed genes in Mll2 KO mouse ES cells was performed by Metascape (Tripathi et al., 2015). MA plots are generated by edgeR: A is logCPM, M is logFC. X axis indicates averaged expression level (log2 normalized counts). Y axis represents log2 fold changes of normalized counts of Mll2 knockout (KO) cells versus the parental cells.
4C–Seq Analysis
Reads were demultiplexed by allowing no mismatches in the library index and 1 mismatch in viewpoint sequences. The library indexes were trimmed before mapping. 4cseqpipe was used for alignment, normalization and generating near-cis domainograms (van de Werken et al., 2012b).
Data Sources
Data generated for this study includes ChIP-seq data for H3K4me3, MLL2, shMll2_MLL2, H3K4me3 in Mll2WT_Vec, Mll2KO_Vec, Mll2KOMLL2WT, Mll2KO_MLL2(ΔCXXC) and Mll2_MLL2(Y2604A) ES cells, H3K9ac, H3K27ac and Pol II in ES cells, and H3K4me3 in MEF cells. Other data sets come from previously published studies. Replicates for H3K4me3, shMLL2_H3K4me3 and RNA-seq in ES cells are from GEO accession number GSE48172. p300, H3K4mel and H3K27ac ChIP-seq data are from GEO accession number GSE24164 (Creyghton et al., 2010). H3K27me3 ChIP-seq data are from GEO accession number GSE12241 (Mikkelsen et al., 2007). H3K4me3 ChIP-seq in Cxxcl (Cfpl) WT and KO ES cells are from ArrayExpress repository under accession E-ERAD-79 (Reynolds et al., 2012). CpG islands were downloaded from UCSC browser mm9.
DATA AND SOFTWARE AVAILABILITY
ChIP-seq and RNA-seq data sets have been deposited at GEO with accession number GSE78708.
Supplementary Material
HIGHLIGHTS.
Mll2/COMPASS occupies and catalyzes H3K4me3 at non-TSS elements
The CXXC domain of Mll2/COMPASS mediates its chromatin targeting
Mll2 regulates transcription of some genes including PGC specification genes
PGC specification requires Mll2/COMPASS’s methyltransferase activity
Acknowledgments
We thank Stacy Marshall and Emily Rendleman for help with Illumina sequencing. We are grateful to the Stowers Molecular Biology core and Tissue Culture core for assistance with the generation of some of the NGS data and maintenance of cell lines. We thank Andrea Piunti for critical reading of the manuscript and Laura Shilatifard for editorial assistance. D.H. was supported by the Robert H. Lurie Comprehensive Cancer Center – Translational Bridge Program Fellowship in Lymphoma Research. G.M. received the support of Ia Convocatoria de Ayudas Fundación BBVA a Investigadores, Innovadores y Creadores Culturales. A.V. was supported by NIH training grant T32CA080621. These studies were further supported by grants from the Spanish “Ministerio de Educación y Ciencia” (SAF2013-48926-P) and AGAUR to L.D.C., and NIH grants R01CA101774 to J.D.C, R50CA211428 to E.R.S., and R35CA197569 to A.S.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
AUTHOR CONTRIBUTIONS
Conceptualization, D.H. and A.S.; Methodology, D.H., K.C., M.A.M., G.M., and A.G.V.; Investigation, D.H., K.C., M.A.M., G.M., and A.G.V.; Formal Analysis, X.G., Validation, K.C., E.T.B., E.R.S. Data Curation, X.G.; Writing – Original Draft, D.H.; Writing – Review & Editing, D.H., M.A.M., K.C., E.R.S., and A.S.; Funding Acquisition, L.D.C., J.D.C., E.R.S. and A.S.; Resources, D.H., K.C., E.T.B., J.D.C., and A.S.; Supervision, L.D.C., J.D.C., and A.S.
REFERENCES
- Andreu-Vieyra CV, Chen R, Agno JE, Glaser S, Anastassiadis K, Stewart AF, Matzuk MM. MLL2 is required in oocytes for bulk histone 3 lysine 4 trimethylation and transcriptional silencing. PLoS Biol. 2010:8. doi: 10.1371/journal.pbio.1000453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ayton PM, Chen EH, Cleary ML. Binding to nonmethylated CpG DNA is essential for target recognition, transactivation, and myeloid transformation by an MLL oncoprotein. Molecular and cellular biology. 2004;24:10470–10478. doi: 10.1128/MCB.24.23.10470-10478.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Azuara V, Perry P, Sauer S, Spivakov M, Jorgensen HF, John RM, Gouti M, Casanova M, Warnes G, Merkenschlager M, et al. Chromatin signatures of pluripotent cell lines. Nat Cell Biol. 2006;8:532–538. doi: 10.1038/ncb1403. [DOI] [PubMed] [Google Scholar]
- Bernstein BE, Mikkelsen TS, Xie X, Kamal M, Huebert DJ, Cuff J, Fry B, Meissner A, Wernig M, Plath K, et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell. 2006;125:315–326. doi: 10.1016/j.cell.2006.02.041. [DOI] [PubMed] [Google Scholar]
- Castrillon DH, Quade BJ, Wang TY, Quigley C, Crum CP. The human VASA gene is specifically expressed in the germ cell lineage. Proceedings of the National Academy of Sciences of the United States of America. 2000;97:9585–9590. doi: 10.1073/pnas.160274797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cierpicki T, Risner LE, Grembecka J, Lukasik SM, Popovic R, Omonkowska M, Shultis DD, Zeleznik-Le NJ, Bushweller JH. Structure of the MLL CXXC domain-DNA complex and its functional role in MLL-AF9 leukemia. Nature structural & molecular biology. 2010;17:62–68. doi: 10.1038/nsmb.1714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clouaire T, Webb S, Bird A. Cfp1 is required for gene expression-dependent H3K4 trimethylation and H3K9 acetylation in embryonic stem cells. Genome biology. 2014;15:451. doi: 10.1186/s13059-014-0451-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clouaire T, Webb S, Skene P, Illingworth R, Kerr A, Andrews R, Lee JH, Skalnik D, Bird A. Cfp1 integrates both CpG content and gene activity for accurate H3K4me3 deposition in embryonic stem cells. Genes & development. 2012;26:1714–1728. doi: 10.1101/gad.194209.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cong L, Ran FA, Cox D, Lin S, Barretto R, Habib N, Hsu PD, Wu X, Jiang W, Marraffini LA, et al. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339:819–823. doi: 10.1126/science.1231143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, Steine EJ, Hanna J, Lodato MA, Frampton GM, Sharp PA, et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proceedings of the National Academy of Sciences of the United States of America. 2010;107:21931–21936. doi: 10.1073/pnas.1016071107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dekker J, Rippe K, Dekker M, Kleckner N. Capturing chromosome conformation. Science. 2002;295:1306–1311. doi: 10.1126/science.1067799. [DOI] [PubMed] [Google Scholar]
- Denissov S, Hofemeister H, Marks H, Kranz A, Ciotta G, Singh S, Anastassiadis K, Stunnenberg HG, Stewart AF. Mll2 is required for H3K4 trimethylation on bivalent promoters in embryonic stem cells, whereas Mll1 is redundant. Development. 2014;141:526–537. doi: 10.1242/dev.102681. [DOI] [PubMed] [Google Scholar]
- Dillon SC, Zhang X, Trievel RC, Cheng X. The SET-domain protein superfamily: protein lysine methyltransferases. Genome biology. 2005;6:227. doi: 10.1186/gb-2005-6-8-227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fujiwara Y, Komiya T, Kawabata H, Sato M, Fujimoto H, Furusawa M, Noce T. Isolation of a DEAD-family protein gene that encodes a murine homolog of Drosophila vasa and its specific expression in germ cell lineage. Proceedings of the National Academy of Sciences of the United States of America. 1994;91:12258–12262. doi: 10.1073/pnas.91.25.12258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glaser S, Schaft J, Lubitz S, Vintersten K, van der Hoeven F, Tufteland KR, Aasland R, Anastassiadis K, Ang SL, Stewart AF. Multiple epigenetic maintenance factors implicated by the loss of Mll2 in mouse development. Development. 2006;133:1423–1432. doi: 10.1242/dev.02302. [DOI] [PubMed] [Google Scholar]
- Guenther MG, Levine SS, Boyer LA, Jaenisch R, Young RA. A chromatin landmark and transcription initiation at most promoters in human cells. Cell. 2007;130:77–88. doi: 10.1016/j.cell.2007.05.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo X, Wang L, Li J, Ding Z, Xiao J, Yin X, He S, Shi P, Dong L, Li G, et al. Structural insight into autoinhibition and histone H3-induced activation of DNMT3A. Nature. 2015;517:640–644. doi: 10.1038/nature13899. [DOI] [PubMed] [Google Scholar]
- Hayashi K, Ohta H, Kurimoto K, Aramaki S, Saitou M. Reconstitution of the mouse germ cell specification pathway in culture by pluripotent stem cells. Cell. 2011;146:519–532. doi: 10.1016/j.cell.2011.06.052. [DOI] [PubMed] [Google Scholar]
- Hayashi K, Saitou M. Generation of eggs from mouse embryonic stem cells and induced pluripotent stem cells. Nature protocols. 2013;8:1513–1524. doi: 10.1038/nprot.2013.090. [DOI] [PubMed] [Google Scholar]
- Heger A, Webber C, Goodson M, Ponting CP, Lunter G. GAT: a simulation framework for testing the association of genomic intervals. Bioinformatics. 2013;29:2046–2048. doi: 10.1093/bioinformatics/btt343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Herz HM, Hu D, Shilatifard A. Enhancer malfunction in cancer. Mol Cell. 2014;53:859–866. doi: 10.1016/j.molcel.2014.02.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Herz HM, Mohan M, Garruss AS, Liang K, Takahashi YH, Mickey K, Voets O, Verrijzer CP, Shilatifard A. Enhancer-associated H3K4 monomethylation by Trithorax-related, the Drosophila homolog of mammalian Mll3/Mll4. Genes & development. 2012;26:2604–2620. doi: 10.1101/gad.201327.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu D, Gao X, Morgan MA, Herz HM, Smith ER, Shilatifard A. The MLL3/MLL4 branches of the COMPASS family function as major histone H3K4 monomethylases at enhancers. Molecular and cellular biology. 2013a;33:4745–4754. doi: 10.1128/MCB.01181-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu D, Garruss AS, Gao X, Morgan MA, Cook M, Smith ER, Shilatifard A. The Mll2 branch of the COMPASS family regulates bivalent promoters in mouse embryonic stem cells. Nature structural & molecular biology. 2013b;20:1093–1097. doi: 10.1038/nsmb.2653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu D, Smith ER, Garruss AS, Mohaghegh N, Varberg JM, Lin C, Jackson J, Gao X, Saraf A, Florens L, et al. The little elongation complex functions at initiation and elongation phases of snRNA gene transcription. Mol Cell. 2013c;51:493–505. doi: 10.1016/j.molcel.2013.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Illingworth RS, Gruenewald-Schneider U, Webb S, Kerr AR, James KD, Turner DJ, Smith C, Harrison DJ, Andrews R, Bird AP. Orphan CpG islands identify numerous conserved promoters in the mammalian genome. PLoS genetics. 2010;6:e1001134. doi: 10.1371/journal.pgen.1001134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krogan NJ, Dover J, Khorrami S, Greenblatt JF, Schneider J, Johnston M, Shilatifard A. COMPASS, a histone H3 (Lysine 4) methyltransferase required for telomeric silencing of gene expression. The Journal of biological chemistry. 2002;277:10753–10755. doi: 10.1074/jbc.C200023200. [DOI] [PubMed] [Google Scholar]
- Lan F, Collins RE, De Cegli R, Alpatov R, Horton JR, Shi X, Gozani O, Cheng X, Shi Y. Recognition of unmethylated histone H3 lysine 4 links BHC80 to LSD1-mediated gene repression. Nature. 2007;448:718–722. doi: 10.1038/nature06034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome biology. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee TI, Johnstone SE, Young RA. Chromatin immunoprecipitation and microarray-based analysis of protein location. Nature protocols. 2006;1:729–748. doi: 10.1038/nprot.2006.98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lesch BJ, Page DC. Poised chromatin in the mammalian germ line. Development. 2014;141:3619–3626. doi: 10.1242/dev.113027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H, Ilin S, Wang W, Duncan EM, Wysocka J, Allis CD, Patel DJ. Molecular basis for site-specific read-out of histone H3K4me3 by the BPTF PHD finger of NURF. Nature. 2006;442:91–95. doi: 10.1038/nature04802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lubitz S, Glaser S, Schaft J, Stewart AF, Anastassiadis K. Increased apoptosis and skewed differentiation in mouse embryonic stem cells lacking the histone methyltransferase Mll2. Molecular biology of the cell. 2007;18:2356–2366. doi: 10.1091/mbc.E06-11-1060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matthews AG, Kuo AJ, Ramon-Maiques S, Han S, Champagne KS, Ivanov D, Gallardo M, Carney D, Cheung P, Ciccone DN, et al. RAG2 PHD finger couples histone H3 lysine 4 trimethylation with V(D)J recombination. Nature. 2007;450:1106–1110. doi: 10.1038/nature06431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, Alvarez P, Brockman W, Kim TK, Koche RP, et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007;448:553–560. doi: 10.1038/nature06008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller T, Krogan NJ, Dover J, Erdjument-Bromage H, Tempst P, Johnston M, Greenblatt JF, Shilatifard A. COMPASS: a complex of proteins associated with a trithorax-related SET domain protein. Proceedings of the National Academy of Sciences of the United States of America. 2001;98:12902–12907. doi: 10.1073/pnas.231473398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mohan M, Herz HM, Smith ER, Zhang Y, Jackson J, Washburn MP, Florens L, Eissenberg JC, Shilatifard A. The COMPASS family of H3K4 methylases in Drosophila. Molecular and cellular biology. 2011;31:4310–4318. doi: 10.1128/MCB.06092-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morgan MA, Shilatifard A. Chromatin signatures of cancer. Genes & development. 2015;29:238–249. doi: 10.1101/gad.255182.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murakami K, Gunesdogan U, Zylicz JJ, Tang WW, Sengupta R, Kobayashi T, Kim S, Butler R, Dietmann S, Surani MA. NANOG alone induces germ cells in primed epiblast in vitro by activation of enhancers. Nature. 2016;529:403–407. doi: 10.1038/nature16480. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nady N, Lemak A, Walker JR, Avvakumov GV, Kareta MS, Achour M, Xue S, Duan S, Allali-Hassani A, Zuo X, et al. Recognition of multivalent histone states associated with heterochromatin by UHRF1 protein. The Journal of biological chemistry. 2011;286:24300–24311. doi: 10.1074/jbc.M111.234104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Noce T, Okamoto-Ito S, Tsunekawa N. Vasa homolog genes in mammalian germ cell development. Cell structure and function. 2001;26:131–136. doi: 10.1247/csf.26.131. [DOI] [PubMed] [Google Scholar]
- Ohinata Y, Payer B, O’Carroll D, Ancelin K, Ono Y, Sano M, Barton SC, Obukhanych T, Nussenzweig M, Tarakhovsky A, et al. Blimp1 is a critical determinant of the germ cell lineage in mice. Nature. 2005;436:207–213. doi: 10.1038/nature03813. [DOI] [PubMed] [Google Scholar]
- Ong CT, Corces VG. Enhancer function: new insights into the regulation of tissue-specific gene expression. Nature reviews Genetics. 2011;12:283–293. doi: 10.1038/nrg2957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ooi SK, Qiu C, Bernstein E, Li K, Jia D, Yang Z, Erdjument-Bromage H, Tempst P, Lin SP, Allis CD, et al. DNMT3L connects unmethylated lysine 4 of histone H3 to de novo methylation of DNA. Nature. 2007;448:714–717. doi: 10.1038/nature05987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piunti A, Shilatifard A. Epigenetic balance of gene expression by Polycomb and COMPASS families. Science. 2016;352:aad9780. doi: 10.1126/science.aad9780. [DOI] [PubMed] [Google Scholar]
- Rada-Iglesias A, Bajpai R, Swigut T, Brugmann SA, Flynn RA, Wysocka J. A unique chromatin signature uncovers early developmental enhancers in humans. Nature. 2011;470:279–283. doi: 10.1038/nature09692. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reynolds N, Salmon-Divon M, Dvinge H, Hynes-Allen A, Balasooriya G, Leaford D, Behrens A, Bertone P, Hendrich B. NuRD-mediated deacetylation of H3K27 facilitates recruitment of Polycomb Repressive Complex 2 to direct gene repression. The EMBO journal. 2012;31:593–605. doi: 10.1038/emboj.2011.431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rickels R, Hu D, Collings CK, Woodfin AR, Piunti A, Mohan M, Herz HM, Kvon E, Shilatifard A. An Evolutionary Conserved Epigenetic Mark of Polycomb Response Elements Implemented by Trx/MLL/COMPASS. Mol Cell. 2016;63:318–328. doi: 10.1016/j.molcel.2016.06.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roguev A, Schaft D, Shevchenko A, Pijnappel WW, Wilm M, Aasland R, Stewart AF. The Saccharomyces cerevisiae Set1 complex includes an Ash2 homologue and methylates histone 3 lysine 4. The EMBO journal. 2001;20:7137–7148. doi: 10.1093/emboj/20.24.7137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Santos-Rosa H, Schneider R, Bannister AJ, Sherriff J, Bernstein BE, Emre NC, Schreiber SL, Mellor J, Kouzarides T. Active genes are tri-methylated at K4 of histone H3. Nature. 2002;419:407–411. doi: 10.1038/nature01080. [DOI] [PubMed] [Google Scholar]
- Schmitz SU, Albert M, Malatesta M, Morey L, Johansen JV, Bak M, Tommerup N, Abarrategui I, Helin K. Jarid1b targets genes regulating development and is involved in neural differentiation. The EMBO journal. 2011;30:4586–4600. doi: 10.1038/emboj.2011.383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schneider J, Wood A, Lee JS, Schuster R, Dueker J, Maguire C, Swanson SK, Florens L, Washburn MP, Shilatifard A. Molecular regulation of histone H3 trimethylation by COMPASS and the regulation of gene expression. Mol Cell. 2005;19:849–856. doi: 10.1016/j.molcel.2005.07.024. [DOI] [PubMed] [Google Scholar]
- Shi X, Hong T, Walter KL, Ewalt M, Michishita E, Hung T, Carney D, Pena P, Lan F, Kaadige MR, et al. ING2 PHD domain links histone H3 lysine 4 methylation to active gene repression. Nature. 2006;442:96–99. doi: 10.1038/nature04835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shilatifard A. The COMPASS family of histone H3K4 methylases: mechanisms of regulation in development and disease pathogenesis. Annual review of biochemistry. 2012;81:65–95. doi: 10.1146/annurev-biochem-051710-134100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sims RJ, 3rd, Chen CF, Santos-Rosa H, Kouzarides T, Patel SS, Reinberg D. Human but not yeast CHD1 binds directly and selectively to histone H3 methylated at lysine 4 via its tandem chromodomains. The Journal of biological chemistry. 2005;280:41789–41792. doi: 10.1074/jbc.C500395200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sims RJ, 3rd, Millhouse S, Chen CF, Lewis BA, Erdjument-Bromage H, Tempst P, Manley JL, Reinberg D. Recognition of trimethylated histone H3 lysine 4 facilitates the recruitment of transcription postinitiation factors and pre-mRNA splicing. Mol Cell. 2007;28:665–676. doi: 10.1016/j.molcel.2007.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25:1105–1111. doi: 10.1093/bioinformatics/btp120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tripathi S, Pohl MO, Zhou Y, Rodriguez-Frandsen A, Wang G, Stein DA, Moulton HM, DeJesus P, Che J, Mulder LC, et al. Meta- and Orthogonal Integration of Influenza “OMICs” Data Defines a Role for UBR4 in Virus Budding. Cell host & microbe. 2015;18:723–735. doi: 10.1016/j.chom.2015.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van de Werken HJ, de Vree PJ, Splinter E, Holwerda SJ, Klous P, de Wit E, de Laat W. 4C technology: protocols and data analysis. Methods in enzymology. 2012a;513:89–112. doi: 10.1016/B978-0-12-391938-0.00004-5. [DOI] [PubMed] [Google Scholar]
- van de Werken HJ, Landan G, Holwerda SJ, Hoichman M, Klous P, Chachik R, Splinter E, Valdes-Quezada C, Oz Y, Bouwman BA, et al. Robust 4C–seq data analysis to screen for regulatory DNA interactions. Nature methods. 2012b;9:969–972. doi: 10.1038/nmeth.2173. [DOI] [PubMed] [Google Scholar]
- Vermeulen M, Eberl HC, Matarese F, Marks H, Denissov S, Butter F, Lee KK, Olsen JV, Hyman AA, Stunnenberg HG, et al. Quantitative interaction proteomics and genome-wide profiling of epigenetic histone marks and their readers. Cell. 2010;142:967–980. doi: 10.1016/j.cell.2010.08.020. [DOI] [PubMed] [Google Scholar]
- Vermeulen M, Mulder KW, Denissov S, Pijnappel WW, van Schaik FM, Varier RA, Baltissen MP, Stunnenberg HG, Mann M, Timmers HT. Selective anchoring of TFIID to nucleosomes by trimethylation of histone H3 lysine 4. Cell. 2007;131:58–69. doi: 10.1016/j.cell.2007.08.016. [DOI] [PubMed] [Google Scholar]
- Voo KS, Carlone DL, Jacobsen BM, Flodin A, Skalnik DG. Cloning of a mammalian transcriptional activator that binds unmethylated CpG motifs and shares a CXXC domain with DNA methyltransferase, human trithorax, and methyl-CpG binding domain protein 1. Molecular and cellular biology. 2000;20:2108–2121. doi: 10.1128/mcb.20.6.2108-2121.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wachter E, Quante T, Merusi C, Arczewska A, Stewart F, Webb S, Bird A. Synthetic CpG islands reveal DNA sequence determinants of chromatin structure. eLife. 2014;3:e03397. doi: 10.7554/eLife.03397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang P, Lin C, Smith ER, Guo H, Sanderson BW, Wu M, Gogol M, Alexander T, Seidel C, Wiedemann LM, et al. Global analysis of H3K4 methylation defines MLL family member targets and points to a role for MLL1-mediated H3K4 methylation in the regulation of transcriptional initiation by RNA polymerase II. Molecular and cellular biology. 2009;29:6074–6085. doi: 10.1128/MCB.00924-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- West JA, Viswanathan SR, Yabuuchi A, Cunniff K, Takeuchi A, Park IH, Sero JE, Zhu H, Perez-Atayde A, Frazier AL, et al. A role for Lin28 in primordial germ-cell development and germ-cell malignancy. Nature. 2009;460:909–913. doi: 10.1038/nature08210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu M, Wang PF, Lee JS, Martin-Brown S, Florens L, Washburn M, Shilatifard A. Molecular regulation of H3K4 trimethylation by Wdr82, a component of human Set1/COMPASS. Molecular and cellular biology. 2008;28:7337–7344. doi: 10.1128/MCB.00976-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wysocka J, Swigut T, Xiao H, Milne TA, Kwon SY, Landry J, Kauer M, Tackett AJ, Chait BT, Badenhorst P, et al. A PHD finger of NURF couples histone H3 lysine 4 trimethylation with chromatin remodelling. Nature. 2006;442:86–90. doi: 10.1038/nature04815. [DOI] [PubMed] [Google Scholar]
- Yamaji M, Seki Y, Kurimoto K, Yabuta Y, Yuasa M, Shigeta M, Yamanaka K, Ohinata Y, Saitou M. Critical function of Prdm14 for the establishment of the germ cell lineage in mice. Nature genetics. 2008;40:1016–1022. doi: 10.1038/ng.186. [DOI] [PubMed] [Google Scholar]
- Yokoyama A, Somervaille TC, Smith KS, Rozenblatt-Rosen O, Meyerson M, Cleary ML. The menin tumor suppressor protein is an essential oncogenic cofactor for MLL-associated leukemogenesis. Cell. 2005;123:207–218. doi: 10.1016/j.cell.2005.09.025. [DOI] [PubMed] [Google Scholar]
- Zang C, Schones DE, Zeng C, Cui K, Zhao K, Peng W. A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics. 2009;25:1952–1958. doi: 10.1093/bioinformatics/btp340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, et al. Model-based analysis of ChIP-Seq (MACS) Genome biology. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.