Summary paragraph:
During ontogeny, proliferating cells become restricted in their fate through the combined action of cell-type specific transcription factors and ubiquitous epigenetic machinery, which recognize universally available histone residues or nucleotides but are nonetheless deployed in a highly context-dependent manner1,2. The molecular functions of these regulators are generally well understood, but assigning direct developmental roles is hampered by complex mutant phenotypes that often emerge following gastrulation3,4. Recently, single-cell RNA sequencing (scRNA-seq) and analytical approaches have explored this highly conserved process across numerous model organisms5–8, including mouse9–18. To elaborate on these strategies, we investigated a panel of ten essential regulators using a combined zygotic perturbation, scRNA-seq platform where many mutant embryos can be assayed simultaneously to recover robust transcriptional and morphological information. Deeper analysis of central Polycomb Repressive Complex (PRC) 1 and 2 members indicate substantial cooperativity, but distinguishes a PRC2-dominant role in restricting the germline that emerges from gross molecular changes within the initial conceptus. We believe our experimental framework will eventually allow for a fully quantitative view of how cellular diversity emerges using an identical genetic template and from a single totipotent cell.
Single-cell view of mouse gastrulation
Gastrulation represents a period of embryogenesis that begins with the induction of the primitive streak and proceeds through the generation of distinct germ layers and initial body axes4. To comprehensively assess this period in mouse development, we generated scRNA-seq data from pools of sibling embryos with a B6/CAST F1 father, allowing us to computationally distinguish single replicates by their randomly inherited CAST genotype, and sex according to ChrX- and Y-linked gene expression (Extended Data Fig. 1a–c, Supplementary Tables 1 and 2). We sampled 9–11 wild-type embryos per time point, beginning with the pluripotent epiblast and proceeding to early organogenesis (Embryonic day (E)6.5 to E8.5, Fig. 1a). In total, our wild-type (WT) compendium comprises 88,779 high-quality single-cell transcriptomes from 50 embryos (median of 16,898 transcripts and 3,854 genes per cell and ~2–49% of each embryo, depending on developmental stage, Extended Data Fig. 1d, Supplementary Table 1).
To build a reference of transcriptional states, we iteratively clustered our data across replicates and time points. We also adjusted the number of informative “marker” genes per state, leading to a set of 712 that reliably assigns individual cells to one of 42 reproducible states (Fig. 1b, Extended Data Fig. 1e–g). We then assembled a complete lineage tree using state emergence time and gene expression, with additional support from RNA velocity analysis to indicate transcriptome dynamics (Fig. 1c, Extended Data Fig. 1h, i, Supplementary Tables 3 and 4). In our tree, all embryonic lineages stem from the pluripotent epiblast, which gives rise to early ectoderm followed by neural and non-neural sub-lineages, as well as to products of the primitive streak, including extraembryonic and embryonic mesoderm, neuromesodermal progenitors, the embryonic endoderm, and primordial germ cells (PGCs). We provide a detailed and referenced explanation for our nomenclature and tree placement in the Methods and Supplementary Note.
Disrupting epigenetic regulators
With our WT reference established, we proceeded to zygotically disrupt one of several epigenetic regulators, prioritizing key enzymes with known viability issues during early and mid-gestation3 (Extended Data Fig. 2a). We included the three major DNA methyltransferases, the maintenance enzyme Dnmt1 and the de novo enzymes Dnmt3a and Dnmt3b, as well as the repressive Histone-3-Lysine-9 (H3K9) methyltransferase G9a. Dnmt1, G9a, and Dnmt3b mutations are lethal after gastrulation, while Dnmt3a mutants die postnatally with noted neuronal abnormalities19. We also selected both canonical and noncanonical Polycomb complex subunits, which repress developmental genes in a temporal and cell-type specific manner: Rnf2 (also known as Ring1b) and Eed are essential to PRC1 and PRC2, while Kdm2b and L3mbtl2 are noncanonical PRC1.1 and PRC1.6 complex subunits, respectively20. Finally, we included the Histone-3-Lysine-4 (H3K4) methyltransferases Kmt2a and Kmt2b, Trithorax group orthologs that promote developmental gene expression in opposition to Polycomb21.
Unlike many transcription factors, these genes are expressed across the majority of lineages and cell states, although the de novo Dnmts are particularly abundant prior to lineage commitment (Extended Data Fig. 2b, c). To disrupt these genes, we injected B6/CAST-fertilized zygotes with the endonuclease Cas9 and 3–4 single-guide RNAs (sgRNAs) targeted to exons common to all isoforms, transferred E3.5 embryos into pseudopregnant females and recovered scRNA-seq data for 8–12 E8.5 embryos comprising 7,548–25,408 cells (Extended Data Fig. 2a). We confirmed gene disruption by inspection of read alignments over their respective target sites (Extended Data Fig. 2d).
Detecting emerging morphological defects
Our data allow us to explore mutant embryos both anatomically (by developmental progression and gross morphology) and molecularly (by transcriptional state). To examine developmental progression, we assigned mutant (hereafter “KO”) cells by marker expression to their closest WT state and examined embryo composition across replicates (Extended Data Fig. 2e, f). Certain regulator KOs clearly influence the number and kinds of states produced, but generally do not perturb their gross transcriptional identity (Fig. 2a, Extended Data Fig. 3a). Instead, most mutants appear to occupy earlier levels of the overall lineage hierarchy, suggesting developmental delays. We therefore innovated a stage-matching metric that considers and weighs the types of states and their relative proportions within an embryo compared to our WT data (Extended Data Fig. 3b–d, Supplementary Table 5).
Our approach confirmed many historical observations, including moderately increased severity for mutations of Dnmt1 compared to Dnmt3a or 3b, as well as of Kmt2b compared to Kmt2a, consistent with its primary role in orchestrating early differentiation22. Of our E8.5-isolated KO embryos, L3mbtl2 showed the greatest delay, arresting shortly after the onset of gastrulation at ~E7.0, followed by Eed and Rnf2, which gastrulate but largely fail to progress beyond ~E7.5 (Fig. 2b). Notably, many KOs affect development beyond progression or growth. For example, Eed and Rnf2 clearly gastrulate, but fail to produce neural ectoderm and bias the primitive streak towards posterior lineages such as the extraembryonic mesoderm and PGCs (Fig. 2a, Extended Data Fig. 3a). In contrast, L3mbtl2 KO embryos form some tissues of the early primitive streak that do not mature, but continue to produce abundant extraembryonic tissues (Extended Data Fig. 3a).
PRC1 and 2 converge to a common gene set
Assigning each cell from our mutant embryos to pre-defined WT states allowed us to measure within-state expression changes in addition to over- or underproduction of certain lineages. To compare our ten KOs, we initially identified differentially expressed genes for each mutant cell state against WT as “up” or “down” regulated (Supplementary Table 6). We then calculated the fraction of cell states where each gene is deregulated, doing so separately for the embryonic and extraembryonic lineages because of their independent origins and distinct epigenetic regulation23,24. As expected, L3mbtl2 remains a global outgroup, with the largest number of recurrently deregulated genes, though this may also be affected by lower overall embryonic complexity. We were surprised to see that the ncPRC1 subunit Kdm2b clusters with Eed and Rnf2, even though Kdm2b KO embryos generally produce more mature cell types (Fig. 2c, Extended Data Fig. 3a, e, f).
All three Polycomb-associated mutants also converge towards functional ontologies associated with developmental processes and cell cycle regulation (Extended Data Fig. 3f). For example, the tumor suppressor Cdkn2a, a known PRC-regulated locus25, is constitutively targeted by both PRC1 and 2. In contrast, L3mbtl2 KO embryos show limited overlap with Eed, Rnf2, or Kdm2b, supporting a predominantly PRC1-independent role for L3mbtl2 in early development that cannot be compensated for by PRC2 or other PRC1 complexes (Supplementary Table 7). In general, our PRC-associated KO data provide the most compelling results in terms of transcriptional and morphological defects to the gastrulation process itself. However, Dnmt1 and G9a KO embryos also exhibit functional ontologies associated with loss of imprinting and other previously described targets that will warrant further investigation (Supplementary Tables 6, 7).
Epigenetic deregulation at E6.5
To provide greater clarity into the onset of epigenetic disruption, we generated whole genome bisulfite sequencing (WGBS) data of E6.5 epiblast and extraembryonic ectoderm (Xecto) for each regulator KO, as these tissues represent the latest homogenous progenitors prior to the actions of gastrulation (Extended Data Fig. 4). We see clear global DNA methylation differences for Dnmt1 KO, as well as more subtly for Dnmt3b and Dnmt3a (Extended Data Fig. 4a–d).
We also examined changes at CpG islands (CGIs), which represent a major focal point of Polycomb and Trithorax group-based regulation and are usually free of DNA methylation in the epiblast. We observe notable CGI methylation in response to Kdm2b and Kmt2b KO, which both proceed through gastrulation with some developmental delay, but not for our Kmt2a KO, which advances normally through E8.5 (Fig. 2d, Extended Data Fig. 4d, Ref26). Kmt2b appears to protect a larger number of CGIs than Kdm2b and also operates within Xecto, suggesting either a broader utility or earlier preimplantation activity (Fig. 2d). Kdm2b and Kmt2b protected promoter CGIs are also ~2.5-fold enriched for H3K27me3-based regulation in the epiblast, though in general their associated genes are lowly expressed in our data, suggesting that their influence may be too subtle to pinpoint with the current scRNA-seq strategy (Extended Data Fig. 4e). In contrast, core PRC subunits Eed and Rnf2 do not appear to be required for protection against CGI methylation in the epiblast, but do influence the methylation status of surrounding regions (Extended Data Fig. 4f).
We also applied our scRNA-seq and WGBS data to explore retrotransposons within our Dnmt or G9a KOs, which otherwise exhibited limited gene expression differences (Extended Data Fig. 5a). Here, the ERVK subfamily of LTRs shows strongly coupled demethylation and transcription within Dnmt1 KOs, specifically methylation-sensitive Intracisternal A-type Particles (IAPs)27 (Fig. 2e, Extended Data Fig. 5b). This may explain Dnmt1 KO embryos’ impeded progression and death within ~1–2 days following E8.5. In contrast, continued IAP repression in Dnmt3b or G9a KOs suggests these embryos maintain a sufficient threshold to preserve epigenetic silencing (Fig. 2e).
Finally, we re-examined our L3mbtl2 KOs to better understand their severe phenotype, including near total embryonic arrest and continued extraembryonic growth (Fig. 2f). We analyzed cells from either the embryonic or Xecto lineage separately for gross expression changes as they relate to promoter DNA methylation. We identify 13 genes that are both highly over-expressed and show lower than expected promoter methylation (Fig. 2g, Extended Data Fig. 5c, d), including several previously reported ncPRC1.6 targets associated with gametogenesis28. Examined over early development, we find that L3mbtl2-sensitive genes are specifically unmethylated in both gametes and become methylated shortly following gastrulation (Extended Data Fig. 5e). Thus, L3mbtl2’s primary developmental function appears to be the active silencing of select gamete-specific genes, whose aberrant and exogenous expression may otherwise be detrimental, particularly within the epiblast.
PRC2 dominates early lineage restriction
We next compared Eed, Rnf2, and Kdm2b mutant phenotypes, which converge to a similar gene set but differ morphologically. Both Eed and Rnf2 KOs overproduce posterior-proximal structures, such as the allantois, without advancing the embryo proper29,30 (Fig. 3a). Notably, Eed KO embryos also substantially overproduce PGC state cells, a result we confirmed by generating KO embryos carrying a Prdm14 promoter-driven reporter31 (Fig. 3b, c, Extended Data Figs. 3a, 6a).
PRC complexes also contribute to imprinted X chromosome inactivation (XCI) in extraembryonic and random XCI in embryonic cells32,33. Separated by sex, we find that Eed KO females consistently fail to maintain the Xecto lineage and substantially derepress ChrX-linked genes, while Rnf2 KOs are more subtly deviated and Kdm2b KOs appear normal (Fig. 3d, e). Notably, extraembryonic endoderm (Xendo) also undergoes imprinted XCI but does not exhibit either of these phenotypes. Furthermore, we observe largely normal ChrX transcript ratios within embryonic lineages, though Eed participates in random XCI as well (Extended Data Fig. 6b).
In contrast to core PRC1 and 2 components, Kdm2b largely appears to play a secondary role on the same overall gene set. For example, Cdkn2a is upregulated in all embryonic lineages for all three KOs, but to a substantially higher degree in Eed and Rnf2 KO (Fig. 3f). Despite a canonical role as a tumor suppressor, Cdkn2a expression does not explain the overall gastrulation defect observed in Eed KO embryos, as co-injecting sgRNAs to Cdkn2a and Eed does not produce appreciable differences compared to Eed KO alone (Extended Data Fig. 6c–e).
In our three PRC KOs, differentially methylated CpGs collect within multi-kb territories, termed DNA methylation valleys (DMVs)34, that are normally maintained in a completely unmethylated state. Upon PRC disruption, a subset of DMVs become hypermethylated, though internal CGIs remain protected (Fig. 3g, h, Extended Data Fig. 7a). These PRC-sensitive DMVs are enriched for H3K27me3 and cover substantially larger genetic territories. They also preferentially harbor promoters of lineage-specific marker genes, although we see no straightforward correlation between epigenetically deregulated genes and the overall morphology of Eed KO embryos (Extended Data Fig. 7b, c, Supplementary Table 8). While we see the same overall change in methylation pattern in Eed, Rnf2, and Kdm2b KO epiblast, the net levels are lower for Kdm2b (Fig. 3g, h, Extended Data Fig. 8a, b). In all examined cases, Kdm2b shows some deregulation consistent with core PRC subunits, but does not appear to pass a critical threshold to substantially alter early embryo patterning.
Our DMV-level analysis also supports the heightened severity of the Eed KO phenotype. For example, although DMVs are preferentially unmethylated within WT epiblast, they are de novo methylated within the WT Xecto lineage, including embedded CGIs (Extended Data Fig. 7a, b). However, these CGIs do not gain methylation specifically within the Eed KO Xecto, including promoters of critical early regulators of the epiblast and germline, such as Otx2 and Prdm14, respectively (Fig. 3h, Extended Data Fig. 8b, c). Eed disruption therefore affects developmental gene promoters in both embryonic and extraembryonic compartments, producing an overall similar epigenetic pattern. Unfortunately, Prdm14 is generally lowly expressed, even within PGCs, limiting our ability to precisely monitor its differential expression as a possible explanation for Eed KO-specific overproduction. However, Dppa3, another key PGC marker gene, is more broadly expressed within Eed KO embryos and particularly abundant within the Xecto lineage (Fig. 3i). These data further support an early, PRC2-dominant role in restricting germline-relevant genes that extends into the trophectoderm.
The PRC2 phenotype precedes gastrulation
We generated additional Eed KO scRNA-seq data from E6.5 and E7.5 to see if these phenotypes otherwise obey general principles of WT development, including stepwise induction of committed progenitors by exogenous signals (Extended Data Figs. 9 and 10). Morphologically, we find minimal changes at E6.5, but confirm diminished complexity and delays by E7.5 (Fig. 4a, Extended Data Fig. 10b, c). Notably, Prdm14 reporter activity indicates that PGCs are positionally specified in Eed KOs, with signal limited to a few cells within the posterior-proximal epiblast at E6.5 (Fig. 4b). More generally, the relative proportions and transcriptional stability of early extraembryonic and embryonic products resemble WT, but subsequently become either abnormal or fail to consistently develop (Fig. 4c). By E7.5, products of the primitive streak skew posteriorly towards extraembryonic mesoderm and PGCs, while the axial mesoderm (node, notochord) is highly abnormal and more advanced embryonic mesoderm or neural ectoderm do not develop (Fig. 4c, Extended Data Fig. 10c, d).
We sought to identify when transcriptional biases first emerge that may determine the ultimate partitioning of Eed KO embryos. Many changes detected at E8.5 are already apparent prior to gastrulation: Cdkn2a is already active within the embryo proper and ChrX is aberrantly transcribed within the female Xecto lineage (Extended Data Fig. 10e, f). Moreover, PGC-associated marker genes tend to be abundant and co-expressed within the same cells of several early states, including the epiblast and pre-specified primitive streak (Fig. 4d, Extended Data Fig. 10g). These genes also function during naive pluripotency and are generally silenced during implantation35. Furthermore, we observe failed stepwise induction of homeotic (Hox) genes throughout the primitive streak, which generally matures from a Hoxb1-positive state into Hoxb1, Hoxd9 double positive caudal mesodermal tissues36. In contrast, Eed KO embryos express Hoxd9 prematurely and destabilize Hoxb1 induction, mirroring the eventual posteriorized phenotype (Fig. 4e, Supplementary Table 9).
Currently our pipeline cannot easily address the influence of non-autonomous factors. For example, deregulation of extraembryonic tissues may alter the initial morphogen gradients that set the primitive streak, which could lead to underdevelopment or promote biases. To examine how PRC2 interacts with these parameters, we generated a knockout mouse embryonic stem cell line (EedKO mESCs) to induce alternate fates in vitro (Extended Data Fig. 11a, b, Supplementary Figure 1). Specifically, we directed EedKO mESCs into a formative epiblast-like state using FGF, followed by exposure to different concentrations of signaling components for an additional 48h. Across many conditions, EedKO mESCs less reliably silenced pluripotency-associated genes, such as Dppa3 and Esrrb, and broadly expressed posterior-proximal mesodermal genes, such as Bmp4 and Bmp8b, which also support PGC production in vivo37. We were unable to induce neural ectodermal genes, even when impeding competing mesendodermal and surface ectodermal pathways with small molecule inhibitors (Fig. 4f, Extended Data Fig. 11c, d). Thus, the Eed mutant phenotype appears to reflect a failure to adequately demarcate exit from pluripotency with the independent and exogenous priming of neural ectodermal and mesendodermal lineages.
Discussion
We present a combined genetic perturbation, scRNA-seq strategy to functionally dissect mammalian embryogenesis. Our platform is designed to understand complex mutant phenotypes comprehensively, both anatomically and molecularly, and to account for natural variation, which may be fundamental to a given developmental process. We investigated a number of key epigenetic regulators that produce lethal post-gastrulation phenotypes but have been difficult to characterize in full because they are presumed to buffer differentiation across many contexts. Using this approach, we confirm that core PRCs largely function cooperatively to counteract an otherwise innate mesodermal bias and safeguard neural regulator induction within the early ectoderm. However, PRC2 mutants exhibit somewhat greater severity that includes the overproduction of PGCs, broad destabilization of a shared PGC/pluripotency subnetwork, and failure to establish several key epigenetic features within the Xecto. Additional work will be necessary to fully resolve the interactions between these and other regulators as they coordinate morphogenesis.
We believe that our approach is highly tractable and may be expanded to address these and other questions, including the simultaneous disruption of multiple genes to explore epistasis or redundancy and conditional strategies to infer temporal, lineage, maternal, or non-autonomous effects38. Integrating molecular lineage recording strategies will contribute additional layers regarding how progenitor fields become altered without epigenetic supervision16. Finally, the ability to measure multiple parameters across replicates will provide insight into the robustness of developmental encoding: how these indeterminate processes reliably reproduce an identical body plan. Cumulatively, these strategies may ultimately yield a complete description of the interactions between genetic and epigenetic mechanisms that govern ontogeny.
Materials and Methods
Embryo generation
Protocols are adapted from those described previously40. Briefly, B6D2F1 strain female mice (age 6–8 weeks, Jackson labs) were superovulated by serial injection of Pregnant Mare Serum Gonadotropin (5IU per mouse, Prospec Protein Specialists) followed by Human Chorionic Gonadotropin (5IU, Millipore) 46 hours later. 12–14 hours after priming, MII stage oocytes were isolated in M2 media supplemented with hyaluronidase (Millipore) and stored in 25 μl drops of pre-gassed KSOM with ½ amino acids (Millipore) under mineral oil (Irvine Scientific). Zygotes were generated by piezo-actuated intracytoplasmic sperm injection (ICSI) as previously described41 using thawed B6/CAST F1 strain sperm in batches of 30–50 oocytes and standard micromanipulation equipment, including a Hamilton Thorne XY Infrared laser, Eppendorf Transferman NK2 and Patchman NP2 micromanipulators, and a Nikon Ti-U inverted microscope. Alternatively, for material subjected to whole genome bisulfite sequencing (WGBS), which does not require SNP-based analysis, hormone primed females were mated overnight with B6D2F1 males (age 2–12 months, Jackson labs) and zygotes were isolated as described above for oocytes.
For zygotic disruption, pronuclear stage 3 (PN3) zygotes were recovered after ~6 hours of incubation and injected with a cocktail consisting of 200 ng/μl Cas9 mRNA and a 100 ng/μl equimolar ratio of 3–4 single guide RNAs (sgRNAs) targeting different exons of an epigenetic regulator gene locus (designed using ChopChop42 and IDT’s CRISPR-Cas9 guide RNA checker as described previously24, see Supplementary Table 10). Preferentially, targeted exons were chosen to be located towards the 5’end and to be shared across isoforms. At ~84 hours postfertilization, cavitated blastocysts were transferred into the uterine horns of pseudopregnant CD-1 strain females (25–35g, Charles River) generated by mating with Vasectomized SW strain males (Taconic), which results in a 24 hour offset in gestational time to accommodate implantation.
B6/CAST F1 mice were generated in house by breeding C57BL/6J strain female mice with CAST/EiJ strain males. Swimming sperm were isolated from the caudal epididymis for males (> 2 months of age) in M2 media (Millipore), decapitated by brief pulse sonication in a Branson Sonifier with double stepped tip (Branson), and stored in –40°C in 25 μl aliquots for use within 6 months to a year of collection. Cas9 mRNA and sgRNAs were in vitro transcribed using the mMESSAGE mMACHINE® T7 Ultra or MEGAshortscript® Kits (Thermo Fisher), purified using the RNA clean and concentrator kit (Zymogen), and resuspended in injection buffer (5 mM Tris buffer, 0.1 mM EDTA, pH = 7.4).
All procedures have been performed in our specialized facility, followed all relevant animal welfare guidelines and regulations, and were approved by Harvard University IACUC protocol (#28–21) and the Max Planck Institute for Molecular Genetics (G0247/13-SGr1).
Single-cell transcriptome profiling of embryos
Wild-type (WT) and knockout (KO) embryos were isolated from surrogate mice between E6.5–8.5 at 12-hour intervals for WT and at gestational day E8.5 for epigenetic regulator KO experiments. The emergence of the Eed mutant phenotype was profiled in more detail by additionally sampling E6.5 and E7.5 KO embryos. Outermost extraembryonic tissues (yolk sac, trophectoderm derived tissues) were preserved if possible. Microscope images recorded embryo number and morphology. Embryos were serially washed through several droplets of 1xPBS/0.4%BSA, pooled without any morphology-based pre-selection and subjected to tissue dissociation in 200 μl TrypLE Express (Gibco) for 40–60 minutes at 37°C, with pipetting in 5-minute intervals. The cell suspension was filtered using Scienceware Flowmi Cell Strainers, 40 μM. Cells were washed twice with 1 ml 1xPBS/0.4%BSA and centrifugation for 5 minutes at 1200 rpm. The cell concentration was determined using a hemocytometer and cells were subjected to single-cell RNA sequencing (10x Genomic, Chromium™ Single Cell 3’ v2 or v3) aiming for a target cell recovery of up to 13,000 sequenced cells per sequencing library. Single-cell libraries were generated following the manual instructions, with the exception of fewer PCR cycles than recommended during cDNA amplification or library generation/sample indexing to increase library complexity. Libraries were sequenced with a minimum of 230 million paired end reads according to parameters as described in the manual. For details see Supplementary Table 1 and 2.
Imaging embryos for morphology and size measurement
E7.5 and E8.5 wild type and Eed KO embryos acquired in experiments that were performed independently of the single-cell sequencing experiments were imaged using ZEISS AxioZoom.V16 microscope and ZENBLUE imaging software at 10X and 7X objectives respectively, with z-stacks of 12–17μm intervals. To obtain a higher resolution of morphology, individual E7.5 embryo images were acquired at 50X and E8.5 embryos at 40X objectives. The E6.5 embryos, which were subjected to single-cell sequencing, were imaged using an Olympus IX71 inverted microscope and Metamorph software. Images of E6.5 WT embryos were acquired at 4X and KO embryos at 10X. Wild-type E7.5 and 8.5 embryos shown in Extended Data Fig. 10 to provide a size and morphological comparison to Eed KO were generated by natural mating. Surface area (in μm2) of embryos was measured using the ‘region’ tool by drawing a polygon contour around each embryo in ZENBLUE.
Prdm14-mVenus reporter experiments: In vitro fertilization, electroporation and embryo imaging
In vitro fertilization of B6D2F1 oocytes was performed with reporter sperm from heterozygous males with mVenus under the control of Prdm14, as described previously43. The reporter strain was generated by the lab of Mitinori Saitou31 and the mVenus Prdm14 promoter sperm (B6.Cg-Tg(Prdm14-Venus)1Sait/SaitRbrc; BRC No. RBRC05384) were provided by the RIKEN BRC through the National Bio-Resource Project of the MEXT/AMED, Japan (Acc. No. CDB0461T; http://www.cdb.riken.jp/arg/mutant%20mice%20list.html; Reproduction. 2008 136(4): 503–14). PN3 zygotes were washed in M2 medium and prepared for electroporation. Electroporation reactions were setup according to the Alt-R CRISPR-Cas9 ribonucleoprotein (RNP) complex protocol from Integrated DNA Technologies (IDT). RNP complexes were assembled just prior to electroporation. Briefly, 2 μL of 200 μM tracrRNA and 0.67 μL of each 200 μM crRNA were mixed, heated to 95°C for 5 minutes and allowed to anneal at room temperature for 10 minutes. 3 μL of crRNA-tracrRNA mix, 1 μL of 61 μM Alt-R Hi-Fi Cas9 Nuclease 3NLS was diluted in 46 μL of Opti-MEM medium and incubated at room temperature for 20 minutes.
The NEPA21 electroporator was used with the following settings. Impedance values were maintained between 120 and 160 kΩ. Four poring pulses of 30 V and 2.5 milliseconds was used with an interval of 50 milliseconds, voltage decay of 10% and (+) polarity. Transfer pulse was applied at 5 V for 50 milliseconds with an interval of 50 milliseconds, voltage decay of 40% and alternating polarity (+) and (−).
Zygotes that developed to blastocyst stage were screened for mVenus fluorescence in the inner cell mass as only half of the embryos are expected to carry the reporter. mVenus positive embryos were re-transferred to pseudopregnant CD-1 fosters and isolated after in vivo development to E6.5, E7.5 and E8.5. Isolated embryos were washed in cold 1x PBS with 0.4% BSA and fixed overnight in 4% Paraformaldehyde (PFA) at 4°C followed by three washes in cold 1x PBS. Nuclei were stained with 0.24 μg/mL DAPI for 40 minutes at 4°C. Images were acquired using Zeiss LSM880 at 10X magnification and z-stacks of 5 μm interval. Images were processed and maximum intensity projections of the z-stacks were generated using the 3D-project tool of the ImageJ software bundled with Java 1.8.0. Four, six and seven Eed KO embryos carrying a Prdm14-reporter were isolated at E6.5, E7.6 and E8.5, respectively, and demonstrated similar results per developmental stage.
Immunofluorescence
Embryos were dissected from deciduae at specific stages in cold 1x HBBS and then fixed overnight in 4% PFA at 4°C. Embryos were rinsed three times in 1x PBS and permeabilized with PBT0.5 (0.5% Triton X-100 in 1x PBS) for two hours followed by blocking for an hour with blocking solution (10% FBS in PBT0.5). Embryos were incubated for 72 hours at 4°C with the primary antibody anti-Histone H3 Lysine 27 tri-methylation (Abcam ab6002), diluted in blocking solution to 1:200. The following day, embryos were washed with PBT0.5 four times (30 minutes per wash), and blocked overnight at 4°C in blocking solution. The following day, embryos were incubated overnight at 4°C with donkey anti-mouse Alexa Fluor® 488 (Invitrogen A21202), diluted in blocking solution at 1:400. Embryos were subsequently washed with PBT0.5 four times (30 minutes per wash), and nuclei were counterstained with DAPI (0.24 μg/mL) for 40 minutes at 4°C. Embryos were washed with PBT0.5 four times (30 minutes per wash), and post-fixed with 4% PFA for 20 minutes. Final washes (three times, 15 minutes) were performed with 0.02 M phosphate buffer (0.025 M NaH2PO4; 0.075 M Na2HPO4, pH 7.4) followed by optical clearing at 4°C for 24 – 48 hours with 1.62 M RIMS clearing agent (Histodenz in 0.02 M phosphate buffer). Images were acquired using Zeiss LSM710 at 63X magnification (oil immersion) and z-stacks of 2.13 μm intervals were generated. Two independent experiments were performed with similar results. A representative z-stack is shown in Extended Data Figure 9c.
EedKO mESC line generation and fate induction experiments
WT V6.5 mouse ESCs (provided by the lab of Konrad Hochedlinger, tested negative for mycoplasma, authenticated by Nanostring for mouse pluripotency markers) were simultaneously transfected with two plasmids encoding Cas9 alongside one of two sgRNAs targeting sequences flanking the Eed gene locus (See Supplementary Table 10). Subclones were expanded and homozyogous Eed disruption was confirmed by target site amplification and Sanger sequencing.
Our selected EedKO cell line was expanded for 16 passages in Serum/LIF to ensure complete depletion of H3K27 methylation before Western blotting for H3K27me3 on histone extracts using the tri-methyl-histone H3 antibody (Cell, Signaling, C36B11, at 1:500 dilution). Histone 4 was detected by anti-histone H4 as a loading control (Millipore, 07–108, at 1:1,000 dilution). Tricine gels were used with tricine buffer and SeeBlue™ Plus2 Pre-stained Protein Standard (Invitrogen, LC5925). Two independent histone isolations and Western blots have been performed with similar results.
WT and EedKO mESC were maintained in Serum/LIF and cultured for at least two weeks in N2B27-containing 2i/LIF media on gelatin-coated plates to ensure their full conversion to naïve pluripotency prior to induction with exogenous factors44. For signaling experiments, 10,000 cells were plated per well into N2B27 media containing 12 ng/ml bFGF. For these experiments, we used human plasma fibronectin (purified protein, Millipore) coated 8-well chamber slides (μ-Slide 8 Well, ibidi). After 24 hours, media was exchanged with N2B27 containing 12 ng/ml bFGF and select concentrations of signaling compounds and/or small molecule inhibitors for an additional 48 hours. Final concentrations of growth factors or small molecule inhibitors were as follows: 12 ng/ml Recombinant Human bFGF (R&D Systems); 0.25 μM Retinoic acid (Sigma); 5 and 500 ng/ml Recombinant Human BMP-4 Protein (R&D Systems); 10, 100, 1000 ng/ml Recombinant Human WNT-3A Protein (R&D Systems); 10 and 1000 ng/ml Recombinant Human/Murine/Rat ACTIVIN A (Peprotech); 0.5 μM ALK2/3 inhibitor LDN-193189 (Stemgent, 10 mM solution) to inhibit the Bmp4 pathway; 3.3 μM Tankyrase1/2 inhbitor XAV939 (Tocris) to inhibit the Wnt pathway; and 10 μM TGF-β RI Kinase Inhibitor VI SB431542 (Millipore) to inhibit Activin/Nodal signaling. Total RNA was isolated by washing wells twice with PBS followed by adding 350 μl RLT buffer as part of the RNeasey Plus Micro Kit protocol (Qiagen). Additional samples for each experiment include N2B27 containing 2i/LIF at day 0 and N2B27 containing 12 ng/ml bFGF after 24 and 72 hours, respectively. mESC experiments and RNA isolation were done as three independent experiments.
Expression profiling of lineage-specific genes using NanoString
To profile the expression of 44 genes and 4 housekeeping genes (Polr1b, Hprt, Abcf1, Gusb), 400 ng total RNA were used in a NanoString nCounter PlexSet assay to profile 88 RNA samples of the mESC experiments described above (triplicates for all but one condition, duplicate for 100 ng/ml WNT-3A). Probe hybridization was set up according to manufacturer’s instructions and performed for 24 hours (MAN-10040–05). Reactions were pooled per column, generating 12 pools and run on the NanoString nCounter SPRINT Instrument. False negative probes detected up to 14 counts, which informed the magnitude of potential false negative signal. Thus, 20 counts were conservatively removed from all measurements. To provide reliable estimates on expression differences, fold changes between transcript counts in WT and EedKO were only calculated if, for a given experimental condition, the gene was detected with at least 50 counts (after background subtraction) in at least one of the two cell lines. Significance of expression differences was tested for all genes (t-test, R function t.test).
Bioinformatics
If not stated otherwise: All statistics and plots are generated using R version 3.5.1 “Feather Spray”. Boxes indicating the median and quartiles with whiskers reaching up to 1.5 times the interquartile range. The violin plot outlines illustrate kernel probability density, i.e. the width of the shaded area represents the proportion of the data located there. For violin plots, boxes indicate the median, with quartiles and whiskers reaching up to 1.5 times the interquartile range. Heatmaps were plotted using the Complex Heatmap package45 and browser track figures using the Gviz package46.
Preprocessing
The Cell Ranger pipeline version 3 (10x Genomics Inc.) was used for each scRNA-seq data set to de-multiplex the raw base call files, generate the fastq files, perform the alignment against the mouse reference genome mm10, filter the alignment and count barcodes and UMIs. Outputs from multiple sequencing runs were also combined using Cell Ranger functions.
Genotyping - Alignment
For each experiment, the scRNA-seq data were aligned against an mm10 hybrid mouse genome assembly using STAR47 with default settings and “--outSAMattributes NH HI NM MD.” The hybrid genome was prepared using SNPsplit48 to mask SNPs between the mouse version mm10 (GRCm38) and the CAST/EiJ strain genomes with the ambiguity base (N). Subsequently, SNPsplit was used to sort reads that cover SNPs by origin (reference genome). Unambiguous and unique alignments of WT samples were used to create a list of SNPs that were covered by reads originating from both reference genomes. Finally, reads covering these SNPs were used to determine the allele composition for each cell, i.e. fraction of CAST/EiJ specific SNPs.
Genotyping - Cell to embryo assignment, doublet removal, and sex determination
Single cells were assigned to embryos according to the autosomal fraction of CAST SNPs, a 19-dimension vector that allowed us to estimate the number of embryos per experiment. A minimum number of 1,000 covered SNPs and SNP information for each autosome was required. k-means clustering for multiple k (kmeans function in R, k = 2–15, default parameters) was performed on cells that fulfilled this criterion and evaluated by calculating the AIC for each model. The k with the minimal AIC defined the number of detected embryos, and the kernel averages represent the SNP profile for each embryo in the pool. Cells were then assigned to the embryo based on minimum distance in their SNP profile.
We found that unstable cell to embryo assignments were often either the result of low UMI counts or of very high counts, most likely representing cell multiplets. To eliminate these, we performed 100 iterations of our embryo assignment strategy using a randomly sampled 20% of each cell’s SNPs and discarded cells that changed their assignments (Extended Data Fig. 1b). Stably assigned cells were consistently assigned to the same embryos based on the k-means clustering (Supplementary Table 2).
Embryo sex was determined based on the expression of the following genes: Xist (ENSMUSG00000086503) to count XX contexts and Erdr1 (ENSMUSG00000096768), Ddx3y (ENSMUSG00000069045) and Eif2s3y (ENSMUSG00000069049) to reliably detect transcription from the Y chromosome. The Cell Ranger gene barcode matrices were used to obtain per cell expression counts for these 4 genes and determine the fraction of positive cells per embryo. Embryos with a high percentage of Xist expressing cells were determined to be female while embryos with higher fractions of Erdr1, Ddx3y or Eif2s3y were determined to be male (Supplementary Table 1).
Genotyping - Cluster determination
The cluster determination was split into four main parts and was largely done using the R package Seurat with default settings49. The establishment of the WT reference cell states was published previously using Cell Ranger version 2 processed data16. In brief, (1) A preliminary set of clusters were generated by agnostically clustering WT embryos of the same stage as a pool without taking replicate identity into account, followed by generating per replicate clusters according to this assignment. Then, (2) replicate embryo clusters from step 1 were used to generate median expression vectors and clustered across time points to obtain preliminary cell states. Next, (3) all WT cells were assigned to their most similar cluster by Euclidean distance according to a reduced set of 712 marker genes to determine the specific cell state kernel. Finally, (4) all WT and KO embryo cells were assigned to their most similar cluster by Euclidean distance according to a reduced set of 706 marker genes to determine their specific cell state identity after reprocessing with Cell Ranger version 3.
(1) Embryo specific centers (WT): All de-convoluted wild type single cells of the same developmental stage were processed together after discarding cells that were not confidently assigned to a genotype/embryo. Parameters were adopted from the Seurat manual. The expression data were log-normalized, scaled to 10,000 and UMI biases were removed (vars.to.regress = ”nUMI”), followed by calling of variable genes (parameters: mean.function = ExpMean, dispersion.function = LogVMR, x.low.cutoff = 0.0125, x.high.cutoff = 3, y.cutoff = 0.5). Next, the variable genes were used to run the PCA and the first 20 PC’s were used for cluster detection. The average expression for each embryo and cluster was calculated, which we refer to as “embryo-specific centers.” This allowed us to detect even rare cell states while preserving embryo-specific variability.
(2) Cell cluster (WT): The embryo specific centers of all WT stages were combined into one analysis to determine variable genes. A PCA was run based on the variable genes and the first 20 PC’s were used to cluster the embryo specific centers (parameters adjusted for low ‘cell’ number: k.param = 8, k.scale = 50, prune.SNN = 1/10). This resulted in 42 clusters of embryo-specific centers and the median expression profile of each cluster was calculated to form preliminary cell states. Then, as a temporary step, all WT cells from all stages were simultaneously assigned to their closest preliminary cell state based on expression similarity (Euclidean distance of log-expression values of variable genes calculated above) to calculate a gene expression average (kernel).
At this stage, we observed that the number of variable genes was unevenly distributed across preliminary cell states, which created biases when comparing single cells across them (clusters defined by a greater number of variable genes have more opportunities to match sparse single-cell measurements, while those defined by fewer variable genes accumulate more noise by including them). We therefore sought to normalize the number of state-specific genes that contribute to each cluster by using the top 30 marker genes (highest difference in fraction of positive cells within the cluster versus other clusters) from each of the 42 cell states. We found that this reduced gene set provides a more stable, lower-noise assignment without biasing the information to describe each cell state (n = 712 unique genes, Extended Data Fig. 1e) and used this set of genes in (3).
(3) Refinement of WT reference cell states: WT cells were assigned to cell state expression profiles (kernels) based on their Euclidean distance log-expression values for the 712 marker genes. Single-cell distances are significantly smaller to their matched cell states than to next-best matches. Cell Ranger version 3.0 was released by 10x Genomics in the course of the generation of this manuscript. Thus, raw data was reprocessed and the cell state kernels were adjusted by again assigning the WT cells to the kernels.
(4) Cell states of single cells: The WT and KO embryo cells were assigned to the cell states based on their Euclidean distance log-expression values for the now 706 marker genes (Cell Ranger v3 adjusted). Single-cell distances are significantly smaller to their matched cell states than to next-best matches (Extended Data Figs. 1f, 2e, and 6a). Cell states with an insufficient number of cells from KO embryos (≤30 cells) were discarded from further analysis.
We believe our experimental strategy should largely account for differences in embryo genotype by sampling multiple siblings: each allele will only be heterozygous for the castaneus background in 50% of embryos, our trends are generally observed across all replicates, and the processes of gastrulation are highly conserved. Nonetheless, we cross referenced our 712 marker genes against those with reported castaneus expression biases across 23 adult and embryonic tissues, including those from all three germ layers, the extraembryonic ectoderm, and the extraembryonic endoderm50. Of the 1,530 genes that show biased expression in at least 10% of these tissues within an F1 context (with reciprocal crosses to control for potential imprinting), only 53 were also marker genes (0 – 7 per cell state, median 2). Furthermore, we saw that all cell states were comprised of several embryos and never resulted from a single embryo.
Cell states prevalence
Prevalence of cell states with respect to embryo stage (Extended Data Fig. 1h) was evaluated normalizing each state across the developmental stages (row).
Cell state proportions
Cell state proportions per embryo were calculated as the number of cells assigned to a cell state divided by the total number of cells comprising an embryo. The stage specific median embryo was calculated as the median proportion of cell state fractions of all embryos from the same developmental stage (applied after our delay adjusted assignment, see below). Proportion changes in Fig. 2c were calculated as the log2-fold change between the mean proportions of developmentally stage-matched KO and WT embryos.
Correlation of gene expression
Gene expression profiles were compared between either two different cell states or between the WT and Eed KO by correlating the average gene expression profiles of the marker genes (R function cor, Pearson correlation).
Differential expression
We called differentially expressed genes between WT and KO experiments for every detectable cell state. To account for changes in 10x Genomics chemistry versions and possible batch effects, we ran the removeBatchEffect function of the limma package per cell state across all samples51. For comparisons across all embryos, we normalized our percent positive cells data with the same function for each state individually. The resulting normalized read count data were used for differential gene expression of the KO vs WT cells. A gene was called differentially expressed within a cell state between WT and KO if it fulfilled the following criteria: (1) adjusted P-value of < 0.05, (2) minimum detectable fraction of 0.05 within at least one condition (WT or KO) and (3) either a minimum difference of 0.1 in percent positive cells or a minimum absolute log2 fold-change of 0.2. Sex chromosomal genes were excluded from further analysis, as well as the PGC cell state as it is not highly observed across many of the KOs that proceed to later developmental stages.
We assigned genes as recurrently deregulated if they were differentially expressed in at least two cell states within the extraembryonic derived lineages (Xendo, extraembryonic endoderm; Xecto, extraembryonic ectoderm) or the embryonically derived lineages (Epiblast; Xmeso, extraembryonic mesoderm; Eendo, embryonic endoderm, Eecto, embryonic ectoderm; Emeso, embryonic mesoderm) and was prevalently up- or downregulated (Supplementary Table 6).
Pathway enrichment for the recurrently differentially expressed genes was performed by a hypergeometric test using the GSEA online tool. The P-value was adjusted for multiple testing according to Benjamini and Hochberg, with 0.05 as a cutoff (Supplementary Table 7).
Stage matching metric to assign “developmental stage”
The gestational age of all KO embryos was adjusted for developmental delay by comparing cell state data to the median of the WT embryos from each time point. Because some states may be more informative about developmental stage than others, we performed two distinct principal component analyses (PCA) using the WT replicate data: (1) using the cell state proportions and (2) using the binary information on presence and absence of a cell state. For the cell state proportion assignments, only the embryonic cell states were used (Emeso, Eendo, Eecto, Epiblast, PGCs and Xmeso), since Xendo and Xecto cell state proportions are more sensitive to embryo dissection and single cell dissociation. The R function prcomp (parameters: retx = TRUE, center = TRUE, scale = TRUE) was used to calculate PCs for WT embryos and WT medians and the predict function transformed KO data according to the WT loadings (Extended Data Fig. 3b–d, Supplementary Tables 1 and 5). The first PCs of both PCAs were used to assign each KO embryo to its closest median WT by Euclidean distance.
H3K27me3 ChIP-seq data
Publicly available H3K27me3 ChIP-seq data of E6.5 epiblast and extraembryonic ectoderm52 were used to calculate the average H3K27me3 occupancy of each gene’s promoter region (calculated as the region 1500 bp upstream to 500 bp downstream of the TSS). Only the first two replicates were used for each, since these two replicates showed a similar trend when compared to WT gene expression of our epiblast or Xecto cell states, while the third replicate did not show any linear relation to gene expression. For both data sets, a cutoff of 400 (average H3K27me3 peak level) showed the strongest drop in gene expression and thus most likely represents functional repression by H3K27me3. The binary assignment of having a promoter H3K27me3 peak was therefor set at this threshold.
PGC number estimation
The total number of PGCs per WT or Eed KO embryo was estimated using the fraction of recovered state 27 (PGC) cells within that embryo multiplied by its total estimated cell number. The total estimated cell number was calculated by multiplying the fraction of the embryo within the pool to the total number of cells in the single-cell suspension prior to loading (as measured using a hemocytometer, see above). We then applied a correction to account for potential technical biasing of embryonic versus extraembryonic sampling during isolation, though this did not change our estimates substantially. The enrichment was tested using the Wilcoxon test (R function wilcox.test, two-sided). All state 27 counts are given in Supplementary Table 5.
UMAP projection
Uniform Manifold Approximation and Projection (UMAP) was used as a dimension reduced visualization of single-cell marker gene expression profiles53. Transformation of the WT data was performed using the R function umap and subsequently applied to all KO data to project it onto the same manifold as produced for WT.
RNA velocity
RNA velocity was calculated using the velocyto tool54 and visualized using scanpy55. The previously calculated UMAP was used for velocity projection.
Cut site analysis
Single reads covering the targeted genes were extracted from the initial alignment and were realigned against the intron-free DNA sequence of the respective gene using STAR47 with default settings and “--alignEndsType EndToEnd --outSAMattributes NH HI NM MD.” The aligned reads were next classified with respect to the target site of the sgRNA as (1) “Spliced/deleted” if they did not match any nucleotide but were spanning across the entire target site, (2) “Mismatched” if any of the nucleotides was aligned as a mismatch/deletion/insertion to the reference, (3) “Complete” if all nucleotides matched the target site, or (4) “Insufficient” if the reads did not span the full target site.
Retrotransposon detection
To quantify retrotransposon expression, only reads that do not overlap with gene annotations were considered. In addition, split reads as well as reads containing an extensive poly-A stretch were excluded. A read was defined as covering a poly-A region if (1) the last 70% of bases were mainly A (A stretch with maximal 10 bases C, G, or T) or (2) the first 70% were mainly T (T stretch with maximal 10 bases A, C, or G). The remaining reads were overlapped with annotated repetitive elements (repeat masker file downloaded from UCSC) and reads with a minimum overlap of 90% were considered for further analysis. Reads that mapped uniquely or multiple times to the same repeat family were counted once per family, reads that mapped to different repeat families were excluded. Subsequently, reads were counted per repeat family, embryo, and cell state and then normalized to full number of considered reads (number of repeat reads plus number of UMIs sequenced).
WGBS library generation and data processing
E6.5 epiblast and Xecto were isolated from ≥7 embryos, pooled and processed into WGBS libraries using the Accel-NGS Methyl-seq® kit as previously described using ≤9 final PCR cycles24. Reads were aligned to the mouse mm10 reference genome using BSmap with flags -v 0.1 -s 16 -w 100 -S 1 –q 20 –u -R. In order to determine the methylation state of all CpGs captured and assess the bisulfite conversion rate, we used the mcall module in the MOABS software suite with standard parameter settings56. Finally, we converted the resulting CpG level files to bigwig files, filtering out all CpGs that were covered with less than ten reads.
For all downstream analysis, replicates were averaged after having applied the coverage cutoff and differentially methylated CpGs/genomic regions were defined by having a minimum difference of 0.1 to the respective WT tissue.
CpG islands were downloaded from the UCSC genome browser, gene annotations were obtained from the build in Cell Ranger version gtf file, and promoter regions were defined as 2.5 kb upstream to 500 bp downstream of annotated TSS. Xecto hypermethylated CpG islands were previously defined24.
The CpG density of a genomic region was calculated as the fraction of CpG dinucleotides within a 200 bp window (sliding window with 20 bp offset).
DNA methylation valleys (DMVs) were detected using a 2 kb sliding window (500 bp offset). Regions with an average methylation rate below 0.15 in WT (excluding CpG island methylation) were merged given a maximum distance of 1 kb.
Data availability
All datasets have been deposited in the Gene Expression Omnibus and are accessible under GSE137337. Source data behind Figures 1a, b, 2, 3a, b, d–f, h, i, 4a, c–f and Extended Data Figures 1b, c, e–i, 2b–f, 3, 4b–f, 5a–c, e, 6, 7, 8a, 9b, 10b–g, 11c, d are available at https://oc-molgen.gnz.mpg.de/owncloud/s/F8g3y5F79JZRyof. Previously published data used in this study include H3K27me ChIPseq data (GSE98149), WGBS data for sperm and oocyte (GSE112320), preimplantation samples, including 8 cell stage embryos and the ICM and trophectoderm (TE) of the E3.5 blastocyst (GSE84236), and late stage samples including an average of somatic tissues and the E14.5 placenta (GSE42836).
Code availability
Code is available at https://github.com/HeleneKretzmer/EpigeneticRegulators_MouseGastrulation.
Supplementary Material
Extended Data
Acknowledgements
We thank Adriano Bolondi, Raha Weigert and other members of the Meissner laboratory, Michelle Chan and Denes Hnisz for discussions and advice, Sabine Otto for experimental support characterizing the Eed KO mESC line, Maria Walter for support with embryo isolations, Tobias Ahsendorf for help with initial efforts to optimize our genotyping pipeline, and Daniel Andergassen for discussions on SNP-typing. We are also grateful to Frederick Koch and the transgenic facility, including Miriam Peetz for their feedback and support. We thank Prof. Mitinori Saitou for the mVenus Prdm14 promoter sperm that were provided by the RIKEN BRC through the National Bio-Resource Project of the MEXT/AMED, Japan (Acc. No. CDB0461T). Funding: This work was funded by the NIH (1P50HG006193, P01GM099117, 1R01HD078679 and 1DP3K111898) and the Max Planck Society.
Footnotes
Competing interests
The authors declare no competing interests.
References
- 1.Hemberger M, Dean W & Reik W Epigenetic dynamics of stem cells and cell lineage commitment: digging Waddington’s canal. Nature Publishing Group 10, 526–537 (2009). [DOI] [PubMed] [Google Scholar]
- 2.Meissner A Epigenetic modifications in pluripotent and differentiated cells. Nat. Biotechnol 28, 1079–1088 (2010). [DOI] [PubMed] [Google Scholar]
- 3.Surani MA, Hayashi K & Hajkova P Genetic and epigenetic regulators of pluripotency. Cell 128, 747–762 (2007). [DOI] [PubMed] [Google Scholar]
- 4.Rivera-Pérez JA & Hadjantonakis A-K The Dynamics of Morphogenesis in the Early Mouse Embryo. Cold Spring Harb Perspect Biol 7, a015867 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Plass M et al. Cell type atlas and lineage tree of a whole complex animal by single-cell transcriptomics. Science 360, eaaq1723 (2018). [DOI] [PubMed] [Google Scholar]
- 6.Briggs JA et al. The dynamics of gene expression in vertebrate embryogenesis at single-cell resolution. Science 360, eaar5780 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wagner DE et al. Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo. Science 360, 981–987 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Farrell JA et al. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science 108, eaar3131 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Scialdone A et al. Resolving early mesoderm diversification through single-cell expression profiling. Nature 535, 289–293 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Peng G et al. Spatial Transcriptome for the Molecular Annotation of Lineage Fates and Cell Identity in Mid-gastrula Mouse Embryo. Developmental Cell 36, 681–697 (2016). [DOI] [PubMed] [Google Scholar]
- 11.Mohammed H et al. Single-Cell Landscape of Transcriptional Heterogeneity and Cell Fate Decisions during Mouse Early Gastrulation. Cell Rep 20, 1215–1228 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ibarra-Soria X et al. Defining murine organogenesis at single-cell resolution reveals a role for the leukotriene pathway in regulating blood progenitor formation. Nat Cell Biol 58, 598 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Peng G et al. Molecular architecture of lineage allocation and tissue organization in early mouse embryo. Nature 572, 528–532 (2019). [DOI] [PubMed] [Google Scholar]
- 14.Pijuan-Sala B et al. A single-cell molecular map of mouse gastrulation and early organogenesis. Nature 566, 490 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Nowotschin S et al. The emergent landscape of the mouse gut endoderm at single-cell resolution. Nature 569, 361–367 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chan MM et al. Molecular recording of mammalian embryogenesis. Nature 570, 77–82 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Cao J et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Argelaguet R et al. Multi-omics profiling of mouse gastrulation at single-cell resolution. Nature 576, 487–491 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Nguyen S, Meletis K, Fu D, Jhaveri S & Jaenisch R Ablation of de novo DNA methyltransferase Dnmt3a in the nervous system leads to neuromuscular defects and shortened lifespan. Dev. Dyn 236, 1663–1676 (2007). [DOI] [PubMed] [Google Scholar]
- 20.Laugesen A & Helin K Chromatin repressive complexes in stem cells, development, and cancer. Cell Stem Cell 14, 735–751 (2014). [DOI] [PubMed] [Google Scholar]
- 21.Piunti A & Shilatifard A Epigenetic balance of gene expression by Polycomb and COMPASS families. Science 352, aad9780 (2016). [DOI] [PubMed] [Google Scholar]
- 22.Glaser S et al. Multiple epigenetic maintenance factors implicated by the loss of Mll2 in mouse development. Development 133, 1423–1432 (2006). [DOI] [PubMed] [Google Scholar]
- 23.Rossant J, Chazaud C & Yamanaka Y Lineage allocation and asymmetries in the early mouse embryo. Philos. Trans. R. Soc. Lond., B, Biol. Sci 358, 1341–8– discussion 1349 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Smith ZD et al. Epigenetic restriction of extraembryonic lineages mirrors the somatic transition to cancer. 1–29 (2017). doi: 10.1038/nature23891 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Gil J & Peters G Regulation of the INK4b-ARF-INK4a tumour suppressor locus: all for one or one for all. Nat. Rev. Mol. Cell Biol 7, 667–677 (2006). [DOI] [PubMed] [Google Scholar]
- 26.Boulard M, Edwards JR & Bestor TH FBXL10 protects Polycomb-bound genes from hypermethylation. Nat Genet 47, 479–485 (2015). [DOI] [PubMed] [Google Scholar]
- 27.Walsh CP, Chaillet JR & Bestor TH Transcription of IAP endogenous retroviruses is constrained by cytosine methylation. Nat Genet 20, 116–117 (1998). [DOI] [PubMed] [Google Scholar]
- 28.Qin J et al. The polycomb group protein L3mbtl2 assembles an atypical PRC1-family complex that is essential in pluripotent stem cells and early development. Cell Stem Cell 11, 319–332 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Faust C, Schumacher A, Holdener B & Magnuson T The eed mutation disrupts anterior mesoderm production in mice. Development 121, 273–285 (1995). [DOI] [PubMed] [Google Scholar]
- 30.Voncken JW et al. Rnf2 (Ring1b) deficiency causes gastrulation arrest and cell cycle inhibition. Proc Natl Acad Sci USA 100, 2468–2473 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Yamaji M et al. Critical function of Prdm14 for the establishment of the germ cell lineage in mice. Nat Genet 40, 1016–1022 (2008). [DOI] [PubMed] [Google Scholar]
- 32.Żylicz JJ et al. The Implication of Early Chromatin Changes in X Chromosome Inactivation. Cell 176, 182–197.e23 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Wang J et al. Imprinted X inactivation maintained by a mouse Polycomb group gene. Nat Genet 28, 371–375 (2001). [DOI] [PubMed] [Google Scholar]
- 34.Li Y et al. Genome-wide analyses reveal a role of Polycomb in promoting hypomethylation of DNA methylation valleys. Genome Biology 19, 18–16 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Leitch HG & Smith A The mammalian germline as a pluripotency cycle. Development 140, 2495–2501 (2013). [DOI] [PubMed] [Google Scholar]
- 36.Forlani S, Lawson KA & Deschamps J Acquisition of Hox codes during gastrulation and axial elongation in the mouse embryo. Development 130, 3807–3819 (2003). [DOI] [PubMed] [Google Scholar]
- 37.Saitou M Specification of the germ cell lineage in mice. Front Biosci (Landmark Ed) 14, 1068–1087 (2009). [DOI] [PubMed] [Google Scholar]
- 38.Nicetto D et al. H3K9me3-heterochromatin loss at protein-coding genes enables developmental lineage specification. Science 363, 294–297 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Tzouanacou E, Wegener A, Wymeersch FJ, Wilson V & Nicolas J-F Redefining the progression of lineage segregations during mammalian embryogenesis by clonal analysis. Developmental Cell 17, 365–376 (2009). [DOI] [PubMed] [Google Scholar]
- 40.Wang H et al. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering. Cell 153, 910–918 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Platt RJ et al. CRISPR-Cas9 knockin mice for genome editing and cancer modeling. Cell 159, 440–455 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Montague TG, Cruz JM, Gagnon JA, Church GM & Valen E CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing. Nucleic Acids Res 42, W401–7 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Nakagata N Cryopreservation of mouse spermatozoa and in vitro fertilization. Methods Mol Biol 693, 57–73 (2011). [DOI] [PubMed] [Google Scholar]
- 44.Ying Q-L & Smith AG Defined conditions for neural commitment and differentiation. Meth Enzymol 365, 327–341 (2003). [DOI] [PubMed] [Google Scholar]
- 45.Gu Z, Eils R & Schlesner M Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016). [DOI] [PubMed] [Google Scholar]
- 46.Hahne F & Ivanek R Visualizing Genomic Data Using Gviz and Bioconductor. Methods Mol Biol 1418, 335–351 (2016). [DOI] [PubMed] [Google Scholar]
- 47.Dobin A et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Krueger F & Andrews SR SNPsplit: Allele-specific splitting of alignments between genomes with known SNP genotypes. F1000Res 5, 1479 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Butler A, Hoffman P, Smibert P, Papalexi E & Satija R Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol 36, 411–420 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Andergassen D et al. Mapping the mouse Allelome reveals tissue-specific regulation of allelic expression. eLife 6, e146 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Ritchie ME et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43, e47–e47 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Wang C et al. Reprogramming of H3K9me3-dependent heterochromatin during mammalian embryo development. Nat Cell Biol 20, 620–631 (2018). [DOI] [PubMed] [Google Scholar]
- 53.McInnes L, Healy J & Melville J UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. (2018). [Google Scholar]
- 54.La Manno G et al. RNA velocity of single cells. Nature 1–25 (2018). doi: 10.1038/s41586-018-0414-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Wolf FA, Angerer P & Theis FJ SCANPY: large-scale single-cell gene expression data analysis. Genome Biology 19, 15–5 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Sun D et al. MOABS: model based analysis of bisulfite sequencing data. Genome Biology 15, R38–12 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Keane TM et al. Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 477, 289–294 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Lei H et al. De novo DNA cytosine methyltransferase activities in mouse embryonic stem cells. Development 122, 3195–3205 (1996). [DOI] [PubMed] [Google Scholar]
- 59.Okano M, Bell DW, Haber DA & Li E DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell 99, 247–257 (1999). [DOI] [PubMed] [Google Scholar]
- 60.Tachibana M et al. G9a histone methyltransferase plays a dominant role in euchromatic histone H3 lysine 9 methylation and is essential for early embryogenesis. Genes & Development 16, 1779–1791 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.YU BD, HESS JL, HORNING SE, Brown GA & KORSMEYER SJ Altered Hox expression and segmental identity in Mll-mutant mice. Nature 378, 505–508 (1995). [DOI] [PubMed] [Google Scholar]
- 62.Hammoud SS et al. Chromatin and transcription transitions of mammalian adult germline stem cells and spermatogenesis. Cell Stem Cell 15, 239–253 (2014). [DOI] [PubMed] [Google Scholar]
- 63.Smallwood SA et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat Methods 11, 817–820 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Nashun B et al. Continuous Histone Replacement by Hira Is Essential for Normal Transcriptional Regulation and De Novo DNA Methylation during Mouse Oogenesis. Mol Cell 60, 611–625 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Hon GC et al. Epigenetic memory at embryonic enhancers identified in DNA methylation maps from adult mouse tissues. Nat Genet 45, 1198–1206 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Auclair G, Guibert S, Bender A & Weber M Ontogeny of CpG island methylation and specificity of DNMT3 methyltransferases during embryonic development in the mouse. Genome Biology 15, 545 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Wymeersch FJ et al. Position-dependent plasticity of distinct progenitor types in the primitive streak. eLife 5, (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Niswander L, Yee D, Rinchik EM, Russell LB & Magnuson T The albino deletion complex and early postimplantation survival in the mouse. Development 102, 45–53 (1988). [DOI] [PubMed] [Google Scholar]
- 69.Faust C, Lawson KA, Schork NJ, Thiel B & Magnuson T The Polycomb-group gene eed is required for normal morphogenetic movements during gastrulation in the mouse embryo. Development 125, 4495–4506 (1998). [DOI] [PubMed] [Google Scholar]
- 70.Kalantry S & Magnuson T The Polycomb group protein EED is dispensable for the initiation of random X-chromosome inactivation. PLoS Genet 2, e66 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Niswander L, Yee D, Rinchik EM, Russell LB & Magnuson T The albino-deletion complex in the mouse defines genes necessary for development of embryonic and extraembryonic ectoderm. Development 105, 175–182 (1989). [DOI] [PubMed] [Google Scholar]
- 72.Han J et al. Tbx3 improves the germ-line competency of induced pluripotent stem cells. Nature 463, 1096–1100 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Magnúsdóttir E & Surani MA How to make a primordial germ cell. Development 141, 245–252 (2014). [DOI] [PubMed] [Google Scholar]
- 74.Arnold SJ & Robertson EJ Making a commitment: cell lineage allocation and axis patterning in the early mouse embryo. Nat. Rev. Mol. Cell Biol 10, 91–103 (2009). [DOI] [PubMed] [Google Scholar]
- 75.Semrau S et al. Dynamics of lineage commitment revealed by single-cell transcriptomics of differentiating embryonic stem cells. Nat Commun 8, 1096 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All datasets have been deposited in the Gene Expression Omnibus and are accessible under GSE137337. Source data behind Figures 1a, b, 2, 3a, b, d–f, h, i, 4a, c–f and Extended Data Figures 1b, c, e–i, 2b–f, 3, 4b–f, 5a–c, e, 6, 7, 8a, 9b, 10b–g, 11c, d are available at https://oc-molgen.gnz.mpg.de/owncloud/s/F8g3y5F79JZRyof. Previously published data used in this study include H3K27me ChIPseq data (GSE98149), WGBS data for sperm and oocyte (GSE112320), preimplantation samples, including 8 cell stage embryos and the ICM and trophectoderm (TE) of the E3.5 blastocyst (GSE84236), and late stage samples including an average of somatic tissues and the E14.5 placenta (GSE42836).