Skip to main content
Elsevier Sponsored Documents logoLink to Elsevier Sponsored Documents
. 2021 May 27;184(11):2825–2842.e22. doi: 10.1016/j.cell.2021.04.004

A single-embryo, single-cell time-resolved model for mouse gastrulation

Markus Mittnenzweig 1,5, Yoav Mayshar 2,5, Saifeng Cheng 2, Raz Ben-Yair 2, Ron Hadas 2, Yoach Rais 2, Elad Chomsky 1, Netta Reines 2, Anna Uzonyi 2,3, Lior Lumerman 2,4, Aviezer Lifshitz 1, Zohar Mukamel 1, Ayelet-Hashahar Orenbuch 2, Amos Tanay 1,, Yonatan Stelzer 2,6,∗∗
PMCID: PMC8162424  PMID: 33932341

Summary

Mouse embryonic development is a canonical model system for studying mammalian cell fate acquisition. Recently, single-cell atlases comprehensively charted embryonic transcriptional landscapes, yet inference of the coordinated dynamics of cells over such atlases remains challenging. Here, we introduce a temporal model for mouse gastrulation, consisting of data from 153 individually sampled embryos spanning 36 h of molecular diversification. Using algorithms and precise timing, we infer differentiation flows and lineage specification dynamics over the embryonic transcriptional manifold. Rapid transcriptional bifurcations characterize the commitment of early specialized node and blood cells. However, for most lineages, we observe combinatorial multi-furcation dynamics rather than hierarchical transcriptional transitions. In the mesoderm, dozens of transcription factors combinatorially regulate multifurcations, as we exemplify using time-matched chimeric embryos of Foxc1/Foxc2 mutants. Our study rejects the notion of differentiation being governed by a series of binary choices, providing an alternative quantitative model for cell fate acquisition.

Keywords: scRNA-seq, developmental biology, mouse gastrulation, network flow model, trajectory inference, cell fate decisions, tetraploid complementation assay, chimera assay

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Single-embryo scRNA-seq synthesis of morphology and transcriptional staging

  • A network flow model infers differentiation of embryonic cell ensembles

  • Gastrulation is dominated by progenitor states that continuously multi-furcate

  • Single-embryo chimeras control functional studies of TFs over time and lineage


A time-resolved, high-resolution model charts differentiation dynamics during mouse gastrulation and reveals that for most lineages, cell fates arise from multi-furcation events rather than classical tree-like bifurcation.

Introduction

Early embryonic development in mammals proceeds through robust acquisition of specialized cellular properties by individual cells that specify the basic embryonic lineages, which, in turn, form the first organs. Decades of research using model organisms have provided tremendous insight into the activity of genes and pathways driving this process and ensuring its robustness (Arnold and Robertson, 2009; Bedzhov et al., 2014; Ebisuya and Briscoe, 2018; Tam and Ho, 2020). This had far-reaching implications for other fields such as cancer and regenerative medicine (Ben-Porath et al., 2008; Spagnoli and Hemmati-Brivanlou, 2006). Advances in single-cell technologies recently enabled high-resolution charting of mouse embryonic development, mapping comprehensively the transcriptional states of developing embryos during gastrulation (Argelaguet et al., 2019; Chan et al., 2019; Cheng et al., 2019; Grosswendt et al., 2020; Ibarra-Soria et al., 2018; Lescroart et al., 2018; Mohammed et al., 2017; Nowotschin et al., 2019; Pijuan-Sala et al., 2019; Scialdone et al., 2016) and organogenesis (Cao et al., 2019; Chan et al., 2019; de Soysa et al., 2019; Han et al., 2018; Nowotschin et al., 2019; Tam and Ho, 2020). Yet, moving from atlases toward the inference of quantitative dynamics during cell state transitions represents a major open challenge in the field. Pseudotemporal ordering defines the progression of cell trajectories based on computational inference of transcriptional similarities (Tanay and Regev, 2017). However, the rapid nature of development, characterized by a continuous flux of intracellular molecular changes within a short time frame, poses unique challenges for this approach (Tritschler et al., 2019) (e.g., when bifurcation points are fuzzy or when distinct differentiating populations converge by activating common gene modules in a non-hierarchical manner). Similarly, pseudo-temporal ordering is difficult when differentiation to a specific cell type occurs over a wide range of time during development. Alternative approaches that aim at merging single-cell analysis with lineage relationships are rapidly evolving, using the estimation of transcriptional derivatives (RNA velocity) (La Manno et al., 2018), time series experiments (Fischer et al., 2019; Schiebinger et al., 2019), or by simultaneous direct recording of cell histories, with promising but still partial coverage (Bowling et al., 2020; Chan et al., 2019; Kalhor et al., 2018; Raj et al., 2018; Spanjaard et al., 2018). In principle, atlases can be constructed from samples that are staged using traditional methods (Cao et al., 2019; Pijuan-Sala et al., 2019), thus facilitating coarse-grained temporal modeling of inferred transcriptional states. However, refining such classical staging and augmenting it with quantitative kinetic modeling of transcriptional regulation remains to be addressed.

Precise staging of developmental progression is critical in order to describe sequences of developmental events in a consistent and comparable fashion, in particular when studying multiple strains, genetic perturbations, and other interventions. In the mouse, timing of development using “days postcoitum” (dpc) provides only a nominal estimate of progression for each embryo. This is because litters are often highly heterogeneous, frequently constituting embryos from several morphologically distinct embryonic stages (Figure 1A) (Downs and Davies, 1993). Another unique challenge of studying viviparous development is that up to the point of extraction, the embryo is largely concealed. To address this, methods for staging embryos using morphological features, such as those proposed by Theiler (1989) and later elaborated on by Downs and Davies (1993), are still considered the gold-standard in the field. However, morphology-based approaches are limited by the resolution provided by the number of such distinguishable stages and the qualitative nature of such classification.

Figure 1.

Figure 1

Resolving time and gene expression in single cells and single embryos

(A) Natural variation among litters and littermates (left panel) is harnessed to generate a temporal transcriptional model of embryonic development. Shown is a schematic representation of the workflow. Image of a single litter isolated at 7.75 dpc. Scale bar, 200 μm. LS, late streak; OB, no bud; EB, early bud; LB, late bud.

(B) Embryo-embryo similarity matrix ranking the embryos according to their intrinsic transcriptional makeup.

(C) Comparison between intrinsic transcriptional rank and ranking based on morphological feature analysis.

(D) Comparison of intrinsic versus external ordering using projection on a reference atlas (Pijuan-Sala et al., 2019).

(E) Comparison between transcriptional rank and size estimate of embryo cross-section (log2, μm2, fitted line using smooth spline interpolation with λ = 7.1 × 10−3).

(F) Projection of cells following binning of embryos into 13 time groups, over 2D MC projections, separately for each of the three embryonic germ layers (epiblast is marked by dashed ovals).

(G) Metacells (MCs) represent a transcriptional state shared by cells from numerous embryos, spanning a time range. Right panel: distribution of embryo rank (columns) for cells included in each MC (each represented by a single row). Rows are normalized to reflect the temporal heterogeneity of each MC. Left panel: highlighted rank distributions of individual MCs from different lineages, and the expression of selected marker genes from each MC (bar plots; log2, relative to mean expression over all MCs).

(H) Four representative embryos are shown alongside the time distribution of their comprising cells (calculated from the MCs mean age). Scale bar, 100 μm. #Embryo rank; in brackets, calculated time (Et).

See also Figure S1 and Table S1.

In the current study, we wished to move further toward an ideal representation of mammalian embryonic development as a continuous process. We reasoned that single-cell RNA sequencing (scRNA-seq) of individual embryos, in correlation with morphological measures obtained from microscopy, will allow placing the embryos on a continuum of transcriptional transformation. Assigning a singular developmental time to cells originating from the same embryo provided a timestamp to each cell in our dataset. This enabled the construction of a flow model (Ahuja et al., 1993) for the transition of cells on the transcriptional manifold. Flow analysis distinguished early specialization of few embryonic cell types, identified their stepwise acquisition of transcriptional identity and linked it with the hierarchical activity of fate-specific transcription factors (TFs). However, for most of the cells in the embryo, the data suggest complex transcriptional dynamics in multipotent progenitor states and complex combinatorial activity of TFs that drive multi-furcation rather than the classical tree-like bifurcation. The temporal atlas and associated flow model form the foundations for elucidating intrinsic and extrinsic effects shaping cell fate decisions in vivo. This is illustrated by studying mesoderm gene regulation by Foxc1 and Foxc2, using single-cell chimera knockout experiments with tight single-embryo time control.

Results

Quantification of morphology and scRNA-seq data from 153 E6.5–E8.25 embryos

To generate a continuous fate map of mouse gastrulation up to somitogenesis, we performed scRNA-seq on dissociated cells from individual embryos representing early post-implantation up to pre-somitic stages, corresponding to embryonic day (E) ∼E6.5–E8.25 (Figure 1A; Table S1). Each embryo was handled and imaged separately for subsequent assessment of morphological features and size measurements. To focus on tissues that contribute to the embryo proper, the ectoplacental cone, parietal endoderm, and much of the extraembryonic ectoderm, were removed prior to dissociation. To allow sequencing of single cells separately from each embryo, we used MARS-seq (Jaitin et al., 2014), where single cells are index-sorted directly into multi-well plates using a flow cytometer, in which they are barcoded prior to pooling. In this manner, 33,700 cells from 153 embryos were sequenced at a median coverage of 4,100 unique molecular identifiers (UMIs) per cell (Figure S1A). A sufficient representation of the various developmental stages was ensured by morphological assessment. Due to the inherent sparsity of scRNA-seq data, metacell analysis (Baran et al., 2019) was employed on the entire dataset, by which transcriptionally similar single cells are aggregated to form more cohesive and distinct transcriptional states termed metacells (MCs). In this manner, we robustly mapped all transcriptional states and generated their similarity graph.

Figure S1.

Figure S1

Sampling single embryos, related to Figures 1 and 2

(A) Distribution of unique reads (UMI) per cell in the manifold. Cells with less than 1000 UMIs were filtered out of the analysis (left panel), Number of cells from each embryo represented in the manifold, sorted according to intrinsic rank (middle panel), and sampled number of cells per age group relative to the estimated number of cells (Figure 2B) (right panel).

(B) Pictures of embryos sampled in the study, ranked and sorted according to morphology. Images are uniformly scaled, scale bar = 100um.

(C) Classification of embryos according to parental strain used in this study (ICR and C57BL/6), demonstrating coordinated correlation between morphology and transcriptional ranks between strains.

(D) Sexing of individual embryos according to the expression of Xist and Y- chromosome genes. Number of UMIs are normalized by total number of UMIs per embryo.

(E) Age group allocation of embryos: Embryos are partitioned into 13 age groups. Intrinsic ranks are translated into a developmental timescale using transcription and size measurements of embryos (Figure 1E).

(F) Density plot of single-cell calculated cell-cycle scores according to combined M- and S- phase UMI counts. Cutoff line represents cells having a low cell-cycle score. These cells are highlighted in the 2D projection in Figure 2C. Black dots represent endoderm cell types (definitive endoderm, foregut, node/notochord, VE, ExEn), that are enriched for low score cells.

(G) Classification of embryonic and extraembryonic endoderm (ExEn) in the flow model: All MCs above the cutoff line in the left panel (Ttr versus Apoe expression) were classified as ExEn. All MCs not classified as ExEn that exhibit high Foxa2 and Foxa1 expression, were assigned to the group of embryonic endoderm. For both groups, growth loss constants λt>0 were used in the flow model (see I).

(H) Boxplot distribution of single-cell cell-cycle score according to cell type.

(I) Estimation of growth loss constants λt for embryonic endoderm and ExEn/VE. Shown is the fraction of these cells over time (log scale). Lines correspond to growth loss constants λt=0.17 for ExEn/VE, and λt=0.12 for embryonic endoderm, as used in the network flow model.

Projection of single embryos over an absolute developmental timescale

To place the embryos on a sequential time axis, we first ranked them using morphology alone (Figure S1B). We then developed a strategy for ranking embryos by K-nn similarities among single-cell profiles (Figure 1B). This resulted in remarkably high comparability between the two independent ranking schemes (ρ = 0.97) (Figures 1C and S1C). For external validation, we projected cells from each embryo onto a recently published mouse single-cell gastrulation atlas (Pijuan-Sala et al., 2019) and re-timed our single cells using this reference atlas consisting of pooled embryos from 6-h time intervals. We averaged the external cell timing for each embryo, deriving again a high degree of concordance with our internal ranking strategy (ρ = 0.98) (Figure 1D). Finally, comparison of embryo ranking to the estimation of embryos’ physical growth rate (approximated by the physical area under the microscope) provided alignment of our ranks to an absolute size scale (ρ = 0.91) (Figure 1E). Single-embryo sampling also facilitated controlled analysis of male and female embryos (genotyped based on Y chromosome RNA) (Figure S1D), demonstrating comparable size and transcriptional distributions between the sexes at these stages (Figures 1C–1E, red versus blue dots). For older embryos, we observed more precise ordering, consistent with the less accurate assignment of morphology and size at early stages, and with acceleration in transcriptional diversification from mid/late streak stage onward. To synthesize all observations, each embryo was assigned a calculated timestamp based on size and transcription (hereinafter designated as Et). To reduce possible sampling noise and facilitate downstream statistical analysis, we formed thirteen temporal groups labeled Et6.5–Et8.1 corresponding to the mean approximated developmental time of embryos within them (3–36 embryos and 884–6,442 cells for each group) (Figures S1A and S1E).

Embryo time and transcriptional states cannot be approximated by one pseudotime

For initial visualization of differentiation dynamics, we separated the dataset using canonical markers into three transcriptional manifolds describing development from the epiblast toward the three germ layers (ectoderm, mesoderm, and endoderm) (Figure 1F). Projection of the thirteen groups of timed embryos on these manifolds showed clear directional transition from the epiblast ground state and subsequent differentiation and diversification, supporting the notion that pseudotime can approximate embryonic development over the manifold. Because each MC is made up of cells from different embryos corresponding to the same transcriptional state, we could estimate a pseudotime for each MC using the calculated ages of the embryos contributing to them (Figure 1G). In a pseudotime representation of a highly synchronous process, MCs are expected to consist of cells from a narrow range of embryonic times. However, the variation in embryonic times for each MC was generally high (SD 1.1–12 h) (Figure 1G, left panel). Conversely, individual embryos consisted of cells belonging to MCs with varied pseudotimes (SD 1.4–6.5 h) (Figure 1H). This demonstrates that dynamics of transcriptional states are far from coherent and synchronous and accentuates the need for models that represent embryonic development as concurrent and unsynchronized collections of single cells that transition through the transcriptional manifold.

The embryo transcriptional manifold and its growth dynamics

To infer developmental dynamics from the series of acquired single-embryo maps, we made two assumptions: (1) embryos that are similarly timed will include cells that are similarly distributed over the transcriptional manifold, such that progression in time will involve a gradual change in states over manifold links; and (2) progression in time involves rapid cell proliferation, with rates that may change depending on cell type or position on the manifold. To derive quantitative models supporting the first assumption, we represented the transcriptional space of embryonic development using 461 MC states. These states are linked using a logistic metric function to form a manifold structure (Figure 2A). In this manner, the likelihood of cells to exhibit transcriptional changes between any two states over time is estimated by a probabilistic (i.e., Markovian) manifold distance over the MC graph (STAR Methods). Because MC linkage is not (in practice, cannot be) fully captured by the 2D model representation, we explicitly visualize neighboring MCs using edges on top of the standard 2D embedding of the model (i.e., links on Figure 2A), even if these edges are not well accommodated by the 2D embedding.

Figure 2.

Figure 2

A network flow model for gastrulation

(A) Color coded 2-D projection of the metacell (MC) graph (numbered nodes connected by edges depicting the most similar neighbors for each MC), representing the entire transcriptional manifold. Single cells indicated by small dots are colored according to the MC they comprise. See color annotation legend below.

(B) Growth rate during gastrulation was estimated by counting DAPI stained nuclei. Left: cell counts of the embryonic compartment out of the total counted cells per embryo are shown for four representative examples. ES, early streak; LS, late streak; OB, no bud; EHF, early head-fold. Scale bar, 100 μm. Right: comparison of embryos’ (n = 19) cell count and estimated embryonic time. For each embryo, time was assigned based on morphological similarity to sequenced embryos.

(C) Single cells with low S-phase and M-phase gene expression (black dots) are highlighted on the 2D projection (left, see Figure S1F and STAR Methods). MCs are colored according to the fraction of such cells they contain. Boxplots (right) depicting the distribution of S-phase/M-phase expression for the identified slower cycling cell types (legend below) and all other cells (Rest). Boxes show interquartile range (IQR) with median of the data. Length of whiskers corresponds to 1.5 IQR, or distance to the extremal value if within 1.5 IQR.

(D) The gastrulation network flow model consists of MCs (nodes in rows) distributed in time (x axis), and flows (edges) that link MCs between adjacent time points. The first time point represents a common source for all MCs. Annotation (color-coded) relies on both marker expression and flow-based fate mapping.

(E) Heatmap showing relative expression (log fold change) of key lineage-specific genes.

See also Figures S2 and S3 and Table S2.

To provide quantitative support to the second assumption regarding proliferation rates, we first estimated the average embryonic growth rate, using image analysis to count nuclei in DAPI-stained embryos that were timed by morphology (Figure 2B). This analysis confirmed the remarkably rapid exponential growth from 900–1,300 cells at E7.0 to 10,000–13,000 cells at E8.0 (average calculated doubling time ∼7 h), consistent with previously reported histology and dissociated cell-based counting (Palis et al., 1999; Snow, 1977). This is also in alignment with studies demonstrating that apoptosis levels in early post-implantation embryos are low and randomly distributed among the embryonic germ layers (Tam and Behringer, 1997). To estimate proliferation rate biases over the manifold, we noted that a distinct and small 10% sub-population of the single cells showed low expression of both M- and S-phase-related genes (Figure S1F), whereas the remaining 90% of the single-cell population showed a distinctive smooth transition between M- and S-stereotypic expression. Initial annotation of MCs using known markers showed that states linked with the embryonic endoderm, node/notochord, visceral endoderm (VE), and extraembryonic endoderm (ExEn) transcriptional programs were enriched for the low proliferation population, in agreement with previous reports (Bellomo et al., 1996; Snow, 1977) (Figures 2C, S1G, and S1H). Tracking of cell-type frequencies over inferred embryonic time allowed estimation of the relative (lower) growth for these populations compared with the rest of the embryo (r = 0.88, 0.83 for embryonic endoderm, VE/ExEn, respectively) (Figure S1I). In summary, the inferred manifold model and estimated cell-type-specific growth rates together set the stage for a model accommodating the two forces that shape embryo cell-state composition over time: the transition between transcriptional states (i.e., differentiation) and the massive growth and proliferation enabling it.

A network flows model infers embryonic differentiation dynamics

To progress toward a fully resolved model of embryonic differentiation dynamics, we developed a network flow model connecting the manifold distributions of the 13 groups of timed embryos, based on our model for differentiation and growth dynamics (Figure 2D). In this representation, MCs are depicted as nodes distributed horizontally over time, and edges denote their predicted transition to the next time point. The flow model uses an approximated mass-conservation strategy (or optimal transport) (Schiebinger et al., 2019) to track cellular differentiation along the manifold links in time. This is implemented via a mincost-maxflow algorithm with convex costs that accommodates uncertainty in sampling cell types at each time point, while considering the variation in cell-type growth rates over time (Figures S2A and S2B; see also STAR Methods, sensitivity and bootstrap analysis). We next used the flow model to refine marker-based cell-type annotation over the transcriptional manifold using the computationally inferred cell fate of MCs representing early differentiating states (Figures 2E and S3; Table S2). This resulted in the annotation of embryonic MCs into 29 cell types (extending annotations in Pijuan-Sala et al. [2019]). We note that annotation assisted by the flow model effectively refines and resolves weakly differentiated and transcriptionally ambiguous groups. For example, we distinguish between two phases of noncommitted nascent mesoderm (early and late) that display differences in differentiation potential and define the first ectodermal cells arising from the epiblast as definitive ectoderm due to their potential to contribute to both surface ectoderm and neural plate fates (Figure 2D) (Harvey et al., 2010).

Figure S2.

Figure S2

Robustness and parameter sensitivity analysis of the mincost flow model, related to Figure 2

(A) Parameter sensitivity of cell type transitions. For each parameter examined, flows were re-computed and the mean transition frequency over all time points between pairs of cell types was reported (left panel for each shown parameter). Capacity variance and capacity costs 1 and 2 are the three parameters defining the convex capacity cost (cost functions are depicted in the right panel of each parameter). Logistic loc and scale as well as Markov process time t are parameters used to define manifold distances between MCs. For the parameters logistic loc and logistic scale, the different shapes of the resulting logistic distance are plotted on the right (see STAR Methods for complete description of the mincost flow model). The values used to derive the flow model in Figure 2 are marked in red in all panels.

(B) Robustness analysis of the mincost flow model: For each replica MC model (one embryo left out per iteration), manifold distances and flows are recomputed. Boxplots show for each transition tt+1 the 20 most frequent flow transitions between two cell types. Boxes represent values of the 153 replica flows, and the transition frequencies of the original flow model are marked as red dots.

Figure S3.

Figure S3

Expression of marker genes and fate map correlation matrix, related to Figure 2

(A) Annotation of MCs using flows. Shown is the second-order correlation matrix of time-averaged incoming and outgoing flows per MC. Correlation matrix was hierarchically clustered into 65 groups. See STAR Methods for additional information.

(B) Canonical marker genes expression supporting the clusters annotation. MCs are sorted and separated according to the hierarchical cluster tree from A (log2 of UMI frequency, y axis).

Tracing back the kinetics of committed developmental trajectories

Network flows provide a framework for understanding how strongly committed cell states emerge over time. The flow model defines different possible timed trajectories leading to any strongly specified transcriptional state, which can be depicted as paths over the network diagram (Figure 3, top panels). The kinetics of genes participating in a differentiation program can be estimated as the mean expression per absolute time (Figure 3, heatmaps) and can also be decomposed into the main temporal regimes composing it (Figure 3, line graphs). These tools provide a quantitative view into the differentiation of blood, node, and cardiomyocyte cells, the first highly specialized cell types in the early embryo. Early blood progenitors are among the first functional cells to be specified. Consistent with previous studies that identified the origins of the extra-embryonic mesoderm, including primitive hematopoiesis, to the proximal primitive streak (Huber et al., 2004; Lawson and Pedersen, 1992), our model predicts that primitive erythroid cells originate from primitive streak (PS) cells at the earliest time point (Figure 3A). In fact, according to our model, PS cells from Et6.7 onward are unlikely to contribute to the blood lineage (Figure 3A). These cells, passing through a transient nascent mesoderm (NM) state (characterized by genes such as T, Mixl1, and Mesp1), rapidly lose the epiblast gene signature and begin expressing sequential waves of key regulatory genes. Notably, Tal1, Kdr, and Etv2 are observed to be expressed at appreciable levels by Et6.9, closely followed by a combination of factors restricting the cells to the hematopoietic fate, such as Runx1 and Lmo2 (Figure 3A).

Figure 3.

Figure 3

Traceback of differentiation fates and their expression kinetics

(A–C) Flow tracebacks were performed from (A) erythroid 2 (MC#1), (B) node/notochord (MC#367), and (C) cardiomyocyte (MC#57) states. Network plots depict traced-back flows color-coded by annotated cell types and labeled (roman numerals) according to key temporal trajectories. Heatmaps show average expression kinetics for genes with the highest variance over the trajectories. Colored polygons represent the cell type composition of the traceback at each time point (combining the trajectories). Line graphs show the absolute expression level (log2 of UMI frequency, y axis) for select marker genes, along each of the trajectories shown in the respective network plot.

The node is a distinct structure that arises from the anterior PS at the distal tip of the late streak embryo (Figure 3B). The node constitutes a major signaling center and is critical for correct patterning of the embryo and determination of the left-right axis. This is achieved by directional flow of extracellular fluid driven by clockwise rotation of specialized cilia (Hirokawa et al., 2006). The ciliated node cells observed in our manifold are identified as cells expressing high levels of Foxj1, a TF known to be critical for ciliary development (Chen et al., 1998), and numerous genes directly relating to cilia structure and function such as Tppp3, Fam183b, and Nek1 (Figure 3B). A small number of Foxj1+ cells can be identified as early as Et6.9 and begin expressing cilia genes closely thereafter. These cells are predicted to arise from anterior PS progenitors, consistent with previous reports (Balmer et al., 2016; Tam and Beddington, 1987). Foxj1 expression correlates to the expression of additional TFs critical for node morphogenesis such as Noto (Beckers et al., 2007), Rfx3, and T. These are preceded by the expression of TFs characteristic of the anterior PS, Gsc, Mixl1, Eomes, and Foxa2 and anti-correlated with the canonical epiblast gene signature (Figure 3B). Similar analysis illustrates transcriptional kinetics in early progenitors of the embryonic heart (Figure 3C). In summary, the model allows reconstituting the rapid and precise establishment of specialized transcriptional programs toward the blood, node, and heart. Such specialized cells, however, comprise only a small fraction (∼10%) of the embryo by Et8.1, whereas for the majority of embryonic cell states, differentiation appears more complex, involving more gradual commitment and specialization.

Prototypical dynamics of cell fate transitions in the epiblast, primitive streak, and nascent mesoderm

After validating the flow model via analysis of differentiation toward highly specialized fates, we focused our analysis on complex fate decision dynamics, which may involve selection between two fates or more (bifurcation and multi-furcation, respectively). One early example of such process is the segregation of the epiblast to the mesoderm and endoderm lineages through the PS and the restriction of this potential by the formation of definitive ectoderm. This process is shown in our model to span a prolonged period of time between Et6.5–Et7.6. To represent the continuous dynamics of transitions, we introduce the “vein plot” (Figure 4A). This plot displays the relative abundance of transcriptional states over time and the differentiation flux between them, as indicated by the relative width of the diagonal connections that span the time points. In this manner, it is clear that epiblast differentiation is dynamic and largely biphasic: 88% of the flow toward the PS fate occurs by Et7.1, whereas 89% of the ectoderm restriction takes place at Et7.3 or later (Figures 4A and 4B). Throughout this extended period, epiblast cells maintain robust expression of the core pluripotency factors Utf1, Pou3f1, Dnmt3b, and Pim2 (Figures 4C, left, and S4A). Interestingly, we discovered major expression changes occur over time even within the seemingly homogeneous epiblast because it continuously supplies cells to the PS and definitive ectoderm (Figure 4C, right). Most notably, downregulation of Nodal signaling genes (Nodal and Tdgf1) happens in parallel to PS differentiation and precedes the induction of definitive ectoderm markers Sox9, Irx3, and Irx5 (Figures 4D and S4A). Interestingly, distinct rostral and caudal signatures are evident in the definitive ectoderm already in Et7.5 (Figure S4B). Activation of the definitive ectoderm program is largely considered to be achieved by secreted antagonists emanating from the distal and anterior VE, protecting the overlying epiblast from differentiation to PS, and further patterned by signals from the endoderm and axial mesoderm (Arkell and Tam, 2012; Balmer et al., 2016; Hemmati-Brivanlou and Melton, 1997), such as bone morphogenetic protein (BMP) and Nodal agonist-antagonist gradients (Li et al., 2013; Liu et al., 2018; McMahon et al., 1998). Using the flow model absolute timescale, we show that, concomitant with nodal signaling repression in the anterior epiblast, transforming growth factor β (TGF-β) superfamily modulators Cer1 and Lefty1 are expressed in the VE, while interestingly, Lefty2 is transiently induced in the nascent and early rostral mesoderm (Figure 4E).

Figure 4.

Figure 4

Temporal dynamics during epiblast, primitive streak, and nascent mesoderm commitment

(A) Vein plots describe the continuous transition of cell types to their direct descendants (represented by diagonal flows spanning time points), and the dynamic relative frequencies of these cell types in the embryo over time (vein width on the y axis). This panel shows transitions emanating from the epiblast to the primitive streak (PS) and definitive ectoderm (DEc). Vein connection width represents flow flux at each time point. Dashed arrow represents a link to a subsequent panel (here, F).

(B) Bars represent the composition of direct differentiation targets of epiblast cells at each time group.

(C) Row-normalized heatmaps of expression in the epiblast per time. Genes with homogeneous expression are shown on the left, clusters of variable expression, on right.

(D) Gene-flow plots in which nodes represent MCs, and edges link each MC to the source MC with the highest contributing flow in the network. MC level expression versus MC mean time is depicted for Eomes, Tdgf1, and Irx5 genes. Only MCs from Epiblast and related cell types are shown.

(E) Expression of members of TGF-β superfamily signaling modulators over time. Line graphs represent total transcript levels for the entire endoderm or mesoderm for each age group.

(F) Vein plot (as in A) describing the flow model emanating from PS to its predicted direct descendants.

(G) Predicted fate composition of early/late NM cells at each age group. Columns represent fraction of cells committing to each of the mesoderm fates, separated into hematoendothelial and extraembryonic lineages (negative values), and embryonic mesoderm (positive values).

(H and I) Vein plots illustrating the flow emanating from early and late NM (respectively) and gene-flow plots for key NM markers.

See also Figure S4.

Figure S4.

Figure S4

Dynamics of cell fate transitions, related to Figure 4

(A) Similar to Figure 3, for flows traced back from MC#397 (rostral neural plate).

(B) Gene-flow plots in which nodes represent MCs, and edges link each MC to the source MC with the highest contributing flow in the network. MC level expression versus MC mean time is depicted for key genes correlated with rostral and caudal neural ectoderm.

(C) Gene-flow plots of key genes associated with PS and caudal epiblast.

(D) Mesoderm and endoderm commitment per MC. Panels show the fraction of the flow passing through each MC that contributes to mesoderm or endoderm at the final time point. The remaining fraction contributes to ectoderm (not shown).

(E) The PS is predicted to be the only bi-potential cell type which significantly contributes to both mesoderm and endoderm (other than the pluripotent epiblast, and a single APS MC) (left). MC expression of key endoderm (Foxa2) and mesoderm (Mesp1) TFs (right).

(F) Retention time of progenitor cell types: Shown is the median length of time a cell spends in each cell type according to the flows. Boxes represent distribution of retention times computed for the 153 iterated MC models (see Figure S2B).

(G) Gene-flow plots of genes characteristic for specific mesodermal fates.

Biphasic differentiation over a prolonged period of massive proliferation as shown for the epiblast is in marked contrast to the developmental dynamics of the PS (Figures 4F). Flows demonstrate the continuous replenishment of the PS by the epiblast to compensate for the massive egress toward the NM and anterior PS. This process is accompanied by the activation of early endoderm and mesoderm markers such as Eomes, Foxa2, Mixl1, and T (Figure S4C), representing the potential of the PS to differentiate to both lineages (Figure S4D). However, calculating a commitment score for each MC identified only a few MCs with such potential (Figure S4E). Furthermore, unlike epiblast and NM, cells quickly transition through the PS without stabilizing their state thus forming a sharp bifurcation (Figures 4F and S4F). This lack of PS stability implies that the mesoderm/endoderm bifurcation occurs very rapidly on exit from the pluripotent state, such that no stable self-renewing mesendodermal progenitor cell state can be defined.

Differentiation from the NM defines a third type of prototypic developmental dynamics that differs from both the epiblast and the PS. NM differentiation occurs gradually and continuously through Et6.7–Et8.0, during which the intrinsic uncommitted state is continuously produced (Figure S4F). Using the flow model, we depict NM multi-furcation into 10 mesodermal programs by computing the relative NM fraction contributing to each of these fates over time (Figure 4G). Up to Et6.9, NM cells are heavily biased toward the hematoendothelial and extraembryonic mesoderm lineages. Cardiac progenitors and rostral mesoderm then emerge, trailed by an increased dominance for caudal mesoderm fates, starting at Et7.4. During this period (Et6.7–Et7.8), NM cells are continuously marked by high levels of Mesp1, T, Mixl1, Lefty2, and Snai1 (Figures 4H and S4G). Concomitantly, temporal activation of key mesodermal factors such as Tal1, Etv2 (hematoendothelial fate), Hand1/2 (extraembryonic mesoderm and cardiac fates), and Foxc1 and Dll3 (rostral and caudal mesoderm, respectively) marks initial commitment toward more specified mesodermal programs (Figures 4H, 4I, and S4G). In conclusion, these data support an extended notion of developmental fate acquisition that can go through phased fate determination (as in the epiblast), rapid transitional bifurcation (as in the PS), or complex and time-dependent multi-furcation (as observed for the NM).

Evidence for combinatorial regulation of mesodermal multifurcation

The classical stepwise differentiation model for development describes a series of specific gene expression signatures, each defining a direct transition between a progenitor state and a more specialized progeny state. We searched for such specific steps by identifying genes whose activation was unique to only one of the major fate transitions predicted by the flow model (Figure 5A). Interestingly, this analysis demonstrated that transitions toward the node, cardiomyocytes, and the hematoendothelial lineages are characterized by highly specific gene programs, in line with their early specification and specialized molecular functions. However, most transitions involved gene expression dominated by combinatorial signatures and lacking unique regulator genes or other specific makers. This was most evident in the complex repertoire of mesodermal fates, which, as shown above, represent temporally dynamic multi-furcation from the NM toward extraembryonic, rostral, and caudal programs. To model the regulation of this complex process, we focused on the combinatorial expression of 63 TFs with variable expression within the mesoderm (Figure 5B). TFs dominate the mesodermal transcriptional programs (59 out of the 256 most variable genes in the mesoderm are TFs), however, only some are likely to drive initial NM differentiation. Distinguishing between driver TFs and “responder” TFs can be achieved by analysis of their temporal kinetics (Figure 5C) and absolute expression level (Figure S5A).

Figure 5.

Figure 5

Combinatorial rather than hierarchical TF expression and regulation in the mesoderm

(A) Identification of genes specifically associated with one of the major cell type transitions during gastrulation. Shown are relative expression levels of genes, comparing between the source cell population and its progeny (columns). Dashed boxes mark the enriched, early-specialized lineages. Selection criteria for genes: (1) minimal gap of 0.5 between largest and second-largest log fold-change along a transition, and (2) second-largest fold-change smaller than 1.

(B) Absolute expression of highly variable TFs in the mesoderm. Columns represent individual MCs following hierarchical clustering.

(C) Temporal expression of key TFs within the mesoderm. For each TF, we identified the MCs with maximum expression in the mesoderm (see STAR Methods). The inferred TF expression kinetics along trajectories leading to such MCs are shown. Error bars reflect SD of expression for traced-back MCs at each time point. y axis, absolute expression.

(D) R2 values for the best regression model using a pair of mesodermal TFs versus the best model using only one TF, trained on non-TFs with the highest variance in the mesoderm, and validated using permutation tests (see Figure S5).

(E and F) Prominent examples of target genes that can be accurately fitted in the mesoderm (R2 > 0.8) using a single TF (E) or a linear combination of two TFs (F). Each dot represents a MC, axes represent absolute expression of TFs and predicted gene (x and y, respectively).

Figure S5.

Figure S5

Supporting information, related to Figure 5

(A) Maximal log2 base expression within the mesoderm for most variable TFs

(B) Permutation test to control for overfitting in linear regression analysis. Shown in red is the maximal gain in R2 when using two TFs instead of one TF to predict target gene expression. Boxes represent the maximal gain in R2 after fixing for each target the best-fitting single TF and using randomly shuffled TF vectors for the second variable. Random shuffling was repeated 100 times.

(C) Examples of highly variable mesodermal genes with poor predictability based on linear regression with one TF, but strongly improved R2 values when using two TFs as explanatory variables. Shown is gene expression of target versus either one of the two TFs, or the combination of both.

TFs expression reflected the overall high-level organization of the mesoderm into 4 main regimes: progenitor states (marked by Eomes, Mixl1, and Mesp1), extraembryonic fates (Msx1/2, Hand1), and a spectrum of rostral (Foxc1/2, Twist1) and caudal (Cdx1, Hoxb1, Hoxa1) programs (Figure 5C). However, none of these TFs define a precise bifurcation pattern. For example, characteristic high expression of Hand1 in extraembryonic mesoderm is also observed in some rostral mesoderm MCs. Similarly, although Foxc1/2 are not expressed in most of the extraembryonic mesoderm lineages, they otherwise show only quantitative preferences to some (e.g., rostral, paraxial) mesodermal fates. The complex combinatorics of the mesoderm TFs is likely to have a major regulatory impact on target genes. To start and quantify this effect, for each mesoderm gene we compared the attainable fit by a regression model using one or two input TFs (Figures 5D and S5B). For a few genes, we observe very tight linkage with the expression of a single TF (e.g., Tdgf1 and Eomes). However, for most genes, we gain significant (compared to shuffled controls) predictive power when using two TFs as regulatory inputs (e.g., Pdch19, Bmp2, Dll3, and Vim) (Figures 5E, 5F, and S5C). The data suggest that although a cascade of master regulators expression can drive bifurcation and differentiation of specialized cell types, multifurcation in the mesoderm depends on complex combinations of many TFs rather than hierarchical TF expression. The correlation-based analysis presented here can use temporally resolved data to derive an initial understanding of such combinatorial schemes. Yet, such analysis is limited in its specificity as models become more complex and must therefore be combined with perturbations of candidate TFs.

Studying the temporal effects of Foxc1 and Foxc2 on mesoderm specification

Forkhead box C1 and C2 genes (Foxc1 and Foxc2) exhibit broad and largely overlapping expression patterns in non-axial mesoderm. Mice harboring deletions of both genes die at mid-gestation with notable phenotypes in somites, intermediate mesoderm derivatives, blood vessels, heart, and neural crest (Fatima et al., 2016; Kume, 2009; Winnier et al., 1997). In our data, Foxc1/Foxc2 are expressed in rostral and caudal mesoderm, as well as weakly in subpopulations of hematoendothelial progenitors and the foregut (Figures 5B and S6A). Furthermore, Foxc1 expression peaks in early paraxial mesoderm, suggesting that it may play an instructive role in its specification. To this end, we adapted the chimeric embryo approach for use with precise single-embryo quantification (Figure 6A). We generated isogenic pairs of fluorescently tagged Foxc1/2 double knockout (DKO) and control mouse embryonic stem cells (mESCs), which were subsequently injected into host blastocysts followed by index sorting of single-embryo cells and MARS-seq (Figures 6B and S6B). In this manner, we could separately project injected and host cells from the same embryo onto the wild-type model. Generating seven such paired single-cell profiles, together with four isogenic control injected chimeras, provided a robust internally controlled platform to assess cell-autonomous effects of Foxc1/2 DKO over time. We then tested the impact of Foxc1/2 loss on the temporal dynamics within the embryo. We further performed in silico analysis to control for loss of imprinting and chromosomal abnormalities that are frequently associated with mESC cultures (Figure S6C). Overall, host and DKO cells consistently match in terms of embryonic time (as quantified by analysis of K-nn similarities between knockout or host cells and the reference temporal model) (Figures 6C and S7A). However, in 4 of 7 chimeras, DKO cells matched embryonic mesoderm of significantly younger reference embryos relative to the host cells. This phenotype, indicating retarded differentiation, is specific to the embryonic mesoderm, as timing based on ectoderm/endoderm cells was similar to the host (D = 0.21, p = 3.8e−11; D = 0.06, p = 0.17, Kolmogorov-Smirnov D statistic and p value for the mesoderm and combined endoderm/ectoderm, respectively) (Figures 6C and S7A). Specificity was further substantiated by directly comparing DKO cells to their isogenic controls (Figure S7B).

Figure S6.

Figure S6

Supporting information, related to Figure 6

(A) Color coded expression for Foxc1 and Foxc2 per time and MC, plotted along inferred flows.

(B) Images of representative chimera and tetraploid embryos (GFP, phase contrast and overlay). Scale bar = 200um (except DKO chimera, where 100um). On the right, flow cytometry side-scatter width (SSC-W) plotted against GFP fluorescence intensity for each cell (after applying logical transformation from R flowcore package) (Ellis et al., 2020). Cells above the upper GFP threshold were classified as mESC derived, and cells below the lower threshold as host cells. Cells in between the thresholds (classified as unclear) were excluded from further analysis.

(C) Chimera DKO versus host differential expression karyogram. Shown is fold-change between all DKO and host cells organized by chromosome position ((105+eg(KO))/(105+eg(host))). On the right, chromosomes 1, 7 and 8 are highlighted. Injected embryonic stem cells display (i) loss of imprinting of the Igf2/H19 locus (chromosome 7), (ii) a trisomy chromosome 8 and (iii) possible monosomy of the distal part of chromosome 1.

Figure 6.

Figure 6

Foxc1/Foxc2 loss delays embryonic mesoderm and inhibits paraxial program

(A) Experimental scheme for the generation and analysis of Foxc1/2 double knockout (DKO) chimera embryos.

(B) Gene targeting design and validation for Foxc1/2 DKO, both are single-exon genes and the entire coding sequence was deleted in a biallelic manner.

(C) DKO mesoderm (meso) cells are developmentally retarded as compared to their matched host cells. The cumulative distribution of nearest neighbor wild-type intrinsic ranks was calculated separately for DKO and host cells (green and black lines, respectively) in each chimeric embryo. Similar analysis shows no significant delay in the ectoderm (ecto) and endoderm (endo) cells. Shown are two representative embryos, with Kolmogorov-Smirnov D statistics and p value highlighted for each (see also Figure S7).

(D) Fractions of cell types per embryo. Black, blue, and green symbols, represent host, isogenic control, and DKO cells (respectively), connected by a vertical line for each chimera embryo. Embryos were assigned a transcriptional rank according to the most similar wild-type embryo based on host cells only. Grey dots represent individual embryos of the wild-type model, and shaded area represents the moving average for wild-type embryos and moving SD (window size = 9).

(E) Relative expression of the most significant differentially expressed genes in the embryonic mesoderm between DKO and matched wild-type embryo cells, shown either per embryo (numbered 1–7 according to transcriptional rank from “youngest” to “oldest”), or across cells per cell type (see STAR Methods for selection criteria).

(F) Absolute expression of key genes per embryo, for injected mESC-derived versus host embryonic mesoderm cells, relative to smoothed wild-type expression (black line, moving average length = 11). Shaded area corresponds to moving SD across neighboring embryos (window length = 11). Units represent UMIs per million UMIs.

See also Figure S6.

Figure S7.

Figure S7

Analysis of tetraploid complemented embryos and supporting information, related to Figure 6

(A) Cumulative distribution of nearest neighbor wild-type intrinsic ranks calculated separately for DKO and host cells (green and black lines, respectively) for each chimeric embryo, with Kolmogorov-Smirnov D statistic and p value.

(B) Scatterplot of Kolmogorov-Smirnov D statistics for DKO and control isogenic chimeras demonstrating mesoderm specific temporal retardation in the Foxc1/2 chimeras.

(C) Relative expression between control isogenic mESC and host in chimeric embryos of genes associated with the DKO phenotype (as in Figure 6E), shown either per embryo (numbered 1-4 according to transcriptional rank, from ‘youngest’ to ‘oldest’), or across cells per cell type.

(D) Cumulative distribution of nearest neighbor wild-type intrinsic ranks calculated for the embryonic compartment of DKO and isogenic control tetraploid embryos (4N; green and blue, respectively).

(E) Relative cell type distributions for each embryo calculated for the tetraploid (4N) complemented embryos (arranged from left to right according to intrinsic rank).

(F) Comparison of gene expression of bulk epiblast and early NM between Foxc1/2 DKO cells and host cells (or wild-type embryo cells in the case of tetraploid embryos) demonstrating only minor changes in the DKO derived cells in these populations. Highlighted are genes with high differential expression between DKO cells and host/wt cells (log2 fold change > 1.5, red) as well as differentially expressed genes in the embryonic mesoderm as shown in Figure 6E (purple).

We next asked whether the observed delay in mesoderm differentiation is driven by specific cell types as a function of time. Time-matched comparison of cell type frequencies indeed showed a reduction in paraxial mesoderm cells in the DKO, together with increased abundance of extraembryonic mesoderm cells (Figure 6D). To search for the most promising direct regulatory targets of Foxc1/2 in the embryonic mesoderm, we performed differential expression analysis comparing pooled expression on timed single cells classified as host or knockout embryonic mesoderm. This showed consistently that the expression of key paraxial and rostral mesoderm TFs (e.g., Tcf15, Prrx2, and Twist1) and regulators (e.g., Ppp1r1a and Cer1), was markedly reduced in knockout cells (Figures 6E and 6F). Differential expression between isogenic unmanipulated cells and their corresponding host cells provided strong validation for the specificity of these genes as potential targets of Foxc1/2 (Figure S7C). Yet, this analysis also stressed the necessity of including isogenic control cells because a noticeable reduction in Pkdcc in the control cells suggested that, for this gene, perturbations may originate already in the parental mESCs (Figure S7C).

To further control for non-cell-autonomous effects, we utilized a tetraploid complementation assay. We injected Foxc1/2 DKO, or isogenic control cells into 4N host blastocysts and dissected the resulting embryos at E8 (dpc). In this manner, the embryonic compartment was solely contributed by the injected cells, allowing evaluating the non-cell-autonomous effects on gastrulation (Figure S6B). Interestingly, knockout embryos generated by 4N complementation assay were significantly younger compared to their isogenic control counterparts, dissected at the same time (Figure S6, Figure S7B and S7D). Yet, unlike in the case of the 2N chimeric embryos, both mesoderm and non-mesoderm lineages were significantly delayed in these embryos (Figure S7E). Surprisingly, however, directly comparing 4N individual knockout or control embryos to time-matched embryos identified only minor changes in gene expression (Figure S7F). Together, these results suggest that the global delay in 4N knockout embryos is due to a failure to correctly execute mesoderm differentiation programs in the absence of Foxc1/2. This may affect key signaling required for the synchronous maturation of mesoderm and non-mesoderm lineages (e.g., Lefty2) (Figure S7F). This effect is compensated in 2N chimeric embryos, most likely due to the ability of host cells to correctly activate mesoderm programs. Taken together, the data highlight Foxc1 and Foxc2 as key early mesodermal regulators and strongly implicates them in the regulation of multiple secondary TFs that lead toward the paraxial mesoderm fate.

Discussion

Fertilization triggers the course of development, whereby individual embryos progress in time toward maturation. To achieve rapid diversification after implantation, gastrulating embryos undergo massive cell proliferation that involves a continuous flux of parallel intracellular molecular changes that break transcriptional and structural symmetry (Bedzhov et al., 2014; Rossant and Tam, 2009). The complexity of modeling this process is best manifested when aiming to assign time to embryos and cells. Unlike individual embryos that can be assigned with absolute time, cells comprising an embryo may represent transcriptional states that exist at multiple time points in development. Therefore, faithfully charting cell state transitions during embryonic development requires unified models that synthesize quantitative transcriptional kinetics of individual cells and embryos.

To address this challenge, we conducted a phenomenological characterization of mouse gastrulation at the resolution of single embryos and the single cells comprising them. Ordering embryos on a transcriptional continuum allowed us to assign a unique time-stamp to each embryo that is based on the transcriptional profile of its comprising cells (denoted Et). Comparing transcription time with classical morphology-based staging showed an overall high correlation between the two approaches. However, morphology was able to capture only part of the variation in the transcriptional landscape between embryos (Figure 7A). In part, this is due to the limited number of distinguishable morphological changes and the qualitative nature of such classification. In many cases, molecular specification precedes the appearance of an appreciated morphological structure as evident, for example, in the cases of the allantois, cardiomyocytes, and early blood specification (Figure 7B). Our data provide a synthesis between transcription and morphology, thus significantly refining classical embryo staging approaches.

Figure 7.

Figure 7

Acquisition of morphological and transcriptional traits during gastrulation

(A) Staged traces of representative embryos highlighting the major phenotypical alterations during gastrulation and up to somitogenesis. Boxplot indicating the distribution of calculated developmental time (Et) per embryo according to morphological classification (see Figure S1B for complete embryo compendium). Boxes show interquartile range (IQR) with median. Whisker length corresponds to 1.5 IQR or the distance to the extremal value of the distribution if within 1.5 IQR.

(B) Cell type distribution per embryo, arranged according to intrinsic rank and separated into the 13 age groups.

(C) Complete flow model showing all cell types and major predicted transitions. Map “altitude” represents the degree of specification of each cell type, from shallow “basins” (e.g., epiblast and NM) to deep “canyons” (e.g., ExEn and blood).

To infer differentiation flows and lineage specification dynamics over the embryonic transcriptional manifold, we developed algorithms that integrate single-embryo time, transcriptional identity of single cells, and estimation of the growth rates for key embryonic lineages. The ability to chart simultaneous programs in the embryo at high temporal resolution provides a holistic view across cell types, thus allowing suggesting co-dependence between them. A notable example of such an interaction is the significant reduction in Nodal signaling genes in the otherwise largely homogeneous epiblast, situated juxtaposed to the source of high levels of the Nodal inhibitors Lefty1 and Cer1 from the VE (Costello et al., 2015; Perea-Gomez et al., 2002) and temporally correlated with the onset of Lefty2 expression by the emerging mesoderm. This transcriptional switch coincides with a shift in the differentiation potential of the epiblast: from early predominant contribution to mesoderm and endoderm fates through the PS, to en masse commitment to definitive ectoderm commencing Et7.0. Interestingly, the watershed-like switch in epiblast commitment correlates with the previously described sharp reduction in the efficiency to isolate epiblast stem cells from late streak embryos (Brons et al., 2007; Kojima et al., 2014; Tesar et al., 2007).

Consistent with previous findings in Xenopus (Hemmati-Brivanlou and Melton, 1997), our model predicts that epiblast commitment to ectoderm initiates through a common progenitor cell population termed definitive ectoderm. These cells are predicted by the flows to differentiate to either neural plate or surface ectoderm (epidermis) fates. Indeed, such cells have been previously identified in the mouse (Cajal et al., 2012) and further isolated in vitro (Harvey et al., 2010; Li et al., 2013; Liu et al., 2018). Yet, the transient nature of definitive ectoderm (retention time 4.8 ± 0.8 h) (Figure S4F), the lack of distinctive markers, and difficulty in tracing their fate in vivo make elucidating the potential of these cells a challenging task. Our data suggest that early distinct spatial expression within this population may predict anterior/posterior patterning (Figure S4B). This potentially extends recent findings demonstrating regionalization of the spinal cord prior to neural differentiation (Metzis et al., 2018), in contrast to the long-standing initiation transformation model (for review, see Stern et al., 2006).

The flow model clearly uncovered prototypic rapidly bifurcating cell state transitions associated with the first specialized cells of the embryo: blood, node/notochord, and cardiomyocytes. Nevertheless, gastrulation appeared to be dominated by enduring progenitor populations that gradually multifurcate to give rise to distinct cell fates. Regulation of such regimes seems combinatorial and nuanced with no single major TF or clear hierarchical stepwise program driving transitions. We propose an approach that focuses on the multi-faceted regulation within such states, combining temporal and kinetic modeling with experimental systems that integrate reporters, perturbations, and single-cell readout for in-depth analysis in vivo. We show that single chimeric embryos, harboring gene-specific mutations, allow separation of the effect of key regulators on developmental timing from their effect on direct target genes. Specifically, in the case of Foxc1/Foxc2 mutants, we could reproduce the previously described loss of paraxial mesoderm, but in the same experiment also trace back this phenotype to gene expression changes in its progenitors. Our data show that knockout cells that were destined toward rostral and paraxial fates are developmentally delayed and may later compensate by elevating alternative programs (e.g., those involving extra-embryonic mesoderm fates). Multi-furcation in the mesoderm, therefore, emerges as a process involving a delicate balance between differentiation programs within a multipotent mesodermal progenitor state.

Continuous commitment over time represents a key limitation of pseudotime methods that inherently forces a rigid tree-like structure on inferred cell state transitions. Similarly, convergence of distinct states into the same program is not easily modeled using pseudotemporal ordering. One such example is the contribution of the VE to the embryonic endoderm recently demonstrated during early mouse gastrulation (Kwon et al., 2008; Nowotschin et al., 2019; Peng et al., 2019). Our work provides a unified model of mouse gastrulation that considers continuous, parallel, and converged differentiation toward multiple lineages (Figure 7C). In such a “basin-like” representation of development, rapid bifurcations into highly functional cells are depicted as “canyons,” whereas progenitor states that gradually multifurcate over several time points are depicted as basins (Figure 7C). Collectively, our work introduces a quantitative temporal model of embryonic development, holding great promise for elucidating the roles of multiple regulatory layers in shaping and memorizing functional programs. Nevertheless, moving forward toward a complete realization of these processes will require integrating spatial information into the model, creating spatiotemporal models that represent development as a concurrent, and interacting collection of intra-cellular processes with progressively specialized and robust transcriptional and spatial identities.

Limitations of study

Limitations of our study are related first to our model resolution and accuracy. Model accuracy is defined by the number of embryos sampled and the uniformity of their absolute (a priori unknown) temporal distribution. Resolution in our model is directly affected by cellular diversity. Indeed, current sampling depth allows more precise tracking for embryonic times showing extensive cellular diversity (at Et6.9 and thereafter) but provides lower resolution for the earlier, inherently less-diverse, time bins. In addition, dissection of extra-embryonic tissues may have variable efficiency and can affect inference of flows for these lineages. These issues can be improved on by sequencing additional embryos up to the noise limits of sampling and scRNA-seq. More fundamental limitations relate to the use and interpretation of the model. The flow model makes assumptions on proliferation rates of different cell types over time, which were extrapolated from bulk measurements and cell-cycle-related gene expression. Measurements involving specific cell type isolation or monitoring lineage dynamics using time-lapse microscopy will allow a more precise evaluation of cell proliferation rates. Finally, the precision of the estimated breakdown of embryonic cells into types per time point is based on thresholding a continuous space of differentiation, and when visualizing the flow model, we must use it as is (e.g., for color-coding).

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Deposited data

scRNA-seq embryonic profiles This paper GEO: GSE169210
Gastrulation scRNA-seq atlas Pijuan-Sala et al., 2019 PMID: 30787436

Experimental models: Cell lines

Mouse: V6.5 mouse embryonic stem cells Jaenisch lab, MIT RRID:CVCL_C865
Mouse: Foxc1/2 dKO mESCs This paper N/A
Mouse: isogenic control mESCs This paper N/A

Experimental models: Organisms/strains

Mouse: C57BL/6JRccHsd Envigo RRID: MGI:6151402
Mouse: Hsd:ICR(CD-1) Envigo RRID: MGI:5649797
Mouse: B6D2F1 Envigo RRID: MGI:5651959

Oligonucleotides

FoxC1 5′ gRNA: gTTGATCCGAACGTTCCTCCG This paper N/A
FoxC1 3′ gRNA: gAGTCTCTGTACCGCACGTCG This paper N/A
FoxC2 5′ gRNA: GGCGCTCGGGTTCAGCCGAC This paper N/A
FoxC2 3′ gRNA: gAGGGACGGCGTAGCTCGATA This paper N/A

Recombinant DNA

Cas9 targeting plasmid: px330 Wu et al., 2013 Addgene plasmid: #98750
HTNC expression plasmid: pTriEx-HTNC Peitz et al., 2002 Addgene plasmid: #13763

Software and algorithms

Metacell Baran et al., 2019 PMID: 31604482
Network flow inference algorithm This paper https://github.com/tanaylab/embflow
https://doi.org/10.5281/zenodo.4646177
MARS-seq pipe mapping/UMI pipeline Keren-Shaul et al., 2019 PMID: 31101904

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Yonatan Stelzer (Yonatan.stelzer@weizmann.ac.il).

Materials availability

Plasmids and cell lines generated in this study will be made available by the Lead Contact upon request.

Data and code availability

The accession number for the raw and processed data reported in this paper is GEO: GSE169210.

Code has been deposited at https://github.com/tanaylab/embflow, https://doi.org/10.5281/zenodo.4646177.

Interactive independent analysis can be performed at https://tanaylab.weizmann.ac.il/embflow.

Experimental model and subject details

Cell culture and genetic manipulation

Mouse embryonic stem cells (mESCs, v6.5) were cultured under standard conditions on a feeder layer of X-ray irradiated mouse embryonic fibroblasts (MEF, DR4). Standard mESC medium: DMEM (High glucose, GIBCO), 20% FBS (US certified, Biological Industries), 20mg/lit recombinant leukemia inhibitory factor (LIF), 0.1mM 2-mercaptoethanol (GIBCO), Penicillin/streptomycin (Biological Industries), 1mM L-glutamine (Biological Industries), 1% non-essential amino acids (Biological Industries). For chimera assays medium was supplemented with 1mM PD0325901 (Sigma, 1 mM), and 3mM CHIR99021 (Sigma), 48hrs prior to blastocyst injection.

Embryo Collection and Documentation

All animal procedures were approved by the Institutional Animal Care and Use Committee and were performed in strict adherence to Weizmann Institute guidelines. Mice were monitored for health and activity and were given ad libitum access to water and standard mouse chow with 12-hr light/dark cycles. Embryos were collected from timed pregnant immune competent C57BL/6JRccHsd or Hsd:ICR(CD-1) females (obtained from Envigo and mated in house with males of the same strain), between E6.5-8.25 (see Figure S1F). Pregnant mice were not involved in previous studies. Embryos were removed from their implantation sites using fine forceps, in PBS, and Reichert’s membrane and ectoplacental cone removed. Embryos were then washed in fresh PBS and transferred to chilled DMEM (Phenol-red free, GIBCO) supplemented with 10% FBS (Biological Industries) for imaging prior to dissociation. Phase contrast images were taken with an Eclipse Ti2 inverted microscope (Nikon) and Zyla sCMOS camera (Andor). Size measurement of the embryonic component of each embryo was performed as previously described (Downs and Davies, 1993). Embryo staging was performed according to (Downs and Davies, 1993) and EMAP (eMouse Atlas Project; http://www.emouseatlas.org/emap/home.html) (Richardson et al., 2014).

The same single investigator performed both staging and measurements for consistency.

Generation of Foxc1/Foxc2 double knock out cells

mESCs expressing constitutive GFP were generated by integration of a Lox-DsRed-Lox-GFP cassette (based on Addgene plasmid #32702) into acceptor attP sites integrated in the H11 locus (using Addgene plasmid #52544), followed by Cre recombination using recombinant His-TAT-NLS-Cre (HTNC) protein (Addgene plasmid #13763), as previously described (Peitz et al., 2002). Cells were then co-transfected using TransIT-X2 (Mirus, according to manufacturers protocol) with four SpCas9 vectors (px330, Addgene plasmid #98750) (Wu et al., 2013), each expressing a single gRNA designed to remove the entire protein coding sequence of either Foxc1 or Foxc2. gRNAs were selected for minimal off targets using CCTop - CRISPR/Cas9 target online predictor (https://cctop.cos.uni-heidelberg.de:8043/) (Stemmer et al., 2015). For genotyping, individual clones were grown for two passages on gelatin-coated plates to eliminate residual MEF, and genomic DNA extracted with PCR compatible lysis buffer (10mM Tris, PH = 8, 0.45% Triton X-100, 0.45% tween-20, 0.2mg/ml Proteinase-K). Validation by genomic PCR used a set of primers flanking the expected deletions, and internal primers, as well as by Sanger sequencing.

Method details

Flow cytometry

Prior to dissociation, to focus on tissues that contribute to the embryo proper, the ectoplacental cone, parietal endoderm, and much of the extraembryonic ectoderm, were removed using fine forceps. For isolation of single cells for scRNA-seq, embryos were dissociated with 0.25% Trypsin-A, 0.02% EDTA (Biological Industries) solution for 5′ at 37°C, and resuspended in DMEM w/o phenol red (GIBCO) supplemented with 10% FBS (Biological Industries). Samples were run on a FACSAria-III flow cytometer (BD Biosciences) using the ‘index sort’ option to retain the spectral properties of each individual sorted cell.

Single-cell RNA-sequencing

Single-cell cDNA libraries were prepared using the MARS-Seq method, as described (Jaitin et al., 2014; Keren-Shaul et al., 2019), with the following modifications: The final concentration of the RT1 primers was 2nM, and pooling was done via centrifugation to VBLOCK200 reservoir (Clickbio). Klenow reaction was not followed by heat inactivation. The volume of the first RT and Exonuclease I reactions mix were scaled down to 1 and 0.5 ul, respectively, and dispensed by MANTIS liquid handler (FORMULATRIX). In brief, single cells were sorted using flow cytometer directly lysis solution containing well identifying poly-T barcodes. mRNA from cell capture plates was then converted into cDNA and pooled. Pooled samples were amplified by T7 in vitro transcription, and the resulting RNA was fragmented and converted into a sequencing-ready library by tagging the samples with pool barcodes and Illumina sequences during ligation, reverse transcription, and PCR.

Estimation of total nuclei counts in E6.25 – E8.0 embryos

To estimate total cell counts in embryos of different developmental stages, we harvested 19 embryos from E6.25 to E8.0 and fixed them overnight using 4% PFA. The embryos were subsequently washed with PBS-0.1% tween (PBST), nuclear-stained using DAPI (Sigma-Aldrich D9564) diluted to 1.0 μg/mL in PBS and incubated overnight in 70% glycerol-PBS on a tilting platform to clarify the tissue. The embryos were mounted on a Cellvis 35mm glass-bottom dish and imaged using a Zeiss LSM-880 confocal with automatically optimized Z-interval. The raw images were next processed using Bitplane Imaris software. E6.25-E7.0 embryos were processed along the entire span of the confocal stack, while embryos from later stages were cropped to half of the span along the lateral-to-medial extent, discarding dimmer Z sections distal to the lens. We used the Imaris “surface” feature to manually segment the image stack into embryonic and extra-embryonic domains. We next used the “spots” feature with optimized parameters to assess the distribution of cell nuclei in each domain separately. For embryos older than E7.0, the nuclei counts were multiplied by a factor of two to account for the entire left-right axis of each embryo.

Blastocyst injections

Blastocyst injections were performed using (C57BL/6xDBA) B6D2F1 (Envigo) host embryos. In brief, 3-4-week old B6D2F1 females were hormone primed by an intraperitoneal (i.p.) injection of pregnant mare serum gonadotropin (PMSG, Vetmarket) followed 46 hr later by an injection of human chorionic gonadotropin (hCG, Sigma). Embryos were harvested at the zygote stage, and cultured in a CO2 incubator until blastocyst stage. On the day of the injection, groups of embryos were placed in drops of M2 medium using a 16-um diameter injection pipet (Biomedical Instruments). Approximately ten cells were injected into the blastocoel of each embryo using a Piezo micromanipulator (Prime Tech). For tetraploid complementation, 2-cell embryos were fused to one cell using a CF150/F instrument (BLS), by 2 DC square pulses of 30V 40ms and 1-2V AC, in 0.3M Manitol solution with BSA. An average of 15 cells were injected to the blastocoel of such embryos. Approximately 20 blastocysts were transferred to each recipient female (CD1 female mice, Envigo); the day of injection was considered as 2.5dpc. Mice were handled in accordance with institutional guidelines and approved by the Institutional Animal Care and Use Committee (IACUC).

MARS-seq and other scRNA-seq processing

MARS-seq reads were processed using the MARS-seq2.0 pipeline (Keren-Shaul et al., 2019). In Brief, After removing plate barcodes (4 base pairs) from Read 1, reads were mapped to the mm9 genome using bowtie2. Demultiplexing of the reads and construction of the single-cell UMI matrix was based on the well barcodes (7 base pairs) and UMI barcodes (8 base pairs) from Read 2.To demultiplex single-cell datasets into individual embryos, we recorded embryo identity per well during sorting and mapped well barcodes to embryo IDs as part of the MARS-seq post-processing stage. We removed mitochondrial transcripts from all cells in the UMI matrix. Overall we processed 40868 wells, out of which we retained 33,900 well-covered cells for further analysis. In particular, cells with the less than 1000 UMIs were filtered out. To analyze chimeric embryos, we used FACS index sorting to record GFP fluorescence per cell in addition to embryo identity per well. We then distinguished knockout and host single cells using thresholding of the green channel bimodal distribution. To allow comparison of our temporal atlas to the published gastrulation atlas we used 116,312 QC positive cells and mapped 10x gene names to MARS-seq gene names using naive matching of gene symbols.

Quantification and statistical analysis

Experimental design

Replication in this study is based on collecting individual embryos. All analyses were validated to be robust to subsets of the embryos acquired. Data collection involved sampling embryos at different estimated times, in several iterations given temporal analysis to insure complete coverage (See Table S1). For scaling embryos in time bins we corroborated scRNA-seq data with estimation of embryo size, thereby preventing bias or skewing of the timescale. We used blinding for staging embryos using two independent methods (scRNA-seq and morphology-based). The study did not involve a-priori determination of required sample size since the robustness of single-embryo sampling was not known in advance. However we did determine temporal bin size (based on the total number of cells from embryos in the bin, aiming at a minimum of 1000). There were no formal inclusion/exclusion criteria in this study.

Metacell analysis

To select feature genes for MC analysis we followed standard practice and identified 804 high variance genes in the UMI matrix downloaded to 1,653 UMIs (Baran et al., 2019). We then filtered cell-cycle and stress-related genes using manual annotation of 80 gene clusters. This resulted in a set of 665 feature genes. MCs were derived as described (Baran et al., 2019) using K = 100 and standard bootstrapping, MC splitting and outlier filtering. To avoid potential bias due to sometimes incomplete removal of extraembryonic tissues, residual MCs corresponding to extraembryonic ectoderm and parietal endoderm were removed prior to further analysis. The derived reference model included 461 MCs on 33,700 cells, leaving 200 outlier cells.

For a representation of temporal trends in Figure 1, we split the MC partition into separate endoderm, mesoderm and ectoderm graphs using known germ layer marker. This classification was not used in subsequent modeling and annotation, which are all based on flows described in Figure 2.

Embryo timing algorithms - K-nn ordering

Intrinsic temporal ordering of embryos is inferred using analysis of the cell-cell similarities and optimization of the linear ordering of embryos minimizing the temporal distance between similar cells. This analysis is done after excluding cells from the extraembryonic endoderm (defined by marker expression of Ttr and Apoe, Figure S1G) and the hematopoietic lineage (defined by expression of Cited4, Figure S3), since sampling of these tissues is variable between embryos. Given the filtered UMI matrix, we construct the adjacency matrix Euv encoding the balanced K-nn graph as defined by the MC pipeline for K=50. We marginalize over embryos to compute raw embryo similarity: Nij=uCi,vCjEuv, where Ci represent the cells of embryo i. We then normalize raw similarities in two steps. First, we normalize columns given the number of cells in each embryo Sijpre=(|Cj|)1Nij. We next normalize rows while also considering embryo sampling batch effect. To do this, assume the sampling batch of each embryo is b(i) and assume some initial embryo ordering ord(i) (e.g., based on morphology alone). We compute the embryo batch effect by estimating the increase in similarity only among embryos that are within the same approximated time range (to avoid computing large batch effects due to differences in scheduled embryo harvesting)

βi=|ord(i)ord(j)]10,b(i)=b(j)Sijpre/|ord(i)ord(j)]10,b(i)b(j)Sijpre.

We correct similarities between embryos within the same batch and |ord(i)ord(j)|10 by setting Sijpre,b=(βi)1Sijpre (keeping all other Spre values unchanged), and then define:

Sij=Sijpre,b/kSikpre,b

We can now define the ordering optimization problem

τ=argminτ()ijSij|τ(i)τ(j)|

among all possible orderings τ() of the embryos. We solve the problem using a sorting heuristic. We initialize embryo ordering by their annotated Theiler stage (randomly ordering within each stage). Embryos for which Theiler staging was not possible (9 embryos), were assigned to the most abundant Theiler stage of their litter. We then run a bubble-sort-like algorithm by iteratively switching the position of neighboring embryos if this lowers the goal function until a local minimum is reached. We found that this simple heuristic is sufficiently powerful given that the initial Theiler ordering already defines the correct coarse-grained solution.

Defining absolute developmental time using size measurements

To transform embryo ordering into physical developmental times (in units of hours) we had to consider non-uniform embryo sampling and correct it using some independent physical measurement. We used area measurement Ai for a subset of the embryos (using only C57BL/6 data to minimize strain-effects). Given the embryos exponential growth dynamics, we fit log(Ai) as a function of the intrinsic rank of each embryo i using a smoothing spline interpolation (R function smooth.spline with parameter spar=0.9, Figure 1E). To extend the boundaries, the interpolation function was linearly extrapolated. To derive absolute time, the log area scale log(A) was transformed into a developmental timescale Et by assuming a linear relationship tlog(A) and setting the median time of the first age group to 6.5 and the median time of the last age group to 8.1.

Comparison of inferred time to reference gastrulation atlas

We constructed a reference MC object for the mouse gastrulation atlas published in Pijuan-Sala et al. (2019). Most atlas cells are labeled by collection time tc into 6h intervals between e6.5 and e8.5. Using these labels, we computed the mean atlas age for each reference MC. We then computed for each of our embryos a projected mean age as follow:

  • i)

    we matched MARS-seq and 10X gene symbols and used only feature genes as defined by the MC pipeline

  • ii)

    for each embryo cell, we identified the atlas MC with maximal correlation (using log scaled single-cell UMI counts correlation to log scale gene frequency in MCs).

  • iii)

    we computed the mean reference MC time matching all cells within each embryo. This is then defined as the projected atlas time of the embryo. Comparisons of these times to the intrinsic embryo ordering is shown in Figure 1.

Manifold construction and distance metrics

Given a MC solution over all embryos’ cells, we define first the gene expression distribution within each MC using the fraction of UMI’s per gene g in MC i, denoted egi (computed by the MC package). Each MC consists of cells from different time groups, represented by vectors nit specifying the number of cells from time group t and MC i. We assume analysis is done using a set of feature genes F including genes that involve significant transcriptional variance and are filtered so as not to include batch-affected genes or genes linked with the cell cycle.

To compute manifold distances between MCs we perform the following steps:

  • i)

    define parametric pairwise distances: we use a logistic function on the regularized egi values over the set of features genes F:

dij=gFplogis(log2(ε+egi)log2(ε+egj);loc,scale)

where loc (default 1) and scale (default 0.2) are parameters for a standard logistic function and ε (default 5e-4) is a regularization parameter.

  • ii)

    defined manifold neighborhoods. We only trust the above parametric pairwise distances over highly related transcriptional states (e.g., for egi vectors that are generally similar). We therefore identify pairs of MCs Epre such that (i,j)Epreifdij<α∗min(Ti,Tj), where Ti is the distance between MC I and its second most similar neighbor, and α is a tolerance parameter (default 3). We further filter Epre such that each MC include at most 5 neighbors (those with the smallest distances), and construct a manifold graph M = (I, E) over MCs, with edge weights that equal the original dij values for the retained edges. This graph is visualized in Figure 2A.

  • iii)

    We next define a rate matrix Q=(rij) where rij=(D/dij) for (i,j)E and 0 otherwise, setting the diagonal as rii={ij}rij and using D=mediani(maxjdij) for scaling distances. The manifold transition probabilities are now computed as:

pij=etQ, i.e., assuming a Markov process over the manifold with the rate matrix Q and timescale t (we use t = 1 to derive the flow models as discussed below). The manifold costs are derived from the probabilities by scaling each row and taking the inverse:

cij=maxj(pij)/pij

Cell cycle modules and growth loss constant estimation

For each cell, we computed a mitosis and synthesis phase score by counting the number of UMIs from mitosis (M) and replication (S) genes and normalizing this count by the total number of UMIs per cell. The group of M-phase genes includes Mki67, Cenpf, Top2a, Smc4;SMC4, Ube2c, Ccnb1, Cdk1, Arl6ip1, Ankrd11, Hmmr;IHABP, Cenpa;Cenp-a, Tpx2, Aurka, Kif4, Kif2c, Bub1b, Ccna2, Kif23, Kif20a, Sgol2, Smc2, Kif11, Cdca2, Incenp, Cenpe. The group of S-phase genes includes Pcna, Rrm2, Mcm5, Mcm6, Mcm4, Ung, Mcm7, Mcm2, Uhrf1, Orc6, Tipin. We detected reduced S phase and M phase scores for cells from extraembryonic and embryonic endoderm (Figure 2C). We also observed exponential decay of the fraction of cells per embryo as a function of inferred embryo time, fitting in log space a global trend of decrease in their relative frequency with coefficients of 0.83 and 0.88 for extraembryonic and embryonic endoderm respectively (Figure S1I). We did not observe significant evidence for differential growth rates in other cell types. We therefore assume a growth loss constant of λexeendot=10.83 = 0.17 for extraembryonic endoderm at all time groups, and of λembendot=10.88=0.12. for all embryonic endoderm types. We set the loss constant to 0 for all other cell types. We note that these estimations are only approximating a much more complex and dynamically changing process, but within the time window studied here, these estimations provide sufficient compensation with minimal additional parameter fitting.

Network flow inference and modeling

Recall each MC consists of cells from different time groups, represented by vectors nit. We normalize these count vectors and generate a probability distribution per time point pit=nit/init. We define a flow model over this time resolved MC model using a flow matrix fijt for every time point t < T (T being the last time point), describing the fraction of cells from time t at MC i transitioning to MC j at time t+1.

The ideal “mass conservation” constraint in the model ensures the total flow in and out of each MC at each time point equals the observed frequency of single cells within it – specifically:

jfijt=pit(forward),jfjit1=pit(backward)

where the boundary constraints (forward at the t = T, backward at t = 1), are ignored.

In practice however, we must relax these constraints given two considerations – First, we must consider differential growth rates between MCs. Second, uncertainty on our estimations of pit (which are a result of a single cell sampling process with considerable variance) must be considered.

To account for differential growth rates we use the growth loss constants λit (as described in the previous section). We assume that at each time step, MCs with λit>0 emit “dilution” flow proportionally to their frequency - λitpit. The total lost flow at time t, λt=iλitpit is being redistributed proportionally among all MCs toward time t+1, adding γit=pit+1λt to the flows going into each MC at time t+1. To define the resulted generalized conservation of mass, we introduce the flow constraints for each MC and time point:

CNSTRflowjfijt+pitλit=jfjit1+pitλt1=fit

We can now consider the uncertainty on the estimation of pit by computing relative total flow error per MC and time point dit=(fitpit)/pit and using a cost function to penalize flows with large d magnitude –

costcap=i,tflow_error(dit)

where the flow error is a piecewise linear convex function with a minimum at 0 (shown in the right panels in Figure S2A).

Given the relaxed formulation of the mass conservation and growth constraints, we can next write down a general optimization goal function for flows, which consider also the transcriptional manifold and the MC-MC distances defined by it – in simple terms:

costmanif=i,j,tcijfijt

The resulted combined optimization problem:

min(costmanif(F)+costcap(F))

Subject to the constraint CNSTRflow and the global flow constraints ifi1=1. In fact, these constraints and cost function define a classical network min-cost max-flow problem with convex costs. This problem is polynomial, and we derive an optimal solution the highly efficient simplex algorithm (Ahuja et al., 1993).

Annotation of metacells using flows

To annotate MCs, we added to the analysis of marker gene expression in each MC an examination of the inferred flow from (and to) them. MCs. In some cases, these supported identification of progenitor states that are already primed toward specific fates despite having only mild marker expression specificity. Let Fij=t=112fijt be the average MC MC flow, F3=FFF (matrix multiplication) and let G be the concatenation of F3 and the transpose (F3)T containing the third-order ingoing and outgoing MC flows for each MC (rows). For G we compute the first-order correlation matrix

Cij1=cork(Gik,Gjk)

with the diagonal set to zero and the second-order correlation matrix Cij2=cork(Cik1,Cjk1). The diagonal of Cij2 is again set to zero. MCs are then clustered into 65 groups using hierarchical clustering (Ward.D method) of Euclidean distances between columns of Cij2. The second-order correlation matrix is shown in Figure S3A. Clusters were manually annotated based on selected markers shown in Figure S3B. In cases where intra-cluster gene expression was not sufficiently homogeneous, single MCs were reannotated accordingly.

Flow robustness and sensitivity analysis

For the analysis in Figure S2A, we used a fixed MC solution and recomputed flows while changing only one parameter at a time. To generate Figure S2B, 153 replica single-cell UMI matrices where generated leaving out the cells of one of the 153 embryos each time. MC partitions, manifold distances and flows were recomputed for each replica using the same parameters as in the original analysis. MCs were annotated with cell types by projecting each replica MC to the best correlated original MC using log transformed mean gene UMI fractions on feature genes. Intrinsic ranks of embryos as well as the age group partitioning was kept unchanged.

Flow propagation

Using the flow matrices fijt that are the solutions of the mincost flow problem, we define forward and backward probability transition matrices between MCs via

Pijt,fw=fijtkfiktPijt,bw=fijtkfkjt.

Given a distribution pit over MCs at time t, we can iteratively forward and backward propagate it through

pt+1=(Pt,fw)Tptpt1=Pt1,bwpt.

Vein plots

To summarize flows as shown in the “vein plots” of Figures 4 and 7, we group cells based on their annotated cell type and time-bin. We then summarize flows between cell types at each time point (where total flows between subsequent time points is 1). We are smoothing total frequency of each cell type per time using local polynomial regression and create “veins” with changing width that is proportional to cell type frequency. Flows between cell types are simplified by eliminating low magnitude flows between cell types (using a threshold of 0.005), and we visualize each edge with a width that is proportional to flow magnitude.

Unique genes specific to only one cell type transition

To generate Figure 5A, we selected for each cell type the ancestral cell type mostly contributing to it according to the flows. For each pair of cell type transitions T1T2 we calculated gene expression profiles

eg1=iT1,jT2egiFijiT1,jT2Fij,eg2=iT1,jT2egjFijiT1,jT2Fij

corresponding to bulk expression of cells before and after the transition T1T2. Denote by lf(g;T1T2)=log2(ε+eg2)log2(ε+eg1) the fold change corresponding to that transition (ε=105). We then selected genes that are differentially expressed along a transition, i.e., that satisfy

|log2(ε+eg2)log2(ε+eg1)|>1,log2(105+max(eg1,eg2))15

for at least one transition T1T2. Among these genes, we filtered genes, that are uniquely differentially expressed along one transition, i.e., that (i) have a gap

max(T1T2)lf(g;T1T2)>0.5+lf(g;T1T2)

for all transition pairs (T1T2) not equal to the maximizing pair and (ii) for which the second largest fold change along a transition (T1T2) is smaller than 1. Figure 5A shows lf(g;T1T2) for all included genes and included pairs T1,T2.

Mesoderm combinatorial TF expression and kinetic analysis

We filtered 63 mesodermal transcription factors that display high absolute expression as well as high variance within the mesoderm, i.e., (ε=3105)

log2(ε+maximesoegi)12,Δmax(g)>2.

with Δmax(g)=log2(ε+maximesoegi)log2(ε+minimesoegi). These factors were visualized on a heatmap over all mesoderm MCs.

To estimate the kinetics of a TF g over time, we selected the mesodermal MCs, where it is highly expressed,

(log2(ε+egi)log2(ε+minimesoegi))>0.8Δmax(g).

Flows were then propagated through these MCs. The kinetics of selected transcription factors along their propagated flows is shown in Figure 5C.

TF linear regression models

We identified genes that have at least an 8-fold difference between minimal and maximal gene expression within the mesoderm. Gene expression levels of highly variable genes (that are not transcription factors) were predicted in terms of transcription factor levels using standard linear regression. For each variable gene, the most predictive single transcription factor and the most predictive pair of transcription factors was reported (Figure 5D). Because Rˆ2 values of linear regressions always increase when increasing the number of explanatory variables, we performed a permutation test for comparing gene expression predictions from two transcription factors with predictions from one transcription factor. To this end, we first shuffled the TF expression matrix. We then used the best fitting single TF in the original data as a fixed anchor and estimated the improvement in Rˆ2 derived when adding to this anchor the best fitting shuffled TF vector. A comparison of the gain in Rˆ2 in the unshuffled matrix and shuffled matrix provided control for the over-fitting we perform when extending the model.

Cell type annotation of chimera cells

We constructed a common balanced K-nn graph consisting of 33.889 cells from the wild-type atlas and 4284 cells from seven Foxc1/2 KO chimera embryos. Using the common K-nn graph, we associate to each cell i a probability vector pi counting the fraction of cells from each cell type among its 50 nearest neighbor cells, i.e.

pi(celltypeA)=#wildtypeneighborcellsfromcelltypeA#wildtypeneighborcells.

Note that we can calculate pi for both wild-type and chimera cells, while only wild-type cells come with a cell type tag. Using pi, we then assigned each chimera cell to a cell type by projecting it on that wild-type cell j, whose pj is most correlated with pi. A MC 2d projection of the combined single-cell graph is shown in Figure S6C. Chimera cells are colored according to their inferred cell type, wild-type cells are plotted in gray.

Timing of KO and host cells within chimera embryos

We used the balanced K-nn graph of the combined chimera and wildetype embryo dataset to time host and KO cells from each chimera embryo separately. More precisely, for each cell i let NN(i) be the nearest neighbor wild-type cell that is not from the same embryo as cell i. We then attach to each cell i the intrinsic rank (between 1 and 153) of the embryo of its nearest neighbor cell NN(i),

ti=RNN(i).

We noted that simply computing the mean embryo time of NN cells as the representation of chimera cell population timing would introduce biases at the boundaries of the analyzed time interval (e.g., very early embryos cannot have neighbors in embryos even more early). We therefore refined the analysis of NN times per knockout and host cells as follows. For a selection S of cell types, for instance, embryonic mesoderm or ectoderm, we define for each chimera embryo E the cumulative distributions qES,KO(t) and qES,host(t) of nearest neighbor time stamps ti for all KO/host cells i from the chimera embryo E and selected cell types S. For each wild-type embryo E we can compute an analogous nearest neighbor cumulative distribution pES(t). By finding for each qES,KO(t) or qES,host(t) the best correlated wild-type pES,(t), we can match each chimera embryo E with a wild-type embryo E, separately for KO and host cells. This procedure is used to independently time KO and host cells from ecto-/endoderm, embryonic mesoderm and extraembryonic mesoderm. The group of ecto-/endoderm contains foregut, definitive endoderm, primitive node, definitive ectoderm, surface ectoderm, rostral neural plate, caudal neural plate and caudal epiblast. Embryonic mesoderm contains early nascent, late nascent, caudal, lateral & intermediate, paraxial, and rostral mesoderm as well as cardiac crescent and cardiomyocyte cells. Extraembryonic mesoderm consists of allantois, amnion/chorion, and extraembryonic mesoderm cells.

Differential expression analysis

We calculated bulk expression profiles eg,+KO and eg,+WT for all KO chimera and WT cells from cell types with high Foxc1/2 expression (rostral, paraxial and caudal mesoderm) and defined the log fold change δg+=log2(eg,+KO+ε)log2(eg,+WT+ε) for ε=5105. As a negative control, we averaged bulk expression profiles eg,iKO and eg,iWT per cell type i and gene g over all cell types not expressing Foxc1/2 (Amnion/Chorion, Allantois, Caudal epiblast, Caudal neural plate, Definitive ectoderm, Definitive endoderm, Epiblast, ExE mesoderm, Node/Notochord, PS, Rostral neural plate and Surface ectoderm) and calculated the log fold change δ between the two:

δg=log2(meani(eg,iKO)+ε)log2(meani(eg,iWT)+ε).

To identify candidate differentially expressed gene, we filtered genes satisfying

|δg+|>0.7|δg+δg|>0.5,

i.e., genes with sufficiently high differential expression among Foxc1/2 expressing cell types that was absent among the non-Foxc1/2 expressing cells. From the remaining list of 27 genes we further removed Igf2 and H19 whose opposite differential expression was attributed to loss of parental imprinting.

Additional resources

To make the data accessible to all users, we have developed an online interactive data exploration interface (https://tanaylab.weizmann.ac.il/embflow).

Acknowledgments

We thank the Tanay and Stelzer group members for discussion and critical reading of the manuscript, and Hernan Rubinstein for graphic work. Y.S. is the incumbent of the Louis and Ida Rich Career Development Chair and is supported by Moross Integrated Cancer Center, the Israel Cancer Research Fund (ICRF), Helen and Martin Kimmel Stem Cell Institute, Hadar Impact Fund, Lord Sieff of Brimpton Memorial Fund, Janet and Steven Anixter, JoAnne Silva and Lester and Edward Anixter Family, Yeda-Sela Center, Barry and Janet Lang, European Research Council (ERC_StG 852865), ISF (1610/18), the Minerva Foundation, and Human Frontier Science Program (CDA00023/2019-C). S.C. was supported by the EMBO long term fellowship (ALTF 268-2018). M.M. was a postdoctoral fellow of the Minerva Stiftung and is supported by the Walter Benjamin program of the German Research Foundation (DFG). A.T. is supported by the European Research Council (ERC CoG scAssembly), the EU BRAINTIME project, the Israel Science Foundation, and the Chen-Zuckerberg Foundation. This research was further supported Israeli Council for Higher Education (CHE) Data Science program and by a grant from Madame Olga Klein-Astrachan.

Author contributions

M.M., Y.M., A.T., and Y.S. conceived and designed the experiment and performed data analysis and its interpretation. M.M. developed the flow algorithms with help from A.T. M.M., Y.M., R.B.-Y., A.T., and Y.S. annotated cell types with the help of A.U. Embryology and single-cell processing were performed by Y.M., S.C., R.B.-Y., and R.H. with input from Z.M. and the help of N.R., L.L., A.L., and E.C. Y.M. generated the Foxc DKO cell line with the help of Y.R. and S.C. A.-H.O. performed chimera injections and supervised animal handling. The interactive web interface was generated by A.L. with help from M.M. and A.T. Y.M., M.M., A.T., and Y.S. wrote the manuscript with input from all the authors.

Declaration of interests

The authors declare no competing interests.

Published: April 30, 2021

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.cell.2021.04.004.

Contributor Information

Amos Tanay, Email: amos.tanay@weizmann.ac.il.

Yonatan Stelzer, Email: yonatan.stelzer@weizmann.ac.il.

Supplemental information

Table S1. Summary characteristics of the 153 embryos making up the transcriptional manifold, related to Figure 1
mmc1.xlsx (21.1KB, xlsx)
Table S2. Processed metacell relative gene expression data, related to Figure 2
mmc2.xlsx (30MB, xlsx)

References

  1. Ahuja R.K., Magnanti T.L., Orlin J.B. Prentice Hall; 1993. Network Flows: Theory, Algorithms and Applications. [Google Scholar]
  2. Argelaguet R., Clark S.J., Mohammed H., Stapel L.C., Krueger C., Kapourani C.A., Imaz-Rosshandler I., Lohoff T., Xiang Y., Hanna C.W. Multi-omics profiling of mouse gastrulation at single-cell resolution. Nature. 2019;576:487–491. doi: 10.1038/s41586-019-1825-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Arkell R.M., Tam P.P. Initiating head development in mouse embryos: integrating signalling and transcriptional activity. Open Biol. 2012;2:120030. doi: 10.1098/rsob.120030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Arnold S.J., Robertson E.J. Making a commitment: cell lineage allocation and axis patterning in the early mouse embryo. Nat. Rev. Mol. Cell Biol. 2009;10:91–103. doi: 10.1038/nrm2618. [DOI] [PubMed] [Google Scholar]
  5. Balmer S., Nowotschin S., Hadjantonakis A.K. Notochord morphogenesis in mice: Current understanding & open questions. Dev. Dyn. 2016;245:547–557. doi: 10.1002/dvdy.24392. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Baran Y., Bercovich A., Sebe-Pedros A., Lubling Y., Giladi A., Chomsky E., Meir Z., Hoichman M., Lifshitz A., Tanay A. MetaCell: analysis of single-cell RNA-seq data using K-nn graph partitions. Genome Biol. 2019;20:206. doi: 10.1186/s13059-019-1812-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Beckers A., Alten L., Viebahn C., Andre P., Gossler A. The mouse homeobox gene Noto regulates node morphogenesis, notochordal ciliogenesis, and left right patterning. Proc. Natl. Acad. Sci. USA. 2007;104:15765–15770. doi: 10.1073/pnas.0704344104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bedzhov I., Graham S.J., Leung C.Y., Zernicka-Goetz M. Developmental plasticity, cell fate specification and morphogenesis in the early mouse embryo. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2014;369:20130538. doi: 10.1098/rstb.2013.0538. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bellomo D., Lander A., Harragan I., Brown N.A. Cell proliferation in mammalian gastrulation: the ventral node and notochord are relatively quiescent. Dev. Dyn. 1996;205:471–485. doi: 10.1002/(SICI)1097-0177(199604)205:4<471::AID-AJA10>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]
  10. Ben-Porath I., Thomson M.W., Carey V.J., Ge R., Bell G.W., Regev A., Weinberg R.A. An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat. Genet. 2008;40:499–507. doi: 10.1038/ng.127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bowling S., Sritharan D., Osorio F.G., Nguyen M., Cheung P., Rodriguez-Fraticelli A., Patel S., Yuan W.C., Fujiwara Y., Li B.E. An Engineered CRISPR-Cas9 Mouse Line for Simultaneous Readout of Lineage Histories and Gene Expression Profiles in Single Cells. Cell. 2020;181:1410–1422.e27. doi: 10.1016/j.cell.2020.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Brons I.G., Smithers L.E., Trotter M.W., Rugg-Gunn P., Sun B., Chuva de Sousa Lopes S.M., Howlett S.K., Clarkson A., Ahrlund-Richter L., Pedersen R.A., Vallier L. Derivation of pluripotent epiblast stem cells from mammalian embryos. Nature. 2007;448:191–195. doi: 10.1038/nature05950. [DOI] [PubMed] [Google Scholar]
  13. Cajal M., Lawson K.A., Hill B., Moreau A., Rao J., Ross A., Collignon J., Camus A. Clonal and molecular analysis of the prospective anterior neural boundary in the mouse embryo. Development. 2012;139:423–436. doi: 10.1242/dev.075499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cao J., Spielmann M., Qiu X., Huang X., Ibrahim D.M., Hill A.J., Zhang F., Mundlos S., Christiansen L., Steemers F.J. The single-cell transcriptional landscape of mammalian organogenesis. Nature. 2019;566:496–502. doi: 10.1038/s41586-019-0969-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chan M.M., Smith Z.D., Grosswendt S., Kretzmer H., Norman T.M., Adamson B., Jost M., Quinn J.J., Yang D., Jones M.G. Molecular recording of mammalian embryogenesis. Nature. 2019;570:77–82. doi: 10.1038/s41586-019-1184-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Chen J., Knowles H.J., Hebert J.L., Hackett B.P. Mutation of the mouse hepatocyte nuclear factor/forkhead homologue 4 gene results in an absence of cilia and random left-right asymmetry. J. Clin. Invest. 1998;102:1077–1082. doi: 10.1172/JCI4786. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Cheng S., Pei Y., He L., Peng G., Reinius B., Tam P.P.L., Jing N., Deng Q. Single-Cell RNA-Seq Reveals Cellular Heterogeneity of Pluripotency Transition and X Chromosome Dynamics during Early Mouse Development. Cell Rep. 2019;26:2593–2607.e3. doi: 10.1016/j.celrep.2019.02.031. [DOI] [PubMed] [Google Scholar]
  18. Costello I., Nowotschin S., Sun X., Mould A.W., Hadjantonakis A.K., Bikoff E.K., Robertson E.J. Lhx1 functions together with Otx2, Foxa2, and Ldb1 to govern anterior mesendoderm, node, and midline development. Genes Dev. 2015;29:2108–2122. doi: 10.1101/gad.268979.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. de Soysa T.Y., Ranade S.S., Okawa S., Ravichandran S., Huang Y., Salunga H.T., Schricker A., Del Sol A., Gifford C.A., Srivastava D. Single-cell analysis of cardiogenesis reveals basis for organ-level developmental defects. Nature. 2019;572:120–124. doi: 10.1038/s41586-019-1414-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Downs K.M., Davies T. Staging of gastrulating mouse embryos by morphological landmarks in the dissecting microscope. Development. 1993;118:1255–1266. doi: 10.1242/dev.118.4.1255. [DOI] [PubMed] [Google Scholar]
  21. Ebisuya M., Briscoe J. What does time mean in development? Development. 2018;145:dev164368. doi: 10.1242/dev.164368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Ellis B., Haaland P., Hahne F., Le Meur N., Gopalakrishnan N., Spidlen J., Jiang M., Finak G. 2020. flowCore: flowCore: Basic structures for flow cytometry data. R package version 2.0.1.https://rdrr.io/bioc/flowCore/ [Google Scholar]
  23. Fatima A., Wang Y., Uchida Y., Norden P., Liu T., Culver A., Dietz W.H., Culver F., Millay M., Mukouyama Y.S., Kume T. Foxc1 and Foxc2 deletion causes abnormal lymphangiogenesis and correlates with ERK hyperactivation. J. Clin. Invest. 2016;126:2437–2451. doi: 10.1172/JCI80465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Fischer D.S., Fiedler A.K., Kernfeld E.M., Genga R.M.J., Bastidas-Ponce A., Bakhti M., Lickert H., Hasenauer J., Maehr R., Theis F.J. Inferring population dynamics from single-cell RNA-sequencing time series data. Nat. Biotechnol. 2019;37:461–468. doi: 10.1038/s41587-019-0088-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Grosswendt S., Kretzmer H., Smith Z.D., Kumar A.S., Hetzel S., Wittler L., Klages S., Timmermann B., Mukherji S., Meissner A. Epigenetic regulator function through mouse gastrulation. Nature. 2020;584:102–108. doi: 10.1038/s41586-020-2552-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Han X., Wang R., Zhou Y., Fei L., Sun H., Lai S., Saadatpour A., Zhou Z., Chen H., Ye F. Mapping the Mouse Cell Atlas by Microwell-Seq. Cell. 2018;172:1091–1107.e17. doi: 10.1016/j.cell.2018.02.001. [DOI] [PubMed] [Google Scholar]
  27. Harvey N.T., Hughes J.N., Lonic A., Yap C., Long C., Rathjen P.D., Rathjen J. Response to BMP4 signalling during ES cell differentiation defines intermediates of the ectoderm lineage. J. Cell Sci. 2010;123:1796–1804. doi: 10.1242/jcs.047530. [DOI] [PubMed] [Google Scholar]
  28. Hemmati-Brivanlou A., Melton D. Vertebrate embryonic cells will become nerve cells unless told otherwise. Cell. 1997;88:13–17. doi: 10.1016/s0092-8674(00)81853-x. [DOI] [PubMed] [Google Scholar]
  29. Hirokawa N., Tanaka Y., Okada Y., Takeda S. Nodal flow and the generation of left-right asymmetry. Cell. 2006;125:33–45. doi: 10.1016/j.cell.2006.03.002. [DOI] [PubMed] [Google Scholar]
  30. Huber T.L., Kouskoff V., Fehling H.J., Palis J., Keller G. Haemangioblast commitment is initiated in the primitive streak of the mouse embryo. Nature. 2004;432:625–630. doi: 10.1038/nature03122. [DOI] [PubMed] [Google Scholar]
  31. Ibarra-Soria X., Jawaid W., Pijuan-Sala B., Ladopoulos V., Scialdone A., Jörg D.J., Tyser R.C.V., Calero-Nieto F.J., Mulas C., Nichols J. Defining murine organogenesis at single-cell resolution reveals a role for the leukotriene pathway in regulating blood progenitor formation. Nat. Cell Biol. 2018;20:127–134. doi: 10.1038/s41556-017-0013-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Jaitin D.A., Kenigsberg E., Keren-Shaul H., Elefant N., Paul F., Zaretsky I., Mildner A., Cohen N., Jung S., Tanay A., Amit I. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science. 2014;343:776–779. doi: 10.1126/science.1247651. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Kalhor R., Kalhor K., Mejia L., Leeper K., Graveline A., Mali P., Church G.M. Developmental barcoding of whole mouse via homing CRISPR. Science. 2018;361:eaat9804. doi: 10.1126/science.aat9804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Keren-Shaul H., Kenigsberg E., Jaitin D.A., David E., Paul F., Tanay A., Amit I. MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing. Nat. Protoc. 2019;14:1841–1862. doi: 10.1038/s41596-019-0164-4. [DOI] [PubMed] [Google Scholar]
  35. Kojima Y., Kaufman-Francis K., Studdert J.B., Steiner K.A., Power M.D., Loebel D.A., Jones V., Hor A., de Alencastro G., Logan G.J. The transcriptional and functional properties of mouse epiblast stem cells resemble the anterior primitive streak. Cell Stem Cell. 2014;14:107–120. doi: 10.1016/j.stem.2013.09.014. [DOI] [PubMed] [Google Scholar]
  36. Kume T. The cooperative roles of Foxc1 and Foxc2 in cardiovascular development. Adv. Exp. Med. Biol. 2009;665:63–77. doi: 10.1007/978-1-4419-1599-3_5. [DOI] [PubMed] [Google Scholar]
  37. Kwon G.S., Viotti M., Hadjantonakis A.K. The endoderm of the mouse embryo arises by dynamic widespread intercalation of embryonic and extraembryonic lineages. Dev. Cell. 2008;15:509–520. doi: 10.1016/j.devcel.2008.07.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. La Manno G., Soldatov R., Zeisel A., Braun E., Hochgerner H., Petukhov V., Lidschreiber K., Kastriti M.E., Lönnerberg P., Furlan A. RNA velocity of single cells. Nature. 2018;560:494–498. doi: 10.1038/s41586-018-0414-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Lawson K.A., Pedersen R.A. Clonal analysis of cell fate during gastrulation and early neurulation in the mouse. Ciba Found. Symp. 1992;165:3–21, discussion 21–26. doi: 10.1002/9780470514221.ch2. [DOI] [PubMed] [Google Scholar]
  40. Lescroart F., Wang X., Lin X., Swedlund B., Gargouri S., Sànchez-Dànes A., Moignard V., Dubois C., Paulissen C., Kinston S. Defining the earliest step of cardiovascular lineage segregation by single-cell RNA-seq. Science. 2018;359:1177–1181. doi: 10.1126/science.aao4174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Li L., Liu C., Biechele S., Zhu Q., Song L., Lanner F., Jing N., Rossant J. Location of transient ectodermal progenitor potential in mouse development. Development. 2013;140:4533–4543. doi: 10.1242/dev.092866. [DOI] [PubMed] [Google Scholar]
  42. Liu C., Wang R., He Z., Osteil P., Wilkie E., Yang X., Chen J., Cui G., Guo W., Chen Y. Suppressing Nodal Signaling Activity Predisposes Ectodermal Differentiation of Epiblast Stem Cells. Stem Cell Reports. 2018;11:43–57. doi: 10.1016/j.stemcr.2018.05.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. McMahon J.A., Takada S., Zimmerman L.B., Fan C.M., Harland R.M., McMahon A.P. Noggin-mediated antagonism of BMP signaling is required for growth and patterning of the neural tube and somite. Genes Dev. 1998;12:1438–1452. doi: 10.1101/gad.12.10.1438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Metzis V., Steinhauser S., Pakanavicius E., Gouti M., Stamataki D., Ivanovitch K., Watson T., Rayon T., Mousavy Gharavy S.N., Lovell-Badge R. Nervous System Regionalization Entails Axial Allocation before Neural Differentiation. Cell. 2018;175:1105–1118.e17. doi: 10.1016/j.cell.2018.09.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Mohammed H., Hernando-Herraez I., Savino A., Scialdone A., Macaulay I., Mulas C., Chandra T., Voet T., Dean W., Nichols J. Single-Cell Landscape of Transcriptional Heterogeneity and Cell Fate Decisions during Mouse Early Gastrulation. Cell Rep. 2017;20:1215–1228. doi: 10.1016/j.celrep.2017.07.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Nowotschin S., Setty M., Kuo Y.Y., Liu V., Garg V., Sharma R., Simon C.S., Saiz N., Gardner R., Boutet S.C. The emergent landscape of the mouse gut endoderm at single-cell resolution. Nature. 2019;569:361–367. doi: 10.1038/s41586-019-1127-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Palis J., Robertson S., Kennedy M., Wall C., Keller G. Development of erythroid and myeloid progenitors in the yolk sac and embryo proper of the mouse. Development. 1999;126:5073–5084. doi: 10.1242/dev.126.22.5073. [DOI] [PubMed] [Google Scholar]
  48. Peitz M., Pfannkuche K., Rajewsky K., Edenhofer F. Ability of the hydrophobic FGF and basic TAT peptides to promote cellular uptake of recombinant Cre recombinase: a tool for efficient genetic engineering of mammalian genomes. Proc. Natl. Acad. Sci. USA. 2002;99:4489–4494. doi: 10.1073/pnas.032068699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Peng G., Suo S., Cui G., Yu F., Wang R., Chen J., Chen S., Liu Z., Chen G., Qian Y. Molecular architecture of lineage allocation and tissue organization in early mouse embryo. Nature. 2019;572:528–532. doi: 10.1038/s41586-019-1469-8. [DOI] [PubMed] [Google Scholar]
  50. Perea-Gomez A., Vella F.D., Shawlot W., Oulad-Abdelghani M., Chazaud C., Meno C., Pfister V., Chen L., Robertson E., Hamada H. Nodal antagonists in the anterior visceral endoderm prevent the formation of multiple primitive streaks. Dev. Cell. 2002;3:745–756. doi: 10.1016/s1534-5807(02)00321-0. [DOI] [PubMed] [Google Scholar]
  51. Pijuan-Sala B., Griffiths J.A., Guibentif C., Hiscock T.W., Jawaid W., Calero-Nieto F.J., Mulas C., Ibarra-Soria X., Tyser R.C.V., Ho D.L.L. A single-cell molecular map of mouse gastrulation and early organogenesis. Nature. 2019;566:490–495. doi: 10.1038/s41586-019-0933-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Raj B., Wagner D.E., McKenna A., Pandey S., Klein A.M., Shendure J., Gagnon J.A., Schier A.F. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 2018;36:442–450. doi: 10.1038/nbt.4103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Richardson L., Venkataraman S., Stevenson P., Yang Y., Moss J., Graham L., Burton N., Hill B., Rao J., Baldock R.A., Armit C. EMAGE mouse embryo spatial gene expression database: 2014 update. Nucleic Acids Res. 2014;42:D835–D844. doi: 10.1093/nar/gkt1155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Rossant J., Tam P.P. Blastocyst lineage formation, early embryonic asymmetries and axis patterning in the mouse. Development. 2009;136:701–713. doi: 10.1242/dev.017178. [DOI] [PubMed] [Google Scholar]
  55. Schiebinger G., Shu J., Tabaka M., Cleary B., Subramanian V., Solomon A., Gould J., Liu S., Lin S., Berube P. Optimal-Transport Analysis of Single-Cell Gene Expression Identifies Developmental Trajectories in Reprogramming. Cell. 2019;176:928–943.e22. doi: 10.1016/j.cell.2019.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Scialdone A., Tanaka Y., Jawaid W., Moignard V., Wilson N.K., Macaulay I.C., Marioni J.C., Göttgens B. Resolving early mesoderm diversification through single-cell expression profiling. Nature. 2016;535:289–293. doi: 10.1038/nature18633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Snow M. Gastrulation in the mouse: Growth and regionalization of the epiblast. J. Embryol. Exp. Morphol. 1977;42:293–303. [Google Scholar]
  58. Spagnoli F.M., Hemmati-Brivanlou A. Guiding embryonic stem cells towards differentiation: lessons from molecular embryology. Curr. Opin. Genet. Dev. 2006;16:469–475. doi: 10.1016/j.gde.2006.08.004. [DOI] [PubMed] [Google Scholar]
  59. Spanjaard B., Hu B., Mitic N., Olivares-Chauvet P., Janjuha S., Ninov N., Junker J.P. Simultaneous lineage tracing and cell-type identification using CRISPR-Cas9-induced genetic scars. Nat. Biotechnol. 2018;36:469–473. doi: 10.1038/nbt.4124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Stemmer M., Thumberger T., Del Sol Keyer M., Wittbrodt J., Mateo J.L. CCTop: An Intuitive, Flexible and Reliable CRISPR/Cas9 Target Prediction Tool. PLoS ONE. 2015;10:e0124633. doi: 10.1371/journal.pone.0124633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Stern C.D., Charite J., Deschamps J., Duboule D., Durston A.J., Kmita M., Nicolas J.F., Palmeirim I., Smith J.C., Wolpert L. Head-tail patterning of the vertebrate embryo: One, two or many unresolved problems? Int J Dev Biol. 2006;50:3–15. doi: 10.1387/ijdb.052095cs. [DOI] [PubMed] [Google Scholar]
  62. Tam P.P., Beddington R.S. The formation of mesodermal tissues in the mouse embryo during gastrulation and early organogenesis. Development. 1987;99:109–126. doi: 10.1242/dev.99.1.109. [DOI] [PubMed] [Google Scholar]
  63. Tam P.P., Behringer R.R. Mouse gastrulation: the formation of a mammalian body plan. Mech. Dev. 1997;68:3–25. doi: 10.1016/s0925-4773(97)00123-8. [DOI] [PubMed] [Google Scholar]
  64. Tam P.P.L., Ho J.W.K. Cellular diversity and lineage trajectory: insights from mouse single cell transcriptomes. Development. 2020;147:dev179788. doi: 10.1242/dev.179788. [DOI] [PubMed] [Google Scholar]
  65. Tanay A., Regev A. Scaling single-cell genomics from phenomenology to mechanism. Nature. 2017;541:331–338. doi: 10.1038/nature21350. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Tesar P.J., Chenoweth J.G., Brook F.A., Davies T.J., Evans E.P., Mack D.L., Gardner R.L., McKay R.D. New cell lines from mouse epiblast share defining features with human embryonic stem cells. Nature. 2007;448:196–199. doi: 10.1038/nature05972. [DOI] [PubMed] [Google Scholar]
  67. Theiler K. Second Edition. Springer Science+ Business Media; 1989. The House Mouse, Atlas of Embryonic Development. [Google Scholar]
  68. Tritschler S., Büttner M., Fischer D.S., Lange M., Bergen V., Lickert H., Theis F.J. Concepts and limitations for learning developmental trajectories from single cell genomics. Development. 2019;146:dev170506. doi: 10.1242/dev.170506. [DOI] [PubMed] [Google Scholar]
  69. Winnier G.E., Hargett L., Hogan B.L. The winged helix transcription factor MFH1 is required for proliferation and patterning of paraxial mesoderm in the mouse embryo. Genes Dev. 1997;11:926–940. doi: 10.1101/gad.11.7.926. [DOI] [PubMed] [Google Scholar]
  70. Wu Y., Liang D., Wang Y., Bai M., Tang W., Bao S., Yan Z., Li D., Li J. Correction of a genetic disease in mouse via use of CRISPR-Cas9. Cell Stem Cell. 2013;13:659–662. doi: 10.1016/j.stem.2013.10.016. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1. Summary characteristics of the 153 embryos making up the transcriptional manifold, related to Figure 1
mmc1.xlsx (21.1KB, xlsx)
Table S2. Processed metacell relative gene expression data, related to Figure 2
mmc2.xlsx (30MB, xlsx)

Data Availability Statement

The accession number for the raw and processed data reported in this paper is GEO: GSE169210.

Code has been deposited at https://github.com/tanaylab/embflow, https://doi.org/10.5281/zenodo.4646177.

Interactive independent analysis can be performed at https://tanaylab.weizmann.ac.il/embflow.

RESOURCES