Summary
Single-cell RNA sequencing (scRNA-seq) provides a leap forward in resolving cellular diversity and developmental trajectories but fails to comprehensively delineate the spatial organization and precise cellular makeup of individual embryos. Here, we reconstruct from scRNA-seq and light sheet imaging data a canonical digital embryo that captures the genome-wide gene expression trajectory of every single cell at every cell division in the 18 lineages up to gastrulation in the ascidian Phallusia mammillata. By using high-coverage scRNA-seq, we devise a computational framework that stratifies single cells of individual embryos into cell types without prior knowledge. Unbiased transcriptome data analysis mapped each cell’s physical position and lineage history, yielding the complete history of gene expression at the genome-wide level for every single cell in a developing embryo. A comparison of individual embryos reveals both extensive reproducibility between symmetric embryo sides and a large inter-embryonic variability due to small differences in embryogenesis timing.
Keywords: single-cell RNA sequencing, light sheet imaging, ascidian, gene expression noise, cell type classification, spatial reconstruction, lineage reconstruction, embryogenesis, cell fate specification
Graphical Abstract
Highlights
-
•
Integration of scRNA-seq and imaging data yield a canonical digital embryo
-
•
Cell type classification without prior knowledge
-
•
De novo reconstruction of the lineage history and spatial organization of the embryo
-
•
Timing differences contribute to inter-embryo variability in gene expression
The complete gene expression history and physical position of every single cell at every cell division in an embryo up to gastrulation is reconstructed using a combination of single-cell transcriptomics and light sheet imaging.
Introduction
One of the most fundamental problems of biology is how the many different cell types of a multicellular organism are generated from a single-celled fertilized egg during embryonic development. Studying the developmental mechanisms driving this cellular diversification at the single-cell, genome-wide, and whole-embryo level has remained challenging. So far, most studies have focused on either a limited number of marker genes or a few select cell lineages. Consequently, we currently lack a comprehensive understanding of the gene expression programs that instruct individual cells to acquire all the cell fates necessary to build an embryo.
Although the advent of single-cell transcriptomics was a major advance to stratify organisms into different cell populations (Cao et al., 2017, Karaiskos et al., 2017), the information of where individual cells originally came from as well as their past or future expression trajectory are typically lost. Indeed, current approaches rely on prior knowledge to reconstruct the spatial organization of embryos (Satija et al., 2015, Achim et al., 2015) or physically sectioning the embryos (Combs and Eisen, 2013, Junker et al., 2014, Peng et al., 2019). Although complex spatial gene expression patterns have been inferred from single-cell RNA sequencing (scRNA-seq) with limited prior information (Karaiskos et al., 2017), a recent seminal study demonstrated that such de novo reconstructions do not necessarily rely on prior knowledge (Nitzan et al., 2019). Assessing different developmental stages enabled the reconstruction of lineage histories (Scialdone et al., 2016, Telley et al., 2016, Tusi et al., 2018, Rosenberg et al., 2018, Briggs et al., 2018, Wagner et al., 2018, Farrell et al., 2018, Cao et al., 2019). Despite these recent advances, an in toto representation of embryonic development accounting for every single cell in space and time has not been achieved.
To unbiasedly reconstruct embryonic development from scRNA-seq data, we exploited the advantageous properties of the chordate Phallusia mammillata (Conklin, 1905, Zalokar and Sardet, 1984). This ascidian combines the genomic complexity and embryonic cell diversity of a vertebrate with a relatively small total number of cells stereotypically segregating into lineages (Figure 1A) in an optically transparent embryo (Corbo et al., 2001). Indeed, cell fates of the future nerve chord, brain, germ cells, blood precursors, and muscles are already specified at the 64-cell stage (Nishida, 1987) (Figure 1B), at which point endoderm cells begin to deform to initiate gastrulation (Sherrard et al., 2010).
Making use of the stereotypic chordate development, we combine high-resolution single-cell transcriptomics and light sheet imaging to generate a comprehensive four-dimensional (4D) atlas of embryonic gene expression in every cell for each cell division up to gastrulation in P. mammillata. By using high-coverage scRNA-seq, we devised a computational framework that identifies cell types within individual embryos without prior knowledge. Moreover, we designed methods that unbiasedly infer the spatial coordinates and mother-daughter relatedness of every embryonic cell directly from their transcriptome data. The integration and quantification of in toto cell shape reconstructions from 4D imaging and scRNA-seq data uncovered the patterned expression of specific protocadherins. Comparing high-resolution gene expression datasets from individual embryos revealed both extensive reproducibility between the bilaterally symmetric embryo sides and a large degree of inter-embryonic variability. We anticipate that the digital chordate embryo we report here will be a rich resource to mine the molecular mechanisms that instruct the patterning of entire organisms (the sequencing and imaging data are deposited in publicly available repositories and can be explored at http://digitalembryo.org). Our results demonstrate that the unbiased mapping of scRNA-seq data by the MorphoSeq framework yields a spatiotemporally resolved atlas of gene expression at the single-cell level in a developing embryo and links it to morphological features.
Results
Developmental Transcriptome of P. mammillata
Due to the lack of an annotated published P. mammillata transcriptome, we first set out to generate a high-quality de novo transcriptome assembly comprising all mRNAs expressed during embryonic development by sampling 15 developmental stages ranging from unfertilized eggs to hatching larvae. To circumvent the high degree of genomic polymorphism present in ascidians (Dehal et al., 2002), we devised a strategy of consensus building in the peptide space to consolidate 30 different assemblies into 12,945 gene models (Figures S1A and S1B; STAR Methods). We then analyzed the temporal gene expression changes at the whole-embryo level (Figures S1C and S1D). Only a handful of genes started to be expressed at the 8-cell stage (Figure S1E), increasing to tens of genes activated at the 16-cell stage (Figure S1F). As expected, the number of expressed genes increased during further development (Figures S1G and S1C).
High-Resolution scRNA-Seq Captures the Bilateral Symmetry of the Embryo
Aiming to precisely resolve the spatial and temporal complexity of gene regulation during embryonic development, we thus turned to scRNA-seq. We collected single cells from individual embryos dissociated at 2- to 64-cell stages (Table S1) and, importantly, kept track of the embryo they belonged to. Due the very high fraction of cells profiled per individual embryo (100% up to the 16-cell stage and up to 90% for 32- and 64-cell stages), we can thus assess the degree of cell-to-cell variability both within a single embryo and between individuals (Figure 1C). The bilateral symmetry of P. mammillata provided an ideal internal control: in principle, symmetric pairs of cells from the left and right sides of the embryo should have highly similar expression profiles. In total, 1,042 cells from 58 embryos were analyzed at high depth, generating 6.65 billion reads and quantifying the expression of 8,542 ± 272 genes in individual cells (Figures S1H–S1L), including orthologs for 100% of the annotated Ciona transcription factors, zinc finger proteins, and signaling ligands expressed during embryogenesis (Imai et al., 2004). Overall, the resolution of the scRNA-seq was comparable to bulk mRNA-seq results confirming its very high depth.
At the 4-cell stage, two symmetric pairs of cells could already be distinguished in all embryos (Figures 1D–1F), with a set of 27 maternal mRNAs being asymmetrically segregated between the two cell types (Figure 1G). We leveraged the ability of non-negative matrix factorization (NMF) to cluster samples in an unsupervised way (Lee and Seung, 1999) to single out in a 16-cell embryo 5 groups of cells that were either pairs or consisted of an even number of cells (Figures 1H and S1M). These results suggested that our transcriptome resolution adequately captured the bilateral symmetry of the embryo. We challenged this notion at the more complex 64-cell stage where major lineage decision events have taken place (Imai et al., 2006). Using a 64-cell embryo with transcriptome information for 90% of cells, we could identify 7 symmetric cell pairs and 6 clusters formed by 4 or more cells (Figures 1I–1K).
Cell Type Determination without Prior Knowledge
We proceeded to stratify embryos into different cell types. At the 2-cell stage, individual single-cell gene expression profiles clustered by embryo (Figure S2A), whereas two cell types were apparent in 4-cell embryos (Figure S2B). At the 8-cell stage, 3 cell types could be distinguished in all embryos with the further asymmetric segregation of maternal mRNAs (Sardet et al., 2005) in a single cell pair (Figures S2C and S2D). Five cell types were singled out either by clustering expression profiles or by NMF in individual 16-cell embryos (Figures 1H, S2E, and S2F). Aiming to generalize cell type classification to more complex embryos, we designed the framework “single-cell expression classification through iterations of NMF” (SCECTION). Cells from a given embryo were iteratively split in subsets until minimal variation in gene expression remained within the subset (STAR Methods; Figure S2G). Subsets from different embryos at the same stage were then merged to ensure complete cell coverage and consistency across embryos, allowing us to define unique cell types. SCECTION could divide the 58 cells of a 64-cell embryo into 17 categories, including 10 symmetric cell pairs and 4 groups of 4 cells (Figure 2A). Although cells of 32- and 64-cell embryos were not readily clustered (Figures S2H and S2I), SCECTION identified 5, 8, and 18 different signatures at the 16-, 32-, and 64-cell stages, respectively (Figures 2B and S2J–S2L). This indicated a striking degree of diversification of single-cell gene expression profiles from the 16-cell stage onward. For each stage, we could derive a canonical embryo by integrating scRNA-seq data from multiple individual stage-matched embryos. Among the genes exhibiting cell-type-specific expression profiles were transcription factors and signaling ligands (Figures 2C–2E). In conclusion, our dataset allowed us to unbiasedly identify 18 different cell fates in 64-cell stage embryos.
Temporal Mapping of scRNA-Seq Expression Profiles
Due to the stereotypic nature of ascidian development, we hypothesized that the global gene expression profile of every cell might contain enough information to infer its lineage history in an unbiased manner. Possessing scRNA-seq data for every round of cell division allowed us to investigate the relatedness of gene expression profiles between mother and daughter cells. Some genes, among them the transcription factors AP-2-like2, SoxB1, FoxD-a/b, Fkh, MSH-B, Tbx6b, and Snail, increased their expression in specific cell types as development progressed (Figure 3A). Because asymmetric mRNA segregation occurred only in germ cells, we hypothesized that transcriptomes of mother and daughter cells should, in general, have a high degree of similarity unless cell fate specification occurs, which should involve the repression of pluripotency or the activation of lineage-specific genes in one or both daughter cells. We could match cell types unambiguously with one or more cell types at the next division (Figures S3A and S3B). This implied that daughter cells inherited the gene expression makeup of their mothers (Figures S3C–S3G), and we could distinguish specific sister cells by the differential expression of key genes. By iteratively matching all daughter cells with their respective mothers, we were able to delineate a complete lineage tree of the cell types identified by scRNA-seq (Figure 3B). This lineage was fully compatible with the actual developmental history of the corresponding embryonic cells (Figures S3H and S3I). A key assumption in our reconstruction was the continuity of gene expression between mothers and daughters. RNA velocity analysis (La Manno et al., 2018) supported the direction of the reconstructed trajectories, i.e., an increase in the expression of marker genes as development progressed (Figure 3C). The only notable exception was the germ cell lineage, which was transcriptionally silent up to the 64-cell stage. Therefore, cell fate specification proceeded mainly through the stepwise upregulation of specific gene sets that then maintained their expression at subsequent developmental stages.
Spatial-Temporal Quantification of Early Development by 4D Imaging
To precisely capture spatial cell dynamics in the developing embryo, we performed in toto high-resolution imaging of P. mammillata embryonic development from the 2- to 64-cell stage by using multi-view light sheet microscopy (Krzic et al., 2012). Selective plane illumination microscopy (SPIM) videos of embryos expressing fluorescent nucleus- and membrane-targeted probes were segmented, and individual cells were tracked (Figures S4A and S4B; Videos S1, S2, and S3; STAR Methods). The quantification of our 4D imaging data yielded a complete digital representation of embryonic development up to gastrulation, quantifying the temporal evolution of cell positions and volumes, membrane shapes, and cell-cell neighbor contacts with a 2-min temporal resolution. Owing to the lack of cell migration in the early embryo, the spatial position of each cell was closely correlated with its lineage history, such that cells arising from sister lineages remained in spatial proximity (Figures S4C and S4D). We noticed a very strong asymmetric volume partitioning in one lineage from the 16-cell stage, yielding the future germ cells that were 8-fold smaller than the average cell at the 64-cell stage (Figure S4E). The enrichment of the 27 maternal factors relative to other genes in the germ cell lineage (Figure S4F; STAR Methods) allowed us to infer that cell divisions in that lineage gave sister cells of unequal volume, a prediction validated by the imaging data. Moreover, the total mRNA numbers of the maternal factors were constant up to the 64-cell stage (Figure S4G), which is consistent with our bulk mRNA-seq data. Strikingly, this was the sole example of asymmetric segregation of mRNAs during cell division in our dataset. Interestingly, germ cell specification was also accompanied by an enrichment of mitochondrial transcripts (Figure S4H), consistent with the known asymmetric partitioning of mitochondria into this lineage (Zalokar and Sardet, 1984).
Spatial Mapping of scRNA-Seq Expression Profiles
Next, we set out to map single-cell gene expression profiles onto their spatial position in the embryo at each developmental stage. Performing this analysis for all stages would yield a spatiotemporally resolved transcriptome atlas at the single-cell level. We reasoned that the local signaling environment of individual cells might impact their gene expression and, therefore, enable us to determine their relative positions in the embryo. We first focused on the 16-cell stage when all cells (except B5.2 cells, which will give rise to the future germ cells) were transcriptionally active. To recover the spatial coordinates of the cells, we devised a method relying on principal-component analysis to identify the axes that best discriminate the cell types of the embryo. At the 16-cell stage, this method defined two axes, namely, Axis161 and Axis162. The projection on these two axes of the expression profiles of all 16 cells of one 16-cell embryo separated the 5 cell types determined by SCECTION (Figure 4A). Axis161 mapped the anterior-posterior axis of the embryo, with the posterior position of the germ cells defining its orientation. Moreover, Axis162 split the embryo in two groups of eight cells, faithfully distinguishing the animal and vegetal poles of the embryo. Notably, the relative positions of the projections of single-cell expression profiles in the Axis16 space were strikingly similar to the corresponding cell’s physical positions within the 16-cell embryo (Figures 4A and 4B). These spatial axes were reproducibly recovered for each of the 11 analyzed individual 16-cell stage embryos (Figures S4I and S4J). Among the top genes contributing to Axis161 (antero-posterior axis), the transcription factor Fkh had graded expression from anterior to posterior (Figure S4K). Interestingly, Wnt-5 and the Wnt-degrading metalloprotease Tiki1 (Zhang et al., 2012) formed opposite gradients along the antero-posterior axis (Figure S4L). Indeed, Wnt is critical for antero-posterior axis specification (Petersen and Reddien, 2009). The posterior B5.2 cells inherit maternal factors that establish cellular asymmetry during early development (Negishi et al., 2007), among them Wnt-5 and the metalloprotease Tolloid. Overall, the combination of scRNA-seq and imaging data allowed the mapping of single-cell gene expression profiles onto their respective spatial locations at the 16-cell stage.
The same map might be used to determine the relative position of daughter cell types at the more complex 32-cell stage. The projections of single-cell transcriptomes of a 32-cell embryo onto the Axis16 space defined by 16-cell stage gene expression resulted in the separation of cells into the SCECTION-identified cell types and also faithfully recapitulated the spatial organization of the embryo (Figures 4C and 4D). Notably, Axis162, defined at 16-cell stage, separated daughters of A5.1, A5.2, and B5.1 into the respective cell clusters at the 32-cell stage, consistent with the SCECTION results (Figure S4M). A closer inspection of expression differences between sister pairs showed that A6.2, A6.4, and B6.2 cells had lower levels of vegetal-pole-specific transcription factors than their sister cells (Figures S4N and S4O). This finding is consistent with the origin of these sister cells from the orientation of the cell division axis that pushed these cells toward the animal pole (Figures S4P and S4Q).
As the expression of signaling ligands dynamically changed during embryogenesis (Figure 2E), it stands to reason that the axes determined at an earlier stage will no longer be able to discriminate sister cell types as development progresses. Indeed, projecting all cell types onto the Axis16 space did not resolve the relative positions of the vegetal pole cell types at the 64-cell stage (Figure S4R). However, new axes defined using the variance in gene expression at the 32- and 64-cell stages in the A and B lineages enabled us to map these cell types (Figures 4E and 4F). Although one axis matched the antero-posterior axis, the other axis was along the left-right axis of the vegetal pole. Despite a large number of identified cell types, the dimensionality of the data was rather small (Figures S4S and S4T). In summary, each cell type could be mapped onto its physical position in the embryo for each developmental stage (Figures 4B, 4D, and 4G).
We then validated the expression territories of key classifier genes by using whole-mount in situ hybridization (ISH) (Figures S5A–S5F). Moreover, our expression territories were fully compatible with reported expression patterns both in Phallusia (Madgwick et al., 2019) or Ciona (Imai et al., 2004). Notably, comparing our data with scRNA-seq data in Ciona by Treen et al. (2018), we found that there was conservation of the asymmetric division of maternal factors in the germ cell lineage (Figure S5G), the expression of marker genes (Figure S5H), Cyclin B3 (Figures S5I and S5J), and Fkh (Figure S5K). All these observations were in accordance with the conservation of expression patterns in ascidians (Wada et al., 1998).
Thus, the single-cell expression profiles contained intrinsic information that allowed us to objectively deduce their original spatial position within the embryo. The reconstruction of both spatial position and history of single-cell transcriptomes yielded a complete unbiased delineation of genome-wide expression trajectories for every single cell of the embryo up to gastrulation.
Integration of Imaging and scRNA-Seq Data Identifies Patterned Expression of Cell Adhesion Molecules
We next integrated the spatiotemporal mapping of averaged single-cell transcriptomes from multiple embryos with the 4D lineage obtained by SPIM imaging to create a canonical “virtual embryo” that captures the interrelationship of gene expression, cell morphology, and lineage history for every single cell throughout development (http://digitalembryo.org). To intuitively demonstrate the power of this concept, we automatically generated 4D movies of virtual whole-mount ISHs to capture dynamic gene expression patterns at a single-cell resolution (see Video S4 for an example and http://digitalembryo.org for all 12,945 genes).
Our quantification of 4D SPIM imaging data detected a decrease of the relative apical surface (i.e., the surface facing the environment) of animal cells compared to vegetal cells at the 16- and 32-cell stages (Figure 5A). Although cell volumes and cell surfaces were highly correlated (Figure S6A), the gradual decrease in the relative apical surface of animal cells occurred at a constant apical perimeter (Figures S6B and S6C). We turned to our scRNA-seq results for genes with differential expression between animal and vegetal cells that might explain these morphological changes. Interestingly, an analysis of cell adhesion molecules showed that the embryo became progressively stratified into distinct spatial territories through dynamic expression of different cadherin family proteins (Figures 5B and S6D). We defined a measure of the homophilic cadherin cell-cell contacts, the cadherin contact strength, as the product of the relative cadherin expression in neighboring cells and their apical contact length. The computed Protocadherin-11 X-linked contact strength was anti-correlated with the relative apical surface of animal cells at 16- and 32-cell stages (Figure 5C). Heterogeneity in the relative apical surface of vegetal cells occurred primarily at the 64-cell stage (Figure S6E). Cadherin-7 was upregulated in vegetal cells at the 64-cell stage (Figure S6F), and its computed contact strength was anti-correlated with the relative apical surface of vegetal cells (Figure S6G). Incubating 32-cell embryos in calcium-free seawater abolished the asymmetry in the relative apical surface between embryonic poles (Figures 5D and S6H), validating cadherin involvement in this process. Thus, both animal and vegetal poles had specific cadherin expression patterns that were correlated with changes in cellular apical surfaces. In the future, it would be interesting to connect our findings to other cell biological processes that influence cell shape. Altogether, this analysis exemplifies the strength of combining quantitative imaging data with scRNA-seq to uncover the links between morphology and gene expression.
Variability of Gene Expression
We next mined our dataset to study the extent of noise and variability in gene expression during development, both within single embryos and between individuals. We, thus, projected and overlaid the single-cell expression profiles from various 16- or 32-cell embryos on the Axis16 space (Figures S7A and S7B). Despite robust classification across all embryos, the expression profiles of individual cells exhibited large variation, which was particularly pronounced for the a5.3, a5.4, b5.3, and b5.4 cells and their progeny that give rise to the ectoderm. This spread was clearly a consequence of inter-embryo variability (Figure 6A). To explain this variation, we turned to RNA velocity analysis (La Manno et al., 2018) of individual cells of 16-cell embryos. The velocity field was aligned with the dispersion of animal pole cells in the Axis16 space (Figure 6B), indicating that the genes contributing to the axis were actively upregulated in individual cells.
Thus, this inter-embryo variability was most likely caused by minor differences in the temporal progression of embryonic development at the time of collection. There was an enrichment in unspliced read counts at low expression levels of the animal pole marker AP-2-like2, indicating an increase in expression (Figure 6C). Moreover, AP-2-like2 showed embryo-dependent states of upregulation, but its expression at the later 32-cell stage was not as variable (Figure 6D). There was a concomitant decrease in its relative intron counts, which plateaued at the 32-cell stage (Figure 6E).
Individual cells and individual embryos could upregulate genes in either a coordinated or an uncoordinated manner. Expression profiles of single a5.3-4 cells at the 16-cell stage showed that individual cells expressed progressively more genes and at higher levels (Figure 6F). Genes exhibited three trends of upregulation, namely, one class of genes in which all profiled cells expressed high levels, one class for which the expression levels of profiled cells spanned a larger spectrum, and finally a class of genes that were upregulated in only a small fraction of cells (Figure 6F). There was a general trend of ramping up expression levels across the upregulated genes once they had detectable expression in individual cells (Figures 6F and 6G). This suggested that genes were upregulated with different timing of activation in individual cells in a coordinated manner. To test whether this observation held true across cell types and embryos, we performed a similar analysis by averaging expression profiles for each cell type in individual embryos. We found that both the number of genes and their extent of upregulation increased in individual 16-cell embryos, with all these genes having an expression that plateaued in daughter cell types (Figures 6H and 6I). Moreover, genes that were upregulated in two different cell types were expressed in a highly correlated manner (Figure 6J). Similarly, genes were upregulated in a coordinated and temporally graded manner in all the somatic cell types at the 32-cell stage (Figures S7C–S7E) and in all the 18 cell types at the 64-cell stage (Figures S7F–S7I). In fact, relative changes in log-transformed gene expression levels in individual cells of 16-cell embryos were well approximated by a model of linear variation in time (Figures 6K–6N; STAR Methods), stressing that gene upregulation occurred in a coordinated, highly reproducible, and temporally ordered manner. Thus, profiling single cells from pooled and dissociated embryos (a strategy commonly used by previous studies) would introduce a major source of noise if there are minor differences in staging between individual embryos.
Precision of Gene Expression Regulation
At the genome-wide level, we found a systematically lower degree of intraembryonic gene expression variability than cells randomly sampled from different embryos of the same stage (Figure 7A). Overall, this gene expression variability increased as development progressed (Figure 7B). We then explored the precision of gene expression regulation, taking advantage of both the high sequencing depth and our ability to unambiguously attribute single-cell expression profiles to a specific cell type within the embryo. We first identified 244 marker genes that were differentially expressed between cell types. We then queried the expression level of individual marker genes in cells identified as either marker-positive or -negative according to the average expression levels of the marker in the considered cell type. As expected, the vast majority of marker-positive cells expressed the respective marker genes; however, a small fraction of marker-negative cells also expressed individual marker genes (Figure 7C). For example, the animal pole marker AP-2-like2 was robustly expressed in animal cells but also found at 50-fold lower levels in a fraction of vegetal cells (Figure 7D). Conversely, the vegetal pole marker Fgf9/16/20 was expressed at 20-fold lower levels in a fraction of animal cells (Figure 7D). When looking at all the individual marker genes our unbiased approach had identified, we found a much larger magnitude of upregulation in marker-positive cell types than in marker-negative cell types (Figure 7E). These results suggest that although cells destined for a particular fate sporadically express marker genes of other lineages, this spurious transcription occurs only at 10-fold lower levels and from isolated loci. Thus, embryonic development can accommodate transcriptional noise at low levels.
Finally, the bilateral symmetry of P. mammillata embryos implies that similar cells are present on the left and right sides of individual embryos. A total of 23 cell types encompassed only 2 cells per embryo, and therefore, these cells could be unambiguously identified as bilaterally symmetric cell pairs. Our dataset had 128 such cell pairs (each pair originating from one embryo and belonging to the same cell type). We compared the levels of 305 marker genes with variable expression during embryogenesis with the levels of genes with constitutive expression in all cells. There was less variation in the expression of marker genes in bilaterally symmetric cell pairs from one embryo than in cells belonging to the same cell type but randomly sampled across embryos or cells randomly sampled within an embryo (Figure 7F). This variation was, in fact, comparable to the variation in the expression of genes with constitutive expression across all cells. Although bilaterally symmetric cell pairs were clear outliers compared to other cells of the same embryo (Figure 7G), the vast majority of cell pairs were more similar in their expression of marker genes compared to cells of the same cell type in other embryos (Figure 7H). In fact, ∼50% of cell pairs exhibited a precision in the expression of marker genes comparable to the one of housekeeping genes.
Discussion
Interpreting single-cell transcriptome data is often hampered by the difficulty of matching individual gene expression profiles with the cell’s precise position, local environment, and function within the tissue architecture. Combining high depth scRNA-seq and high coverage of precisely staged individual embryos, we designed MorphoSeq, which consisted of computational frameworks that classify embryos into cell types without prior knowledge, reconstructed directly from its transcriptome the physical position and lineage history of each cell in an unbiased manner, and linked it to high-resolution 4D imaging data. We, thus, present the complete history of gene expression at the genome-wide level for every single cell and at every cell division in a developing embryo up to gastrulation.
The accurate reconstruction of lineage trees must account for the cell division history within particular cell types and the quantitative cell type composition of the embryo at different stages. Indeed, this consideration allowed us to map the sisters of the cells that belong to the germ cell lineage, whereas a gene-expression-based reconstruction ignoring the cell divisions occurring in the germ cell lineage would be erroneous. Precise mapping of cell division patterns can be obtained with light sheet microscopy that enables tracking cellular behaviors at the whole-embryo level (Keller et al., 2008, McDole et al., 2018). In addition, combining lineage tracing (McKenna et al., 2016) with sequencing (Raj et al., 2018, Wagner et al., 2018) will help address this aspect in vertebrate embryos.
During regulative developmental processes, such as vertebrate embryogenesis, cells will acquire their fate guided by environmental cues, and therefore, their gene expression will reflect their spatial position. Although recent advances in single-molecule fluorescent ISH techniques allow measurement of transcriptome-wide expression levels in tissues (Eng et al., 2019), a recent study demonstrated that it is possible to reconstruct de novo the spatial architecture of embryos or tissues directly from scRNA-seq data (Nitzan et al., 2019). ISH will, nevertheless, be required to position migrating cells within tissues.
In the case of a continuum of states during differentiation, the identification of individual cell types or differentiation intermediates would be problematic for any classification strategy. However, such cells usually possess a signature of the germ layer or tissue they belong to, whether in mouse (Peng et al., 2019, Sladitschek and Neveu, 2019), zebrafish (Wagner et al., 2018, Farrell et al., 2018), or Xenopus (Briggs et al., 2018). Examining genes with variable expression within that cell cluster would then reveal the existence of intermediates. Given our results on inter-embryo variability, such analyses would be greatly facilitated if performed on cells belonging to the same embryo.
The spatiotemporal atlas of single-cell gene expression linked to morphological features identified cell-adhesion molecules shaping the embryo. We found that cell-fate acquisition is driven by the rather simple mechanisms of cell-type-specific expression of transcription factors and the expression of signaling ligands in a spatially confined manner. Although the lineage reconstruction and the spatial mapping of cell types could largely be conducted using only three dimensions, classification of single-cell expression profiles was simplified using additional dimensions (fewer than 10 in total) that were specific to the subset of cells to classify. This indicated that ascidian development uses similar gene sets in a modular fashion to specify cell types along different lineages.
Previous studies have highlighted considerable noise in gene expression during development (Arias and Hayward, 2006), which might stem from (and readily be explained by) small differences in timing of embryogenesis between samples, as our analysis demonstrated. Conversely, pairs of bilateral symmetric cells that assume the same fate had remarkably similar gene expression profiles within a given embryo. This supports the notion that the coordinated gene regulatory networks underlying development yield a highly reproducible outcome (Levine and Davidson, 2005). Moreover, our high sequencing depth allowed us to capture the effect of transcriptional noise.
The canonical virtual embryo we report here (http://digitalembryo.org) is a valuable resource to mine the detailed molecular mechanisms that instruct the patterning of entire organisms or tissue architecture in a comprehensive manner. The ability to track systematically, quantitatively, and in a spatially resolved manner the genome-wide changes of gene expression of every cell at each cell division in embryos will usher in a new era in developmental biology. The MorphoSeq framework can be readily adapted to other stereotypically developing animals and might be applicable to more plastic embryonic development.
STAR★Methods
Key Resources Table
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Chemicals, Peptides, and Recombinant Proteins | ||
Proteinase K | Merck | 1.24568.0100 |
FM4-64 | ThermoFisher | T3166 |
Tn5 transposase | Protein Expression Core Facility, EMBL | N/A |
Pronase | Merck | P5147 |
Sodium thioglycolate | Merck | T0632 |
Critical Commercial Assays | ||
NEBNext Ultra II DNA library preparation kit | NEB | E7645S |
MirVana miRNA isolation kit | ThermoFisher | AM1560 |
TruSeq RNA Sample Preparation | Illumina | FC-122-1001 |
DIG RNA SP6/T7 labeling kit | Roche | 11175025910 |
Rneasy mini kit | QIAGEN | 74104 |
Superscript III First-Strand Synthesis SuperMix kit | ThermoFisher | 18080-400 |
Deposited Data | ||
scRNA-Seq of Phallusia mammillata from 2-cell to 16-cell stage | This paper | ArrayExpress E-MTAB-6506 |
scRNA-Seq of Phallusia mammillata from 32-cell to 64-cell stage | This paper | ArrayExpress E-MTAB-6508 |
Gene expression profiles during Phallusia mammillata embryogenesis | This paper | ArrayExpress E-MTAB-6528 |
Genome sequencing of Phallusia mammillata | This paper | ArrayExpress E-MTAB-6530 |
single cell RNA-Seq data of Ciona robusta embryos | Treen et al., 2018 | GSE110588 |
Experimental Models: Organisms/Strains | ||
Adult Phallusia mammillata | Roscoff Marine Station | N/A |
Oligonucleotides | ||
Primer: PH domain of human PLCD1 Forward: AATACGCGTAACTCGAGATGGACTCGGGCC GGGACTTCC |
This paper | N/A |
Primer: PH domain of human PLCD1 Reverse: CTAGTCGACGATGTTGAGCTCCTTCAGGAA GTTCTGC |
This paper | N/A |
Primers for in situ probes, see Table S3 | This paper | N/A |
Recombinant DNA | ||
Genomic AHC0AAA267YK08 clone | Brozovic et al., 2018 | N/A |
Plasmid pRN3-MX | This paper | N/A |
Plasmid pRN3-PH-Citrine | This paper | N/A |
Plasmid pRN3-H2B-mCherry | This paper | N/A |
Software and Algorithms | ||
Python 2.7.5 | Python Software Foundation | https://www.python.org |
Numpy 1.7.1 | N/A | http://numpy.org |
Scipy 0.12.0 | N/A | http://scipy.org |
Velvet | Zerbino and Birney, 2008 | https://github.com/dzerbino/velvet |
Oases | Schulz et al., 2012 | https://github.com/dzerbino/oases |
BLAST | Altschul et al., 1990 | https://blast.ncbi.nlm.nih.gov |
Bowtie 1.0 | Langmead et al., 2009 | https://bio.sourceforge.io/ |
ImageJ | Schneider et al., 2012 | https://imagej.nih.gov/ij/ |
ImageMagick | ImageMagick Studio LLC | https://imagemagick.org/index.php |
Chimera | Pettersen et al., 2004 | http://www.cgl.ucsf.edu/chimera |
ilastik | Sommer et al., 2011 | https://www.ilastik.org |
tulip | Auber et al., 2016 | https://sourceforge.net/projects/auber/files/tulip/tulip-4.9.0/ |
IMARIS v. 8.4.1 | Bitplane | https://imaris.oxinst.com |
MARS | Fernandez et al., 2010 | https://idp.nature.com/authorize?response_type=cookie&client_id=grover&redirect_uri=https%3A%2F%2Fwww.nature.com%2Farticles%2Fnmeth.1472 |
Amira v. 5.4.1 & v. 6.4.3 | Thermo Scientific | https://thermofisher.com/amira-avizo |
Blender | Blender Foundation | https://www.blender.org |
MATLAB | Mathworks | https://www.mathworks.com/products/matlab |
SCECTION | This paper | https://git.embl.de/neveu/morphoseq |
Other | ||
Resource website for the scRNA-Seq of P. mammillata embryos | This paper | http://Digitalembryo.org |
Resource Availability
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Pierre A. Neveu (neveu@embl.de).
Materials Availability
Plasmids generated in this study are available upon signature of an MTA.
Data and Code availability
The accession numbers for the sequencing results reported in this paper are ArrayExpress: E-MTAB-6506, E-MTAB-6508, E-MTAB-6528, E-MTAB-6530. They can be explored at http://digitalembryo.org. Code is available at https://git.embl.de/neveu/morphoseq.
Experimental Model and Subject Details
Embryos were obtained from wild adult Phallusia mammillata animals that were procured from the Roscoff Marine Station (France). Embryos were allowed to develop in artificial sea water at 18◦C until collection for transcriptomics analysis. During imaging, embryos developed in artificial sea water as well. Embryo stages studied by scRNA-Seq were 2-cell, 4-cell, 8-cell, 16-cell, 32-cell and 64-cell stages. Imaging was performed up to 64-cell stage. P. mammillata individuals are hermaphodrites.
Method Details
RNA-seq library construction
Adult P. mammillata animals were procured from the Roscoff Marine Station (France). RNA was extracted from P. mammillata staged embryos (staging was performed according to Hotta et al. 2007) using the MirVana kit (Ambion) following the manufacturer’s instructions. 30 barcoded stranded mRNA libraries representing 15 developmental stages were prepared using TruSeq RNA Sample Preparation (Illumina) following the manufacturer’s instructions. Libraries were sequenced in two runs on Illumina NextSeq 500 in the 75 bp single-end regime yielding 1.07 billion reads (161 Gbp total).
Genomic DNA extraction
Sperm from a single P. mammillata individual was lysed in 20 mL spooling buffer (75 mM NaCl, 25 mM EDTA pH 8.0, 1% SDS, 0.2 mg/ml Proteinase K) at 55◦C overnight with gentle shaking. 5 mL of saturated NaCl solution were added and the solution was mixed by gentle shaking till homogeneous. 25 mL of isopropanol were added and the solution was mixed by gentle shaking till homogeneous. The stringy DNA precipitate was fished out and washed in 70% ethanol. After removing the 70% ethanol, the DNA pellet was air-dried and resuspended in 10 mM Tris pH 8 at 55◦C with gentle shaking. A genomic DNA library was prepared using the NEBNext Ultra II DNA library preparation kit starting with 50 ng of sheared material following the manufacturer’s instructions. The library was run on Illumina MiSeq in the 250PE regime yielding 18.4 million reads (9.2 Gbp).
Transcriptome assembly
Our P. mammillata transcriptome assembly involves a strategy relying on consensus building in peptide space from a large number of assemblies followed by several refining iterations. For each mRNA library, transcripts were assembled using Velvet (Zerbino and Birney, 2008) and Oases (Schulz et al., 2012) using 31-mers. Further processing of the Oases output was carried out using custom Python scripts. Transcript models were translated in the three forward reading frames, keeping the longest peptide for each transcript. Peptides were sorted according to their C-termini, enabling us to discard partially assembled proteins. This yielded a collection of 350,000 different C-termini. Practically, peptides of different sizes for each C terminus were found in many assemblies. For each C terminus, we retained the longest > 99% identical peptides found in more than one assembly, reasoning that they most likely corresponded to the full-length protein. This set of consensus peptides was aligned to Ciona robusta (formerly known as Ciona intestinalis type A but mostly referred to as Ciona intestinalis) protein sequences using BLAST (Altschul et al., 1990). We retained peptide models with a C. robusta hit at most 25% longer than the query and an alignment score greater than 1.5 times the target length. A Bowtie index was built for the transcripts corresponding to these “high confidence” peptides. mRNA reads were aligned to this index using Bowtie (Langmead et al., 2009) with default parameters. For each sample, reads with a match were discarded and the remaining reads were used for a new round of assembly and peptide consensus building. The entire procedure was iterated a third time. This aforementioned strategy efficiently assembled transcripts up to 20 kb. In order to identify potentially larger genes, peptide models with matching overlap were further assembled. This led to the full unsupervised assembly of the largest gene in vertebrate genomes Titin with a transcript 77.5 kb long encoding for 25,560 amino acids, representing 189 exons spanning a genomic locus of 140 kb. Peptide models were annotated using the best C. robusta BLAST hit. Altogether, our P. mammillata transcriptome assembly comprised 14,203 transcript models representing 12,945 gene models. We validated a set of transcripts by amplifying them from cDNA, cloning in a custom plasmid (Sladitschek and Neveu, 2015a) and Sanger sequencing of the insert. Assembled transcripts are in the process of being deposited at the European Nucleotide Archive.
Genome assembly
As a quality control for our transcriptome assembly and to get the reconstruction of transcripts with introns, we independently assembled P. mammillata genome using custom Python scripts. We matched both ends from paired-end reads of short inserts in order to generate longer sequences that will be used for further assembly. This generated ∼18 million 250-450 bp long sequences. Sequences corresponding to ribosomal RNAs and genomic repeats were identified from their high coverage and assembled. Reads mapping to these regions were discarded leaving us with ∼16 million reads for assembly. To minimize assembly errors, we again adopted a strategy that relied on consensus building from independent assemblies. We started from batches of 3 million reads representing an average of 10-fold coverage. Reads were chosen randomly and extended by overlap matching. Contigs from the different assemblies were then matched to each other and validated by mapping transcripts to them. Transcripts that were not fully mapped without gaps were used as scaffolds. Seeds in exons were extended by overlap matching for the assembly of the introns.
Single-cell collection
P. mammillata embryos were dissociated in 0.2% pronase (Sigma), 1% sodium thioglycolate (Sigma), 50 mM NaOH in calcium-free artificial sea water for 20 minutes and individual dissociated embryos were transferred to calcium-free artificial sea water in imaging plates (Mat-tek) coated with a solution of 0.1% PFA and 0.1% gelatin. Dissociated embryos were kept on ice before single-cell collection and during collection. Individual cells were manually collected in 0.3 μL with pipet tips coated with a solution of 0.1% PFA and 0.1% gelatin and cells were directly lysed in 4 μL of lysis+dNTP+oligodT mix (2 μL of lysis buffer – 0.2% Triton X-100+RNase inhibitor–, 1 μL of 10 μM oligo-dT primer – sequence 5′AAGCAGTGGTATCAACGCAGAGTACT30VN –, 1 μL of 10 mM dNTP mix) according to the Smart-Seq2 protocol (Picelli et al., 2014). 1084 cells were collected as follows: eight 2-cell embryos, eight 4-cell embryos, eight 8-cell embryos, eleven 16-cell embryos, 380 cells from fourteen 32-cell embryos, 31 cells from one 44-cell embryo and 385 cells from eight 64-cell embryos. All embryos up to the 16-cell stage were complete, 32-cell embryos were 72% to 97% complete and 64-cell embryos were 48% to 94% complete.
Single-cell RNA-Sequencing
cDNA from single cells was prepared following the Smart-Seq2 protocol (Picelli et al., 2014) using 18 pre-amplification PCR cycles. Smart-Seq2 was found to be the most sensitive and accurate method with the smallest dropout rate in a recent survey of six scRNA-Seq methods (Ziegenhain et al., 2017). 500 pg of cDNA was then used for the tagmentation reaction using a home-made Tn5 transposase (Hennig et al., 2018). All samples were processed with the same batch of reagents. Barcoded libraries were pooled in 12 batches of 84 to 96 samples. Libraries were sequenced in twelve runs on Illumina NextSeq 500 in the high density 75 bp single-end regime yielding 6.65 billion reads total (498.9 Gbp total).
scRNA-seq analysis and quality control
We built a Bowtie index for the assembled transcripts. mRNA reads were aligned to this index using Bowtie (Langmead et al., 2009) allowing up to 3 mismatches. The Bowtie output was parsed to count the number of reads aligning to each transcript model. 3.35 billion reads mapped to our transcriptome assembly, with an average of 3.09 million mapped reads per cell. We removed 42 cells with < 100,000 mapped reads, leaving us with 1042 cells out of 1084. On average, there were 10,173 ± 655 detected transcripts per cells with at least 5,500 detected transcripts in the cell with the smallest sequencing depth that was retained for further analysis. In comparison, we detected for 11,179 ± 290 transcripts in bulk mRNA-Seq at the similar embryonic stages. Notably, 8,542 ± 272 genes had expression levels > 4 RPM. In comparison, we detected for 8,392 ± 131 transcripts with expression levels > 4 RPM in bulk mRNA-Seq at 32- and 64-cell stages. Thus, our scRNA-Seq depth was comparable to bulk mRNA-Seq depth.
Normalization
The very high scRNA-Seq depth circumvented the problem of many zero counts commonly encountered in scRNA-Seq. Read counts were normalized in two steps to correct for different sequencing depth. In a first step, samples were coarsely normalized by multiplying read counts by a factor to have the same total number of reads across all samples. We identified 654 genes with a minimum expression > 64 RPM across all samples (including the ones with low sequencing depth). A fine normalization factor was determined by matching the log-transformed read counts of these 654 genes to the identity line for samples pairwise. For each sample, read counts were then multiplied by this fine normalization factor. It should be noted that each sample was normalized by multiplying read counts by a single factor that was independent of the read count. Read counts were not normalized by the transcript length for individual genes as we were solely interested in relative expression changes across samples. Technical batch effects were not detectable. However, we could detect batch effects due to biological variability. Some genes had several highly polymorphic alleles that were only found in the embryos of some individuals. The sum of the read counts mapping to the different alleles was used for downstream analysis. In addition, the coefficient of variation of gene expression increased as development progressed. This was probably due to the apportioning of the mRNA pool deposited in the oocyte between sister cells upon cell division. Moreover, the coefficient of variation of gene expression in individual embryos was smaller compared to the coefficient of variation across embryos of the same stage (see Noise analysis paragraph below). This was probably due to slight differences in the quantity of mRNAs deposited in individual oocytes.
Identification of maternal factors
Before the onset of zygotic genome activation, different expression profiles can only be generated by selective RNA inheritance or degradation. Such a process should occur in an even number of cells (due to the bilateral symmetry of ascidians) and reproducibly in individual embryos. We therefore sought to identify genes that would single out a similar fraction of cells in the different embryos that were profiled at the same stage. Expression levels were clipped to a minimum of 1 RPM. We retained genes with maximal expression > 16 RPM (reads per million mapped reads) in at least one cell and at least a 4-fold difference in expression between cells. Expression levels were log transformed. For each stage and each gene, cells were rank ordered according to expression levels. Difference in expression between subsequent ranks was computed. The index of the largest difference was used to classify the cells in two categories: the cells rank ordered below that index belonged to the first category and the cells rank ordered above the index belonged to the second category. At the 4-cell stage, only 10 genes (out of 3354 candidates) split the cells in two categories, sampling individual embryos evenly. Remarkably, the categories defined independently by these 10 genes were identical (picking the same 15 or 16 cells out of 31 randomly has a probability of 6.65 10−9). At the 8-cell stage, 10 (out of 3354 candidates) genes classified cells into two categories: one category with lower expression comprising three quarter of the cells and one with higher expression comprising one quarter of the cells. Like for the 4-cell stage embryos, the same 15 cells were singled out by all these 10 genes (there are 1.22 1014 15-combinations of cells from 63 cells), picking exactly two cells in each 8-cell embryo (and one cell in an 8-cell embryo for which only seven cells passed quality control). Furthermore, these genes were identical to the ones identified at the 4-cell stage. We then determined if these genes would stratify cells in 16- and 32-cell stage embryos. Indeed, they singled out at most two cells per embryo for an average of two cells per embryo (when taking into account sampling error) for both stages. To further identify genes which mRNAs would be co-segregated in the same cells, we looked for genes with a differential enrichment. For each gene and each individual embryo, we computed its expression enrichment in the cells singled out by the 10 genes compared to all the other cells of the embryo by making the ratio of its mean expression levels in each category. We defined as maternal factors genes that had an increased expression in the cells singled out by the 10 genes for all the embryos profiled from 4- to 16-cell stage (27 embryos total). 27 genes and some mitochondrial transcripts fulfilled that criterion.
Cell type classification
We reasoned that genes that classify cell types should have markedly different expression levels in single cells of a given embryo. In addition, cell types should have an even number of cells in a complete embryo due to the bilateral symmetry of P. mammillata. We thought to identify cell types for individual embryonic stages.
8-cell stage
8-cell stage embryos have in theory at most 4 cell types. The expression of maternal factors that are asymmetrically inherited singled out 2 cells in each 8-cell stage embryo. Using our bulk mRNA-Seq data, we retained genes with maximal expression > 1 RPM in 4- and 8-cell embryos and a 2-fold difference in expression between the two stages. This list of 24 genes was narrowed down by keeping the genes with maximal expression > 1 RPM and at least a 4-fold difference in expression between cells of each embryo for all the eight 8-cell embryos profiled. This left us with 4 potential classifiers. As there are only 2 cells with high maternal factor expression per embryo, these cells cannot be further subclassified. Thus, classification among cells with high maternal factor expression had to be significantly different than the classification of all the cells of 8-cell stage embryos (χ2-test with Benjamini-Hochberg correction for multiple hypothesis testing, p < 0.05). 2 genes out of 4 passed p < 0.05, the transcription factor Fkh and an ankyrin repeat and SAM domain-containing protein. Fkh was expressed at higher levels in 4 cells per 8-cell embryo and not detected in the remaining 4 cells (including the cells with high maternal factor expression). The other gene was not expressed in the cells with high maternal factor expression.
16-cell stage
16-cell stage embryos have in theory at most 8 cell types. The expression of maternal factors that are asymmetrically inherited singled out 2 cells in each 16-cell stage embryo. Using our bulk mRNA-Seq data, we retained genes with maximal expression > 1 RPM in 4- and 16-cell embryos and a 2-fold difference in expression between the two stages. We clipped the expression data at a minimum of 1 RPM and we used the same strategy and criteria to find potential classifier genes as for 8-cell embryos. After the χ2-test, we had 48 candidate classifiers. Using these candidate classifiers and the 27 maternal factors, we performed non-negative matrix factorization (NMF) with a varying number of metaprofiles (Neveu et al., 2010) on individual embryos. Gene expression levels were log2-transformed. For each gene, the mean expression level across the cells of a given embryo was subtracted and the variance was normalized to one. We found that whereas two metaprofiles split the embryos in two clusters of 8 cells each, three metaprofiles split the embryos in 5 clusters, 3 comprising 4 cells and 2 comprising 2 cells. 4 metaprofiles did not increase the number of clusters but instead lead to a degraded consensus clustering compared to 3 metaprofiles. Clustering was determined according to Brunet et al. (2004): for each number of metaprofiles, we determined the consensus matrix as the average connectivity matrix over 100 clustering runs.
Cell type classification by SCECTION
To implement Single-Cell Expression Classification Through Iterations Of NMF (SCECTION), we used Non-negative Matrix Factorization (NMF) as an unsupervised way to cluster the single-cell expression profiles.
SCECTION framework
We clipped the expression data at a minimum of 1 RPM. We retained genes with maximal expression > 4 RPM in at least one cell and a 16-fold difference in expression in the set of cells to be classified. Gene expression levels were log2-transformed. For each gene, the mean expression level across the set of cells to be classified was subtracted and the variance was normalized to one. NMF with two metaprofiles was computed as described (Lin, 2007). Clustering was determined according to Brunet et al. (2004): we determined the consensus matrix as the average connectivity matrix over 50 clustering runs. Subsets of cells belonging to a cluster with more than 75% clustering consensus were used for a subsequent SCECTION iteration. The iterative process stops when no gene satisfies the differential expression condition within the set of cells to be classified or when SCECTION outputs no cluster with more than 75% consensus clustering.
Cell type matching across embryos
Within embryo, average expression profiles were computed for each cell type using log-transformed expression values. For each embryo, the levels of each gene were rescaled by subtracting by their mean expression level and rescaled by their standard deviation. This removed embryo-specific expression trends. The expression profile for cell type j in embryo k consisted of log-transformed expression values for the associated genes . Relatedness between cell types in different embryos k was determined using average linkage clustering using the Euclidean distance between the average expression profiles. were matched with the closest profiles. Cell types were expected to be found across several embryos according to sampling rules, that is cell types found in a single embryo were merged to their closest matching type (corresponding to the subset of cells that generated that specific cell type in the last SCECTION round) within that embryo.
Application of SCECTION
We applied SCECTION to the eleven 16-cell stage embryos. SCECTION split the 16-cell stage embryo into 5 cell types, 3 comprising 4 cells and 2 comprising 2 cells that were similar to the cell types recovered by running NMF with 3 metaprofiles on individual embryos. The first SCECTION round separated animal from vegetal cells, the second round split each pole into the A/a and B/b lineages while the third round of SCECTION distinguished B5.1 and B5.2 in the B lineage. SCECTION applied to 32-cell stage embryos identified 8 cell types: 6 cell types in the vegetal pole (2 cell types in the A lineage, 4 cell types in the B lineage) and 2 cell types in the animal pole. SCECTION applied to 64-cell stage embryos identified 18 cell types: 13 cell types in the vegetal pole (6 cell types in the A lineage, 7 cell types in the B lineage) and 5 cell types in the animal pole (2 cell types in the a lineage, 3 cell types in the b lineage). Final attribution of single cells to individual cell types was manually validated. In situ hybridization data in Ciona (Imai et al., 2004) supported both our classification and the constellation of genes expressed in a given cell type.
Lineage tree reconstruction
Mother-daughter relationships were established by assessing the expression of the genes that classified the mother cells after one cell division. Using our bulk mRNA-Seq data, we retained genes with maximal expression > 1 RPM in 4-cell embryos and the embryonic stage of the mother cells and a 2-fold difference in expression between the two stages. Average expression profiles were computed for each cell type using log-transformed expression values. For each stage, the levels of each gene were rescaled by subtracting by their mean expression level and rescaled by their standard deviation. This removed stage-specific expression trends. The expression profile for cell type j at stage k consisted of log-transformed expression values for the associated genes . Relatedness between cell types at stage k and the subsequent embryonic stage k + 1 (representing one cell division) was determined using average linkage clustering using the Euclidean distance between the expression profiles. were matched with the closest profiles until the sum of the cells contained in the matching profiles was equal to twice the number of cells present in cell type (to account for the doubling of cells through cell division). This is in effect a parsimony argument minimizing the number of cell type changes between mother and daughter cells. The lineage tree reconstructed from our scRNA-Seq data was fully compatible with the actual cell lineage tree of P. mammillata.
Dimensionality reduction
The lineage tree of different cell types provides a natural way to embed the data. One local axis was the developmental progression of individual cells from mother to daughter cell types. Position along this local axis was computed by projecting the gene expression profile of an individual cell on the direction defined by the average expression profiles of the mother cell type and the daughter cell type the cell belongs to. Arrangement on a second local dimension was done by determining nearest neighbors as measured by the Euclidean distance in the space of genes with differential expression between mother and daughter cell types.
RNA velocity analysis
RNA velocity was performed following La Manno et al. (2018) under the constant velocity assumption. Reads aligning to introns or intron-exon junctions were considered to originate from unspliced RNA molecules (reads mapping to individual exons were not counted in that category as they cannot be unambiguously attributed to an unspliced molecule). For individual genes, the normalized degradation rate γ was computed using the set of cells with the top decile of expression. The velocity v was then computed as follows v = u − γ s where u is the unspliced read counts and s the spliced read counts (La Manno et al., 2018). Values for v are smaller than the unspliced read counts (which can be 1000-fold smaller than the spliced read counts due to the compactness of ascidian genomes), we therefore scaled v by 1/γ to extrapolate the spliced read counts (indeed at steady state u ∼γs; La Manno et al., 2018). Neither genes nor cells were pooled to estimate the velocity of individual cells. To visualize RNA velocities, the extrapolated spliced read counts were projected on the relevant space (Axis161 and Axis162 or the two-dimensional space of the lineage tree) and were the tips of the velocity arrows. In the case of projection on Axis161 and Axis162, we used all genes that were used for the principal component analysis to compute these two axes. In the case of the two-dimensional space of the lineage tree, we used genes with a difference of expression greater than 4-fold between mother and daughter cell types.
Spatial mapping of single-cell transcriptomes
Embryonic axis determination at 16-cell stage
We relied on principal component analysis (PCA) to find the directions which explained most of the variance in gene expression between single cells of individual 16-cell embryos. PCA was carried out as described (Sladitschek and Neveu, 2015b). Genes with 4-fold expression changes within an embryo were kept for subsequent analysis. Expression profiles were log-transformed and the principal components were computed using gene expression profiles of cells of a single embryo. The expression profile for cell j consisted of log-transformed expression values for the associated genes . PCA determined principal components (PCs) corresponding to the set of eigenvalues of the covariance matrix. PCs were rescaled by the square root of the associated eigenvalue . The first PCs were rotated to remove the trends across PCs between cells within the embryonic poles defined by the first round of SCECTION. This determined a new coordinate system with axes Axis16β orthogonal to each other. For cell j, we plotted the projection on the αth Axis16: where was the mean expression profile averaged on all cells from the embryo. The eleven 16-cell stage embryos retrieved exactly the same directions (with occasional swaps of direction along one axis).
Mapping of single cells at 32-cell stage
Log-transformed transcriptomes of single cells of individual 32-cell stage embryos were projected on the Axis16 coordinate system.
Mapping of cell types
Log-transformed expression levels were averaged among all cells of a given cell type and were projected on the Axis16 coordinate system.
Finer mapping at 64-cell stage
The vegetal pole of 64-cell embryos consists of 13 cell types and the Axis16 coordinate system cannot resolve the spatial relationship of some of the sister cell types. We reasoned that the axes that map the vegetal pole of 32-cell embryos might discriminate cells at the 64-cell stage. We defined two axes which shared the same global direction: the one that classifies the daughters of B6.1 compared to the ones of B6.2 and the one that classifies the daughters of A6.1&3 compared to the ones of A6.2&4. The B lineage could be perfectly mapped on Axis161 and this first axis with the exception of the germ cell B7.6. The A lineage could be mapped using the second axis and the direction of the most remaining variance.
Asymmetric cell divisions
Before the onset of zygotic transcription, a partition bias of one RNA will lead to a relative increase and decrease in the levels of this particular RNA in the two sister cells. In the absence of zygotic transcription: v1/v2 = (x0 − x2)/(x1 − x0) where v1 and v2 are the volumes of the sister cells, x0, x1 and x2 are the concentrations of the given RNA in the mother cell and sister cells respectively. Using the levels of the maternal factors PEM-1, Vasa and Distal-less Dll-B, we found that B5.2 was 3 times smaller than its sister B5.1.
Comparison with Ciona expression profiles
We used the analyzed Ciona robusta single-cell RNA-Seq data from Treen et al. (2018) as provided by the authors on GEO with accession number GSE110588.
Noise analysis
In order to assess the variability of single-cell gene expression, we analyzed independently each embryo we collected. We considered only genes with a maximal expression greater than 16 RPM in the embryo and computed the coefficient of variation of their expression across the cells from the embryo. For the same gene set, a mean coefficient of variation was estimated from hundred random samplings of a similar number of cells from the embryos of the same stage. To find candidate marker genes, we retained genes with maximal expression > 4 RPM in at least one cell type and at least a 4-fold difference in mean expression between cell types with high and low expression. High and low expressing cell types were then classified in either marker-positive or marker-negative categories. The fraction of cells expressing the marker gene was then determined for each of the two categories. We then assessed the levels of the marker genes in cells with detectable expression for each of the two categories. This allows to investigate two different scenarios of gene expression regulation: (i) a Boolean case where there is a single expression ON state for marker genes and the control point would be the fraction of cells expressing the gene, (ii) cells can control the expression levels if the gene is not in an OFF state.
Analysis of gene upregulation coordination
We retained for analysis genes with maximal expression > 4 RPM and that were upregulated more than 4-fold between the mother and daughter cell types of the specific cell type considered. Log-transformed expression levels were rescaled for each gene to have a mean expression equal to zero in the mother cells and a mean expression equal to one in the daughter cells. Individual cells were then sorted according to the sum of these rescaled expression levels. To compare the coordination of gene expression between embryos, the average expression per cell type was computed within each individual embryo for each gene. Log-transformed average expression levels were rescaled for each gene to have a mean expression equal to zero in the mother cell type and a mean expression equal to one in the daughter cell types. Individual embryos were sorted according to the sum of these rescaled expression levels for each individual cell type and the consensus order was retained. Embryos were randomized to test for the significance of the correlation of expression levels between cell types (p < 10−52 for 16-cell stage cell types, p < 10−20 for 32-cell stage cell types, and p < 10−5 for 64-cell stage cell types, Kolmogorov-Smirnov test).
Temporal model of gene expression variation
Log-transformed expression levels in individual cells were normalized to have a mean expression of 0 in the mother cell type and 1 in the grand-daughter cell type. The expression of an individual gene i was considered to vary linearly in time t with rate in the form of , where is the time of the expression ramp-up. According to this model, the expression in cell j captured at time will be . Measured expression levels in individual cells were then adjusted using two parameters per gene and one per cell . To assess the quality of the fit, we used the residuals between the predicted values (computed using the parameter sets and ) and the measured values for individual genes.
Constructs for live imaging
MluI and XhoI restriction sites were inserted between the 5′ and 3′ β-globin UTRs in a pRN3 expression vector (Lemaire et al., 1995) containing PH-GFP (a gift of Alex McDougall) to yield pRN3-MX and make it compatible with the MXS-chaining technique (Sladitschek and Neveu, 2015a). Citrine and the PH domain of human PLCD1 (subcloned using primers Forward: 5′-AATACGCGTAACTCGAGATGGACTCGGGCCGGGACTTCC and Reverse: 5′-CTAGTCGACGATGTTGAGCTCCTTCAGGAAGTTCTGC) were successively chained in pRN3-MX yielding pRN3-PH-Citrine. H2B-Cherry was chained in pRN3-MX to give pRN3-H2B-Cherry.
Embryo imaging
The embryos were microinjected prior to fertilization with mRNA encoding for PH-Citrine and H2B-mCherry using phenol red as microinjection dye. Microinjected embryos were incubated for ∼3 hours at 18◦C prior to fertilization. The embryos were imaged from the 2-cell stage up to the gastrulation period using multi-view light-sheet microscopy (MuVi-SPIM) (Krzic et al., 2012). We used a MuVi-SPIM configuration with two Nikon 10X, 0.3 NA water-dipping objectives for the illumination and two Olympus 20X, 1.0 NA water-dipping objectives to detect emitted photons and a 300 mm tube lens to provide a 33.3X magnification of the sample. The embryos were imaged across four orthogonal angles –for each time point the sample was rotated 0◦ and 90◦– with simultaneous dual illumination and dual acquisition, using electronic confocal slit detection (de Medeiros et al., 2015) adjusted to the illumination beam size. Stacks of images every 1 μm were acquired every 2 minutes with Hamamatsu ORCA-Flash4.0 V2 sCMOS cameras.
Image processing
Multi-angle data was fused to produce isotropic resolution images using bead-based affine registration to yield a single high-quality 3D dataset for each time point. Fluorescence images were rendered using Chimera (Pettersen et al., 2004). Cell lineage extraction was based on nuclei tracking. Nuclei probability maps were created using ilastik pixel classification (Sommer et al., 2011). Tracking was done using the ilastik semi-automated method “Manual Tracking Workflow.” Lineage representation was created using the data visualization software tulip (Auber et al., 2016). Nuclei cell identification was done by 3D visualization of the segmented and ilastik tracked labeled nuclei in IMARIS (version 8.4.1). Digital 3D representations of the cellular morphology of each cell in the embryo were created using machine learning (pixel classification) to extract membrane probability maps with ilastik (Sommer et al., 2011). Single time point segmentations were done through membrane-based 3D watershed segmentation with the algorithm MARS (Fernandez et al., 2010) on the membrane probability maps and creating segmented embryo surface representations in AMIRA (versions 5.4.1 and 6.4.3) or the open source software Blender (https://www.blender.org). Time-course membrane-based segmentations were done using 3D-seed watershed segmentation with MATLAB image analysis libraries using the segmented and tracked nuclei as seeds. Tesselation of the segmented embryos was performed using the marching cubes algorithm (Lorensen and Cline, 1987). The marching cube algorithm was kindly provided by the Kreshuk laboratory (EMBL, Heidelberg). The tessellated surfaces generated for each individual cell were smoothed for visualization purposes using the open source software Blender. The 3D virtual in situs in http://digitalembryo.org result from a color-coded expression level projection on the surface of the tessellated embryo representations.
Creation of animated 4D renderings
The membrane signal was segmented in individual frames as described (Sladitschek and Neveu, 2015b). Cells were matched between frames using the Hungarian algorithm in order to obtain a 3D reconstruction. Individual time points were registered by translation to keep the center of mass of the entire embryo invariant. Tracking cells from one time point to the next was done by matching the position of a cell’s center of mass at time i with the corresponding cell at time i − 1. Grayscale 3D renderings of the segmented membrane signal were created using the VolumeViewer plugin of ImageJ (Schneider et al., 2012). 4D renderings were created by coloring the grayscale 3D renderings according to cell lineage, cell types or gene expression levels using a custom python script. Gene expression levels were interpolated between stages using a logistic function. Annotation of python-generated colored time frames and conversion to gif format was performed using ImageMagick for the 12,945 genes.
Apical surface and protocadherin expression
SPIM images were used to compute for each cell its volume, the area of its surface and its apical surface (i.e., the surface in contact with the environment), the length of the apical boundary between individual cells. The computed cadherin contact strength for cell i is defined as , where and are the cadherin expression levels in cells i and j, the length of the apical boundary between cells i and j.
Perturbation of cell-cell adhesion
Phallusia embryos were treated with calcium free artificial sea water for ≈12 min during late 32-cells stage prior to imaging with light sheet microscopy in calcium-free artificial sea water containing the lipophilic dye FM464 (6 μM) for membrane labeling.
Whole mount in situ hybridization
Total mRNA was extracted from 50 μL of Phallusia embryos at 112-cells stage using the QIAGEN Rneasy mini kit. A cDNA library was produced using oligodT and random hexamers and the Superscript III First-Strand Synthesis SuperMix kit (Invitrogen). The AHC0AAA267YK08 clone from the publicly available Phallusia mammillata EST clone collection (Brozovic et al., 2018) was used as a template for generating the TTF1 in situ probe. Templates for probe synthesis were generated by PCR amplification from the cDNA library introducing a T7 promoter in the reverse primers either for the primary reaction from cDNA (for MyoD and Nodal) or in a subsequent amplification (for the other genes). Primers used for AP-2-like2, Fkh, FoxD-a/b, Macho1, MyoD, Nodal, Tbx6b, Wnt3 and ZicL can be found in Table S3. The TTF1 EST clone was linearized by EcoRI digestion. All in situ probes were made using the DIG RNA SP6/T7 labeling kit (Roche) and purified with the QIAGEN RNeasy RNA purification kit. Whole mount in situ hybridizations were performed as described (Christiaen et al., 2009).
Quantification and Statistical Analysis
Statistical tests were computed using the Python SciPy module. The statistical tests used were two-sided and are indicated in the text or in the figure legends. The number of samples is indicated in the figure legends. Significance was set at p < 0.05. Normality was assessed by the Shapiro-Wilk test.
Additional Resources
Sequencing results can be explored at http://digitalembryo.org.
Acknowledgments
We thank Jan Ellenberg for comments on the manuscript. We thank Jean-Karim Hériché for setting up the http://digitalembryo.org website. We thank D. Kutra and A. Kreshuk for the tesselation algorithm; and M. Albert, B. Balazs, and L. Guignard for technical discussions and advice for image processing. This work was technically supported by the EMBL Genomics Core facility. We thank the EMBL Protein Expression Facility for purifying the Tn5 transposase. The study was funded by the European Molecular Biology Laboratory. U.-M.F. was supported by the EMBL Interdisciplinary Postdoc Programme (EIPOD) under Marie Curie Actions COFUND.
Author Contributions
L.H. and P.A.N. conceived and supervised the study. H.L.S. designed and cloned constructs for live imaging, assembled the transcriptome and genome, and analyzed sequencing data under supervision by P.A.N. H.L.S. generated the 4D virtual in situ hybridizations and virtual in situ hybridizations schemes. L.H. and U.-M.F. generated the 3D virtual in situ hybridizations and 3D embryo tesselations in http://digitalembryo.org. U.-M.F. prepared and staged embryos and developed the embryo dissociation method. U.-M.F. conducted SPIM imaging of developing embryos and U.-M.F. and L.H. analyzed SPIM imaging data. U.-M.F. performed whole-mount in situ hybridization. P.A.N. generated the animated 4D renderings. D.P. prepared mRNA-seq and DNA-seq libraries and optimized the scRNA-seq protocol under supervision by V.B. P.A.N. performed all embryo dissociations for single-cell collection and optimized and performed all single-cell collections. P.A.N. prepared scRNA-seq libraries with technical assistance from D.P. H.L.S. and P.A.N. wrote the paper, and all authors commented on the manuscript.
Declaration of Interests
The authors declare no competing financial interests.
Published: April 20, 2020
Footnotes
Supplemental Information can be found online at https://doi.org/10.1016/j.cell.2020.03.055.
Contributor Information
Lars Hufnagel, Email: hufnagel@embl.de.
Pierre A. Neveu, Email: neveu@embl.de.
Supplemental Information
References
- Achim K., Pettit J.-B., Saraiva L.R., Gavriouchkina D., Larsson T., Arendt D., Marioni J.C. High-throughput spatial mapping of single-cell RNA-seq data to tissue of origin. Nat. Biotechnol. 2015;33:503–509. doi: 10.1038/nbt.3209. [DOI] [PubMed] [Google Scholar]
- Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- Arias A.M., Hayward P. Filtering transcriptional noise during development: concepts and mechanisms. Nat. Rev. Genet. 2006;7:34–44. doi: 10.1038/nrg1750. [DOI] [PubMed] [Google Scholar]
- Auber D., Bourqui R., Delest M., Lambert A., Mary P., Melancon G., Pinaud B., Renoust B., Vallet J. Tulip 4. Laboratoire Bordelais de Recherche en Informatique. 2016. https://hal.archives-ouvertes.fr/hal-01359308
- Briggs J.A., Weinreb C., Wagner D.E., Megason S., Peshkin L., Kirschner M.W., Klein A.M. The dynamics of gene expression in vertebrate embryogenesis at single-cell resolution. Science. 2018;360:981–987. doi: 10.1126/science.aar5780. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brozovic M., Dantec C., Dardaillon J., Dauga D., Faure E., Gineste M., Louis A., Naville M., Nitta K.R., Piette J. ANISEED 2017: extending the integrated ascidian database to the exploration and evolutionary comparison of genome-scale datasets. Nucleic Acids Res. 2018;46:D718–D725. doi: 10.1093/nar/gkx1108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brunet J.-P., Tamayo P., Golub T.R., Mesirov J.P. Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. USA. 2004;101:4164–4169. doi: 10.1073/pnas.0308531101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cao J., Packer J.S., Ramani V., Cusanovich D.A., Huynh C., Daza R., Qiu X., Lee C., Furlan S.N., Steemers F.J. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science. 2017;357:661–667. doi: 10.1126/science.aam8940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cao C., Lemaire L.A., Wang W., Yoon P.H., Choi Y.A., Parsons L.R., Matese J.C., Wang W., Levine M., Chen K. Comprehensive single-cell transcriptome lineages of a proto-vertebrate. Nature. 2019;571:349–354. doi: 10.1038/s41586-019-1385-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Christiaen L., Wagner E., Shi W., Levine M. Whole-mount in situ hybridization on sea squirt (ciona intestinalis) embryos. Cold Spring Harb. Protoc. 2009;2009 doi: 10.1101/pdb.prot5348. pdb.prot5348. [DOI] [PubMed] [Google Scholar]
- Combs P.A., Eisen M.B. Sequencing mRNA from cryo-sliced Drosophila embryos to determine genome-wide spatial patterns of gene expression. PLoS One. 2013;8:e71820. doi: 10.1371/journal.pone.0071820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Conklin E.G. The organization and cell lineage of the ascidian egg. J. Acad. Nat. Sci. 1905;13:1–119. [Google Scholar]
- Corbo J.C., Di Gregorio A., Levine M. The ascidian as a model organism in developmental and evolutionary biology. Cell. 2001;106:535–538. doi: 10.1016/s0092-8674(01)00481-0. [DOI] [PubMed] [Google Scholar]
- de Medeiros G., Norlin N., Gunther S., Albert M., Panavaite L., Fiuza U.-M., Peri F., Hiiragi T., Krzic U., Hufnagel L. Confocal multiview light-sheet microscopy. Nat. Commun. 2015;6:8881. doi: 10.1038/ncomms9881. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dehal P., Satou Y., Campbell R.K., Chapman J., Degnan B., De Tomaso A., Davidson B., Di Gregorio A., Gelpke M., Goodstein D.M. The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science. 2002;298:2157–2167. doi: 10.1126/science.1080049. [DOI] [PubMed] [Google Scholar]
- Eng C.L., Lawson M., Zhu Q., Dries R., Koulena N., Takei Y., Yun J., Cronin C., Karp C., Yuan G.-C., Cai L. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature. 2019;568:235–239. doi: 10.1038/s41586-019-1049-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Farrell J.A., Wang Y., Riesenfeld S.J., Shekhar K., Regev A., Schier A.F. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science. 2018;360:eaar3131. doi: 10.1126/science.aar3131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fernandez R., Das P., Mirabet V., Moscardi E., Traas J., Verdeil J.-L., Malandain G., Godin C. Imaging plant growth in 4D: robust tissue reconstruction and lineaging at cell resolution. Nat. Methods. 2010;7:547–553. doi: 10.1038/nmeth.1472. [DOI] [PubMed] [Google Scholar]
- Hennig B.P., Velten L., Racke I., Tu C.S., Thoms M., Rybin V., Besir H., Remans K., Steinmetz L.M. Large-scale low-cost NGS library preparation using a robust Tn5 purification and tagmentation protocol. G3. 2018;8:79–89. doi: 10.1534/g3.117.300257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hotta K., Mitsuhara K., Takahashi H., Inaba K., Oka K., Gojobori T., Ikeo K. A web-based interactive developmental table for the ascidian Ciona intestinalis, including 3D real-image embryo reconstructions: I. From fertilized egg to hatching larva. Dev. Dyn. 2007;236:1790–1805. doi: 10.1002/dvdy.21188. [DOI] [PubMed] [Google Scholar]
- Imai K.S., Hino K., Yagi K., Satoh N., Satou Y. Gene expression profiles of transcription factors and signaling molecules in the ascidian embryo: towards a comprehensive understanding of gene networks. Development. 2004;131:4047–4058. doi: 10.1242/dev.01270. [DOI] [PubMed] [Google Scholar]
- Imai K.S., Levine M., Satoh N., Satou Y. Regulatory blueprint for a chordate embryo. Science. 2006;312:1183–1187. doi: 10.1126/science.1123404. [DOI] [PubMed] [Google Scholar]
- Junker J.P., Noël E.S., Guryev V., Peterson K.A., Shah G., Huisken J., McMahon A.P., Berezikov E., Bakkers J., van Oudenaarden A. Genome-wide RNA Tomography in the zebrafish embryo. Cell. 2014;159:662–675. doi: 10.1016/j.cell.2014.09.038. [DOI] [PubMed] [Google Scholar]
- Karaiskos N., Wahle P., Alles J., Boltengagen A., Ayoub S., Kipar C., Kocks C., Rajewsky N., Zinzen R.P. The Drosophila embryo at single-cell transcriptome resolution. Science. 2017;358:194–199. doi: 10.1126/science.aan3235. [DOI] [PubMed] [Google Scholar]
- Keller P.J., Schmidt A.D., Wittbrodt J., Stelzer E.H.K. Reconstruction of zebrafish early embryonic development by scanned light sheet microscopy. Science. 2008;322:1065–1069. doi: 10.1126/science.1162493. [DOI] [PubMed] [Google Scholar]
- Krzic U., Gunther S., Saunders T.E., Streichan S.J., Hufnagel L. Multiview light-sheet microscope for rapid in toto imaging. Nat. Methods. 2012;9:730–733. doi: 10.1038/nmeth.2064. [DOI] [PubMed] [Google Scholar]
- La Manno G., Soldatov R., Zeisel A., Braun E., Hochgerner H., Petukhov V., Lidschreiber K., Kastriti M.E., Lönnerberg P., Furlan A. RNA velocity of single cells. Nature. 2018;560:494–498. doi: 10.1038/s41586-018-0414-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B., Trapnell C., Pop M., Salzberg S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee D.D., Seung H.S. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401:788–791. doi: 10.1038/44565. [DOI] [PubMed] [Google Scholar]
- Lemaire P., Garrett N., Gurdon J.B. Expression cloning of Siamois, a Xenopus homeobox gene expressed in dorsal-vegetal cells of blastulae and able to induce a complete secondary axis. Cell. 1995;81:85–94. doi: 10.1016/0092-8674(95)90373-9. [DOI] [PubMed] [Google Scholar]
- Levine M., Davidson E.H. Gene regulatory networks for development. Proc. Natl. Acad. Sci. USA. 2005;102:4936–4942. doi: 10.1073/pnas.0408031102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin C.-J. Projected gradient methods for nonnegative matrix factorization. Neural Comput. 2007;19:2756–2779. doi: 10.1162/neco.2007.19.10.2756. [DOI] [PubMed] [Google Scholar]
- Lorensen W., Cline H. Marching cubes: A high resolution 3d surface construction algorithm. Comput. Graph. 1987;21:163–169. [Google Scholar]
- Madgwick A., Magri M.S., Dantec C., Gailly D., Fiuza U.-M., Guignard L., Hettinger S., Gomez-Skarmeta J.L., Lemaire P. Evolution of embryonic cis-regulatory landscapes between divergent Phallusia and Ciona ascidians. Dev. Biol. 2019;448:71–87. doi: 10.1016/j.ydbio.2019.01.003. [DOI] [PubMed] [Google Scholar]
- McDole K., Guignard L., Amat F., Berger A., Malandain G., Royer L.A., Turaga S.C., Branson K., Keller P.J. In toto imaging and reconstruction of post-implantation mouse development at the single-cell level. Cell. 2018;175:859–876.e33. doi: 10.1016/j.cell.2018.09.031. [DOI] [PubMed] [Google Scholar]
- McKenna A., Findlay G.M., Gagnon J.A., Horwitz M.S., Schier A.F., Shendure J. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science. 2016;353:aaf7907. doi: 10.1126/science.aaf7907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Negishi T., Takada T., Kawai N., Nishida H. Localized PEM mRNA and protein are involved in cleavage-plane orientation and unequal cell divisions in ascidians. Curr. Biol. 2007;17:1014–1025. doi: 10.1016/j.cub.2007.05.047. [DOI] [PubMed] [Google Scholar]
- Neveu P., Kye M.J., Qi S., Buchholz D.E., Clegg D.O., Sahin M., Park I.-H., Kim K.-S., Daley G.Q., Kornblum H.I. MicroRNA profiling reveals two distinct p53-related human pluripotent stem cell states. Cell Stem Cell. 2010;7:671–681. doi: 10.1016/j.stem.2010.11.012. [DOI] [PubMed] [Google Scholar]
- Nishida H. Cell lineage analysis in ascidian embryos by intracellular injection of a tracer enzyme. III. Up to the tissue restricted stage. Dev. Biol. 1987;121:526–541. doi: 10.1016/0012-1606(87)90188-6. [DOI] [PubMed] [Google Scholar]
- Nitzan M., Karaiskos N., Friedman N., Rajewsky N. Gene expression cartography. Nature. 2019;576:132–137. doi: 10.1038/s41586-019-1773-3. [DOI] [PubMed] [Google Scholar]
- Peng G., Suo S., Cui G., Yu F., Wang R., Chen J., Chen S., Liu Z., Chen G., Qian Y. Molecular architecture of lineage allocation and tissue organization in early mouse embryo. Nature. 2019;572:528–532. doi: 10.1038/s41586-019-1469-8. [DOI] [PubMed] [Google Scholar]
- Petersen C.P., Reddien P.W. Wnt signaling and the polarity of the primary body axis. Cell. 2009;139:1056–1068. doi: 10.1016/j.cell.2009.11.035. [DOI] [PubMed] [Google Scholar]
- Pettersen E.F., Goddard T.D., Huang C.C., Couch G.S., Greenblatt D.M., Meng E.C., Ferrin T.E. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 2004;25:1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
- Picelli S., Faridani O.R., Björklund A.K., Winberg G., Sagasser S., Sandberg R. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 2014;9:171–181. doi: 10.1038/nprot.2014.006. [DOI] [PubMed] [Google Scholar]
- Raj B., Wagner D.E., McKenna A., Pandey S., Klein A.M., Shendure J., Gagnon J.A., Schier A.F. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 2018;36:442–450. doi: 10.1038/nbt.4103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rosenberg A.B., Roco C.M., Muscat R.A., Kuchina A., Sample P., Yao Z., Graybuck L.T., Peeler D.J., Mukherjee S., Chen W. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science. 2018;360:176–182. doi: 10.1126/science.aam8999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sardet C., Dru P., Prodon F. Maternal determinants and mRNAs in the cortex of ascidian oocytes, zygotes and embryos. Biol. Cell. 2005;97:35–49. doi: 10.1042/BC20040126. [DOI] [PubMed] [Google Scholar]
- Satija R., Farrell J.A., Gennert D., Schier A.F., Regev A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 2015;33:495–502. doi: 10.1038/nbt.3192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schneider C.A., Rasband W.S., Eliceiri K.W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods. 2012;9:671–675. doi: 10.1038/nmeth.2089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schulz M.H., Zerbino D.R., Vingron M., Birney E. Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics. 2012;28:1086–1092. doi: 10.1093/bioinformatics/bts094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scialdone A., Tanaka Y., Jawaid W., Moignard V., Wilson N.K., Macaulay I.C., Marioni J.C., Göttgens B. Resolving early mesoderm diversification through single-cell expression profiling. Nature. 2016;535:289–293. doi: 10.1038/nature18633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sherrard K., Robin F., Lemaire P., Munro E. Sequential activation of apical and basolateral contractility drives ascidian endoderm invagination. Curr. Biol. 2010;20:1499–1510. doi: 10.1016/j.cub.2010.06.075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sladitschek H.L., Neveu P.A. MXS-Chaining: A Highly Efficient Cloning Platform for Imaging and Flow Cytometry Approaches in Mammalian Systems. PLoS One. 2015;10:e0124958. doi: 10.1371/journal.pone.0124958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sladitschek H.L., Neveu P.A. The bimodally expressed microRNA miR-142 gates exit from pluripotency. Mol. Syst. Biol. 2015;11:850. doi: 10.15252/msb.20156525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sladitschek H.L., Neveu P.A. A gene regulatory network controls the balance between mesendoderm and ectoderm at pluripotency exit. Mol. Syst. Biol. 2019;15:e9043. doi: 10.15252/msb.20199043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sommer C., Straehle C., Köthe U., Hamprecht F.A. Ilastik: Interactive learning and segmentation toolkit. 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro. 2011:230–233. [Google Scholar]
- Telley L., Govindan S., Prados J., Stevant I., Nef S., Dermitzakis E., Dayer A., Jabaudon D. Sequential transcriptional waves direct the differentiation of newborn neurons in the mouse neocortex. Science. 2016;351:1443–1446. doi: 10.1126/science.aad8361. [DOI] [PubMed] [Google Scholar]
- Treen N., Heist T., Wang W., Levine M. Depletion of maternal cyclin B3 contributes to zygotic genome activation in the Ciona embryo. Curr. Biol. 2018;28:1330–1331. doi: 10.1016/j.cub.2018.03.058. [DOI] [PubMed] [Google Scholar]
- Tusi B.K., Wolock S.L., Weinreb C., Hwang Y., Hidalgo D., Zilionis R., Waisman A., Huh J.R., Klein A.M., Socolovsky M. Population snapshots predict early haematopoietic and erythroid hierarchies. Nature. 2018;555:54–60. doi: 10.1038/nature25741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wada H., Saiga H., Satoh N., Holland P.W. Tripartite organization of the ancestral chordate brain and the antiquity of placodes: insights from ascidian Pax-2/5/8, Hox and Otx genes. Development. 1998;125:1113–1122. doi: 10.1242/dev.125.6.1113. [DOI] [PubMed] [Google Scholar]
- Wagner D.E., Weinreb C., Collins Z.M., Briggs J.A., Megason S.G., Klein A.M. Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo. Science. 2018;360:981–987. doi: 10.1126/science.aar4362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zalokar M., Sardet C. Tracing of cell lineage in embryonic development of Phallusia mammillata (Ascidia) by vital staining of mitochondria. Dev. Biol. 1984;102:195–205. doi: 10.1016/0012-1606(84)90184-2. [DOI] [PubMed] [Google Scholar]
- Zerbino D.R., Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–829. doi: 10.1101/gr.074492.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang X., Abreu J.G., Yokota C., MacDonald B.T., Singh S., Coburn K.L.A., Cheong S.-M., Zhang M.M., Ye Q.-Z., Hang H.C. Tiki1 is required for head formation via Wnt cleavage-oxidation and inactivation. Cell. 2012;149:1565–1577. doi: 10.1016/j.cell.2012.04.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ziegenhain C., Vieth B., Parekh S., Reinius B., Guillaumet-Adkins A., Smets M., Leonhardt H., Heyn H., Hellmann I., Enard W. Comparative analysis of single-cell RNA sequencing methods. Mol. Cell. 2017;65:631–643.e4. doi: 10.1016/j.molcel.2017.01.023. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The accession numbers for the sequencing results reported in this paper are ArrayExpress: E-MTAB-6506, E-MTAB-6508, E-MTAB-6528, E-MTAB-6530. They can be explored at http://digitalembryo.org. Code is available at https://git.embl.de/neveu/morphoseq.