Summary
We developed the Rainbow-seq technology to trace cell division history and reveal single-cell transcriptomes. With distinct fluorescent protein genes as lineage markers, Rainbow-seq enables each single-cell RNA sequencing (RNA-seq) experiment to simultaneously decode the lineage marker genes and read single-cell transcriptomes. We triggered lineage tracking in each blastomere at the 2-cell stage, observed microscopically inequivalent contributions of the progeny to the two embryonic poles at the blastocyst stage, and analyzed every single cell at either 4- or 8-cell stage with deep paired-end sequencing of full-length transcripts. Although lineage difference was not marked unequivocally at a single-gene level, it became clear when the transcriptome was analyzed as a whole. Moreover, several groups of novel transcript isoforms with embedded repeat sequences exhibited lineage difference, suggesting a possible link between DNA demethylation and cell fate decision. Rainbow-seq bridged a critical gap between division history and single-cell RNA-seq assays.
Subject Areas: Preimplantation, Cell fate, Single cell, Lineage tracing, Transposon
Graphical Abstract

Highlights
-
•
Rainbow-seq traces cell division history and reveals single-cell transcriptomes
-
•
A single gene does not exhibit unequivocal lineage difference at 4- and 8-cell stages
-
•
Lineage difference becomes clear when the transcriptome is analyzed as a whole
-
•
Novel RNA isoforms with embedded repeat sequences exhibit lineage difference
Preimplantation, Cell fate, Single cell, Lineage tracing, Transposon
Introduction
A central question to developmental biology is how cells break molecular and functional symmetry during mitotic divisions. Two models have been proposed (Altschuler and Wu, 2010). In one model, a progenitor cell first divides into a set of homogeneous cells, and then these homogeneous cells exhibit different tendencies toward different routes of differentiation (homogeneous model). In the other model, every division creates a pair of cells with slight differences in their molecular profile, and accumulation of these differences leads to cell fate commitment (non-homogenous model).
Mammalian preimplantation development from a zygote to a blastocyst consisting of the functionally differentiated inner cell mass (ICM) and trophectoderm (TE) (Wolpert, 2015) is a perfect example of the symmetry-breaking paradigm (Gardner, 2001, Zernicka-Goetz et al., 2009). The first cell fate decision in mammals was believed to be an example of the homogeneous model (Hiiragi et al., 2006). After fertilization, a mouse zygote was thought to undergo two or more rounds of symmetric cell divisions, which create four or more homogeneous embryonic cells (blastomeres) (Leung and Zernicka-Goetz, 2015, Motosugi et al., 2005). These homogeneous cells further divide and through a stochastic process (Wennekamp and Hiiragi, 2012) eventually self-organize into ICM and TE (Wennekamp et al., 2013) (Figure 1A). However, recent work on single-cell transcriptome profiling reported reproducible inter-blastomere gene expression differences in 2- and 4-cell mouse embryos (Goolam et al., 2016, Biase et al., 2014, Burton et al., 2013, Piotrowska-Nitsche et al., 2005, Shi et al., 2015), offering support to the non-homogeneous model (Vogel, 2005, Hiiragi and Solter, 2006).
Figure 1.
Competing Models and Lineage Tracking Strategy
(A) Homogeneous and non-homogeneous hypotheses differ in whether the 2 blastomeres at 2-cell stage have identical tendencies to contribute their descendant cells to inner cell mass (brown) and trophectoderm (blue-gray).
(B) Cell lineage tracing was enabled by creation of preimplantation embryos heterozygous for Brainbow sequence and Cre and induction of Cre at 2-cell stage. At blastocyst stage, activated Brainbow genes can exhibit either relatively equal (balanced) or biased (unbalanced) distributions in inner cell mass and trophectoderm. The equal and biased distributions would lend support to the homogeneous and the non-homogeneous hypotheses respectively.
The critical missing information for using single-cell transcriptome profiling to test the competing models is the history of cell divisions (Goolam et al., 2016, Plachta et al., 2011, Torres-Padilla et al., 2007, Rodriguez-Terrones et al., 2018). We approached the challenge of connecting cell-to-cell transcriptome differences at later embryonic stages to progenitor cells by developing Rainbow-seq, which combines cell division lineage tracing (Tabansky et al., 2013) with single-cell RNA sequencing.
Advances in cell-specific genomic barcodes are revolutionizing cell lineage tracing. Leveraging genome editing tools (McKenna et al., 2016, Kalhor et al., 2017, Perli et al., 2016) (Raj et al., 2018) or combinatorial DNA recombination loci (Polylox) (Pei et al., 2017), researchers were able to insert cell-specific exogenous DNA barcodes to the genomes of progenitor cells. The lineage information is propagated by DNA duplication and can be recovered in the progeny by targeted sequencing of the engineered genomic sequences. We reasoned that combining lineage-specific genomic barcodes and single-cell RNA sequencing (RNA-seq) would reveal single-cell transcriptome together with cell division lineage. To maximize the chances of obtaining expressible genomic barcodes, we resorted to the well-tested Brainbow-2.1 construct that can express one of four fluorescent protein genes (green fluorescent protein [GFP], cyan fluorescent protein [CFP], red fluorescent protein [RFP], yellow fluorescent protein [YFP]) (Livet et al., 2007). These fluorescent protein genes interspersed with recombination sites serve as both genomic barcodes and expressible cell markers. Their expression can be examined microscopically, offering an intermediate step of validation before single-cell RNA-seq analysis.
Results
Cell Lineage Tracing Starting with 2-Cell Embryos
We generated mouse embryos by crossing a strain that is homozygous for the Brainbow sequence (Cai et al., 2013) with a mouse strain that is homozygous for a tamoxifen-inducible Cre recombinase gene (Figure 1B). Tamoxifen treatment was expected to induce CRE translocation to the nucleus, which in turn could induce recombination on loxP sites, and thus lead to activation of one and only one of the four fluorescent protein genes in the Brainbow sequence (Cai et al., 2013). We tested whether these anticipated effects could be achieved by a short application of tamoxifen to early 2-cell embryos (Figure S1A). Upon collection of early 2-cell embryos (42 hours post injection) and a 2-hr 4-hydroxytamoxifen treatment, CRE protein translocated to the nuclei (Figure S1B), and within 1 hr post treatment the nuclear concentration of CRE reduced to the background level (Figure S1B). Embryos that underwent this brief tamoxifen treatment expressed fluorescent proteins at later developmental stages including 4-cell, 8-cell, and blastocyst stages (Figure 2A). Hereafter, we name these early 2-cell-stage hybrid embryos that undergo a 2-hr 4-hydroxytamoxifen treatment as rainbow embryos, and the Cre-activated genes as lineage marker genes.
Figure 2.
Cell Lineage Tracking in Mouse Preimplantation Embryos
(A) Recombination of Brainbow construct was induced in early 2-cell embryos, and images were acquired at 4-cell, 8-cell and blastocyst stages. GFP: GFP, green fluorescent protein; RFP, red fluorescent protein; DIC: differential interference contrast. Scale bar: 15 μm. Images were acquired by a wide-field microscope containing filters for RFP and GFP.
(B) Two representative blastocyst-stage embryos imaged with DIC, DAPI, GFP, and RFP channels.
(C) Numbers of blastocyst-stage embryos with balanced (orange) and unbalanced (blue) lineage marker expression in the two embryonic poles, with respective counting methods (three lanes). Data from different counting methods are shown in columns, which include the method requiring at least 1/3 cells (one-third) or 1/2 cells (half) expressing either GFP or RFP, and the method of counting all the imaged embryos (all).
See also Figure S1.
Daughter Cells of Blastomeres at 2-Cell Stage Tend to Contribute Inequivalently to Embryonic and Abembryonic Poles at the Blastocyst Stage
We leveraged the Brainbow embryos to revisit the question whether the descendant cells produced by each blastomere of a 2-cell embryo contribute equivalently to embryonic and abembryonic poles in blastocyst-stage embryos (Hiiragi and Solter, 2004, Plusa et al., 2005, Vogel, 2005). Our experiment was designed as follows. First, a total of N blastocyst-stage rainbow embryos would be imaged by a wide-field fluorescent microscope with filters for GFP, RFP, and DAPI (Figures 2A and 2B). Because we had to devote one of three channels to nuclei staining (DAPI) for cell counting, this experimental design would not be able to detect CFP- and YFP-expressing cells. Next, the imaged embryos with more than one-third of their cells in the embryonic pole expressing a detectable lineage marker (either GFP or RFP) would be used for data analysis. A contingency table would be built for each embryo documenting the numbers of GFP-expressing (GFP+) and non-GFP-expressing (GFP-) cells in the embryonic pole and in the abembryonic pole. An odds ratio (OR) would be calculated for each embryo, and if 1/3 < OR < 3, this embryo would be regarded as having balanced GFP+ cells in the two poles, otherwise the embryo would be regarded as having unbalanced GFP+ cells in the two poles. The same analysis would be carried out for RFP.
We imaged 18 early blastocysts expressing RFP and/or GFP, of which 12 embryos had more than one-third of cells expressing fluorescent protein. Eight of these embryos (66%) exhibited unbalanced GFP+ or RFP+ cells in the two poles (first column, Figure 2C). For sensitivity analyses, we used two alternative counting methods to re-estimate the proportion of embryos with unbalanced distribution of cell lineage between the two poles. First, we required the embryos to be included for analysis to have at least 50% of its cells expressing either GFP or RFP, resulting in a total of 9 embryos, of which 6 embryos (66%) exhibited unbalanced GFP+ or RFP+ cells in the two poles (second column, Figure 2C). Second, we counted the GFP+ cells in the embryonic pole (n1) and GFP+ cells in the abembryonic pole (n2) and regarded any embryo with 1/3 <n1/n2< 3 as having balanced GFP+ cells, and did the same counting for RFP. All 18 embryos were included in this analysis, of which 13 embryos (72%) were unbalanced (third column, Figure 2C). Taken together, approximately two-thirds of rainbow embryos exhibited unbalanced distribution of the two cell lineages in the two embryonic poles. Previous efforts obtained discrepant estimates of such a proportion (Piotrowska et al., 2001, Hiiragi and Solter, 2004, Plusa et al., 2005, Vogel, 2005). Our results obtained from rainbow embryos were better aligned with those suggesting more embryos exhibiting unequal contribution of the two lineages than those exhibiting equal contribution to the embryonic and abembryonic poles. These microscopic analyses confirmed that the recombination strategy introduced lineage-specific expressible marker genes and served as a sanity check for downstream Rainbow-seq analyses.
Single-Cell RNA Sequencing Analysis of Rainbow Embryos
We induced lineage tracking from 2-cell stage and carried out Rainbow-seq on nine 4-cell embryos and four 8-cell embryos. We lost one blastomere from an 8-cell embryo during the manipulation. We combined the Smart-seq2 full-length RNA-seq protocol (Picelli et al., 2014) with deep paired-end sequencing and produced paired-end 100-nt sequencing data from a total of 67 single blastomeres, including 36 from 4-cell stage and 31 from 8-cell stage (Tables S1 and S2). One 8-cell-stage blastomere did not pass our quality control as it produced fewer than 3 million uniquely mapped sequencing read pairs (Row E13, B7, Table S2). The remaining 66 cells yielded on average 21 million uniquely mapped read pairs (mm10) per cell (Figure S2A). We used single-cell RNA-seq reads mapped to the Brainbow-2.1 sequence to determine which lineage marker gene was expressed and categorized the cells from each embryo into two groups (lineage A and B) based on its expressed marker gene (Lineage column, Tables S1 and S2). Data were insufficient to resolve three 4-cell-stage embryos (12 blastomeres) and four 8-cell-stage blastomeres due to limited sequencing reads mapped onto any lineage marker gene. We deposited all sequencing data into GEO (GSE106287) for public access; however, we limited our analyses to the 50 blastomeres whose lineage can be traced to either one of the sister blastomeres at the 2-cell stage.
The first two principle components explained 8% of sample variations, where the developmental stage was a major source of variation (Figure S2B). With Fragments Per Kilobase of transcript per Million mapped reads (FPKM) >1 as the threshold for calling expressed genes, the 4-cell-stage blastomeres on average expressed 6,852 coding genes per cell. A total of 15,218 coding genes were expressed in at least one 4-cell-stage blastomere. The 8-cell blastomeres on average expressed 6,474 coding genes per cell, with a total of 15,342 coding genes expressed in at least one 8-cell blastomere.
Transcriptome-wide Expression Difference between Cell Division Lineages
Recognizing that data of individual genes may not provide conclusive evidence to either the homogeneous or the non-homologous model, we analyzed data of the entire transcriptome. We carried out two transcriptome-wide tests. First, we tested whether there is non-zero probability for any subset of genes to exhibit greater between-lineage variation than embryo-to-embryo variation. To this end, we fit a generalized linear model to every gene (g) and calculated the sum of squares to account for the between-lineage variation and embryo-to-embryo variation . For every gene, we calculated the ratio of lineage sum of squares to combined sum of squares of embryo and lineage. The empirical distribution of this ratio (denoted as R) exhibited greater tail probabilities than a series of three β distributions, β(2,5), β(1,3), and β(5,5), each of which has non-zero probability in any non-singular interval near 1 ([1-δ,1],(δ> 0)) (Figures 3A–3D). The non-zero probability for a subset of genes to exhibit greater lineage variation than embryo variation suggests a transcriptome-wide difference between the two division lineages.
Figure 3.
Transcriptome-wide Differences between Two Lineages
(A and B) Empirical distributions of random variable (ratio of between-lineage variation to combined variation) in 4-cell (A) and 8-cell-stage (B) embryos, superimposed with a series of β distributions [red: β(1,3), blue: β(2,2), green: β(5,5)].
(C and D) Q-Q plots of R versus a series of β distributions in 4-cell (C) and 8-cell (D) stage embryos, where red, blue, and green represent β(1,3), β(2,2), and β(5,5), respectively.
(E and F) Distributions of q-values from lineage equivalence tests based from real data (red) and shuffled data (blue) in 4-cell (E) and 8-cell-stage (F) blastomeres.
Second, after a gene-by-gene test of the null hypothesis that this gene does not exhibit between-lineage expression differences (ANOVA, Transparent Methods), we obtained the empirical distribution of the q-values for all the genes from such a test (red bars, Figures 3E and 3F). To derive a background distribution, we shuffled the lineage labels on every blastomere and carried out the same test (blue bars, Figures 2E and 2F). The distribution of q-values from real data was skewed toward lower values when compared with the q-values from shuffled data (p value < 2.2 × 10−16, Kolmogorov test), reflecting a transcriptome-wide difference between the two cell division lineages.
Specific Genes with Expression Differences between Cell Division Lineages
The genes with the largest between-lineage differences in transcript abundance at 4-cell stage were involved in the basic transcription machinery including Gtf2f1 (General Transcription Factor IIF Subunit 1, TFIIF) and Eloa (RNA Polymerase II Transcription Factor SIII Subunit A1, SIII), negative regulations of P53 activity (Aurkb) and WNT signaling (Ctnnbip1, Inhibitor of Beta-Catenin and Tcf4), nuclear pore complex (Nup35), protein synthesis in cytoplasm (Rbm3) and mitochondrion (Mrpl10), endoplasmic reticulum (ER) biogenesis (Atl2), and glycogen synthesis (Fbp2) (Figure 4A). The genes with the largest between-lineage variations at 8-cell stage were transcriptional suppressors Bclaf1 and Smad7, the chromatin remodeler Atrx involved in deposition of H3K9me3, histone demethylase Kdm2b related to removal of active histone marks including H3K4me3, vesicle trafficking and secretory pathways including Golgi membrane proteins Glg1 and B4galnt2, trans-Golgi network, Snx15 and Golgi-to-ER transport Stx8 (vesicle fusion, Golgi-to-ER transport), cell shape modulator Fez2, and the lincRNA genes Gm12514 and GM14617 (Figure 4B). Interestingly, suppression of H3K4 demethylase Kdm2b in mouse oocytes resulted in impaired blastocyst formation (see extended Figure 6E in Liu et al. (2016)). Even though the above-mentioned genes appeared to support the non-homologous model, the absolute expression difference is small (Figure 4) and statistical significance is obscure due to limited sample size and multiple comparisons. At this point, it became clear that the statistical tests based on the entire transcriptome (Figure 3) served to better evaluate the two competing models than any possible test based on any individual gene.
Figure 4.
Genes with differential expression between two lineages.
Genes with the between-lineage differences at 4- (A) and 8-cell (B) stages. FPKM (y axis) of every blastomere (column) in each embryo (marked by embryo number in columns). Shaded columns delineate different embryos. The blastomeres of the two lineages are marked with A (red) and B (blue).
Figure 6.
A Model of Repeat Sequence-Related Lineage Difference
Decrease of modified cytosine (filled circles) allows repeat sequence (yellow) to be expressed (A). Atrx is expressed in one lineage where ATRX preferentially attaches to hypomethylated (empty circles) repeats and suppresses their expression (B).
Transposon-Related Novel Transcript Isoforms
We tested whether any novel transcript isoforms were specifically expressed in early-stage preimplantation embryos. To this end, we combined all the Rainbow-seq datasets to obtain over 2.1 billion paired-end reads, of which over 1.6 billion read pairs could be uniquely mapped to the genome (mm10). Using Cufflinks (v2.2.1 with g parameter), we recorded a total of 20,371 novel transcript isoforms, corresponding to 7,594 annotated genes. Among these novel isoforms, 3,632 (17.8% of all novel isoforms) contained transposon sequences according to Repeatmasker database (February 2015 release) (Figure 5A), which will be referred to as Transposon Related Novel Isoforms (TRENI). These 3,632 TRENIs corresponded to 1,701 Refseq genes, in which “poly(A) RNA binding,”“RNA binding,” “nucleotide binding,” and “nucleic acid binding” were enriched (Benjamini adjusted p value < 1.1 × 10−7; Gene Ontology functions [DAVID v6.8]) (Figure 5B), suggesting a recurring theme in RNA binding and processing.
Figure 5.
Transposon Related Novel Isoforms (TRENIs)
(A) Pie chart of novel transcript isoforms, including 16,739 that do not contain any repeat sequence (gray) and 3,064 and 673 isoforms with transposons located downstream to start codons (blue) and in 5′ UTRs (yellow).
(B) Enriched GO molecular functions in TRENIs, ranked by multiple hypothesis testing adjusted p values.
(C) Histogram of junction ratios for all TRENI-containing transposons.
(D and E) Histogram of TRENIs' q-values derived from tests of equivalent expression between lineages at 4- (D) and 8-cell (E) stages, based from real data (red) and shuffled data (blue).
(F) Copy number of repeats contained in TRENIs, categorized by transposon family (columns).
(G) Odds ratios between TRENIs and each repeat family (columns). Vertical bars: 95% confidence intervals. Odds ratio was not calculated for those repeat families when the TRENIs associated with a repeat family were fewer than 20.
(H) Copy number of repeats contained in TRENIs' 5 UTRs, categorized by transposon family (columns).
(I) Odds ratios between 5′UTR-TRENIs and each repeat family (columns). Vertical bars: 95% confidence intervals. Odds ratio was not calculated for those repeat families when the 5′UTR-TRENIs associated with a repeat family were fewer than 20.
See also Figures S3–S6.
We asked whether the incorporation of these transposons in TRENIs were supported by junction reads spanning across the transposon and the nearest known Refseq exons. To this end, we computed the ratio of junction reads and the total number of reads aligned to each transposon, hereafter denoted as junction ratio (JR, Transparent Methods). We identified 3,902 transposons contained in TRENIs, of which 2,817 (72.19%) exhibited a JR of 0.9 or above (Figure 5C).
To test whether TRENIs exhibit between-lineage expression difference, we derived a q-value for each of the 3,632 TRENIs by ANOVA (Transparent Methods). For comparison, we shuffled the lineage labels on all blastomeres and derived q-values again (background q-values). The distribution of real data q-values differed with that of background q-values (p value < 2.2 × 10−16, Kolmogorov test) (Figures 5D, 5E, and S3). More importantly, a total of 37 TRINEs exhibited q-values less than 0.01, whereas none of the background q-values were less than the same threshold (Figure S3). Finally, all the repeats expressed in 4- and 8-cell embryos, regardless of whether they were contained within TRINEs, did not exhibit statistically discernable between-lineage differences (the distribution of q-values derived from real data was not different from that of shuffled data) (Figure S4).
Murine Alu Makes the Largest Contribution to Novel Transcript Isoforms
We asked whether different repeat families contributed equally to TRENIs. In the total of 58 repeat families (RepeatMasker) 18 families had 20 or more genomic copies included in TRENIs (Figure 5F). To account for genome-wide copy numbers of each repeat family, we calculated the OR for each repeat family to be contained in TRENIs. When repeat families were ranked by this OR, murine Alu, B4, ID, and B2 were most enriched in TRENIs (Figure 5G). We asked whether the association between any repeat family and TRENIs would became more pronounced when only high-confidence transposon-containing TRENIs were included in the analysis. To this end, we repeated the above analysis using only the 2,817 transposons with JR ≥ 0.9, which resulted in similar ORs with Alu being the most enriched repeat family (Figure S5).
Next, we asked whether different repeat families contributed equally to TRENIs' 5′ UTR. There were a total of 673 (18.53%) TRENIs in which the transposons were inserted in the 5′ upstream to previously annotated start codons (5′ UTR-TRENIs) (Figure 5H), among which 418 contained murine Alu, simple repeat, low complexity, and B4 and B2 repeats in 5′ UTRs, where murine Alu alone accounted for more than three hundred 5′ UTRs (Figure 4H). We calculated the OR to quantify the association of transposons in each transposon family to 5′UTR-TRENIs. Murine Alu exhibited the largest enrichment to 5′ UTRs in these novel transcript isoforms (OR = 3.03, p value = 0, chi-square test) (Figure 4H). In contrast, the other repeat families exhibited either moderate enrichment (1 < OR <1.5) or depletion (OR <1) in 5′ UTR-TRENIs (Figure 5I). Finally, 9 and 26 5′ UTR-TRENIs exhibited expression differences between the two division lineages at 4- and 8-cell stages, respectively, including novel transcript isoforms of Gata4 (Figure S6A) and Khdc1a (Figure S6B).
Discussion
Lack-of-Continuity Dilemma
Not a single gene has been reported to exhibit continuous between-lineage expression difference starting from the two blastomeres at 2-cell stage until the formation of ICM and TE at blastocyst stage. This is not completely unexpected considering that comprehensive transcriptome analyses revealed highly dynamic gene transcription in early-stage mouse preimplantation embryos (Kidder and Pedersen, 1982, Xie et al., 2010), likely attributable to combined effects of cleavage-related split of RNA pool, RNA degradation, and zygotic genome activation (Biase et al., 2014). The lack of identified single gene(s) with continuous expression divergence posed a dilemma to the non-homologous hypothesis, in that if the two blastomeres at 2-cell stage are poised for cell fate decisions, what could to be the molecular means to record and later exhibit such a difference? Hereafter we call this dilemma the lack-of-continuity dilemma.
Reconciliations to Lack-of-Continuity Dilemma
The dilemma was rooted in the non-homogenous model and may be resolved in several ways. First, the cell fate decision is initiated after the first cleavage. This idea essentially refutes the non-homogeneous model for the first cleavage. Second, the lineage difference is passed from one gene activated at an earlier stage, for example, 2-cell stage, onto another gene activated at a later stage. Given that the transcription rate and mRNA half-life are both an order of magnitude smaller than the time of a cleavage, this model is within the boundaries of biophysics principles. Third, the lineage difference is recorded in the order of expression levels of a pair or list of genes, and this order is preserved throughout preimplantation development. For example, at 2-cell stage the expression level order is A > B in one blastomere and B > A in the other blastomere, and such an order is preserved in the ICM and TE. The third explanation was supported by single-cell RNA-seq data (Biase et al., 2014). Taken together, there are plausible means to explain the dilemma while withholding the homologous model.
The Rainbow-seq Approach
Rainbow-seq addresses the lack-of-continuity dilemma from another perspective. The main idea is to combine cell division tracing with single-cell RNA-seq. A genetic cell-lineage tracer was activated at 2-cell stage and analyzed at later stages. This method has two advantages. First, activation of distinct tracer genes in each blastomere's genome was achieved by a relatively simple procedure. Second, through a genetic tracer embedded in the genome, DNA duplication ensured inheritance of the tracer during cell divisions. We used Rainbow-seq to test the consequential events of two cell division lineages created by the first cleavage.
The Lineage Difference Lies in the Collective Behavior of Many Genes
When each gene is individually analyzed, even the most significant genes only exhibited small expression differences between the lineages (Figure 4). However, the difference of the two lineages became clear when the entire transcriptome was analyzed as a whole. In other words, the collective distribution of transcriptome exhibited a significant difference in the two cell division lineages (Figure 3). This observation has important implications. On the one hand, it repeated once again the negative results of previous studies that not a single gene exhibited clear difference between the two division lineages created by the first cleavage. On the other hand, the two division lineages in fact exhibit pronounced difference, which lies in the integral expression of many genes. To translate this difficult concept, we would like to make an analogy that “the individual seed beads all look similar, but two bead-woven crafts can exhibit very different patterns.”
Recent studies have reported inter-blastomere expression difference of several specific genes at the 4-cell stage or later stages (Biase et al., 2014, Burton et al., 2013, Piotrowska-Nitsche et al., 2005). This study pushed the initiation time point of between-blastomere difference to the 2-cell stage. The top-ranked genes with between-lineage differences appeared to emphasize a theme of negative regulation, including inhibitors of the WNT pathway Ctnnbip1 (Inhibitor of Beta-Catenin and Tcf4) (Tago et al., 2000) and Smad7 (Han et al., 2006), aurora kinase B that negatively regulates P53's transcriptional activity by promoting its degradation (Gully et al., 2012), chromatin remodeler Atrx that promotes repressive histone mark H3K9me3 and formation of heterochromatin (Sadic et al., 2015), histone demethylase Kdm2b that removes active histone mark H3K4me3 (Tsukada et al., 2006), and glycosyltransferase B4galnt2 involved in negative regulation of cell-cell adhesion (Kawamura et al., 2005) (Figure 4). Considering that negative feedback as exemplified in Delta-Notch signaling is perhaps the best known molecular mechanism for neighboring cells to take on divergent functions (Artavanis-Tsakonas et al., 1999), the emergence of negative regulators from Rainbow-seq analysis may allude to cell-cell communication between the first two cell lineages. The other emerged theme of vesicle trafficking and secretion as exemplified by Atl2, Glg1, B4galnt2, Snx15, and Stx8 is consistent with the idea of cell-cell communication at very early embryonic stages (Calarco-Gillam, 1986) (Sozen et al., 2014). Finally, Fbp2's divergent presence in the two lineages is reminiscent of a sperm-related embryo patterning hypothesis (Zernicka-Goetz, 2002, Takaoka and Hamada, 2014), considering the presence of Fbp2 mRNA in mouse sperm (Schuster et al., 2016) and interactions of FBP2 and ALDOA, a sperm-head protein essential for sperm-egg interaction (Petit et al., 2013).
Contribution of Transposon Related Novel RNA Isoforms (TRENI) and DNA Demethylation to Lineage Difference
A group of novel transcript isoforms of known genes that contained repeat sequences, called TRENIs, exhibited lineage difference, especially at the 8-cell stage (Figure 5D). The lineage difference is more pronounced in each group of TRENIs (Figures 5D and 5E) than an individual TRENI (Figure S6), reinforcing the idea that lineage difference at such early stages of development is marked by the cumulative behavior of many genes. At 8-cell stage, chromatin remodeler Atrx that responds to DNA demethylation and suppresses the expression of hypomethylated repeat sequence in preimplantation embryos (He et al., 2015) exhibited between-lineage expression difference (Figure 4B). These data suggest a model where DNA demethylation on repeat sequence is asymmetrically compensated by ATRX in the two cell lineages, leading to differential expression of transposon-containing transcripts (Figure 6). Murine Alu repeats were most enriched in TRENIs, and the murine Alu family was the only repeat family with significant enrichment at the 5′UTR of TRENIs. These expression data may offer an explanation to the selective retention of Alu repeats to upstream and intronic regions of known genes (Tsirigos and Rigoutsos, 2009), where the selectively retained Alu repeats are part of rare transcript isoforms, which are specifically expressed in preimplantation embryos. Furthermore, asymmetric suppression of these repeats leads to asymmetric expression of genes required for the first cell fate decision, which is the functional force for selective retention. Consistent with this idea, the same group of genes involved in “RNA binding” were enriched in TRENIs and were reported by an independent analysis as the neighboring genes to selectively retained Alu repeats (Tsirigos and Rigoutsos, 2009). However, the other transcripts of repeat sequences that were expressed in preimplantation embryos (Rodriguez-Terrones and Torres-Padilla, 2018) but not contained within TRENIs did not exhibit statistically discernable lineage differences (Figure S4).
Limitations of the Study
Tradeoff between the Number of Embryos and Sequencing Depth
Our experimental design was determined by our analysis goals. The goals of identifying potentially very small inter-blastomere differences and novel transcript isoforms required deep paired-end sequencing of full-length transcriptome in single cells. Most single cells in this study produced 20 to 30 million uniquely mapped 100-nt read pairs, which are 40–60 million uniquely mapped 100 nt reads per cell. Sequencing cost remains a limitation at this sequencing depth. If future applications do not care about transposon-produced transcripts or novel RNA isoforms, further experiments can significantly reduce the sequencing depth, use single-end sequencing, and even replace with other protocols that do not sequence full-length transcripts. As a tradeoff, more single cells can be sequenced.
Methods
All methods can be found in the accompanying Transparent Methods supplemental file.
Acknowledgments
The authors thank Drs. Bo Huang and Xiaoyi Cao for useful discussions and help on graphical presentation. This work is supported by NIH R00HL122368 (Z.C.) and DP1HD087990 (S.Zhong.), and the National Key Research and Development Program of China 2016YFC0901704 (S.Zhou).
Author Contributions
Conceptualization, S.Zhong; Methodology, F.H.B.,Q.W.,R.C.,M.R.A., and S.Zhong.; Investigation, F.H.B.,Q.W.,R.C.,M.R.-A.,S.Zhou.,Z.C., and S.Zhong.; Writing – Original Draft, F.H.B.,Q.W.,R.C., and S.Zhong.; Writing – Review & Editing, Z.C. and S.Zhong.; Funding Acquisition, S.Zhou, Z.C., and S.Zhong.; Resources, F.H.B. and S.Zhong.; Supervision, S.Zhong.
Declaration of Interests
S.Zhong. is a co-founder and a board member of Genemo Inc.
Published: September 28, 2018
Footnotes
Supplemental Information includes Transparent Methods, six figures, and two tables and can be found with this article online at https://doi.org/10.1016/j.isci.2018.08.009.
Data and Software Availability
The GEO accession number for the data reported in this paper is GEO: GSE106287.
Supplemental Information
References
- Altschuler S.J., Wu L.F. Cellular heterogeneity: do differences make a difference? Cell. 2010;141:559–563. doi: 10.1016/j.cell.2010.04.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Artavanis-Tsakonas S., Rand M.D., Lake R.J. Notch signaling: cell fate control and signal integration in development. Science. 1999;284:770–776. doi: 10.1126/science.284.5415.770. [DOI] [PubMed] [Google Scholar]
- Biase F.H., Cao X., Zhong S. Cell fate inclination within 2-cell and 4-cell mouse embryos revealed by single-cell RNA sequencing. Genome Res. 2014;24:1787–1796. doi: 10.1101/gr.177725.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burton A., Muller J., Tu S., Padilla-Longoria P., Guccione E., Torres-Padilla M.E. Single-cell profiling of epigenetic modifiers identifies PRDM14 as an inducer of cell fate in the mammalian embryo. Cell Rep. 2013;5:687–701. doi: 10.1016/j.celrep.2013.09.044. [DOI] [PubMed] [Google Scholar]
- Cai D., Cohen K.B., Luo T., Lichtman J.W., Sanes J.R. Improved tools for the Brainbow toolbox. Nat. Methods. 2013;10:540–547. doi: 10.1038/nmeth.2450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calarco-Gillam P. Cell-cell interactions in mammalian preimplantation development. Dev. Biol. (N. Y. 1985) 1986;2:329–371. doi: 10.1007/978-1-4613-2141-5_8. [DOI] [PubMed] [Google Scholar]
- Gardner R.L. Specification of embryonic axes begins before cleavage in normal mouse development. Development. 2001;128:839–847. doi: 10.1242/dev.128.6.839. [DOI] [PubMed] [Google Scholar]
- Goolam M., Scialdone A., Graham S.J.L., Macaulay I.C., Jedrusik A., Hupalowska A., Voet T., Marioni J.C., Zernicka-Goetz M. Heterogeneity in Oct4 and Sox2 targets biases cell fate in 4-cell mouse embryos. Cell. 2016;165:61–74. doi: 10.1016/j.cell.2016.01.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gully C.P., Velazquez-Torres G., Shin J.H., Fuentes-Mattei E., Wang E., Carlock C., Chen J., Rothenberg D., Adams H.P., Choi H.H. Aurora B kinase phosphorylates and instigates degradation of p53. Proc. Natl. Acad. Sci. USA. 2012;109:E1513–E1522. doi: 10.1073/pnas.1110287109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Han G., Li A.G., Liang Y.Y., Owens P., He W., Lu S., Yoshimatsu Y., Wang D., Ten Dijke P., Lin X., Wang X.J. Smad7-induced beta-catenin degradation alters epidermal appendage development. Dev. Cell. 2006;11:301–312. doi: 10.1016/j.devcel.2006.06.014. [DOI] [PubMed] [Google Scholar]
- He Q., Kim H., Huang R., Lu W., Tang M., Shi F., Yang D., Zhang X., Huang J., Liu D., Songyang Z. The Daxx/atrx complex protects tandem repetitive elements during DNA hypomethylation by promoting H3K9trimethylation. Cell Stem Cell. 2015;17:273–286. doi: 10.1016/j.stem.2015.07.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hiiragi T., Louvet-Vallee S., Solter D., Maro B. Embryology: does prepatterning occur in the mouse egg? Nature. 2006;442:E3–E4. doi: 10.1038/nature04907. discussion E4. [DOI] [PubMed] [Google Scholar]
- Hiiragi T., Solter D. First cleavage plane of the mouse egg is not predetermined but defined by the topology of the two apposingpronuclei. Nature. 2004;430:360–364. doi: 10.1038/nature02595. [DOI] [PubMed] [Google Scholar]
- Hiiragi T., Solter D. Fatal flaws in the case for prepatterning in the mouse egg. Reprod. Biomed. Online. 2006;12:150–152. doi: 10.1016/s1472-6483(10)60854-1. [DOI] [PubMed] [Google Scholar]
- Kalhor R., Mali P., Church G.M. Rapidly evolving homing CRISPR barcodes. Nat. Methods. 2017;14:195–200. doi: 10.1038/nmeth.4108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kawamura Y.I., Kawashima R., Fukunaga R., Hirai K., Toyama-Sorimachi N., Tokuhara M., Shimizu T., Dohi T. Introduction of Sd(a) carbohydrate antigen in gastrointestinal cancer cells eliminates selectin ligands and inhibits metastasis. Cancer Res. 2005;65:6220–6227. doi: 10.1158/0008-5472.CAN-05-0639. [DOI] [PubMed] [Google Scholar]
- Kidder G.M., Pedersen R.A. Turnover of embryonic messenger RNA in preimplantation mouse embryos. J. Embryol. Exp. Morphol. 1982;67:37–49. [PubMed] [Google Scholar]
- Leung C.Y., Zernicka-Goetz M. Mapping the journey from totipotency to lineage specification in the mouse embryo. Curr. Opin. Genet. Dev. 2015;34:71–76. doi: 10.1016/j.gde.2015.08.002. [DOI] [PubMed] [Google Scholar]
- Liu X., Wang C., Liu W., Li J., Li C., Kou X., Chen J., Zhao Y., Gao H., Wang H. Distinct features of H3K4me3 and H3K27me3 chromatin domains in pre-implantation embryos. Nature. 2016;537:558–562. doi: 10.1038/nature19362. [DOI] [PubMed] [Google Scholar]
- Livet J., Weissman T.A., Kang H., Draft R.W., Lu J., Bennis R.A., Sanes J.R., Lichtman J.W. Transgenic strategies for combinatorial expression of fluorescent proteins in the nervous system. Nature. 2007;450:56–62. doi: 10.1038/nature06293. [DOI] [PubMed] [Google Scholar]
- McKenna A., Findlay G.M., Gagnon J.A., Horwitz M.S., Schier A.F., Shendure J. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science. 2016;353:aaf7907. doi: 10.1126/science.aaf7907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Motosugi N., Bauer T., Polanski Z., Solter D., Hiiragi T. Polarity of the mouse embryo is established at blastocyst and is not prepatterned. Genes Dev. 2005;19:1081–1092. doi: 10.1101/gad.1304805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pei W., Feyerabend T.B., Rossler J., Wang X., Postrach D., Busch K., Rode I., Klapproth K., Dietlein N., Quedenau C. Polylox barcoding reveals haematopoietic stem cell fates realized in vivo. Nature. 2017;548:456–460. doi: 10.1038/nature23653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perli S.D., Cui C.H., Lu T.K. Continuous genetic recording with self-targeting CRISPR-Cas in human cells. Science. 2016;353 doi: 10.1126/science.aag0511. [DOI] [PubMed] [Google Scholar]
- Petit F.M., Serres C., Bourgeon F., Pineau C., Auer J. Identification of sperm head proteins involved in zona pellucida binding. Hum. Reprod. 2013;28:852–865. doi: 10.1093/humrep/des452. [DOI] [PubMed] [Google Scholar]
- Picelli S., Faridani O.R., Bjorklund A.K., Winberg G., Sagasser S., Sandberg R. Full-length RNA-seq from single cells using smart-seq2. Nat. Protoc. 2014;9:171–181. doi: 10.1038/nprot.2014.006. [DOI] [PubMed] [Google Scholar]
- Piotrowska-Nitsche K., Perea-Gomez A., Haraguchi S., Zernicka-Goetz M. Four-cell stage mouse blastomeres have different developmental properties. Development. 2005;132:479–490. doi: 10.1242/dev.01602. [DOI] [PubMed] [Google Scholar]
- Piotrowska K., Wianny F., Pedersen R.A., Zernicka-Goetz M. Blastomeres arising from the first cleavage division have distinguishable fates in normal mouse development. Development. 2001;128:3739–3748. doi: 10.1242/dev.128.19.3739. [DOI] [PubMed] [Google Scholar]
- Plachta N., Bollenbach T., Pease S., Fraser S.E., Pantazis P. Oct4 kinetics predict cell lineage patterning in the early mammalian embryo. Nat. Cell Biol. 2011;13:117–123. doi: 10.1038/ncb2154. [DOI] [PubMed] [Google Scholar]
- Plusa B., Hadjantonakis A.K., Gray D., Piotrowska-Nitsche K., Jedrusik A., Papaioannou V.E., Glover D.M., Zernicka-Goetz M. The first cleavage of the mouse zygote predicts the blastocyst axis. Nature. 2005;434:391–395. doi: 10.1038/nature03388. [DOI] [PubMed] [Google Scholar]
- Raj B., Wagner D.E., Mckenna A., Pandey S., Klein A.M., Shendure J., Gagnon J.A., Schier A.F. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 2018;36:442–450. doi: 10.1038/nbt.4103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodriguez-Terrones D., Gaume X., Ishiuchi T., Weiss A., Kopp A., Kruse K., Penning A., Vaquerizas J.M., Brino L., Torres-Padilla M.E. A molecular roadmap for the emergence of early-embryonic-like cells in culture. Nat. Genet. 2018;50:106–119. doi: 10.1038/s41588-017-0016-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodriguez-Terrones D., Torres-Padilla M.E. Nimble and ready to mingle: transposon outbursts of early development. Trends Genet. 2018 doi: 10.1016/j.tig.2018.06.006. [DOI] [PubMed] [Google Scholar]
- Sadic D., Schmidt K., Groh S., Kondofersky I., Ellwart J., Fuchs C., Theis F.J., Schotta G. Atrx promotes heterochromatin formation at retrotransposons. EMBO Rep. 2015;16:836–850. doi: 10.15252/embr.201439937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schuster A., Tang C., Xie Y., Ortogero N., Yuan S., Yan W. SpermBase: a database for sperm-borne RNA contents. Biol. Reprod. 2016;95:99. doi: 10.1095/biolreprod.116.142190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shi J., Chen Q., Li X., Zheng X., Zhang Y., Qiao J., Tang F., Tao Y., Zhou Q., Duan E. Dynamic transcriptional symmetry-breaking in pre-implantation mammalian embryo development revealed by single-cell RNA-seq. Development. 2015;142:3468–3477. doi: 10.1242/dev.123950. [DOI] [PubMed] [Google Scholar]
- Sozen B., Can A., Demir N. Cell fate regulation during preimplantation development: a view of adhesion-linked molecular interactions. Dev. Biol. 2014;395:73–83. doi: 10.1016/j.ydbio.2014.08.028. [DOI] [PubMed] [Google Scholar]
- Tabansky I., Lenarcic A., Draft R.W., Loulier K., Keskin D.B., Rosains J., Rivera-Feliciano J., Lichtman J.W., Livet J., Stern J.N. Developmental bias in cleavage-stage mouse blastomeres. Curr. Biol. 2013;23:21–31. doi: 10.1016/j.cub.2012.10.054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tago K., Nakamura T., Nishita M., Hyodo J., Nagai S., Murata Y., Adachi S., Ohwada S., Morishita Y., Shibuya H., Akiyama T. Inhibition of Wntsignaling by ICAT, a novel beta-catenin-interacting protein. Genes. Dev. 2000;14:1741–1749. [PMC free article] [PubMed] [Google Scholar]
- Takaoka K., Hamada H. Origin of cellular asymmetries in the pre-implantation mouse embryo: a hypothesis. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2014;369 doi: 10.1098/rstb.2013.0536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Torres-Padilla M.E., Parfitt D.E., Kouzarides T., Zernicka-Goetz M. Histone arginine methylation regulates pluripotency in the early mouse embryo. Nature. 2007;445:214–218. doi: 10.1038/nature05458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsirigos A., Rigoutsos I. Alu and b1 repeats have been selectively retained in the upstream and intronic regions of genes of specific functional classes. PLoS Comput. Biol. 2009;5:e1000610. doi: 10.1371/journal.pcbi.1000610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsukada Y., Fang J., Erdjument-Bromage H., Warren M.E., Borchers C.H., Tempst P., Zhang Y. Histone demethylation by a family of JmjC domain-containing proteins. Nature. 2006;439:811–816. doi: 10.1038/nature04433. [DOI] [PubMed] [Google Scholar]
- Vogel G. Embryology. Embryologists polarized over early cell fate determination. Science. 2005;308:782–783. doi: 10.1126/science.308.5723.782. [DOI] [PubMed] [Google Scholar]
- Wennekamp S., Hiiragi T. Stochastic processes in the development of pluripotency in vivo. Biotechnol. J. 2012;7:737–744. doi: 10.1002/biot.201100357. [DOI] [PubMed] [Google Scholar]
- Wennekamp S., Mesecke S., Nedelec F., Hiiragi T. A self-organization framework for symmetry breaking in the mammalian embryo. Nat. Rev. Mol. Cell Biol. 2013;14:452–459. doi: 10.1038/nrm3602. [DOI] [PubMed] [Google Scholar]
- Wolpert L. Oxford University Press; 2015. Principles of Development, Oxford, United Kingdom. [Google Scholar]
- Xie D., Chen C.C., Ptaszek L.M., Xiao S., Cao X., Fang F., Ng H.H., Lewin H.A., Cowan C., Zhong S. Rewirable gene regulatory networks in the preimplantation embryonic development of three mammalian species. Genome Res. 2010;20:804–815. doi: 10.1101/gr.100594.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zernicka-Goetz M. Patterning of the embryo: the first spatial decisions in the life of a mouse. Development. 2002;129:815–829. doi: 10.1242/dev.129.4.815. [DOI] [PubMed] [Google Scholar]
- Zernicka-Goetz M., Morris S.A., Bruce A.W. Making a firm decision: multifaceted regulation of cell fate in the early mouse embryo. Nat. Rev. Genet. 2009;10:467–477. doi: 10.1038/nrg2564. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






