Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2017 Oct 25;114(45):E9579–E9588. doi: 10.1073/pnas.1708341114

GATA2/3-TFAP2A/C transcription factor network couples human pluripotent stem cell differentiation to trophectoderm with repression of pluripotency

Christian Krendl a,1, Dmitry Shaposhnikov a,1, Valentyna Rishko a, Chaido Ori a, Christoph Ziegenhain b, Steffen Sass c, Lukas Simon c, Nikola S Müller c, Tobias Straub d, Kelsey E Brooks e, Shawn L Chavez e, Wolfgang Enard b, Fabian J Theis c,f, Micha Drukker a,2
PMCID: PMC5692555  PMID: 29078328

Significance

This study provides a mechanistic explanation for the differentiation of trophoblasts from human pluripotent stem cells, a process relying on BMP morphogens. We found that a network of the transcription factors GATA2, GATA3, TFAP2A, and TFAP2C regulates early trophoblast progenitor specification by activating placental genes and inhibiting the pluripotency gene OCT4, thus acting to couple trophoblast specification with exit from pluripotency. To demonstrate the relevance of our findings in vivo, we show that down-regulating GATA3 in primate embryos prevents trophectoderm specification. In addition, we present a genome-wide analysis of active and inactive chromatin during trophoblast progenitor specification. These results provide a basis to guide investigations of human trophectoderm development.

Keywords: trophectoderm, trophoblast, BMP4, hESC, differentiation

Abstract

To elucidate the molecular basis of BMP4-induced differentiation of human pluripotent stem cells (PSCs) toward progeny with trophectoderm characteristics, we produced transcriptome, epigenome H3K4me3, H3K27me3, and CpG methylation maps of trophoblast progenitors, purified using the surface marker APA. We combined them with the temporally resolved transcriptome of the preprogenitor phase and of single APA+ cells. This revealed a circuit of bivalent TFAP2A, TFAP2C, GATA2, and GATA3 transcription factors, coined collectively the “trophectoderm four” (TEtra), which are also present in human trophectoderm in vivo. At the onset of differentiation, the TEtra factors occupy multiple sites in epigenetically inactive placental genes and in OCT4. Functional manipulation of GATA3 and TFAP2A indicated that they directly couple trophoblast-specific gene induction with suppression of pluripotency. In accordance, knocking down GATA3 in primate embryos resulted in a failure to form trophectoderm. The discovery of the TEtra circuit indicates how trophectoderm commitment is regulated in human embryogenesis.


The earliest cell fate commitment event that takes place during eutherian embryogenesis is the bifurcation of totipotent cells into the inner cell mass that generates the fetus, and trophectoderm (TE) precursors that give rise to the chorion and subsequently the fetal portion of the placenta (1). Studies of TE specification in the mouse revealed the importance of the transcription factors (TFs) Tead4 (2, 3), Cdx2 (4, 5), Eomes (4, 6), Gata3 (7), and AP-2γ (Tfap2c) (8, 9). Further differentiation of the precursors involves TFs such as the placenta morphogenesis master regulator Gcm1, Elf5, which promotes self-renewal of mouse trophoblast stem cells (TSCs), and Hand1 and Mash2 that regulate giant cell and spongiotrophoblast development, respectively (1013). The expression of Cdx2 in the outer layer cells of the embryo, which are destined to become trophoblasts, is thought to antagonize pluripotency by interfering with Oct4 autoregulation (5). In accordance with these key roles, overexpression of Tead4, Cdx2, Eomes, or Gata3 in mouse embryonic stem cells (ESCs) is sufficient to drive them toward the TE fate (5, 7, 14). Recently, it has also been shown that ectopic expression of Tfap2c, Gata3, Eomes, and either Myc or Ets2 converts mouse fibroblasts to functional trophoblast stem-like cells (1517).

The molecular mechanism of TE specification in humans has not been elucidated, but expression studies have shown that orthologs of some of the key TFs implicated in mouse TE development, including CDX2, TFAP2C, GCM1, and GATA3, are found in human TE progenitors at the blastocyst stage (1821). The early expression of TFAP2 TFs could have been inferred from deregulation of their target genes in cases of placental dysfunction (22). Other mouse TFs, however, like Eomes and Elf5, were not unequivocally assigned to the TE lineage in human embryos (23).

Human ESCs (hESCs) and induced pluripotent stem cells (iPSCs) can differentiate into trophoblast-like cells by treatment with BMP4, BMP5, BMP10, or BMP13 (2429). Although these morphogens were not initially implicated in trophoblast development in the mouse (30, 31), it was recently found that components of the BMP signaling pathway are indeed differentially expressed in TE-fated cells immediately following the first wave of asymmetric divisions in mouse embryos, and that BMP signaling is required for development of mouse TE in vivo (32). The treatment of human ESCs by BMPs also gives rise to mesoderm lineages, but this process, unlike derivation of trophoblasts, is known to be Wnt dependent (33).

Exposure of human ESCs and iPSCs (collectively PSCs) to BMPs triggers induction of a broad cohort of TFs including CDX2, GATA2, GATA3, TFAP2A, TFAP2C, MSX2, SSI3, HEY1, GCM1, and others, the majority of which by analogy to mouse knowledge could be considered candidate factors involved in human TE specification (29, 34). However, it has yet to be determined which of these TFs actually participate in the initial specification of human trophoblast progenitors, how the TFs are configured in a circuit that drives further trophoblast development by transcription of placenta-specific genes, and to what extent this network governs primate TE specification in vivo.

The colocalization of the transcription promoting and inhibiting trimethylation modifications on lysine 4 and 27 of the histone H3 tail, namely H3K4me3 and H3K27me3, respectively, contributes to transcriptional poising of developmental genes in ESCs (35, 36). Upon differentiation and depending on the lineage, the H3K4me3 mark dominates the expressed genes that lose the H3K27me3 mark, and conversely, the H3K27me3 mark is enhanced and the H3K4me3 mark is reduced in nonexpressed genes. These bivalent genes harbor cytosine-guanine dinucleotide (CpG)-dense promoter regions [high-CpG promoters (HCPs)] that are silenced by CpG methylation (37). Interestingly, a distinct class of developmental genes does not exhibit bivalency in ESCs and is characterized by low presence of CpGs in promoters [low-CpG promoters (LCPs)] (36). The mode of activation and repression of this class, which is thought to include tissue-specific regulators and structural genes, is considered distinct from that of HCP genes (38).

We have previously identified a panel of cell surface markers, including aminopeptidase A (APA) (or CD249/Ly-51), LIFR, EGFR, and CD117 (c-kit) that are expressed by a trophoblast progenitor population that emerges as early as 48 h after treatment of human PSCs by BMP4. We showed that sorted APA+, but not APA− cells, display cytotrophoblast characteristics and the capacity to further differentiate into multinucleated fused syncytiotrophoblast-like cells, and to express placental hormones in vivo (26). These surface markers therefore could be instrumental for purifying trophoblast progenitors and discovering mechanisms that regulate initial specification and differentiation along this lineage.

Here, we addressed these questions by transcriptome and epigenome analyses of purified BMP4-treated human ESC-derived trophoblast-fated progenitors as bulk population and single cells. We coupled these data to temporally resolved transcriptome changes that occur in the cells before the progenitors emerge. Our results support the identification of the human PSC-derived trophoblast progeny as TE, rather than extraembryonic mesoderm (33). Moreover, we discovered a TF circuit that could explain the coupling of TE specification with suppression of pluripotency, a finding that was supported by mapping the genome-wide binding sites of the TFs, and by the results of functional manipulation of these TFs in human ESCs in vitro and in primate embryos in vivo. Our results also revealed the modes of epigenetic regulation that govern gene induction and suppression along the differentiation axis of human PSCs to trophoblasts.

Results

The Transcriptome of Human Trophoblast Progenitors.

To identify cell-intrinsic mechanisms that govern human TE specification in vitro, we used the previously characterized cell surface marker APA (ENPEP), which marks trophoblast progenitors that differentiate from human ESCs upon constant exposure to BMP4 (26). The APA+ cell population emerged, peaked, and leveled out when 70–90% of the cells became positive, at differentiation days 2, 3, and 4, respectively (Fig. 1 A and B). Further culturing using this condition led to production of secreted CG (hCG), a pregnancy hormone expressed by trophoblasts in utero (39) (Fig. 1C). This protocol performed equally well in KnockOut serum replacement (KSR)-based and B27-based media (Fig. S1).

Fig. 1.

Fig. 1.

Purification and characterization of human APA+ progenitors. (A) Representative time-course flow cytometry analysis of APA in BMP4-treated hESCs. (B) Quantification of the temporal dynamics of the APA+ cells over the course of 6 d (n = 2; mean ± SEM). (C) Time-course measurements of the secreted CG (hCG) in hESCs with and without BMP4 in differentiation medium (n = 2; mean ± CI, 95%). (D and E) Venn diagrams showing overlapping microarray probe sets that were down-regulated (D) and up-regulated (E) in sorted APA+, APA− cells after 2.5 d of BMP4 treatment [n = 3; false-discovery rate (FDR): adjusted P value < 0.05; fold change ≥ 2]. (F) Significant tissue and cell type associations of the up-regulated (FDR: adjusted P value < 0.05; fold change ≥ 2) microarray probe sets in APA+ cells compared with SSEA5+ hESCs, based on the literature mining algorithm of the Genomatix GeneRanker tool. The lowest log10 P values out of three replicates are displayed. (G) t-SNE plot of the single-cell RNA-seq data from FACS-sorted APA+ cells (72 h of BMP4 treatment) and undifferentiated hESCs. Gene expression count estimates of the ENPEP (APA) gene are shown for APA+ cells in orange, and for undifferentiated hESCs in blue. Estimated logtwofold change of ENPEP is shown at the Bottom. (H) Significant tissue and cell type associations of the up-regulated genes [SCDE, conservative fold change estimate (log2) > 1,276 genes] in the APA+ cells compared with undifferentiated hESCs from single cell RNA-seq data, analyzed as in F.

Fig. S1.

Fig. S1.

(Related to Fig. 1.) (A) Temporal dynamics of the APA+ cell population in DMEM/F12-plus-KSR–based and RPMI1640-plus-B27–based differentiation media over 5 d of BMP4 treatment analyzed by flow cytometry; n = 2; t test for the difference in APA+ population size between media is shown for days 3, 4, and 5. (B) Representative flow cytometry histograms of the data in A. (C) Real-time PCR comparison of TEtra and APA expression levels in samples from A. Mean ± SEM; n = 2.

To characterize key genes involved in the differentiation of human trophoblast progenitors, we sorted the top 20% brightest APA+ and dimmest APA– cell populations after 60 h (2.5 d) of differentiation, around the time when the size of the APA+ population grows exponentially. To set a baseline for gene expression levels, we sorted the SSEA-5+ cell population from undifferentiated cultures (which includes ∼95% of the cells). This removes spontaneously differentiated cells that can obstruct analysis of cell-intrinsic properties (40). Lineage assessment of these cell populations by qPCR before global transcriptomics analysis indicated a transition from pluripotency to TE fate in the APA+ population evident by OCT4 down-regulation and a reciprocal CDX2, GCM1, and ENPEP (APA) up-regulation (Fig. S2A). Moreover, in the APA− population, we noticed an up-regulation of key mesoderm genes and surface markers, including T, GSC, ROR2, and CD13, as well as lower enrichment of trophoblast genes, for example, GCM1, indicating that this population consisted of primitive streak-like progenitors and possibly cells in pre-APA phase (Fig. S2A).

Fig. S2.

Fig. S2.

(Related to Fig. 1.) (A) Real-time PCR validation of samples for microarray analysis, comparing trophoblast, meso/endoderm, and pluripotency genes in sorted APA+, APA−, and SSEA-5+ cell populations. Expression is calculated as linear fold change over SSEA-5+, with the exception of CDX2 and ELF5 that were plotted by relative quantity to GAPDH. Mean ± SEM; n = 2. (B) Hierarchical clustering of microarray data from three replicates of sorted APA+, APA−, and SSEA-5+ cell populations. Dendrogram is based on Pearson’s correlation. (C) Significant tissue and cell type associations of the probe sets that were significantly up-regulated (adjusted P value < 0.05, fold change > 2) in APA+ compared with APA− cell populations. Lowest P value out of three biological replicates is shown. (D) A Venn diagram showing overlapping TFs (and the ENPEP gene) that were up-regulated (adjusted P value < 0.05, fold change > 2) in the APA+ compared with the SSEA-5+ cell population, with significantly up-regulated TFs in isolated mural TE compared with undifferentiated hESCs (fold change > 5; ref. 19). The right panel shows significant tissue and cell type associations for gene sets from each area of the diagram.

Next, we globally analyzed differentially expressed (DE) genes in the APA+, APA−, and SSEA-5+ cell populations using Affymetrix oligonucleotide microarrays (Fig. S2B and Dataset S1). Comparing APA+, APA−, and SSEA-5+ profiles, we noted ∼700 down- and ∼1,000 up-regulated transcripts (Fig. 1 D and E, respectively). The cohort of the down-regulated genes included the pluripotency circuitry members SOX2, OCT4, and NANOG (Fig. 1D). Importantly, tissue association analysis of the genes that were up-regulated in the APA+ population identified trophoblast and placental tissues as the most overrepresented (Fig. 1F; a similar signature was observed when comparing the APA+ and APA− cell populations; Fig. S2C). This shows the relevance of the APA+ cell population for identifying key human TE genes. To substantiate this claim, we compared the up-regulated gene set of the APA+ population with the genes that are enriched in human embryonic mural trophoblasts (19), and demonstrated an overlapping set of TFs including GCM1, TP63, VGLL1, GATA2, GATA3, and TFAP2C, as well as the surface marker ENPEP (APA), which was collectively annotated as trophoblast/placenta specific with high confidence (Fig. S2D). Finally, the enrichment of CDX2 and ELF5 in the APA+ cell population (Fig. S2A) is also consistent with commitment to TE fate (5, 13).

To assess the degree of heterogeneity in the APA+ population, we performed global RNA sequencing (RNA-seq) of ∼350 individually sorted APA+ and undifferentiated cells. Analysis of this dataset confirmed that the APA+ population is essentially homogenous, except for fewer than 10% of the cells that clustered with undifferentiated cells (Fig. 1G and Fig. S3). Importantly, significantly up-regulated genes in single APA+ cells exhibited associations with trophoblast and placental tissues (Fig. 1H), similarly to the bulk APA+ cell population (Fig. 1F). Taken together, we concluded that BMP4 treatment of human ESCs leads to the specification of APA+ progenitors that, on the gene expression level, resemble human TE progenitors in vivo and lack pluripotency features, and that the transcriptional network of these early progenitors is enriched for factors which were previously found to be important in mouse TE development.

Fig. S3.

Fig. S3.

(Related to Figs. 1G and 3F.) (A) Flow cytometry analysis of 72-h BMP4-treated and undifferentiated hESCs used for single-cell RNA sequencing. The gates applied for sorting single cells from the APA+ population and undifferentiated cultures into 384-well plates are shown. (B) Plots showing total UMI count per cell and total number of identified genes per cell in SCRB-seq libraries (both BMP4-treated and undifferentiated hESC samples). (C) A scatterplot of the correspondence between the gene expression changes detected by single-cell RNA-seq (APA+ cell population versus undifferentiated hESCs) and gene expression changes detected by total RNA-seq of bulk BMP4-treated hESCs (72 h of treatment versus undifferentiated hESC samples). Plotted are mean logtwofold changes (680 cells for scRNA-seq; n = 2 for bulk RNA-seq).

Histone Modification Redistribution During TE Progenitor Specification.

To analyze posttranslational histone modifications that underlie human trophoblast progenitor specification and pluripotency shutdown, we used sorted APA+, APA−, and SSEA-5+ cell populations (matched samples of Fig. 1) to perform chromatin immunoprecipitation and deep sequencing (ChIP-seq) of H3K4me3- and H3K27me3-bound DNA fragments (using validated antibodies; Fig. S4). Comparing APA+ and SSEA-5+ populations, we found that the redistribution of H3K4me3 or H3K27me3 monovalent, bivalent (both marks present), and H3K4me3/K27me3 double-negative genes was markedly different between up- and down-regulated genes (Fig. 2 A and B, representative maps in Fig. 2E). While close to 65% of the genes that were bivalent in undifferentiated SSEA-5+ cells became predominantly H3K4me3 monovalent (by losing the H3K27me3 mark) in the case they were up-regulated, there was a little change in bivalency for genes that are down-regulated. Also, only around 25% of the genes from the H3K4me3 class became bivalent when they are down-regulated. Taken together, this indicates that transcriptional changes often precede changes in histone modifications, both in the direction of gene induction and repression. This conclusion is supported by a group of up-regulated genes that maintained the H3K4me3/K27me3 double-negative phenotype (Fig. 2A).

Fig. S4.

Fig. S4.

(Related to Fig. 2.) (A and B) Real-time PCR-based validation of the H3K4me3 (A) and H3K27me3 (B) antibodies used for ChIP-seq performed using sorted SSEA5+ cells. Loci of expected enrichment and depletion for H3K4me3 are ACTB, GAPDH #1, B2M, and HOXD12, ESR, K4neg, SPERT, respectively. For H3K27me3, expected enrichment is seen in HOXD12, EVX2, NEUROG1 loci, while depletion is in GAPDH #3, CCND1, and B2M. Error bars indicate SEM; for H3K4me3, n = 4, except K4neg and SPERT (n = 2), and for H3K27me3, n = 3.

Fig. 2.

Fig. 2.

Changes in the histone modification patterns during APA+ cell differentiation. (A and B) Histone modification pattern turnover in the sorted APA+ cells compared with SSEA-5+ cells. Plotted are DE genes between the APA+ and the SSEA-5+ cells (870 up- and 592 down-regulated genes, no fold-change cut-off). Histone modification classes are based on the enrichment of H3K4me3 and H3K27me3 in the 4-kb region around known TSS (n = 3). (C and D) Heatmaps of histone mark enrichment in promoters of up-regulated (C) and down-regulated (D) genes in the APA+ cells compared with SSEA-5+ cells (from Fig. 1 D and E, respectively). TFs are listed on the side. An Inset in C shows double-negative non-TF genes that were significantly associated with placental tissues (Genomatix). (E) Representative histone mark coverage profiles that correspond to bivalent (GATA2, GATA3, TFAP2C), monovalent H3K4me3 (ARID3A, OCT4) and H3K27me3 (TFAP2B, CDX2), and double-negative (GCM1) loci in the SSEA-5+ cells. The y axis was normalized for each condition/histone modification combination across all genes.

To gain specific insights into the relationship between chromatin states and transcriptional regulation in TE progenitors, we next focused on transcription factors/cofactors (TFs collectively) that were DE between the APA+ and the SSEA-5+ cell population. With respect to the up-regulated TFs, we found that orthologs of key mouse trophoblast-related TFs belong to the bivalent or double-negative class in undifferentiated cells (Fig. 2 C and E and Dataset S2). This included CDX2, GATA3, TFAP2C and GCM1, VGLL1, TP63, and ELF5, respectively. Moreover, a cohort of genes that is specific for trophoblast and placental tissues, including steroid sulfatase (STS) and solute carriers SLC13A4 and SLC8A1, was found in the double-negative class (Fig. 2C). TFs that were down-regulated in the APA+ cell population, including SOX2, NANOG, and OCT4 in most part belonged to the H3K4me3-monovalent class in undifferentiated cells (Fig. 2D). Intriguingly, NANOG and OCT4 were among the rare genes in this class that lost H3K4me3 mark without gaining H3K27me3 mark in the APA+ cells (Fig. 2 B and E).

Taken together, this argues that distinct cohorts of regulators which participate in the process of human trophoblast specification are held in divergent chromatin states in undifferentiated human ESCs: one cohort includes bivalent TFs that become predominantly H3K4me3 monovalent in trophoblast progenitors, and a second cohort comprises trophoblast-specific TFs and other genes that are mostly double negative. Interestingly, the initial transcriptional up-regulation of the latter group took place without H3K4me3 histone mark changes.

Temporal Activation of TFs During TE Progenitor Specification.

To analyze the order of TF activation of the DE genes in the APA+ progenitors (Fig. 1E), we conducted time-series analysis of global transcriptome changes in bulk human ESCs during the first 72 h of BMP4 treatment. The gradual up-regulation and high percentage of overlap between the datasets of DE genes at 48 and 72 h of differentiation, relative to the APA+ progenitor cells, substantiated the use of the time-series information to deduce transcriptional trajectories of genes that are pertinent for TE progenitor specification (Fig. 3 A and B and Dataset S3). Analysis of the up-regulated TFs according to their temporal trends identified several distinctly clustered groups (Fig. S5A), of which three exhibited early (8 h), intermediate (24 h), and late (48 h) induction phases (Fig. 3C). Read coverage plots of representative genes from these clusters, including GATA3, CDX2, and GCM1, respectively, as well as of OCT4 are depicted in Fig. 3B. Next, we analyzed the TFs from each group with respect to overrepresented classes of histone modifications in undifferentiated SSEA-5+ cells. This showed that early genes were predominantly bivalent, while the late genes tend to be (although not reaching statistical significance) H3K4me3/K27me3 double negative and bivalent (Fig. 3D). The cluster of the intermediate genes did not exhibit a specific overrepresentation of any histone class. While inspecting the TFs from the three groups, we noted that many of the genes that had the highest induction amplitudes (>32-fold at the given time points) are human orthologs of genes that had been previously implicated in mouse TE development (Fig. 3E). This included bivalent (in SSEA-5+ cells) early TFs, namely GATA2, GATA3, TFAP2A, and TFAP2C (the latter exhibited only 10-fold up-regulation), which have known TE-specific functions (7, 9, 4144); intermediate-group bivalent TFs, HAND1 and CDX2, which are essential for placental development (4, 11); and late H3K4me3/K27me3 double-negative placental TFs, GCM1, VGLL1, and TP63 (45). This further strengthens the notion that the APA+ cell population represents bona fide early TE progenitors. The expression of MSX2 from the early group was not analyzed further in this context because its early induction is common to other early progenitors (Fig. S5B). Analysis of GATA2, GATA3, TFAP2A, and TFAP2C in single cells, as well as immunocytochemistry of the respective proteins, indicated that this TE phenotype represents the intrinsic properties of individual cells (Fig. 3F and Fig. S5C).

Fig. 3.

Fig. 3.

Expression kinetics of TE regulators and their epigenetic characteristics. (A) Correspondence of the microarray dataset to the time-course RNA-seq analysis of bulk BMP4-treated hESCs. Bars represent the total number of APA+ versus SSEA-5+ DE genes (n = 3; FDR: adjusted P value < 0.05). Green and yellow represent the number of DE genes that are respectively up- or down-regulated in the indicated time point (8, 16, 24, 48, and 72 h) of BMP4 treatment (every time point compared with undifferentiated hESCs; n = 2; FDR: adjusted P value < 0.05). Gray represent genes not detected as DE by RNA-seq. (B) Representative RNA-seq read coverage plots (bulk BMP4-treated hESCs) of the pluripotency gene OCT4, early-response cluster gene GATA3, intermediate cluster gene CDX2, and late cluster gene GCM1. Per gene, the y-axis scale is equal for all time points. (C) Clusters of TFs produced by k-means analysis that exhibit early (8 h), intermediate (24 h), and late (48 h) up-regulation during 72-h treatment of hESCs with BMP4. Plotted are significantly up-regulated TFs in APA+ cells compared with SSEA-5+ cells (Fig. 1E). Representative clusters shown here were selected by visual inspection (all clusters in Fig. S5A). (D) Overrepresentation analysis of the histone modification classes in genes from C (P values, Fisher’s exact test for the observed number of overlaps between the genes with the respective histone modifications and the genes in all three clusters). (E) TFs from the early, intermediate, and late clusters [log2 fold change (FC) of ≥5 at 8, 24, and 48 h, with the exception of TFAP2C with log2 FC = 3.42 at 8 h]. A log2 FC = 4.1 of ELF5 was detected at 72 h. Font color corresponds to the histone modification classes from D. (F) t-SNE plot of single-cell RNA-seq dataset (as in Fig. 1G) colored according to trophoblast gene expression score (Materials and Methods). Estimated logtwofold changes between APA+ cells and undifferentiated hESCs for TFAP2A, TFAP2C, GATA2, and GATA3 are shown on the Right. (G) DNA methylation status of the CpG sites around TSS of GATA2 and GCM1 genes in SSEA-5+, APA+, and APA− cells (n = 3; unmethylated: 0–20%; intermediate: 20–60%; highly methylated: 61–100%).

Fig. S5.

Fig. S5.

(Related to Fig. 3.) (A) Six clustered gene groups produced by the k-means clustering of time-course expression data of TFs that were up-regulated in the APA+ compared with the SSEA-5+ cell population. Clusters 1, 3, and 6 were analyzed in detail in Fig. 3C. (B) Time course RNA-seq expression levels of MSX2 gene analyzed in hESCs undergoing trophoblast (BMP4) and mesoderm (CHIR) differentiation. Strong up-regulation in both conditions at 8 h is highlighted. Two biological replicates are displayed. (C) Representative immunofluorescent staining microphotographs of GATA3, TFAP2A, and TFAP2C proteins in undifferentiated and 60-h BMP4-treated hESCs.

High- and low-CpG island promoters (HCPs, LCPs) have been correlated with early embryonic or tissue-specific gene expression, respectively (38). To determine whether the methylation status of the HCP and LCP changes during APA+ progenitor specification, we performed a genome-wide analysis of CpG methylation. Surprisingly, we could not detect significant changes of the methylation states in CpGs of bivalent and H3K4me3/K27me3 double-negative TF genes, which are respectively low and high, comparing the APA+, APA−, and undifferentiated SSEA-5+ cell population (Fig. 3G).

Based on these data, we hypothesized that early TE progenitors are specified rather quickly and exit the state of pluripotency by an input from a group of early-response TFs (induced by 8 h) that are bivalent with unmethylated CpG islands in undifferentiated cells, and include GATA2, GATA3, TFAP2A, and TFAP2C. Furthermore, these TFs are very likely to govern the induction of late (by 48 h) placenta-specific LCP genes, which are H3K4me3/K27me3 double negative with methylated CpG in undifferentiated state. We postulated that the intermediate group is more heterogeneous with respect to the transcriptional activation mechanisms.

Global Mapping of TFAP2A/C and GATA2/3 TF-Bound Loci.

To delineate the binding landscape of the early TFs in TE progenitors, we used human ESCs that were differentiated for 72 h to perform ChIP-seq of GATA2, GATA3, TFAP2A, and TFAP2C. De novo search produced motifs that closely resemble those previously published for these factors (Fig. 4A). Next, listing of putative target genes, according to TF binding in −3.5/+5 kb around the transcription start site (TSS), revealed that when more TFs were bound, the correlation to transcriptional up-regulation was higher. Conversely, the transcriptional down-regulation was negatively correlated with the number of bound TFs (Fig. 4B). Moreover, GATA3 exhibited the broadest potential of synergy with the other three TFs because it coincided with the other TFs more frequently (Fig. 4C). Taken together, this suggests that binding of GATA2/3, TFAP2A/C is predominantly gene activating, and that GATA3 if the chief member of the four TFs, which collectively promote trophoblast differentiation.

Fig. 4.

Fig. 4.

Identification of GATA2/3 and TFAP2A/C genomic binding sites. (A) De novo binding motif analysis of GATA2, GATA3, TFAP2A, and TFAP2C from ChIP-seq of hESCs treated by BMP4 for 72 h. (B) Correlation plot of all expressed genes in 72 h of BMP4 treatment (RNA-seq) and the number of TFs (GATA2, GATA3, TFAP2A, and TFAP2C) bound in their promoter regions (−5…+3.5 kb around TSS). Color denotes DE status compared with undifferentiated hESCs: up-regulated (green), down-regulated (red), and not DE (blue) (n = 2; FDR: adjusted P value < 0.05). (C) The total number of gene promoters (–5…+3.5 kb around known TSS) that are bound by either all four (GATA2, GATA3, TFAP2A, and TFAP2C) or different combinations of three TFs as determined by ChIP-seq (n = 3 for GATA2/3 and TFAP2A; n = 1 for TFAP2C). (D) An overview of the genes that were bound by all four TFs. Only TFs and genes significantly associated with placental tissues are displayed. Asterisks denote genes with promoter CpG demethylation in the APA+ cells. Binding within the four TEtra TFs is depicted in Fig. S6A. (E) Representative ChIP-seq coverage profiles of TEtra TFs in the CDX2, ANKRD1, GCM1, and OCT4 genes. (F) An example of ChIP-seq coverage profiles of TEtra TFs in the promoter of the LCP trophoblast-associated gene VTCN1. DNA methylation in CpG sites is shown below.

Indeed, of the 204 genes that were bound by the four TFs, 122 (60%) were up-regulated in the APA+ cell population (Dataset S4). This includes 11 TFs, among them CDX2 and ANKRD1, which have been implicated in TE specification, and GCM1 that is bound by three TFs (Fig. 4 D and E). Importantly, CDX2 displayed potential binding of all four TFs in the first intron (Fig. 4E), which is in line with the previous reports showing that the binding of GATA3 and TFAP2C to the first intron of Cdx2 in mouse TSCs activates its transcription (46, 47). Interestingly, this intronic site is occupied by OCT4 and NANOG in undifferentiated human ESCs (Fig. S6B). The up-regulated non-TF genes that exhibited binding of the four TFs, included ENPEP, STS, VTCN1, and other placenta-specific genes (Fig. 4D). In addition, we noted in ∼50% of the possible cases, autofactor or cross-factor interactions between the TFs GATA2/3 and TFAP2A/C (Fig. S6A). Strikingly, in the very few genes where promoter CpG demethylation did take place during the differentiation from SSEA-5+ to APA− and APA+ populations, these genes were bound by the four TFs (asterisk-labeled genes in Fig. 4D).

Fig. S6.

Fig. S6.

(Related to Fig. 4D.) (A) Representative ChIP-seq coverage profiles of GATA2, GATA3, TFAP2A, and TFAP2C over GATA2, GATA3, TFAP2A, and TFAP2C genes (72-h BMP4-treated hESCs). (B) NANOG and OCT4 binding sites in first intron of the CDX2 gene based on the ENCODE Project dataset.

Finally, although only two down-regulated TFs were bound by GATA2/3 and TFAP2A/C, this binding is likely important for down-regulation of pluripotency, as it took place in the first intron of OCT4, and in JADE1, which promotes histone acetylation and was implicated in embryogenesis (48) (Fig. 4 D and E). Taken together, these data indicate that GATA2/3 and TFAP2A/C TFs form a feedforward network that regulates pluripotency and TE genes.

Functional Validation of GATA3 and TFAP2A in TE Specification.

To functionally analyze the roles of GATA2/3 and TFAP2A/C in human TE differentiation, we manipulated the expression of the one factor from either pair that is more likely to have a broader regulation. We chose GATA3 because it exhibited the highest co-occupancy potential among the four TFs (Fig. 4C), and TFAP2A since it was induced to a higher extent than TFAP2C (Fig. 3E). We used the HUES9 human ESC line that carries an inducible Cas9 cassette (iCRISPR) (49) to delete the boundary of the first intron and the second exon in both genes (325 and 149 bp, respectively). Analysis of differentiated clones (72-h BMP4), harboring homozygous deletions, demonstrated complete absence of GATA3 protein in two clones and of TFAP2A in one clone, and faint bands of TFAP2A in additional two clones (likely due to residual heterozygous or wild-type cells) (Fig. 5A). We observed that, while the knockout (KO) of GATA3 led to a drastic reduction in the number of APA+ progenitors (Fig. 5 B and C) and hCG levels (Fig. 5E), which were comparable to undifferentiated cells, TFAP2A KO exhibited a much milder effect, leading to ∼30% reduction in the number of APA+ progenitors (Fig. 5 B and C).

Fig. 5.

Fig. 5.

Functional validation of the human TE specification circuit. (A) Western blot analysis of GATA3 and TFAP2A KO clones treated with BMP4 for 72 h. TFAP2A clones 3 and 8 demonstrate residual expression (likely due to clonal impurities). (B) Representative flow cytometry analysis of APA-expressing cells in GATA3 and TFAP2A KO clones treated with BMP4 for 72 h. The parental iCRISPR cell line was used as the control. (C) Quantification of B. For GATA3 KO clones: n = 2, mean ± SEM; for wild type (wt) iCRISPR and TFAP2A KO clones: n = 5, mean ± SEM. (D) qPCR analysis of a set of TE genes and OCT4 in GATA3 and TFAP2A KO clones treated with BMP4 for 72 h. Results are plotted as logtwofold changes (ddCt) compared with identically treated parental cells. [For GATA3 KO, n = 2, mean ± SEM; for TFAP2A KO, n = 6 (except where indicated), mean ± SEM.] (E) Time-course analysis of secreted hCG in GATA3 KO clones and wild-type iCRISPR cells treated with BMP4 (n = 2; mean ± SEM). (F) Western blot analysis of an inducible GATA3 hESC line. Cells were either left in differentiation medium alone (KSR only) or treated with BMP4 or doxycycline (Dox) for 72 h. (G) Quantification of the APA+ cells following inducible overexpression of GATA3 (as in B). BMP4-treated hESCs served as the positive control (n = 4; mean ± SEM). (H) qPCR analysis of a set of TE genes and OCT4 following overexpression of GATA3 in differentiation medium for 72 and 96 h, and in 72-h BMP4-treated hESCs. Inset shows GCM1 gene that is first detected after 96 h of Dox treatment (n = 3, except 96 h of Dox, where n = 1, mean ± SEM; n.d., not detected). (I) Representative microinjected rhesus macaque embryos with control 3′-COF–labeled morpholino antisense oligonucleotides (MAO). The intensity of the green signal corresponds to the amount of MAO delivered to each embryo (brightfield on Left). (J) Rhesus macaque embryos microinjected with GATA3 MAO and immunostained with GATA3 (green), NANOG (red), and DAPI (blue), shown at various arrested stages. (K) Noninjected controls of J reached the blastocyst stage and expressed GATA3 in the TE layer, and NANOG in the inner cell mass. (L) A summary of the results of MAO microinjection into rhesus macaque embryos according to the developmental stages. Noninjected embryos (n = 15), and GATA3 MAO (n = 19) from two independent in vitro fertilization experiments (error bars indicate SEM; n = 2).

Next, we analyzed the expression of OCT4 and early, intermediate, and late group up-regulated genes that overlapped between the time course induction and enrichment in the APA+ population (Fig. 3D). In accordance with the direction of regulation during trophoblast differentiation, the up-regulation of GCM1 and VGLL1 significantly diminished and OCT4 increased in the GATA3−/− and TFAP2A−/− clones that were treated by BMP4 compared with isogenic cells, but TP63 did not show a clear pattern of deregulation, at least not at the 72-h time point (Fig. 5D). Unexpectedly, CDX2 expression increased more in the KO clones compared with isogenic control cells. Interestingly, the effects of the KOs on the other GATA/TFAP2 TFs did not follow an obvious pattern; GATA2 up-regulation decreased, TFAP2A increased, and TFAP2C did not show significant change in the GATA3−/− clones, and in the TFAP2A−/− clones the expression of the other three TFs was not significantly altered (Fig. 5D). We postulate that this is due to the complex web of interactions between the GATA and TFAP2 TFs (Fig. S6A).

To substantiate the central function of GATA3 in TE specification, we analyzed the outcome of GATA3 overexpression in human ESCs cells (Fig. 5F). This led to a phenotype that closely mimicked 72 h of BMP4 treatment, including the appearance of APA+ progenitors (albeit to a lesser extent), and the up-regulation of GATA2/3, TFAP2A/C, CDX2, and GCM1, as well as the down-regulation of OCT4 (Fig. 5 G and H). Finally, to validate the role of GATA3 in primate development in vivo, we knocked down its expression in rhesus macaque embryos. We injected zygotes with morpholino antisense oligonucleotides and monitored the development of the embryos (Fig. 5 IL). While control noninjected embryos reached the blastocyst stage, with a typical frequency of macaque embryos in vitro of ∼30%, GATA3 knockdown led to a failure of blastocyst formation and an embryonic arrest at 32-cell/morula stage (Fig. 5 JL). These results suggest that GATA3 is vital for TE specification and embryonic development in primates.

Discussion

In this study, we provide multiple lines of evidence that BMP4 treatment of human ESCs promotes differentiation of trophoblast-like progeny that highly resembles human TE progenitors in vivo. We find that, when the cells exit from the state of pluripotency under this condition, mesendodermal commitment can be detected in the fraction of cells that do not express the cell surface marker APA (Fig. S1). APA had been previously characterized as a trophoblast-specific marker, and sorted APA+ cells were shown to give rise to placental-like tissues upon engraftment in vivo (26). More recently, APA (ENPEP) was classified as TE-specific based on its expression in preimplantation human embryos (23, 50). We establish here that gene expression analysis of the APA+ cells reliably classifies them as rather homogenous based on single-cell RNA-seq data, and combined -omics datasets that we produced show that the APA+ population has a pronounced trophoblast gene signature. For example, recently published criteria for defining human mononuclear trophoblasts (39) are fully met by the APA+ cells with respect to expression of marker genes (e.g., TFAP2C, GATA3, KRT7, ELF5). We do not, however, detect hypomethylated CpGs in the ELF5 promoter at day 3 APA+ progenitor state, arguing that the epigenetic remodeling at this locus begins at a later stage. Indeed, Lee et al. have assessed methylation changes at day 6 of BMP4 treatment. Additionally, we find that APA+ cells express the majority of the trophoblast-specific TFs from human mural trophoblasts (19), as well as TE-enriched genes characterized in human preimplantation embryos, including GATA3, CDX2, KRT18, DAB2, EFNA1, PPARG, and TEAD3 (23, 50). Interestingly, three putative TE-specific markers proposed by Blakeley et al.—PLAC8, CLDN10, and TRIML1—were not detected as up-regulated in the APA+ cells. PLAC8 and CLDN10 were transiently up-regulated in bulk BMP4-treated cells from 8 to 24 h and from 24 to 48 h, respectively, arguing that they are not relevant for classifying trophoblasts in vitro, at least using the BMP4 protocol. Overall, based on the gene expression similarities, we are confident to claim that APA marks progenitors closely resembling human embryonic TE, making these cells a suitable model for studying TE progenitor development.

Our combined analysis highlighted three groups of genes with early, intermediate, and late up-regulation phases. We hypothesized that TFs displaying strong induction at the earliest time point of 8 h and persistent expression at least until the progenitors emerge on day 2.5 of differentiation, could be crucial for governing the transition from pluripotency to the trophoblast fate. Here, we propose that the identified set of four genes—GATA2, GATA3, TFAP2A, and TFAP2C—are likely to be the earliest drivers of the trophoblast specification in human ESCs. We named them the trophectoderm four (“TEtra”). Although all four factors have been implicated in the trophoblast development in mouse, and the GATA2–GATA3 redundancy was demonstrated (17, 51), it is important to point out that the earliest known TFs induced during murine TE development—Tead4 and Klf5—were not found to be induced in our analysis (for review of mouse TE commitment, see ref. 52).

Our investigation of the TEtra’s chromatin occupancy revealed with high confidence gene-activating binding to the promoters of trophoblast-specific genes from all temporal groups—early (Fig. 6A), intermediate (CDX2), and late (GCM1, VGLL1, TP63, STS, and ENPEP) (3, 12, 45, 5355). In line with the reports of refs. 56 and 57, we found that one member of TEtra—GATA3—exhibited certain characteristics of a “pioneer factor.” Its promoter was the only one not bound by any other TEtra member, while it occupied the promoters of all of the other TEtra members (except its own, Fig. S6A). In addition, it showed the highest degree of chromatin co-occupancy the with other TEtra members (Fig. 4C). On the functional level, ablation of GATA3 in human ESCs and in primate embryos in vivo completely abolished the formation of TE progenitors, while overexpression of GATA3 in human ESCs produced a phenotype that closely mimics that of BMP4 treatment (Fig. 5 G and H). Overall, this leads us to postulate that GATA3 occupies the very top of the TEtra hierarchy.

Fig. 6.

Fig. 6.

Proposed models of human TE specification. (A) A summary depicting up- and down-regulated TE-specific and pluripotency genes during the time course of BMP4 treatment of hESCs. Reciprocal interactions (binding in the promoter regions) between the TEtra TFs are indicated by arrows, and genes bound by three or more TEtra TFs are labeled with asterisks. (B) A model outlining the coupling of OCT4 and CDX2 regulation by the TEtra factors during the transition from pluripotency to TE progenitor fate. The unique binding site of GATA2, GATA3, TFAP2A, and TFAP2C is highlighted in the first intron of OCT4. CDX2 is bound by NANOG and OCT4 in hESCs (Fig. S6A) in the same region regulated by GATA2, GATA3, TFAP2A, and TFAP2C binding in hESCs that are treated by BMP4. The model includes a hypothetical position of a suppressor of CDX2 that is induced by the TEtra. (C) A model outlining gene expression, histone modifications, and DNA-methylation turnover during the transition from the pluripotency state of undifferentiated hESCs toward APA+ TE progenitor fate. The key pluripotency genes, OCT4 and NANOG, lose the active transcription mark H3K4me3, while the bivalency of early- and intermediate-group genes, which harbor unmethylated HCP, is resolved to the H3K4me3 state. Conversely, late TE regulators, structural and hormone genes, are generally LCP and do not exhibit H3K4me3 and H3K27me3 in undifferentiated cells or following specification to TE progenitors.

The chromatin occupancy analysis showed that GATA2, is bound by the three other TEtra factors and itself, pointing to its place at a step lower in the TEtra hierarchy compared with GATA3 (Fig. 6A). Interestingly, while we observed a significant decrease of GATA2 up-regulation in the GATA3 KO BMP4-treated cells, others have reported an opposite effect of Gata3 knockdown in mouse and rat TSCs (58). This highlights possible interspecies differences in GATA factor connectivity, which are expected according to studies showing the evolutionary plasticity of developmental gene regulatory network architectures (59). TFAP2C was also bound by three factors, except GATA2, while TFAP2A, only by GATA3 and TFAP2C. Although TFAP2A KO did not lead to a drastic down-regulation of the APA+ progenitor amount and expression of the other TEtra TFs, it led to similar perturbations of GCM1, VGLL1, and OCT4, as noted for the KO of GATA3. Further combinatorial TEtra gene KO and overexpression studies should reveal the finer details of the architecture of this gene-regulatory network and the connectivity to placental genes.

Rather unexpectedly, we noted up-regulation of CDX2 in BMP4-treated GATA3 and TFAP2A KO cells, even though we observed binding of all TEtra members in the same intronic region of the gene that was previously described to be activated by GATA3 and TFAP2C in the mouse (46, 48). It appears thus that varying the stoichiometric ratios of the TEtra effect the output of CDX2, or, alternatively, the TEtra regulate an unknown factor that suppresses CDX2, forming so-called “incoherent feedforward loop” (Fig. 6B), which is often seen in developmental gene networks (60). Importantly, because up-regulation of CDX2 in GATA3 KO did not rescue the ablation of the APA+ progenitors, this argues against its critical role in human TE specification.

Importantly, we provide evidence that the down-regulation of OCT4 and pluripotency is mechanistically linked to the up-regulation of the TEtra (Fig. 6B). This is based on the observations that these processes take place during the relatively short phase of APA+ progenitor specification, that the TEtra bind in the first intron of OCT4, and on the phenotype of GATA3 and TFAP2A KO that leads to weaker OCT4 down-regulation during differentiation (Figs. 4E and 5D). We speculate that TEtra promote the formation of a repressive complex on this unique site in OCT4, which has not reported in the mouse, and that the inhibition of OCT4 could be connected to the activation of the placental hormone hGC during BMP4-induced differentiation (61). Additional experiments are warranted to reveal the precise mechanism.

In line with previous findings that early developmental and later tissue-specific genes exhibit distinct chromatin configurations (38), we further observed that the majority of the earliest BMP4-induced genes (including TEtra) are bivalent with HCPs and hypomethylated CpGs in undifferentiated cells, while the genes from the late group (e.g., hCG) are respectively H3K27me3/H3K4me3 double negative with LCPs and hypermethylated CpGs. Somewhat unexpectedly, we did not detect significant changes in DNA methylation and acquisition of H3K4me3 mark in late up-regulated genes with LCP, arguing that the transcriptional induction can precede epigenetic changes during human ESC differentiation (Fig. 6C).

Taken together, we predict that our validation of the APA+ progenitor cells as a model for studying the biology of human TE in vitro and our discoveries of the TEtra gene-regulatory network lay a foundation for understanding the mechanisms of human placental development and pathologies related to placental dysfunction. Furthermore, we point out important features of regulation of pluripotency dissolution and TE specification that could be unique to human and primates, thus opening a path to understand the evolution of the placenta from a developmental perspective.

Materials and Methods

The H9 (WA09) hESC line was cultured on irradiated mouse embryonic fibroblasts (MEFs) in hESC medium (DMEM/F12; 11320074; Life Technologies), supplemented with 20% KSR (10828028; Thermo Fisher Scientific), Glutamax (35050038; Life Technologies), nonessential amino acids (1140050; Thermo Fisher Scientific), β-mercaptoethanol (31350-010; Thermo Fisher Scientific), 10 ng/mL FGF2 (100-18B; Peprotech), and 1% penicillin–streptomycin (15140122; Thermo Fisher Scientific), or on Geltrex-coated (A1413202; Life Technologies) plates with mTeSR1 medium (05850; STEMCELL Technologies). Colonies were passaged by treatment with 2 mg/mL collagenase IV (17104019; Thermo Fisher Scientific) in DMEM/F12 medium for 30–45 min. HUES9 iCRISPR and H9-GATA3 lines were cultured in mTeSR1 medium in plates coated with Matrigel (356234; BD Biosciences).

Multiple preovulatory follicles from rhesus macaque females of average maternal age (∼8 y old) were obtained via controlled ovarian stimulation from the Oregon National Primate Research Center (ONPRC) Assisted Reproductive Technologies (ART) Core according to the Institutional Animal Care and Use Committee (IACUC) approved protocol #0095 entitled “Assisted Reproduction in Macaques.” The IACUC is fully accredited by the Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC) and Oregon Health & Science University (OHSU)/ONPRC has an Animal Welfare Assurance on file with the NIH Office for Laboratory Animal Welfare (OLAW; #A3304-01).

SI Materials and Methods

Trophoblast Differentiation.

Cells were dissociated using 0.25% trypsin-EDTA (25200056; Thermo Fisher) and seeded as single-cell monolayers (105,000 cells per cm2) on Matrigel-coated plates with KSR-based differentiation medium (DMEM/F12 medium, supplemented with 20% KSR, Glutamax, NEAA, b-ME, and 1% pen-strep) supplemented with 50 ng/mL BMP4 (314-BP; R&D Systems). Fresh medium with BMP4 was applied every 24 h. For RNA-seq of bulk cultures, cells were dissociated with Accutase (A6964; Sigma) and seeded as single-cell monolayers into Matrigel-coated plates with mTeSR1 medium supplemented with 10 µM ROCK inhibitor (Y-27632; 1254/10; R&D Systems). Sixteen to 18 h later, medium was changed to differentiation medium (RPMI1640; 21875034; Thermo Fisher Scientific) supplemented with Glutamax and B27 without insulin (A1895601; Thermo Fisher Scientific) and 50 ng/mL BMP4. Undifferentiated cells from mTeSR1 medium were collected at time point 0 h. Fresh medium with BMP4 was applied every 24 h. At the time points 8, 16, 24, 48, and 72 h, cells were lysed in the plates using PLB lysis buffer from RNeasy mini kit (74104; Qiagen) and frozen at −80 °C until RNA isolation.

Mesoderm Differentiation (Fig. S5B).

To induce mesodermal commitment, H9 cells were treated using 10 µM CHIR99021 (4953/50; R&D Systems) using the same experimental procedure as for trophoblast differentiation time course (see above).

Flow Cytometry.

Cells were dissociated using 0.05% trypsin–EDTA solution (25200054; Thermo Fisher), resuspended in FACS staining media (2–4% FBS/1 mM EDTA in PBS), and incubated with mouse anti-human CD249 (APA; 564532; BD Biosciences) and secondary goat anti-mouse Alexa Fluor 488 or 647 (A11001 or A21235; Life Technologies), or with mouse anti-human SSEA-5-PacificBlue-conjugated antibody (40). Live/dead cell discrimination was performed using either propidium iodide or Cytox Blue (S34857; Thermo Fisher). For ChIP-seq experiments, cells were fixed with 1% final concentration of methanol-free formaldehyde solution (10321714; Fisher Scientific) for 10 min at room temperature (RT), quenched with 0.125 M glycine (23391.02; Serva) for 5 min, and washed three times with cold PBS plus 100 µM PMSF (6367.1; Carl Roth) before antibody staining and sorting. All flow cytometry analyses and sorting were performed on FACSAria III instrument (BD Biosciences). Flow cytometry data were analyzed using FlowJo software. Gating strategy for determining the percentage of APA+ cells and sorting them, utilized dead cells exclusion and setting gates to include >98% of the isotype IgG control-incubated cells into the APA-negative coordinates.

hCG Measurement.

Medium was collected at 2, 4, 6, and 8 d of differentiation, and hCG levels were quantified using the AccuLite CLIA kit (8575-300; Monobind) according to manufacturer’s instructions.

RNA Isolation.

Total RNA was isolated using RNeasy mini kit according to the manufacturer’s instructions.

Microarray.

GeneChip Human Gene 2.0 ST Arrays (902113; Affymetrix) were hybridized and scanned according to the manufacturer’s instructions.

RNA-Seq.

Three micrograms of total RNA were treated with TURBO DNase (am2238; Life Technologies) and purified using RNeasy Minelute RNA cleanup kit (74204; Qiagen). RNA quality (RNA integrity number values > 8) was assessed using microcapillary electrophoresis on Agilent 2100 Bioanalyzer with RNA Pico 6000 kit (5067-1513; Agilent). Per RNA-seq library, 1 µg of DNase-treated RNA was treated with RiboZero Gold (Human/Mouse/Rat) kit (RS-122-2301; Illumina) to remove rRNAs, followed by RNA cleanup using the RNeasy Minelute RNA cleanup kit. Sequencing libraries were prepared from equal quantities of rRNA-depleted RNA using TruSeq Stranded total RNA LT kit (RS-122-2301; Illumina) according to the manufacturer’s instructions using 11 cycles of enrichment PCR. The quality of the libraries was assessed using Agilent 2100 Bioanalyzer with the DNA 1000 kit (5067–1504; Agilent). Library concentration was measured using Qubit dsDNA HS Assay kit (Q32854; Life Technologies). Multiplexed libraries were sequenced using a NextSeq 500 (Illumina) to generate 75-nt single-end reads; sequencing depth was 20–40 Mio reads per library. Multiplexing of libraries was performed according to the manufacturer’s instructions.

Histone ChIP-Seq.

Per population, 2.5 × 105 fixed sorted cells were lysed and sheared using the E220 focused ultrasonicator (Covaris). The average fragment size was determined using the High Sensitivity DNA Analysis kit (5067–1504; Agilent) and 2100 Bioanalyzer. The specificity of different antibodies was tested using the AbSurance Histone H3 Antibody Specificity Array (16–667; Merck Millipore) and further by ChIP-qPCR (Fig. S4). Antibodies used for ChIP-Seq include the following: H3K4me3 (Premium; C15410003-50; LotNr: A.5051–001P; Diagenode), H3K27me3 (Classic; C15410069; LotNr: A1821D; Diagenode).

Immunoprecipitation.

Sonicated chromatin was precleared for 2 h at 4 °C by incubation with Dynabeads Protein A (10001D; Life Technologies) that were blocked overnight with denatured tRNA (1 mg/mL) from baker’s yeast (R5636-1ML; Sigma) and (20 µg/mL) rabbit IgG isotype control antibody (GTX35035; Biozol Diagnostica). The precleared supernatant was then incubated overnight at 4 °C with the respective ChIP antibodies (1 µg/sample). Next, chromatin/antibody mix was incubated with 20 µL of BSA (A9647; Sigma)-blocked Dynabeads for 3 h at 4 °C. Following five washing steps, the chromatin was eluted from the beads twice for 15 min at 65 °C, and RNA was digested using RNase A (200 µg/mL; Life Technologies) for 45 min at 37 °C. De–cross-linking and proteinase K treatment (200 µg/mL; 12091021; Life Technologies) was performed overnight at 65 °C. DNA was then purified using the QiaQuick PCR purification kit (28104; Qiagen) and eluted in 30 µL.

Library preparation.

Purified DNA fragments were prepared for sequencing using the NEBNext ChIP-Seq Library Prep Reagent Set for Illumina (E6200L; New England Biolabs) and the NEBNext Multiplex Oligos for Illumina (Index Primer Set 1) (E7335S; New England Biolabs) according to manufacturer’s instructions using 15 cycles of enrichment PCR. Library size was determined using the High Sensitivity DNA Analysis kit on a 2100 Bioanalyzer instrument, and concentration was measured with Quant-iT PicoGreen dsDNA Assay kit (P7589; Life Technologies). Libraries were multiplexed and sequenced on an Illumina HiSeq 2500 to generate 50-bp single-end reads, with sequencing depth of 30–60 Mio reads per library.

TF Chip-Seq.

For TF ChIP-Seq, 1 × 107 cells were cross-linked using 1% methanol-free formaldehyde in culture dishes for 10 min at RT, followed by quenching using glycine for 5 min. IP was performed using the ChIP-IT high sensitivity kit (53040; Active Motif) according to manufacturer’s instructions with modifications. Briefly, chromatin was sheared in total volume of 300 µL to an average fragment size of 250–500 bp using Bioruptor Pico (Diagenode). One-half of the sheared chromatin (diluted to 500 µL with the lysis buffer) was used for the IP with 5 µg of the target antibodies, and the other half, with 5 µg of the IgG control antibody overnight at 4 °C. Next, complexes were captured by incubation with 50 µL of Dynabeads protein G (10004D; Life Technologies) for 2 h at RT. Following bead washing, elution was performed at 65 °C for 15 min. Chromatin was de–cross-linked at 65 °C overnight in presence of proteinase K and purified using provided columns in 36 µL. Antibodies used for ChIP were as follows: rabbit anti-GATA2 (sc-9008X; Santa Cruz), mouse anti-GATA3 (sc-268; Santa Cruz), rabbit anti-TFAP2A (sc-184X; Santa Cruz), mouse anti-TFAP2C (sc-12762X; Santa Cruz), mouse IgG isotype control (16-4714-81; Life Technologies), and rabbit IgG isotype control antibody (GTX35035; Biozol Diagnostica). Libraries were prepared from the complete eluate using the NEBNext ChIP-Seq Library Prep Reagent Set for Illumina, multiplexed, and sequenced on a NextSeq 500 instrument to generate 75-nt single-end reads. Sequencing depth was 15–30 Mio reads per library.

DNA Methylation Analysis.

Genomic DNA of the sorted populations was isolated using the Wizard DNA isolation kit (A1120; Promega). Following bisulfite conversion using the EZ DNA Methylation kit (D5030; Zymo Research), genome-wide DNA methylation levels were determined using the Infinium HumanMethylation450 BeadChip kit (WG-314-1003; Illumina) according to the manufacturer’s instructions. Data were analyzed using the RnBeads R package (rnbeads.mpi-inf.mpg.de).

Immunofluorescence.

Cells were seeded for immunofluorescent staining as confluent monolayers into Matrigel-coated eight-well µ-slides (80826; Ibidi) and were either differentiated using BMP4 or maintained as pluripotent in MEF-conditioned hESC medium. Cells were fixed with 4% formaldehyde in PBS for 15 min at RT and permeabilized with 0.3% Triton X-100 (X100; Sigma-Aldrich) and 5% BSA (A9647; Sigma) in PBS for 30 min at RT. Cells were then incubated with primary antibodies in 1% BSA/0.1% Triton X-100 in PBS (PBS-TB) overnight at 4 °C, washed with PBS, and incubated with secondary antibodies in PBS-TB for 1 h at RT. After washing with PBS containing DAPI (1:2,000; D1306; Thermo Fisher), wells were filled with several drops of mounting medium (50001; Ibidi) and imaged. Primary antibodies used were as follows: rabbit anti-GATA2 (sc-9008X; Santa Cruz), mouse anti-GATA3 (CM405A; Biocare Medical), rabbit anti-TFAP2A (sc-184X; Santa Cruz), and mouse anti-TFAP2C (sc-12762X; Santa Cruz). Secondary antibodies were as follows: goat anti-mouse IgG (H+L) Alexa Fluor 488 (A11001; Life Technologies) and goat anti-rabbit IgG (H+L) Alexa Fluor 488 (A11008; Life Technologies). All primary antibodies were used at 1:200 dilutions, and all secondary antibodies at 1:1,000 dilutions. Images were obtained using a Zeiss Axiovert 200M epifluorescent microscope equipped with 10× Fluar 0.5 objective and AxioVision software (Carl Zeiss).

Western Blotting.

Cells were trypsinized and washed with PBS, and cell pellets were lysed using RIPA buffer, containing phosphatase (4906837001; Sigma-Aldrich) and protease (539134; Merck) inhibitors. After addition of 2× Laemmli sample buffer (161-0737; Bio-Rad) with 0.5% 2-mercaptoethanol (M3148; Sigma-Aldrich), samples were heated at 95 °C for 5 min. Mini-PROTEAN TGX Stain Free Gels, 4–15% (456–8086; Bio-Rad Laboratories), were used for all experiments. Wet blotting performed using the Mini Trans-Blot Cell (1703930; Bio-Rad Laboratories) for 1 h at 100 V. Following blocking of the membranes with 5% milk powder (T145.1; Carl Roth) in TBS with 0.05% Tween 20 (P9416; Sigma) (TBS-T), primary antibodies in blocking buffer were added and membranes incubated overnight at 4 °C. Secondary antibodies in blocking buffer were incubated with the membranes for 1 h at RT. Following 3× washes with TBS-T, membrane was incubated with Clarity Western ECL Substrate (170-5060; Bio-Rad Laboratories) and imaged on ChemiDoc MP System (Bio-Rad Laboratories). Primary antibodies were as follows: mouse anti-GATA3 (558686; 1:700; BD Biosciences), rabbit anti-TFAP2A (sc-184X; 1:2,000; Santa Cruz), mouse anti–β-Actin (3700; 1:5,000; Cell Signaling), mouse anti-FLAG-HRP–coupled IgG (A8592; Sigma). Secondary antibodies were as follows: goat anti-mouse IgG-HRP (sc-2064 at 1:1,000; Santa Cruz) and goat anti-rabbit IgG-HRP (111-035-045 at 1:2,000; Jackson Laboratories).

Gene Editing.

HUES9 iCRISPR cell line that inducibly expresses human codon-optimized Streptococcus pyogenes Cas9 protein from AAVS1 locus was used (50). Nucleofection with guide RNA (gRNA) plasmids was performed using the P3 Primary Cell 4D-Nucleofector X Kit (V4XP-3024; Lonza) on a 4D Nucleofector System (Lonza). Briefly, cells were dissociated using Accutase, and 1 × 106 cells were mixed with 3 µg of two gRNA encoding plasmids per gene in complete P3 solution. Mixtures were transferred to cuvettes and nucleofected using CB-156 program. Cells were replated in prewarmed mTeSR1 containing 1 µg/mL doxycycline hydrochloride (Dox) (631311; Clontech) and 10 µM Y-27632. After 24 h, medium was changed to mTeSR1 plus Dox for 2 more days. Afterward, cells were cultured in mTeSR as described above, and single clones were picked manually and expanded in 96-well plates. gDNA from single clones was isolated using QuickExtract DNA extraction solution (QE0905T; Epicentre) and was used for deletion detection by PCR screening with primers amplifying genomic region around the cuts introduced by Cas9.

Inducible GATA3 Overexpression.

H9 hESC cell line was nucleofected with PB-GFP-P2A-GATA3 plasmid and helper plasmid coding for Piggybac transposase (gift from Y. Mayshar, Harvard University, Cambridge, MA; map in Fig. S7) at 1:1 ratio using P3 Primary Cell 4D-Nucleofector X kit as noted above. Twenty-four hours after nucleofection, Hygromycin B (10687010; Life Technologies) was added to the medium at 50 µg/mL final concentration, and the resistant polyclonal line was selected for 2 wk. The line was maintained in presence of 25 µg/mL Hygromycin B.

Fig. S7.

Fig. S7.

(Related to experimental procedures.) Maps of plasmids used in this study.

Quantitative PCR.

cDNA was synthesized from 0.5 to 1 µg of total RNA using SuperScript III reverse transcriptase (18080093; Thermo Fisher), according to manufacturer’s protocol in 20 µL. cDNA was either used undiluted or was diluted 1:2 before using 1 µL as template in 10-µL qPCRs. qPCRs were prepared using TaqMan Gene Expression Master Mix (4369016; Thermo Fisher) and TaqMan Gene Expression Assays (4331182; Thermo Scientific) according to manufacturer’s protocol. qPCR was performed on QuantStudio 12K Flex Real-Time PCR System (Thermo Fisher) using predefined cycling protocol for TaqMan assays. Two technical replicates were run for each reaction. TaqMan assays used in this study are listed in Table S1. GAPDH served as a housekeeping gene in all experiments. Relative fold changes (FCs) in gene expression were calculated using ΔΔCt method (62). Results are presented as mean ΔΔCt ± SEM between biological replicates, unless indicated otherwise. Plots were produced using ggplot2 package (63) in RStudio.

Table S1.

Predesigned TaqMan gene expression assays

Gene Assay no.
CD13 Hs00174265_m1
CDX2 Hs01078080_m1
GSC Hs00418279_m1
ENPEP Hs00157366_m1
GAPDH Hs02758991_g1
GATA2 Hs00231119_m1
GATA3 Hs00231122_m1
GCM1 Hs00961601_m1
MESP1 Hs00251489_m1
POU5F1 Hs01895061_u1
ROR2 Hs00171695_m1
T Hs00610080_m1
TFAP2A Hs00231461_m1
TFAP2C Hs00231476_m1
TP63 Hs00978343_m1
VGLL1 Hs00212387_m1

Construction of Vectors.

For gene-editing experiments using CRISPR-Cas9, gRNAs were designed using the MIT CRISPR design webpage (crispr.mit.edu). Two gRNAs were designed per gene (GATA3 and TFAP2A) (Table S2). gRNAs were chemically synthesized (Sigma) as a pair of cDNA oligonucleotides carrying BbsI overlaps. gRNA oligos were annealed to created double-stranded fragments with BbsI overhangs. pBS-U6 vector (64) was digested with BbsI (R0539; New England Biolabs) and ligated to gRNA fragments using T4 DNA ligase (M0202; New England Biolabs) overnight at 16 °C. Ligation products were transformed into 5-α competent Escherichia coli (C2987 I; New England Biolabs). Positive clone selection was done by Sanger sequencing of miniprep DNA. To create PB-GFP-P2A-GATA3 plasmid for inducible GATA3 overexpression, protein-coding sequence of human GATA3 mRNA (transcript variant 1, Gencode transcript ENST00000379328.7) was amplified from H9 hESC cDNA using the following primers: forward, cccatggactacaaagacgatgacgacaagGAGGTGACGGCGGACCAG; reverse, gaactccagcatgagatccccgcgctgcagCTAACCCATGGCGGTGACCA. Primers included 30-bp overlaps with the 5′ and 3′ ends of the recipient vector PB-GFP-2A. PB-GFP-2A vector was derived from PB-Lin28B-2A-GFP plasmid (gift from Y. Mayshar, Harvard University, Cambridge, MA) by recloning eGFP directly downstream from the promoter and replacing T2A self-cleaving peptide bridge with GSG linker-P2A bridge. Maps of the plasmids are presented in Fig. S7. PB-GFP-2A backbone was amplified as two fragments using the following primers: fragment 1, forward, ccctccagcatggtcaccgccatgggttagCTGCAGCGCGGGGATCTC; reverse, atttaggacatctcagtcgccgctTGGAGCTCCCGTGAGGCG; fragment 2, forward, AAGCGGCGACTGAGATGT; reverse, CTTGTCGTCATCGTCTTTGTAG. All PCRs were done using Q5 high-fidelity master mix (M0492; New England Biolabs) according to the manufacturer’s recommendations. PCR fragments were purified using Minelute Reaction Cleanup kit (28204; Qiagen) and assembled using Gibson assembly master mix (E2611; New England Biolabs) in 20-µL total volume according to the manufacturer’s protocol. Reactions were digested with DpnI (R0176; New England Biolabs) for 15 min at 37 °C to remove template plasmid used in PCR amplification before transformation into 5-α competent E. coli cells. Screening for positive colonies was done by Sanger sequencing of miniprep DNA.

Table S2.

Primers for cloning of gRNAs

Gene Strand Sequence
GATA3 gRNA 1 Forward caccGTACTGCGCCGCGTCCATGT
Reverse aaacACATGGACGCGGCGCAGTAC
GATA3 gRNA 2 Forward caccGACACTCTCGCGACGAGCCAG
Reverse aaacCTGGCTCGTCGCGAGAGTGTC
TFAP2A gRNA 1 Forward caccGCAACCGTGCCGTCCCGTTGC
Reverse aaacGCAACGGGACGGCACGGTTGC
TFAP2A gRNA 2 Forward caccGGGAAATCGCCCGTTCCCGT
Reverse aaacACGGGAACGGGCGATTTCCC

Table S3.

Real-time PCR primers

Gene Forward Reverse
→ChIP-seq validation
 ACTB AACGGCAGAAGAGAGAACCA AAGATGACCCAGGTGAGTGG
 B2M GAGGCTATCCAGCGTGAGTC GAAGTCACGGAGCGAGAGAG
 CCND1 TGAAGAATCCCTGGATGGAG GCCTGGGGTGAGATACAAGA
 ESR1 AGAAAGGCGGGCATTAACTT GGCCTTGACTTTCATGGTGT
 EVX2 CTGAGTCTTCGGGGTTTCAA GTCAGCGGGAGAAAGAGTTG
 GAPDH #1 AGTCCCCAGAAACAGGAGGT AGAGCGCGAAAGGAAAGAA
 GAPDH #2 CTCTCTCCCATCCCTTCTCC GGGAAGAGGGGAAGCTGTAT
 GAPDH #3 AGGCAACTAGGATGGTGTGG TGGACTCCACGACGTACTCA
 HOXD12 GGAAACCCTACACGAAGCAG TCGCTGAGGTTCAGCCTATT
 K4neg CCAGGCAGATGAATGAGGAT CCCTTCCAAGGCTCTCTTCT
 NEUROG1 CTGCAGGTACCCCTGATCTC AACTGCCCTTTCCTGAGTGA
 SPERT GCATTAGAAGCTGGGGTGAA CCTTCTCTCTTGCCCATCTG
→Real-time PCR
 GAPDH GAGTCAACGGATTTGGTCGT ATGACAAGCTTCCCGTTCTC
 ELF5 CCTGATGTCGTGGACTGATCTG CTTAGTCCAGTATTCAGGGTGG

SCRB-Seq.

H9 hESC grown from feeder-free mTeSR1 cultures were differentiated for 3 d using BMP4 and KSR-based differentiation medium as described above. Differentiated cells and undifferentiated control cells from mTeSR1 medium were sorted into 384-well plates (0030128540; Eppendorf) at one cell per well, containing 5 µL of Phusion polymerase HF buffer (B0518; New England Biolabs) at 1:500 dilution. Plates were centrifuged and immediately frozen on dry ice. SCRB-seq libraries were prepared as described previously (65, 66). Briefly, plates were thawed, and cell lysis was completed by Proteinase K (AM2548; Thermo Fisher) digest. RNA was desiccated and subsequently reverse transcribed using custom barcoded oligo-dT primers (IDT). RT products were pooled and unincorporated primers digested using Exonuclease I (M0293; New England Biolabs). Amplification of cDNA was performed using the KAPA HiFi HotStart polymerase (KK2502; KAPA Biosystems). Libraries were prepared using Nextera XT library prep kit (FC-131-1024; Illumina) from 1 ng of preamplified cDNA with a custom P5 primer (IDT) and sequenced on HiSeq 2500 instrument (Illumina) using a custom procedure.

Rhesus Macaque Oocyte Collection.

Multiple preovulatory follicles from rhesus macaque females of average maternal age (∼8 y old) were obtained via controlled ovarian stimulation from the Oregon National Primate Research Center (ONPRC) Assisted Reproductive Technologies (ART) Core according to the Institutional Animal Care and Use Committee (IACUC) approved protocol #0095 entitled “Assisted Reproduction in Macaques.” The IACUC is fully accredited by the Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC) and Oregon Health & Science University (OHSU)/ONPRC has an Animal Welfare Assurance on file with the NIH Office for Laboratory Animal Welfare (OLAW; #A3304-01). This consisted of a daily intramuscular (IM) injection of 60 IU recombinant human (rh) follicle-stimulating hormone (FSH) (30 IU at 0800 hours and 1600 hours) for 6 d followed by IM injection of 60 IU luteinizing hormone (LH) (30 IU at 0800 hours and 1600 hours) on days 6 and 7. The gonadotropin-releasing hormone (GnRH) antagonist, Acyline, was given IM on day 6 at 1600 hours (60 μg/kg; Eunice Kennedy Shriver National Institute of Child Health and Human Development) to prevent an endogenous LH surge. Numbers and the size of growing follicles were monitored by ultrasonography on day 7 of treatment, and when six or more follicles of 4-mm diameter or greater were present (typically on day 8), 1,000 IU of hCG (Merck) was injected IM to initiate oocyte maturation. A laparoscopy was aseptically performed 30 h later with the follicular contents aspirated by suction using a stainless-steel needle attached to a 5-cc syringe.

Fertilization of Rhesus Macaque Embryo and Culture.

Conventional in vitro fertilization (IVF) was performed overnight with fresh sperm at a concentration of 20 million per mL collected from the semen of rhesus male monkeys the same day as oocyte retrieval. Fertilized oocytes were cultured in alphanumeric-labeled microwell Petri dishes (Progyny; formerly Auxogyn) containing commercially available media with 10% protein supplement (LGPS; LifeGlobal) under mineral oil at 37 °C with 6% CO2, 5% O2, and 89% N2, as previously described and in accordance with clinical IVF practice (67).

GATA-3 Silencing.

Knockdown of GATA-3 was accomplished by microinjecting morpholino oligonucleotide antisense (MAO; Gene Tools) with the sequence 5′-TGGTCTGCCGTCACCTCCATG-3′ designed to target the ATG start site. The MAO was labeled with 3′-carboxyfluorescein (3′-COF) for visualization and microinjected into the cytoplasm of zygotes using a manual microinjector and Transferman NK 2 Micromanipulators (#5193000012; Eppendorf) at a concentration of 0.3 mM as previously described (68). Two to three independent experiments with 10 or more embryos per injection group were performed. Noninjected and the injection of a 3′-COF–labeled standard MO served as negative controls.

Time-Lapse Imaging.

Each zygote, whether microinjected or not, was monitored up to ∼5–7 d until the blastocyst stage using an Eeva time-lapse dark-field microscope system (Progyny). Embryos that arrested before this stage of development were removed from the microwell Petri dishes once they ceased dividing. Images were taken every 5 min and compiled into a time-lapse movie with well identification labels and time stamps for the measurement of imaging parameters using Fiji (69).

Confocal Analysis.

The zona pellucida (ZP) was removed with Tyrode’s solution (MR-004-D; Millipore), and each embryo was washed in 0.1% BSA (A2058; Sigma) and 0.1% Tween 20 (P1379; Sigma-Aldrich) (PBS-T). ZP-free embryos were fixed in 4% paraformaldehyde in PBS (AA433689M; Fisher Scientific) for 20 min at RT. Once fixed, the embryos were washed three times in PBS-T to remove any residual fixative and permeabilized in 1% Triton X-100 (catalog #648466; EMD Millipore) for 1–2 h at RT. Following permeabilization, the embryos were washed three times in PBS-T and then blocked in 7% normal donkey serum (017-000-121; Jackson ImmunoResearch Laboratories) in PBS-T overnight at 4 °C. The embryos were incubated with primary antibodies in PBS-T with 1% donkey serum sequentially for 1 h each at RT at the following dilutions: 1:100 mouse anti-GATA-3 (clone #L50-823; CM 405 A; Biocare Medical) and 1:200 goat anti-NANOG (catalog #AF1997-SP; R&D Systems). Primary signals were detected using the appropriate 488-, 568-, 594-, or 647-conjugated donkey Alexa Fluor secondary antibody (A-21202; Invitrogen) at a 1:250 dilution at RT for 1 h in the dark. Immunofluorescence was visualized by sequential imaging, whereby the channel track was switched each frame to avoid cross-contamination between channels, using a Leica SP5 AOBS spectral inverted laser-scanning confocal system in the Imaging and Morphology Core at ONPRC.

Data Analysis

Microarray.

CEL files were processed and normalized using the affy package (70). Normalization was performed using robust multiarray average. Genes with low variation or low signal were removed by applying a filtering approach that is implemented in the genefilter package (71). Differential gene expression was determined using limma package (72). False-discovery rate (FDR) was calculated from P values using the Benjamini and Hochberg procedure. A gene was considered to be differentially expressed if its adjusted P value was <0.05.

Genomatix Software Suite Analysis (www.genomatix.de).

Tissue and cell type associations were analyzed using GeneRanker tool by providing a corresponding list of gene names as input. This tool employs a proprietary literature data-mining algorithm (LitInspector) based on all PubMed abstracts. Individual tissues to gene associations found on sentence level are filtered for significance. Names and synonyms are based on medical subject headings terms and National Cancer Institute thesaurus. Subcellular localization analysis was performed using Genomatix Pathway System tool. Transcriptional factors and coregulators of transcription were isolated from gene lists using Genomatix TF database (MatBase).

ChIP-Seq Read Processing.

Sequenced reads were aligned to version hg19 of the human genome using bowtie (73), version 1.1.1, allowing only for single matches to the reference (parameter –m 1). We extended the matched reads to a total of 200 bp and calculated for each sample a per-base genomic coverage vector by cumulating the total spans of all sequenced fragments.

Histone Modification Enrichment at Promoters.

Per gene, for each 4-kb window, the number of matching reads in each replicate sample were calculated. Raw read counts were transformed by reads.transformedi=arcsin(readsi/total reads), where i = promoter window, reads = number of reads covering i. Signal enrichment was analyzed by subtracting the normalized reads of the corresponding input samples from the IP read values. Enrichments across replicates of each histone modification were adjusted by quantile normalization and the values were averaged. In case of many promoters-to-one gene relationships, we selected the promoter with highest interquartile range of signals across all experimental conditions.

Peak Calling and Definition of Robust Peak Sets.

TF peaks were called using Homer (74) findPeaks (version 4.7.2) with parameters style = factor, size = 200, fragLength = 200, inputFragLength = 200, and C = 0. All peaks were called using the corresponding input samples as control. We defined peaks as robust if the region was called in at least two biologically replicated samples except for TFAP2C, for which only one high-quality dataset was obtained.

De Novo Motif Discovery and Genome-Wide Motif Searches.

We searched for enriched motifs in peak regions using MEME (75) using zero or one occurrence per sequence (“zoops”) model.

RNA-Seq.

RNA-seq reads were mapped to the hg38 genome using TopHat (76). The resulting alignments were overlaid with the University of California, Santa Cruz, “known gene” track (77) to obtain exon read coverage for every annotated gene. Genes with zero read counts across all samples were considered to be nonexpressed and were removed from further analysis. The count data were then preprocessed using voom (78) including quantile normalization between the individual samples. Differential gene expression between time points was determined similarly to the microarray data. For illustration purposes, read coverage data were assembled from BAM files using Integrated Genomics Viewer (IGV) (79) using “normalize coverage data” option. Time series of selected TFs were clustered according to the pairwise Pearson correlation coefficients using κ-means method with k = 6 clusters. Fisher’s exact test was employed to test whether a set of genes with a certain histone modification was significantly overrepresented in one of the resulting clusters.

SCRB-Seq.

SCRB-seq plate pools (i7) were demultiplexed from the Illumina barcode reads using deML (80). Reads were mapped to the human genome (build hg38) using STAR 2.5 (81). Cell- and gene-wise unique molecular identifier (UMI) tables were generated using the published Drop-seq pipeline (version 1.1) (82). Ensembl GRCh38.84 gene models were used. The median total UMI counts per cell were 4,956, and the median number of genes detected per cell was 2,643 (Fig. S3).

Single-cell differential expression analysis.

Analysis was restricted to 680 cells with a minimum of 900 genes detected and 12,736 genes with a minimum of 50 UMI counts and detected in at least 5% of cells. Single-cell differential gene expression analysis between APA+ population and undifferentiated hESC was performed using the R package SCDE (83). Differential expression plots of exemplary genes showing the estimated fold changes were generated using the scde.test.gene.expression.difference function.

Single-cell and bulk RNA-seq comparison.

Bulk RNA-seq samples from undifferentiated hESC (T0) and 72 h of BMP4 differentiation (T72) were compared against the single-cell gene expression data. To avoid confounding due to differences in sequencing technologies and library sizes, the relative expression changes within bulk and single-cell gene expression were compared. Both the single-cell and bulk data were restricted to genes present in both datasets. Both datasets were normalized for library size using the calcNormFactors function using the “RLE” method of the edgeR R package (84). For each gene, the mean expression logtwofold changes were calculated for T72 relative to T0 and APA+ relative to undifferentiated hESC in the bulk and single-cell datasets, respectively. Next, Pearson correlation was used to assess correspondence between bulk and single-cell fold changes.

Single-cell tSNE analysis.

Single-cell UMI count matrix was normalized for library size using the computeSumFactors function of the scran R package (85). To decrease confounding due to very large library sizes, cells with calculated size factors bigger than 2.5 were removed. Next, 10,000 genes with the highest variance were subjected to principal-component analysis. Subsequently, the first five principal components, which explain over 97% of the variation in the data, were provided as input for the tSNE projection using the tsne function of the tsne R package (86).

Single-cell trophoblast gene expression score.

To calculate trophoblast gene expression score, the following nine marker genes were used: DLX3, GCM1, GATA2, GATA3, TFAP2A, TFAP2C, HAND1, VTCN1, and IGF2. To account for noise due to dropout, the gene expression score for each cell was calculated as the number of marker genes with UMI count greater than 0.

Supplementary Material

Supplementary File
pnas.1708341114.sd01.xlsx (440.1KB, xlsx)
Supplementary File
pnas.1708341114.sd02.xlsx (100.6KB, xlsx)
Supplementary File
Supplementary File

Acknowledgments

For prolific scientific discussions and critical reading of the manuscript, we thank Tal Raveh. We also acknowledge Deutsche Forschungsgemeinschaft Grant DR1008/1-1 (to M.D.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: The raw sequencing data reported in this paper have been deposited in the Gene Expression Omnibus (GEO) database (accession nos. GSE104818, GSE104969, GSE105022, GSE105081, and GSE105258).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1708341114/-/DCSupplemental.

References

  • 1.Rossant J, Cross JC. Placental development: Lessons from mouse mutants. Nat Rev Genet. 2001;2:538–548. doi: 10.1038/35080570. [DOI] [PubMed] [Google Scholar]
  • 2.Yagi R, et al. Transcription factor TEAD4 specifies the trophectoderm lineage at the beginning of mammalian development. Development. 2007;134:3827–3836. doi: 10.1242/dev.010223. [DOI] [PubMed] [Google Scholar]
  • 3.Nishioka N, et al. Tead4 is required for specification of trophectoderm in pre-implantation mouse embryos. Mech Dev. 2008;125:270–283. doi: 10.1016/j.mod.2007.11.002. [DOI] [PubMed] [Google Scholar]
  • 4.Strumpf D, et al. Cdx2 is required for correct cell fate specification and differentiation of trophectoderm in the mouse blastocyst. Development. 2005;132:2093–2102. doi: 10.1242/dev.01801. [DOI] [PubMed] [Google Scholar]
  • 5.Niwa H, et al. Interaction between Oct3/4 and Cdx2 determines trophectoderm differentiation. Cell. 2005;123:917–929. doi: 10.1016/j.cell.2005.08.040. [DOI] [PubMed] [Google Scholar]
  • 6.Russ AP, et al. Eomesodermin is required for mouse trophoblast development and mesoderm formation. Nature. 2000;404:95–99. doi: 10.1038/35003601. [DOI] [PubMed] [Google Scholar]
  • 7.Ralston A, et al. Gata3 regulates trophoblast development downstream of Tead4 and in parallel to Cdx2. Development. 2010;137:395–403. doi: 10.1242/dev.038828. [DOI] [PubMed] [Google Scholar]
  • 8.Werling U, Schorle H. Transcription factor gene AP-2 gamma essential for early murine development. Mol Cell Biol. 2002;22:3149–3156. doi: 10.1128/MCB.22.9.3149-3156.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Auman HJ, et al. Transcription factor AP-2gamma is essential in the extra-embryonic lineages for early postimplantation development. Development. 2002;129:2733–2747. doi: 10.1242/dev.129.11.2733. [DOI] [PubMed] [Google Scholar]
  • 10.Guillemot F, Nagy A, Auerbach A, Rossant J, Joyner AL. Essential role of Mash-2 in extraembryonic development. Nature. 1994;371:333–336. doi: 10.1038/371333a0. [DOI] [PubMed] [Google Scholar]
  • 11.Riley P, Anson-Cartwright L, Cross JC. The Hand1 bHLH transcription factor is essential for placentation and cardiac morphogenesis. Nat Genet. 1998;18:271–275. doi: 10.1038/ng0398-271. [DOI] [PubMed] [Google Scholar]
  • 12.Anson-Cartwright L, et al. The glial cells missing-1 protein is essential for branching morphogenesis in the chorioallantoic placenta. Nat Genet. 2000;25:311–314. doi: 10.1038/77076. [DOI] [PubMed] [Google Scholar]
  • 13.Donnison M, et al. Loss of the extraembryonic ectoderm in Elf5 mutants leads to defects in embryonic patterning. Development. 2005;132:2299–2308. doi: 10.1242/dev.01819. [DOI] [PubMed] [Google Scholar]
  • 14.Nishioka N, et al. The Hippo signaling pathway components Lats and Yap pattern Tead4 activity to distinguish mouse trophectoderm from inner cell mass. Dev Cell. 2009;16:398–410. doi: 10.1016/j.devcel.2009.02.003. [DOI] [PubMed] [Google Scholar]
  • 15.Kubaczka C, et al. Direct induction of trophoblast stem cells from murine fibroblasts. Cell Stem Cell. 2015;17:557–568. doi: 10.1016/j.stem.2015.08.005. [DOI] [PubMed] [Google Scholar]
  • 16.Benchetrit H, et al. Extensive nuclear reprogramming underlies lineage conversion into functional trophoblast stem-like cells. Cell Stem Cell. 2015;17:543–556. doi: 10.1016/j.stem.2015.08.006. [DOI] [PubMed] [Google Scholar]
  • 17.Kuckenberg P, et al. The transcription factor TCFAP2C/AP-2gamma cooperates with CDX2 to maintain trophectoderm formation. Mol Cell Biol. 2010;30:3310–3320. doi: 10.1128/MCB.01215-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Chen AE, et al. Optimal timing of inner cell mass isolation increases the efficiency of human embryonic stem cell derivation and allows generation of sibling cell lines. Cell Stem Cell. 2009;4:103–106. doi: 10.1016/j.stem.2008.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Bai Q, et al. Dissecting the first transcriptional divergence during human embryonic development. Stem Cell Rev. 2012;8:150–162. doi: 10.1007/s12015-011-9301-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Niakan KK, Eggan K. Analysis of human embryos from zygote to blastocyst reveals distinct gene expression patterns relative to the mouse. Dev Biol. 2013;375:54–64. doi: 10.1016/j.ydbio.2012.12.008. [DOI] [PubMed] [Google Scholar]
  • 21.Deglincerti A, et al. Self-organization of the in vitro attached human embryo. Nature. 2016;533:251–254. doi: 10.1038/nature17948. [DOI] [PubMed] [Google Scholar]
  • 22.Sõber S, et al. Extensive shift in placental transcriptome profile in preeclampsia and placental origin of adverse pregnancy outcomes. Sci Rep. 2015;5:13336. doi: 10.1038/srep13336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Blakeley P, et al. Defining the three cell lineages of the human blastocyst by single-cell RNA-seq. Development. 2015;142:3613. doi: 10.1242/dev.131235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Amita M, et al. Complete and unidirectional conversion of human embryonic stem cells to trophoblast by BMP4. Proc Natl Acad Sci USA. 2013;110:E1212–E1221. doi: 10.1073/pnas.1303094110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Xu RH. In vitro induction of trophoblast from human embryonic stem cells. Methods Mol Med. 2006;121:189–202. doi: 10.1385/1-59259-983-4:187. [DOI] [PubMed] [Google Scholar]
  • 26.Drukker M, et al. Isolation of primitive endoderm, mesoderm, vascular endothelial and trophoblast progenitors from human pluripotent stem cells. Nat Biotechnol. 2012;30:531–542. doi: 10.1038/nbt.2239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Horii M, et al. Human pluripotent stem cells as a model of trophoblast differentiation in both normal development and disease. Proc Natl Acad Sci USA. 2016;113:E3882–E3891. doi: 10.1073/pnas.1604747113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Telugu BP, et al. Comparison of extravillous trophoblast cells derived from human embryonic stem cells and from first trimester human placentas. Placenta. 2013;34:536–543. doi: 10.1016/j.placenta.2013.03.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Xu RH, et al. BMP4 initiates human embryonic stem cell differentiation to trophoblast. Nat Biotechnol. 2002;20:1261–1264. doi: 10.1038/nbt761. [DOI] [PubMed] [Google Scholar]
  • 30.Winnier G, Blessing M, Labosky PA, Hogan BL. Bone morphogenetic protein-4 is required for mesoderm formation and patterning in the mouse. Genes Dev. 1995;9:2105–2116. doi: 10.1101/gad.9.17.2105. [DOI] [PubMed] [Google Scholar]
  • 31.Fujiwara T, Dehart DB, Sulik KK, Hogan BL. Distinct requirements for extra-embryonic and embryonic bone morphogenetic protein 4 in the formation of the node and primitive streak and coordination of left-right asymmetry in the mouse. Development. 2002;129:4685–4696. doi: 10.1242/dev.129.20.4685. [DOI] [PubMed] [Google Scholar]
  • 32.Graham SJ, et al. BMP signalling regulates the pre-implantation development of extra-embryonic cell lineages in the mouse embryo. Nat Commun. 2014;5:5667. doi: 10.1038/ncomms6667. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Kurek D, et al. Endogenous WNT signals mediate BMP-induced and spontaneous differentiation of epiblast stem cells and human embryonic stem cells. Stem Cell Rep. 2015;4:114–128. doi: 10.1016/j.stemcr.2014.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Lichtner B, Knaus P, Lehrach H, Adjaye J. BMP10 as a potent inducer of trophoblast differentiation in human embryonic and induced pluripotent stem cells. Biomaterials. 2013;34:9789–9802. doi: 10.1016/j.biomaterials.2013.08.084. [DOI] [PubMed] [Google Scholar]
  • 35.Bernstein BE, et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell. 2006;125:315–326. doi: 10.1016/j.cell.2006.02.041. [DOI] [PubMed] [Google Scholar]
  • 36.Mikkelsen TS, et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007;448:553–560. doi: 10.1038/nature06008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Gifford CA, et al. Transcriptional and epigenetic dynamics during specification of human embryonic stem cells. Cell. 2013;153:1149–1163. doi: 10.1016/j.cell.2013.04.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Xie W, et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell. 2013;153:1134–1148. doi: 10.1016/j.cell.2013.04.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Lee CQ, et al. What is trophoblast? A combination of criteria define human first-trimester trophoblast. Stem Cell Rep. 2016;6:257–272. doi: 10.1016/j.stemcr.2016.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Tang C, et al. An antibody against SSEA-5 glycan on human pluripotent stem cells enables removal of teratoma-forming cells. Nat Biotechnol. 2011;29:829–834. doi: 10.1038/nbt.1947. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Ma GT, et al. GATA-2 and GATA-3 regulate trophoblast-specific gene expression in vivo. Development. 1997;124:907–914. doi: 10.1242/dev.124.4.907. [DOI] [PubMed] [Google Scholar]
  • 42.Johnson W, et al. Regulation of the human chorionic gonadotropin alpha- and beta-subunit promoters by AP-2. J Biol Chem. 1997;272:15405–15412. doi: 10.1074/jbc.272.24.15405. [DOI] [PubMed] [Google Scholar]
  • 43.Biadasiewicz K, et al. Transcription factor AP-2α promotes EGF-dependent invasion of human trophoblast. Endocrinology. 2011;152:1458–1469. doi: 10.1210/en.2010-0936. [DOI] [PubMed] [Google Scholar]
  • 44.Knöfler M, et al. Transcriptional regulation of the human chorionic gonadotropin beta gene during villous trophoblast differentiation. Endocrinology. 2004;145:1685–1694. doi: 10.1210/en.2003-0954. [DOI] [PubMed] [Google Scholar]
  • 45.Li Y, et al. BMP4-directed trophoblast differentiation of human embryonic stem cells is mediated through a ΔNp63+ cytotrophoblast stem cell state. Development. 2013;140:3965–3976. doi: 10.1242/dev.092155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Home P, et al. GATA3 is selectively expressed in the trophectoderm of peri-implantation embryo and directly regulates Cdx2 gene expression. J Biol Chem. 2009;284:28729–28737. doi: 10.1074/jbc.M109.016840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Cao Z, et al. Transcription factor AP-2γ induces early Cdx2 expression and represses HIPPO signaling to specify the trophectoderm lineage. Development. 2015;142:1606–1615. doi: 10.1242/dev.120238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Tzouanacou E, Tweedie S, Wilson V. Identification of Jade1, a gene encoding a PHD zinc finger protein, in a gene trap mutagenesis screen for genes involved in anteroposterior axis development. Mol Cell Biol. 2003;23:8553–8562. doi: 10.1128/MCB.23.23.8553-8562.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.González F, et al. An iCRISPR platform for rapid, multiplexable, and inducible genome editing in human pluripotent stem cells. Cell Stem Cell. 2014;15:215–226. doi: 10.1016/j.stem.2014.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Petropoulos S, et al. Single-cell RNA-seq reveals lineage and X chromosome dynamics in human preimplantation embryos. Cell. 2016;167:285. doi: 10.1016/j.cell.2016.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Home P, et al. Genetic redundancy of GATA factors in the extraembryonic trophoblast lineage ensures the progression of preimplantation and postimplantation mammalian development. Development. 2017;144:876–888. doi: 10.1242/dev.145318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Pfeffer PL, Pearton DJ. Trophoblast development. Reproduction. 2012;143:231–246. doi: 10.1530/REP-11-0374. [DOI] [PubMed] [Google Scholar]
  • 53.Baczyk D, et al. Glial cell missing-1 transcription factor is required for the differentiation of the human trophoblast. Cell Death Differ. 2009;16:719–727. doi: 10.1038/cdd.2009.1. [DOI] [PubMed] [Google Scholar]
  • 54.Lee Y, et al. A unifying concept of trophoblastic differentiation and malignancy defined by biomarker expression. Hum Pathol. 2007;38:1003–1013. doi: 10.1016/j.humpath.2006.12.012. [DOI] [PubMed] [Google Scholar]
  • 55.Ugele B, Regemann K. Differential increase of steroid sulfatase activity in XX and XY trophoblast cells from human term placenta with syncytia formation in vitro. Cytogenet Cell Genet. 2000;90:40–46. doi: 10.1159/000015657. [DOI] [PubMed] [Google Scholar]
  • 56.Takaku M, et al. GATA3-dependent cellular reprogramming requires activation-domain dependent recruitment of a chromatin remodeler. Genome Biol. 2016;17:36. doi: 10.1186/s13059-016-0897-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Wei G, et al. Genome-wide analyses of transcription factor GATA3-mediated gene regulation in distinct T cell types. Immunity. 2011;35:299–311. doi: 10.1016/j.immuni.2011.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Ray S, et al. Context-dependent function of regulatory elements and a switch in chromatin occupancy between GATA3 and GATA2 regulate Gata2 transcription during trophoblast differentiation. J Biol Chem. 2009;284:4978–4988. doi: 10.1074/jbc.M807329200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Hinman VF, Davidson EH. Evolutionary plasticity of developmental gene regulatory network architecture. Proc Natl Acad Sci USA. 2007;104:19404–19409. doi: 10.1073/pnas.0709994104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Goentoro L, Shoval O, Kirschner MW, Alon U. The incoherent feedforward loop can provide fold-change detection in gene regulation. Mol Cell. 2009;36:894–899. doi: 10.1016/j.molcel.2009.11.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Gupta R, Ezashi T, Roberts RM. Squelching of ETS2 transactivation by POU5F1 silences the human chorionic gonadotropin CGA subunit gene in human choriocarcinoma and embryonic stem cells. Mol Endocrinol. 2012;26:859–872. doi: 10.1210/me.2011-1146. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) method. Methods. 2001;25:402–408. doi: 10.1006/meth.2001.1262. [DOI] [PubMed] [Google Scholar]
  • 63.Wickham H. ggplot2: Elegant Graphics for Data Analysis. Springer; New York: 2009. [Google Scholar]
  • 64.Brandl C, et al. Creation of targeted genomic deletions using TALEN or CRISPR/Cas nuclease pairs in one-cell mouse embryos. FEBS Open Bio. 2014;5:26–35. doi: 10.1016/j.fob.2014.11.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Soumillon M, Cacchiarelli D, Semrau S, van Oudenaarden A, Mikkelsen TS. Characterization of directed differentiation by high-throughput single-cell RNA-Seq. bioRxiv. March 5, 2014 doi: 10.1101/003236. [DOI] [Google Scholar]
  • 66.Ziegenhain C, et al. Comparative analysis of single-cell RNA sequencing methods. Mol Cell. 2017;65:631–643.e4. doi: 10.1016/j.molcel.2017.01.023. [DOI] [PubMed] [Google Scholar]
  • 67.Chavez SL, et al. Dynamic blastomere behaviour reflects human embryo ploidy by the four-cell stage. Nat Commun. 2012;3:1251. doi: 10.1038/ncomms2249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Chavez SL, et al. Comparison of epigenetic mediator expression and function in mouse and human embryonic blastomeres. Hum Mol Genet. 2014;23:4970–4984. doi: 10.1093/hmg/ddu212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Schindelin J, et al. Fiji: An open-source platform for biological-image analysis. Nat Methods. 2012;9:676–682. doi: 10.1038/nmeth.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Gautier L, Cope L, Bolstad BM, Irizarry RA. affy–Analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 2004;20:307–315. doi: 10.1093/bioinformatics/btg405. [DOI] [PubMed] [Google Scholar]
  • 71.Bourgon R, Gentleman R, Huber W. Independent filtering increases detection power for high-throughput experiments. Proc Natl Acad Sci USA. 2010;107:9546–9551. doi: 10.1073/pnas.0914005107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Smyth GK. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004;3:1–25. doi: 10.2202/1544-6115.1027. [DOI] [PubMed] [Google Scholar]
  • 73.Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Heinz S, et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010;38:576–589. doi: 10.1016/j.molcel.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Bailey TL, et al. MEME SUITE: Tools for motif discovery and searching. Nucleic Acids Res. 2009;37:W202–W208. doi: 10.1093/nar/gkp335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Trapnell C, Pachter L, Salzberg SL. TopHat: Discovering splice junctions with RNA-seq. Bioinformatics. 2009;25:1105–1111. doi: 10.1093/bioinformatics/btp120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Kent WJ, et al. The human genome browser at UCSC. Genome Res. 2002;12:996–1006. doi: 10.1101/gr.229102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Law CW, Chen Y, Shi W, Smyth GK. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014;15:R29. doi: 10.1186/gb-2014-15-2-r29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative genomics viewer (IGV): High-performance genomics data visualization and exploration. Brief Bioinform. 2013;14:178–192. doi: 10.1093/bib/bbs017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Renaud G, Stenzel U, Maricic T, Wiebe V, Kelso J. deML: Robust demultiplexing of Illumina sequences using a likelihood-based approach. Bioinformatics. 2015;31:770–772. doi: 10.1093/bioinformatics/btu719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Dobin A, et al. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Macosko EZ, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161:1202–1214. doi: 10.1016/j.cell.2015.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Kharchenko P, Fan J. 2016 scde: Single Cell Differential Expression, Version 2.0.1. Available at pklab.med.harvard.edu/scde. Accessed January 20, 2017.
  • 84.Robinson MD, McCarthy DJ, Smyth GK. edgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Lun AT, Bach K, Marioni JC. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 2016;17:75. doi: 10.1186/s13059-016-0947-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Donaldson J. 2016 tsne: T-Distributed Stochastic Neighbor Embedding for R (t-SNE), Version 0.1-3. Available at https://github.com/jdonaldson/rtsne/. Accessed January 20, 2017.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.1708341114.sd01.xlsx (440.1KB, xlsx)
Supplementary File
pnas.1708341114.sd02.xlsx (100.6KB, xlsx)
Supplementary File
Supplementary File

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES