Summary
A complete knockout of a single key pluripotency gene may drastically affect embryonic stem cell function and epigenetic reprogramming. In contrast, elimination of only one allele of a single pluripotency gene is mostly considered harmless to the cell. To understand whether complex haploinsufficiency exists in pluripotent cells, we simultaneously eliminated a single allele in different combinations of two pluripotency genes (i.e., Nanog+/−;Sall4+/−, Nanog+/−;Utf1+/−, Nanog+/−;Esrrb+/− and Sox2+/−;Sall4+/−). Although these double heterozygous mutant lines similarly contribute to chimeras, fibroblasts derived from these systems show a significant decrease in their ability to induce pluripotency. Tracing the stochastic expression of Sall4 and Nanog at early phases of reprogramming could not explain the seen delay or blockage. Further exploration identifies abnormal methylation around pluripotent and developmental genes in the double heterozygous mutant fibroblasts, which could be rescued by hypomethylating agent or high OSKM levels. This study emphasizes the importance of maintaining two intact alleles for pluripotency induction.
Keywords: haploinsufficiency, pluripotent stem cells, reprogramming, methylation, stochastic expression, knockin/knockout targeting approach, nuclear transfer, tracing system, reporter genes
Graphical abstract
Highlights
-
•
PSCs with complex haploinsufficiency maintain their developmental potential
-
•
Fibroblasts with one allele elimination in two PSC genes show reprogramming delay
-
•
Fibroblasts with one allele elimination in two PSC genes display methylation defect
-
•
The KI/KO approach to introduce a reporter gene should be used carefully
In this article, Buganim and colleagues investigate the effect of complex haploinsufficiency on PSCs. By eliminating single alleles in different combinations of two key pluripotency genes, they show that, although mutant PSCs can contribute equally to chimeras, their derived fibroblasts exhibit reprogramming blockage. Abnormal methylation around vital genes underscores the necessity of maintaining two intact alleles for effective pluripotency induction.
Introduction
Embryonic development and cell fate induction require appropriate gene dosage for the activation of the regulatory circuits that control cellular identity.
While a complete knockout (KO) of an important gene may be detrimental to the cell as seen for Oct4 and Sox2 (Masui et al., 2007; Nichols et al., 1998), a complete KO of other genes such as Nanog, while partially maintains the pluripotent state of the cells, and contributes to chimeras, shows a dramatic reduced reprogramming efficiency to induced pluripotent stem cells (iPSCs) by their fibroblast derivatives, which can only be partially overcome by high levels of exogenous OCT4, SOX2, KLF4, and MYC (OSKM) factors (Carter et al., 2014; Schwarz et al., 2014). In contrast, elimination of only one allele in one gene is considered harmless to the cell.
Given this assumption, many fluorescent reporter cell lines have been generated over the years using the knockin/KO (KI/KO) approach, leaving only one intact allele of the targeted gene. Such reporter lines (e.g., Sox2 [Arnold et al., 2011], Nanog [Wernig et al., 2008], and Utf1 [Morshedi et al., 2013]) are useful to study pluripotency acquisition following reprogramming and nuclear transfer (Buganim et al., 2012, 2014; Boiani et al., 2002). Although one allele elimination is considered safe, there are rare cases when a reduction in expression of approximately 50% is detrimental to the cell, a phenomenon termed haploinsufficiency. Moreover, even when one allele elimination is not detrimental to the cells, our previous study suggest that reduced expression levels of genes such as Nanog may result in suboptimal reprogramming, producing low-quality iPSCs (Buganim et al., 2014).
During the maturation phase of the reprogramming process, epigenetic changes happen stochastically to eventually allow expression of the first pluripotent-related genes (Buganim et al., 2013; David and Polo, 2014). Using single-cell analyses, it has been shown that stochastic low expression of pluripotent genes such as Utf1, Esrrb, Sall4 (Buganim et al., 2012), and Nanog (Polo et al., 2012) can be observed early on in the process in a small fraction of induced cells which is correlated with the low efficiency of reprogramming. The stochastic behavior of the maturation phase ends with the activation of late pluripotent genes such as Sox2, Dppa4, Prdm14, and Gdf3 (Buganim et al., 2012; Soufi et al., 2012) which unleashes the final deterministic phase, leading to iPSC stabilization (Buganim et al., 2013).
While efforts to understand the link between exogenous pluripotent reprogramming factors, iPSC quality, and efficiency have been substantial (Benchetrit et al., 2019; Buganim et al., 2014; Carey et al., 2011; Sebban and Buganim, 2015), studies focusing on the effect of reduced levels of endogenous pluripotency genes are lacking and mostly rely on single-gene KO or haploid embryonic stem cell (ESC) systems (Elling et al., 2019; Leeb and Wutz, 2011). Given this, we sought to examine whether a complex haploinsufficiency (i.e., insufficiency induced by the elimination of one allele in combinations of genes) exists in pluripotent cells and whether and how it may affect their developmental potential and their cells’ derivatives.
To address that, we engineered three secondary systems, NGFP2 (Nanog-GFP#2 [Wernig et al., 2008]), NGFP1 (Nanog-GFP#1 [Wernig et al., 2008]), and SGFP1 (Sox2-GFP#1) to incorporate KO of one allele in two different pluripotent genes. These double heterozygous mutant lines include NGFP2 (Nanog+/−;Sall4+/−, Nanog+/−;Esrrb+/− and Nanog+/−;Utf1+/−), NGFP1 (Nanog+/−;Sall4+/−), and SGFP1 (Sox2+/−;Sall4+/−). Interestingly, while all double heterozygous mutant lines contributed to chimeras similarly to their parental iPSC controls (i.e., NGFP2 [Nanog+/−], NGFP1 [Nanog+/−], and SGFP1 [Sox2+/−]), multiple derivations of fibroblasts from these lines resulted in poor reprogramming efficiency. This reduced reprogramming efficiency was evident in the nuclear transfer (NT) technique as well.
Tracing the stochastic expression of Sall4 or Nanog along the reprogramming process revealed that only a very small fraction of cells activated these loci, a result that cannot explain the global reprogramming blockage seen in the double heterozygous mutant lines. We then profiled the CpG-rich methylation landscape of fibroblasts derived from SGFP1S2+/−;S4+/− and SGFP1S2+/− control, and noted a clear difference in the methylation levels of multiple developmental and pluripotent loci in the double heterozygous mutant fibroblasts. Accordingly, treating all double heterozygous mutant fibroblasts for 2 days before factor induction with 5-azacytidine rescued the reprogramming blockage and allowed the induction of pluripotency. This study emphasizes the importance of having two intact alleles for proper pluripotency induction and normal embryonic development, and raises a concern regarding the often used KI/KO technique for the purpose of introducing reporters.
Results
Double heterozygous mutant pluripotent cells contribute to chimeras and exhibit modest transcriptional changes
Considering the vital role of functioning core ESC circuitry to pluripotency, we hypothesized that even a slight decrease in the expression of key pluripotency genes could significantly impact the developmental potential of the cells or the ability of their somatic cell derivatives to undergo reprogramming. We focused our research on secondary iPSC systems (i.e., iPSC clones that harbor functional doxycycline (dox)-inducible OSKM factor integrations in their genome), as these systems contribute to chimeras and exhibit stable and reproducible reprogramming efficiency by minimizing cell heterogeneity (Wernig et al., 2008).
We targeted the NGFP2 secondary system, as it already contains a single KI/KO allele of Nanog (Wernig et al., 2008). We chose to eliminate a single allele of Esrrb, Utf1, or Sall4 as they have all been shown to be important for pluripotency and reprogramming (Buganim et al., 2012; Feng et al., 2009; Tsubooka et al., 2009). To produce a single allele KO and to be able to monitor the activity of the targeted allele, we designed donor vectors that fused, in frame, to the first or second exon a tdTomato reporter (Figures 1A and 1B). To avoid exon skipping and to destabilize the targeted mRNA, polyA was omitted from the targeting vectors. Electroporated colonies were examined for correct targeting by southern blots using external or internal probes (Figure 1C). Overall, we isolated two correctly targeted clones for each combination of manipulated genes: Nanog+/−; Esrrb+/− (NGFP2N+/−;E+/−, clones# 1 and 5), Nanog+/−; Utf1+/− (NGFP2N+/−;U+/−, clones# 3 and 5) and Nanog+/−; Sall4+/− (NGFP2N+/−;S+/−, clones# 3 and 5). To validate the reduced levels of the targeting genes, we cultured the cells in 2i/L medium (GSK3β and MEK inhibitors and Lif) that recapitulates the ground pluripotent state and facilitates gene expression from both alleles (Miyanari and Torres-Padilla, 2012). qPCR and western blot analyses demonstrated a reduction in approximately 50% of the total mRNA or protein levels of all targeted alleles (Figures 1D and S1A), but not in other key pluripotency genes such as Oct4, Sox2, Lin28, Fbxo15, and Fgf4 (Figure S1B). Some further reduction in the protein level of NANOG and ESRRB was seen in NGFP2N+/−;U+/− and NGFP2N+/−;S+/− iPSC lines (Figure 1D) and in the mRNA of the Dppa3 gene in NGFP2N+/−;S+/− line (Figure S1A). These results suggest that Nanog and Esrrb are either direct or indirect targets of SALL4 and UTF1 and that Dppa3 is regulated by SALL4. To test the stability of the targeted alleles, cells grown in either serum/Lif (S/L) or 2i/L conditions were analyzed for GFP and tdTomato activity using flow cytometry. In agreement with the western blot analysis, cells grown under S/L conditions exhibited 68% GFP reporter activity (reporter that was introduced in frame and contains polyA) in NGFP2N+/− control and NGFP2N+/−;E+/− iPSC lines, and 55% and 58% in NGFP2N+/−;S+/− and NGFP2N+/−;U+/− iPSC lines, respectively (Figure 1E). In accordance with our strategy, tdTomato activity for all targeted genes was minor (Figure 1E). Nanog-GFP and tdTomato reporters showed improved activation under 2i/L conditions in all clones, but a reduced percentage remained in the double heterozygous mutant iPSC lines (Figure S1C).
To investigate the impact of eliminating a single allele in two different pluripotent genes on the developmental potential of the cells, we injected the cells into blastocysts and measured their potential to form chimeric mice. A comparable grade of chimerism was noted between all double heterozygous mutant and control iPSC lines, suggesting that elimination of a single allele in these combinations of two genes does not exert a significant developmental barrier (Figure S2A).
Gene expression can distinguish between iPSCs with poor, low, and high quality as assessed by grade of chimerism and 4n complementation assay (Buganim et al., 2014). Thus, we profiled the transcriptome of the three heterozygous mutant lines, as well as the parental NGFP2N+/− cells and wild-type (WT) ESCs (V6.5), grown in either S/L or 2i/L conditions. Pearson correlation heatmap clustered the cells into two main groups based on the culture conditions. Nevertheless, within the S/L group some changes in gene expression were noted in NGFP2N+/−;S+/− and NGFP2N+/−;U+/− compared with NGFP2N+/−;E+/−, parental NGFP2N+/−, and control WT ESCs (Figure S2B). Given that Esrrb has been identified as a downstream target gene of NANOG (Festuccia et al., 2012), it is unsurprising that minimal transcriptional changes were observed between the parental NGFP2N+/− and NGFP2N+/−;E+/− lines. Principal component analysis (PCA) validated the Pearson correlation heatmap, separating S/L conditions from 2i/L conditions by PC1 and NGFP2N+/−;S+/− and NGFP2N+/−;U+/− that were grown under S/L conditions from the rest of the samples by PC2 (Figure S2C). Interestingly, NGFP2N+/−;U+/− grown under S/L conditions, clustered closer to samples that grew under 2i/L conditions as indicated by PC1 (Figure S2C). In contrast, cells grown under 2i/L conditions clustered together with minimal transcriptional changes between them (Figure S2C). Considering the expression differences among the lines grown under S/L conditions, we performed a differential expression analysis (p < 0.05, 2-fold change) comparing the control cells with all the double heterozygous mutant iPSC lines. This analysis revealed 1,604 genes with differential expression between the control groups and at least one double heterozygous mutant line (Table S1). Gene Ontology (GO) term analysis for this gene list, using EnrichR (Xie et al., 2021), includes “loss of function of Oct4 in ESCs,” “TGFβ regulation,” “abnormal heart position,” and “abnormal mesendoderm development” (Figure S2D). A gene regulatory network (GRN) constructed using iRegulon identified key pluripotent, mesodermal and neuronal developmental genes, such as Pou5f1, Pqbp1, Pax2, Bcl11a, and Zfp110 (Casademunt et al., 1999; Fotaki et al., 2008; Iwasaki and Thomsen, 2014; Simon et al., 2020), as major regulators of these aberrantly expressed 1,604 genes (Figure S2E). These results suggest that the elimination of one allele of two distinct pluripotent genes, while exhibiting some transcriptional changes under S/L conditions, still maintains a functional pluripotent state with minimal variations in gene expression in the ground pluripotent state.
Fibroblasts derived from NGFP2 double heterozygous mutant iPSC lines fail to induce pluripotency
Given that the reprogramming process involves a stochastic phase of activation of pluripotency genes (Buganim et al., 2012), we hypothesized that mouse embryonic fibroblasts (MEFs) harboring double heterozygous mutant alleles might exhibit reprogramming delay because of difficulties in the activation of the core pluripotency circuitry.
To that end, secondary MEFs were established from all the three NGFP2 double heterozygous mutant lines and control. To initiate reprogramming, MEFs were exposed to dox for 13 days followed by dox withdrawal for 3 more days to stabilize any iPSC colony, and the percentage of Nanog-GFP-positive cells was scored by flow cytometry.
NGFP2N+/− control induced MEFs exhibited the expected approximately 2% of Nanog-GFP-positive cells by the end of the reprogramming, while 2-independent clones from each double heterozygous mutant line showed a complete blockage (Figure 2A). Cell death and proliferation arrest were ruled out, as all double heterozygous mutant and control plates stained equally to crystal violet (Figure 2B), and alkaline phosphatase, albeit to a lesser extent, indicating reprogramming initiation (Figure 2C). By extending dox exposure to 20 days, a small percentage of Nanog-GFP-positive cells did emerge in all double heterozygous mutant lines, suggesting that some cells can overcome this blockage when prolonged exposure of OSKM is triggered (Figure 2D).
We then asked whether the reprogramming defect can be rescued by exogenously expressing the targeted genes. Double heterozygous mutant MEFs were transduced with either Nanog or with its corresponding targeted gene (i.e., Sall4, Utf1 or Esrrb) or with additional viruses encoding for OSK and reprogramming was scored. Both Nanog or each of the corresponding factors showed either partial or complete rescue of the reprogramming blockage, while additional OSK further boosted the reprogramming process (Figures 2E, S2F, and S2G). Given that reduced levels of ESRRB was noted in all the double heterozygous mutant iPSC lines (Figure 1D), we asked whether ectopic expression of Esrrb can rescue all the mutant MEF lines. While additional expression of Esrrb could rescue NGFP2N+/−;E+/− and NGFP2N+/−;U+/−, it had only a mild effect, although significant, on NGFP2N+/−;S+/− (Figure S2H). Similarly, ectopic expression of Sall4 rescued only some of the lines, but not others (Figure S2I). These data suggest that the seen blockage is not specific to a unique allele elimination, but rather it is associated with a broader effect that can be overcome only by high levels of pluripotent factors, such as OSK.
We then explored whether the observed reprogramming blockage is specific to the reprogramming by defined factors or if it would persist in other reprogramming techniques, such as NT. Enucleated eggs were injected with MEF nuclei from each of the three double heterozygous mutant MEF lines and control. Blastocyst formation and establishment of ESC lines were scored. Notably, while all lines exhibited a comparable and expected efficiency in producing blastocysts, the efficiency of ESC line derivation was significantly lower in the double heterozygous mutant lines compared with controls (i.e., 0%–4% vs. 11% in control lines) (Figures 2F and 2G). These results suggest that eliminating two alleles from two distinct key pluripotency genes impacts the somatic nucleus in a manner that hinders its ability to undergo reprogramming to pluripotency.
NGFP2N+/− double heterozygous mutant lines show an early defect in the activation of epithelial markers
We next profiled the transcriptome of the three double heterozygous mutant lines and control lines (i.e., NGFP2N+/− cells, and NGFP2N+/− cells that were infected with empty vector) after 6 days of reprogramming. We chose this time point as it showed a clear reprogramming delay in the double heterozygous mutant plates compared with control plates. NGFP2N+/− MEFs and the parental NGFP2N+/− iPSCs were profiled as well. Hierarchical clustering analysis showed that all the double heterozygous mutant lines clustered together and were different from the control lines (Figure 3A). PCA and scatterplots demonstrate significant transcriptional changes by day 6 of reprogramming between the double heterozygous mutant lines and controls (Figures 3B–3D). Notably, all the double heterozygous mutant lines exhibited minimal transcriptional changes both among themselves and when compared with NGFP2N+/− MEFs, indicating the presence of an early reprogramming defect.
Differential expression analysis between the control groups and all the double heterozygous mutant lines identified 294 genes (p < 0.05, 2-fold change) that are upregulated solely in the control groups and 18 genes that are upregulated exclusively in the double heterozygous mutant lines (Figure S3A; Table S1). GO term analysis for the 294 genes of the control groups identified “epithelial cells,” “EMT,” “tight junction,” and “intermediate filament” as the most enriched terms (Figure S3B), suggesting the acquisition of an epithelial identity via mesenchymal to epithelial transition (MET). Accordingly, GRN analysis using iRegulon identified key reprogramming and MET factors such as GLIS1 (Scoville et al., 2017) and GATA2 (Shu et al., 2015) as key regulators for these 294 genes (Figure S3C). GO term analysis of the 18 genes of the double heterozygous mutant lines identified “JUND” as one of the most significant regulators of this gene list and “serotonin receptor signaling” as the most enriched pathway (Figure S3D). Of note, the AP1 family of proteins was previously suggested to act as the safeguard of the fibroblast identity (Jaber et al., 2020; Liu et al., 2015).
Given these analysis, we examined the expression levels of well-known fibroblastic markers (Thy1, Col5a1, Postn, and Des) and EMT regulators (Twist1, Zeb1, Snai2, and Foxc2), and noticed a comparable downregulation between the control and the double heterozygous mutant lines (Figures 3E and S3E). In contrast, the double heterozygous mutant lines failed to express epithelial genes such as Cdh1, Dsp, Epcam, Cldn4, and Cldn7, (Figures 3F and S3F), suggesting late MET blockage.
Reprogramming impairment caused by double heterozygous allele elimination is not restricted to a system or to the identity of the modified alleles
To exclude the possibility that the observed effect is system specific, we used additional secondary iPSC system, NGFP1N+/−, which differs in its reprogramming efficiency, dynamics, and factor stoichiometry (Wernig et al., 2008).
As NGFP2N+/−;S+/− demonstrated the strongest delay in pluripotency induction, we thought to eliminate one allele of Sall4 in NGFP1N+/− as well. Initially, we confirmed by single molecule mRNA-fluorescence in situ hybridization (sm-mRNA-FISH) that the strong effect seen in NGFP2N+/−;S+/− is a result of approximately a 50% decrease in the transcript levels of Sall4 (Figure 4A).
Then, we targeted a tdTomato reporter gene into the Sall4 locus of NGFP1N+/− as described above (Figure 4B). Correctly targeted NGFP1N+/−;S+/− iPSC colonies were validated by PCR and western blot (Figures 4C and 4D). We also produced a Nanog KO NGFP1N−/– line as a single KO gene control (Figures 4E, 4F, and S4A). Secondary MEFs were produced from NGFP1N+/−, NGFP1N+/−;S+/−, and NGFP1N−/–, which were then exposed to dox for 13 days followed by 3 days of dox removal. Flow cytometry analysis of the various reprogramming plates showed a clear and comparable reduction in the percentage of Nanog-GFP-positive cells in NGFP1N+/−;S+/− and NGFP1N−/–-induced cells compared with control NGFP1N+/− cells (Figure 4G). As in the NGFP2N+/− system, exogenous expression of Nanog rescued NGFP1N+/−;S+/− double heterozygous mutant cells (Figures 4G and 4H).
We then asked whether the pluripotency induction impairment seen is restricted to combinations that harbor allele elimination of Nanog. To that end, we eliminated one allele of Sall4 in SGFP1S2+/− line, a secondary iPSC system that was generated in our laboratory and contains GFP reporter instead of one allele of Sox2. Correctly targeted SGFP1S2+/−;S4+/− iPSC colonies were validated by PCR, western blot, and immunostaining (Figures 4I, 4J, and S4B). As expected, and differently than the NGFP2/1 double heterozygous mutant lines (Figures 1D and S4C), SGFP1S2+/−;S4+/− did not show reduction of ESRRB levels (Figure S4C). Nevertheless, a significant reduction in reprogramming efficiency was noted in SGFP1S2+/−;S4+/− cells compared with SGFP1S2+/− controls (Figures 4K–4M). It is interesting, however, to note that while all the double heterozygous NGFPN+/− lines produced a negligible number of iPSCs following 13 days of reprogramming (i.e., 0.0%–0.2%), the SGFP1S2+/−;S4+/− double heterozygous mutant cells produced approximately 2%–2.5% of iPSCs. This difference can be explained by the levels of the Oct4 transgene that is much higher in SGFP1S2+/− cells compared with the NGFPN+/− cells (Figure S4D). Taken together, these results suggest that the reprogramming blockage seen in the double heterozygous mutant lines is not specific to a system nor to a combination of eliminated genes’ alleles.
Reduced early stochastic expression of the targeted genes cannot explain the reprogramming blockage seen in the double heterozygous mutant lines
Stochastic expression of pluripotency genes during early stages of reprogramming was evident by multiple single-cell studies (Buganim et al., 2012; Guo et al., 2019). Thus, we hypothesized that the lack of two key pluripotency alleles in the double heterozygous mutant cells might impair their ability to pass the early stochastic phase. To explore it, we generated tracing system for Nanog and Sall4, as they both exhibit high stochastic activity at early stages of reprogramming (Buganim et al., 2012).
We targeted a 2A-EGFP-ERT-CRE-ERT cassette into the 3′ UTR of Sall4 or Nanog using ESC line that contains a lox-STOP-lox (L-S-L) cassette upstream to a tdTomato reporter gene and M2rtTA transactivator at the Rosa26 locus (Figures 5A and 5B). Transfected colonies were sorted based on EGFP expression and correct targeting was validated by PCR (Figures 5C and 5D). Correctly targeted ESC clones (i.e., RL8 for Sall4 and RL9 for Nanog) were exposed to tamoxifen (Tam) and the percentage of tdTomato-positive cells was scored by flow cytometry (Figures 5E, 5F, and S5A–S5D), demonstrating high L-S-L cassette removal efficiency.
To correlate the stochastic expression of the targeted alleles to the observed delay, most induced cells should show some activation of the targeted alleles at early time point of reprogramming.
MEFs produced from Sall4 and Nanog tracing ESC systems were transduced with dox-inducible OSKM cassette and tdTomato activation was assessed in the induced cells after 6 days and after 13 days of reprogramming followed by 3 days of dox removal. Only up to 0.24% of the Sall4 tracing cells and up to 0.62% of Nanog tracing cells were tdTomato-positive at day 6 of reprogramming, ruling out the possibility that Sall4 or Nanog stochastic expression early in reprogramming is responsible for the observed blockage (Figures 5G–5I and 5J–5L). In addition, 7.42% of SALL4-2A-EGFP in conjunction with 7.96% of tdTomato-positive cells for the Sall4 tracing system and 2.8% of NANOG-2A-EGFP together with 6.7% of tdTomato-positive cells for the Nanog tracing system at the end of the reprogramming process confirmed successful reprogramming (Figures 5M and 5N). We also explored the ability of NANOG or SALL4-positive cells (i.e., tdTomato cells) to mark reprogrammed cells. On day 6 of reprogramming, tdTomato-positive cells were sorted and reseeded on a feeder layer for continuous reprogramming with dox and Tam. Indeed, both NANOG and SALL4 demonstrated significant enrichment for reprogrammed cells (Figures S5E–S5H). In conclusion, this set of experiments, challenges the notion that reduced stochastic expression of the targeted pluripotent alleles is responsible for the early reprogramming blockage.
Methylation abnormalities in the double heterozygous mutant fibroblasts is correlated with reprogramming impairment
The fact that additional exogenous expression of OSK factors rescued the phenotype of the double heterozygous mutant cells (Figure S2G) suggests that epigenetic abnormalities, rather than the elimination of the targeted alleles themselves, are responsible for the observed blockage. Given the crucial role of DNA methylation in reprogramming, we hypothesized that the double heterozygous mutant MEFs might harbor abnormal DNA methylation that hinders their ability to undergo reprogramming. To test this hypothesis, SGFP1S2+/−;S4+/− MEFs and control SGFP1S2+/− MEFs were subjected to reduced representation bisulfite sequencing (RRBS).
Methylation analysis revealed that the two MEF lines are very similar in regard to their CpG-enriched methylation landscape, suggesting that overall the double heterozygous mutant cells harbor a correct fibroblastic methylation landscape, comprising of approximately 1,900,000 sites, that are shared with the control MEFs. However, read counts did vary between samples and so did reads per site, clustering them as two different groups (Figure 6A). Differentially methylated regions (DMRs) were defined as CpG sites of consecutive tiles that are 100-bp long in size, include at least 15 reads and show at least 20% methylation differences between the two MEF lines. All DMRs were adjusted to p value of 1e-3 or lower. This analysis yielded two groups of DMRs: (i) 1,263 tiles that are more methylated and (ii) 1,384 tiles that are less methylated in the double heterozygous mutant MEFs compared with controls (Figures 6B and 6C). We then associated each DMR to its neighboring gene and ran GO term analysis. Interestingly, many of the DMRs were found to be associated with “loss of function of Oct4” and are associated with “Hippo signaling” (Figures 6D and 6E), suggesting that the loss of the indicated two pluripotency alleles in the pluripotent state might result in abnormal differentiation and DNA methylation later on in their somatic cell derivatives.
To confirm that DNA methylation abnormalities is responsible for the reprogramming delay, double heterozygous mutant MEFs from all systems were treated for two days with 5-Aza-2′-deoxycytidine (5′azaDC) and reprogramming experiments were carried out. In agreement with the RRBS results, treatment of 5′azaDC rescued the reprogramming defect (Figure 6F).
We then correlated the 1604 differentially expressed genes identified through the comparison between NGFP2N+/− control iPSCs and the double heterozygous mutant iPSC lines (Figure S2D) with the genes affected by methylation in SGFP1S2+/−;S4+/− MEFs. A significant overlap was observed, with 53 genes displaying hypermethylation and 69 genes showing hypomethylation in SGFP1S2+/−;S4+/− MEFs (p < 0.00001) (Figures S6A and S6B; Table S1). This overlap was particularly enriched in pathways governing fibroblastic identity, such as “MEFs,” “FGF signaling,” and “fibrosis,” and was further associated with regulation by pluripotency factors such as “OCT4,” “TCF3,” “SOX2,” and "NANOG" (Figures S6C and S6D). GRN analysis conducted for both gene lists identified the pluripotency factor OCT4 as a major regulator of the 53 hypermethylated genes, along with the TGFPβ protein member SMAD1 and the homeobox protein member NKX2-1. Furthermore, the analysis pinpointed on critical early developmental factors such as PAX2, FOXA1, E2F1, and the homeobox protein CDX4 as major regulators for the 69 hypomethylated genes (Figures S6E and S6F). These findings collectively suggest that reduced pluripotency gene levels during the pluripotent state may lead to methylation abnormalities in regions critical for the function of somatic cell derivatives, and this process is mediated by both pluripotent and key developmental regulators.
Discussion
PSCs in 2i/L culture are less affected by differentiation cues due to robust inhibitor-based protection. Conversely, those in S/L conditions are more prone to differentiation signals, resulting in greater transcriptome heterogeneity. In this scenario, any pluripotency gene expression dysregulation can disrupt pluripotency maintenance, potentially affecting somatic cell derivative development.
Here, by using PSCs as a tested model we aimed to understand how reduced levels of pluripotency genes affects cell’s function. We deleted a single allele from various combinations of two pluripotency genes (i.e., Nanog+/−;Sall4+/−, Nanog+/−;Esrrb+/−, Nanog+/−;Utf1+/−, and Sox2+/−;Sall4+/−) and used different PSC systems to exclude any system-specific effect.
Interestingly, while examination of the developmental potential of the cells did not reveal a significant difference between the double heterozygous mutant cells and their parental controls, fibroblasts derived from the double heterozygous mutant pluripotent cells demonstrated a strong delay in their capability to induce pluripotency either by transcription factors or by NT. The poor reprogramming efficiency observed between the various pluripotent stem cell systems ranged from a complete blockage at the MET transition (NGFP2 line) to a later blockage at the stabilization step just before the acquisition of pluripotency (NGFP1 and SGFP1 lines).
Given that the affected genes were shown to play a major role during the stochastic phase of the reprogramming process, we examined the possibility that reduced stochastic expression of the targeted genes hinders the capability of the cells to pass the stochastic phase and to induce pluripotency. To support this hypothesis, one should show that the activation of the Sall4 or Nanog allele is a frequent event and occurred in most induced cells at early stages of reprogramming. Using tracing systems for Nanog and Sall4 we show that, only a small number of induced cells could activate the targeted alleles following 6 days of factor induction, suggesting that reduced stochastic expression of these genes is not responsible for the global reprogramming delay seen in the double heterozygous mutant cells.
Additional expression of multiple pluripotent genes (e.g., Sall4, Nanog, Utf1, Esrrb, and OSK) can either partially or fully rescue the observed blockage; thus, we next hypothesized that epigenetic barrier in the double heterozygous mutant fibroblasts may cause the observed delay. Indeed, CpG-enriched DNA methylation analysis demonstrated a clear difference in the DNA methylation levels in regions within pluripotent and developmental genes between the two fibroblast lines, suggesting that even a 50% reduction in the levels of two pluripotent genes is sufficient to induce aberrant DNA methylation during development. In fact, although Oct4 expression was unaffected in the iPSCs, GO enrichment analysis of the derived MEFs revealed the loss of Oct4’s core pluripotency function. This discrepancy can be attributed to the reduced levels of key pluripotent genes in the iPSCs, including Nanog, Sox2, Sall4, and Esrrb, which are known to regulate the core DNA methylation machinery (Adachi et al., 2018; Shanak and Helms, 2020; Tan et al., 2013).
These findings may have implications beyond their impact on pluripotency and reprogramming. Our data indicate that even a 50% reduction in the levels of two pluripotent genes can have significant consequences during embryonic development. This mechanism may provide valuable insights and potential explanations for unresolved cases of spontaneous abortion or improper development.
Fluorescent reporter genes are a widely used tool in science to monitor the activity of a gene, regulatory element, or other elements in the genome. One of the most common approaches to introduce a reporter gene in a locus-specific manner is by the KI/KO approach. In this technique, the genomic sequence of the element of interest is being replaced by the coding sequence of the reporter gene, leaving only one intact allele of the targeted element. Our research highlights the potentially harmful impact of eliminating even a single allele within targeted cells. Consequently, exploring alternative techniques like self-cleavage peptides 2A and the internal ribosome entry site for introducing a reporter gene into the gene of interest, without disrupting the gene’s coding sequences, offers notable benefits. Collectively, our findings underscore the importance of maintaining two intact alleles for ensuring optimal cellular functionality.
Experimental procedures
Resource availability
Corresponding author
Further information and requests for resources and reagents should be directed to and will be fulfilled by the corresponding author, Yosef Buganim (yossib@ekmd.huji.ac.il).
Materials availability
All unique/stable reagents generated in this study are available from the lead contact with a completed Materials Transfer Agreement.
Experimental model and subject details
This research was performed in compliance with the joint ethics committee (IACUC) of the Hebrew University and Hadassah Medical Center. The Hebrew University is an AAALAC international accredited institute.
Quantification and statistical analyses
Statistical analysis was performed by 2-tailed unpaired t test calculated by GraphPad Prism (8.3.0). All data are presented mean ± SD. p < 0.05 was considered statistically significant. Sufficient sample size was estimated without the use of a power calculation. Data analysis was not blinded.
Acknowledgments
Y.B. is supported by research grants from EMBO Young Investigator Programme (YIP), Howard Hughes Medical Institute International Research Scholar (HHMI, #55008727), Israel Science Foundation (ISF, 161/23), and by a generous gift from Ms. Nadia Guth Biasini. We thank Yuval Nevo and huji core bioinformatics unit for analyzing part of the RNA-seq data.
Author contributions
Y.B. and R.J. conceived the study; Y.B. and R.L. designed the experiments, prepared the figures and wrote the manuscript; Y.B. together with E.K., C.O., and D.F. generated the NGFP2N+/− double heterozygous mutant lines and ran the various reprogramming experiments on NGFP2N+/− lines; R.L. generated the tracing systems for Nanog and Sall4 and the NGFP1N+/−;S+/−, NGFP1N−/–, and SGFP1S2+/−;S4+/− lines, performed reprogramming experiments on these lines, immunostaining, flow cytometry and 5′azaDC experiments; N.M. prepared the samples for the RNA-seq at day 6 of reprogramming and performed qPCR for the MET genes; N.M. together with N.YT. ran rescue reprogramming experiments; N.M. performed sm-mRNA-FISH; A.W.C. analyzed the RNA-seq data from the various NGFP2N+/− iPSC lines; H.Y. performed NT experiments; S.M. and K.M. injected iPSC lines to produce secondary MEFs and chimeric mice; M.A. helped R.L. to run reprogramming experiments and to analyze the flow cytometry results; D.O. helped running the Esrrb rescue experiments and performed the iRegulon analysis.
Declaration of interests
The authors declare no competing interests.
Declaration of generative AI and AI-assisted technologies in the writing process.
During the preparation of this work the author(s) used ChatGPT to improve language and readability. After using this tool/service, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.
Published: October 12, 2023
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.stemcr.2023.09.009.
Supplemental information
Data and code availability
The accession number for the RNA-seq data for the various NGFP2N+/− double heterozygous mutant and control iPSC lines is "GEO: GSE182009". The accession number for the RNA-seq for NGFP2N+/− double heterozygous mutant and control MEF lines after 6 days of reprogramming and RRBS for the SGFP1S2+/− and SGFP1S2+/−;S4+/− primary MEFs is "GEO: GSE192655".
References
- Adachi K., Kopp W., Wu G., Heising S., Greber B., Stehling M., Araúzo-Bravo M.J., Boerno S.T., Timmermann B., Vingron M., Schöler H.R. Esrrb Unlocks Silenced Enhancers for Reprogramming to Naive Pluripotency. Cell Stem Cell. 2018;23:900–904. doi: 10.1016/j.stem.2018.11.009. [DOI] [PubMed] [Google Scholar]
- Arnold K., Sarkar A., Yram M.A., Polo J.M., Bronson R., Sengupta S., Seandel M., Geijsen N., Hochedlinger K. Sox2(+) adult stem and progenitor cells are important for tissue regeneration and survival of mice. Cell Stem Cell. 2011;9:317–329. doi: 10.1016/j.stem.2011.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benchetrit H., Jaber M., Zayat V., Sebban S., Pushett A., Makedonski K., Zakheim Z., Radwan A., Maoz N., Lasry R., et al. Direct Induction of the Three Pre-implantation Blastocyst Cell Types from Fibroblasts. Cell Stem Cell. 2019;24:983–994.e7. doi: 10.1016/j.stem.2019.03.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boiani M., Eckardt S., Schöler H.R., McLaughlin K.J. Oct4 distribution and level in mouse clones: consequences for pluripotency. Genes Dev. 2002;16:1209–1219. doi: 10.1101/gad.966002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buganim Y., Faddah D.A., Cheng A.W., Itskovich E., Markoulaki S., Ganz K., Klemm S.L., van Oudenaarden A., Jaenisch R. Single-cell expression analyses during cellular reprogramming reveal an early stochastic and a late hierarchic phase. Cell. 2012;150:1209–1222. doi: 10.1016/j.cell.2012.08.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buganim Y., Faddah D.A., Jaenisch R. Mechanisms and models of somatic cell reprogramming. Nat. Rev. Genet. 2013;14:427–439. doi: 10.1038/nrg3473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buganim Y., Markoulaki S., van Wietmarschen N., Hoke H., Wu T., Ganz K., Akhtar-Zaidi B., He Y., Abraham B.J., Porubsky D., et al. The developmental potential of iPSCs is greatly influenced by reprogramming factor selection. Cell Stem Cell. 2014;15:295–309. doi: 10.1016/j.stem.2014.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carey B.W., Markoulaki S., Hanna J.H., Faddah D.A., Buganim Y., Kim J., Ganz K., Steine E.J., Cassady J.P., Creyghton M.P., et al. Reprogramming factor stoichiometry influences the epigenetic state and biological properties of induced pluripotent stem cells. Cell Stem Cell. 2011;9:588–598. doi: 10.1016/j.stem.2011.11.003. [DOI] [PubMed] [Google Scholar]
- Carter A.C., Davis-Dusenbery B.N., Koszka K., Ichida J.K., Eggan K. Nanog-independent reprogramming to iPSCs with canonical factors. Stem Cell Rep. 2014;2:119–126. doi: 10.1016/j.stemcr.2013.12.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Casademunt E., Carter B.D., Benzel I., Frade J.M., Dechant G., Barde Y.A. The zinc finger protein NRIF interacts with the neurotrophin receptor p75(NTR) and participates in programmed cell death. EMBO J. 1999;18:6050–6061. doi: 10.1093/emboj/18.21.6050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- David L., Polo J.M. Phases of reprogramming. Stem Cell Res. 2014;12:754–761. doi: 10.1016/j.scr.2014.03.007. [DOI] [PubMed] [Google Scholar]
- Elling U., Woods M., Forment J.V., Fu B., Yang F., Ng B.L., Vicente J.R., Adams D.J., Doe B., Jackson S.P., et al. Derivation and maintenance of mouse haploid embryonic stem cells. Nat. Protoc. 2019;14:1991–2014. doi: 10.1038/s41596-019-0169-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feng B., Jiang J., Kraus P., Ng J.H., Heng J.C.D., Chan Y.S., Yaw L.P., Zhang W., Loh Y.H., Han J., et al. Reprogramming of fibroblasts into induced pluripotent stem cells with orphan nuclear receptor Esrrb. Nat. Cell Biol. 2009;11:197–203. doi: 10.1038/ncb1827. [DOI] [PubMed] [Google Scholar]
- Festuccia N., Osorno R., Halbritter F., Karwacki-Neisius V., Navarro P., Colby D., Wong F., Yates A., Tomlinson S.R., Chambers I. Esrrb is a direct Nanog target gene that can substitute for Nanog function in pluripotent cells. Cell Stem Cell. 2012;11:477–490. doi: 10.1016/j.stem.2012.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fotaki V., Price D.J., Mason J.O. Newly identified patterns of Pax2 expression in the developing mouse forebrain. BMC Dev. Biol. 2008;8:79. doi: 10.1186/1471-213X-8-79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo L., Lin L., Wang X., Gao M., Cao S., Mai Y., Wu F., Kuang J., Liu H., Yang J., et al. Resolving Cell Fate Decisions during Somatic Cell Reprogramming by Single-Cell RNA-Seq. Mol. Cell. 2019;73:815–829.e7. doi: 10.1016/j.molcel.2019.01.042. [DOI] [PubMed] [Google Scholar]
- Iwasaki Y., Thomsen G.H. The splicing factor PQBP1 regulates mesodermal and neural development through FGF signaling. Development. 2014;141:3740–3751. doi: 10.1242/dev.106658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jaber M., Radwan A., Loyfer N., Abdeen M., Sebban S., Kolb T., Zapatka M., Makedonski K., Ernst A., Kaplan T., et al. Comparative Parallel Multi-Omics Analysis During the Induction of Pluripotent and Trophectoderm States. bioRxiv. 2020 doi: 10.1038/s41467-022-31131-8. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leeb M., Wutz A. Derivation of haploid embryonic stem cells from mouse embryos. Nature. 2011;479:131–134. doi: 10.1038/nature10448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu J., Han Q., Peng T., Peng M., Wei B., Li D., Wang X., Yu S., Yang J., Cao S., et al. The oncogene c-Jun impedes somatic cell reprogramming. Nat. Cell Biol. 2015;17:856–867. doi: 10.1038/ncb3193. [DOI] [PubMed] [Google Scholar]
- Masui S., Nakatake Y., Toyooka Y., Shimosato D., Yagi R., Takahashi K., Okochi H., Okuda A., Matoba R., Sharov A.A., et al. Pluripotency governed by Sox2 via regulation of Oct3/4 expression in mouse embryonic stem cells. Nat. Cell Biol. 2007;9:625–635. doi: 10.1038/ncb1589. [DOI] [PubMed] [Google Scholar]
- Miyanari Y., Torres-Padilla M.E. Control of ground-state pluripotency by allelic regulation of Nanog. Nature. 2012;483:470–473. doi: 10.1038/nature10807. [DOI] [PubMed] [Google Scholar]
- Morshedi A., Soroush Noghabi M., Dröge P. Use of UTF1 genetic control elements as iPSC reporter. Stem Cell Rev. Rep. 2013;9:523–530. doi: 10.1007/s12015-011-9342-7. [DOI] [PubMed] [Google Scholar]
- Nichols J., Zevnik B., Anastassiadis K., Niwa H., Klewe-Nebenius D., Chambers I., Schöler H., Smith A. Formation of pluripotent stem cells in the mammalian embryo depends on the POU transcription factor Oct4. Cell. 1998;95:379–391. doi: 10.1016/s0092-8674(00)81769-9. [DOI] [PubMed] [Google Scholar]
- Polo J.M., Anderssen E., Walsh R.M., Schwarz B.A., Nefzger C.M., Lim S.M., Borkent M., Apostolou E., Alaei S., Cloutier J., et al. A molecular roadmap of reprogramming somatic cells into iPS cells. Cell. 2012;151:1617–1632. doi: 10.1016/j.cell.2012.11.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schwarz B.A., Bar-Nur O., Silva J.C.R., Hochedlinger K. Nanog is dispensable for the generation of induced pluripotent stem cells. Curr. Biol. 2014;24:347–350. doi: 10.1016/j.cub.2013.12.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scoville D.W., Kang H.S., Jetten A.M. GLIS1-3: emerging roles in reprogramming, stem and progenitor cell differentiation and maintenance. Stem Cell Investig. 2017;4:80. doi: 10.21037/sci.2017.09.01. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sebban S., Buganim Y. Nuclear Reprogramming by Defined Factors: Quantity Versus Quality. Trends Cell Biol. 2015;26:65–75. doi: 10.1016/j.tcb.2015.08.006. [DOI] [PubMed] [Google Scholar]
- Shanak S., Helms V. DNA methylation and the core pluripotency network. Dev. Biol. 2020;464:145–160. doi: 10.1016/j.ydbio.2020.06.001. [DOI] [PubMed] [Google Scholar]
- Shu J., Zhang K., Zhang M., Yao A., Shao S., Du F., Yang C., Chen W., Wu C., Yang W., et al. GATA family members as inducers for cellular reprogramming to pluripotency. Cell Res. 2015;25:169–180. doi: 10.1038/cr.2015.6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simon R., Wiegreffe C., Britsch S. Bcl11 Transcription Factors Regulate Cortical Development and Function. Front. Mol. Neurosci. 2020;13:51. doi: 10.3389/fnmol.2020.00051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Soufi A., Donahue G., Zaret K.S. Facilitators and impediments of the pluripotency reprogramming factors' initial engagement with the genome. Cell. 2012;151:994–1004. doi: 10.1016/j.cell.2012.09.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tan M.H., Au K.F., Leong D.E., Foygel K., Wong W.H., Yao M.W. An Oct4-Sall4-Nanog network controls developmental progression in the pre-implantation mouse embryo. Mol. Syst. Biol. 2013;9:632. doi: 10.1038/msb.2012.65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsubooka N., Ichisaka T., Okita K., Takahashi K., Nakagawa M., Yamanaka S. Roles of Sall4 in the generation of pluripotent stem cells from blastocysts and fibroblasts. Gene Cell. 2009;14:683–694. doi: 10.1111/j.1365-2443.2009.01301.x. [DOI] [PubMed] [Google Scholar]
- Wernig M., Lengner C.J., Hanna J., Lodato M.A., Steine E., Foreman R., Staerk J., Markoulaki S., Jaenisch R. A drug-inducible transgenic system for direct reprogramming of multiple somatic cell types. Nat. Biotechnol. 2008;26:916–924. doi: 10.1038/nbt1483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xie Z., Bailey A., Kuleshov M.V., Clarke D.J.B., Evangelista J.E., Jenkins S.L., Lachmann A., Wojciechowicz M.L., Kropiwnicki E., Jagodnik K.M., et al. Gene Set Knowledge Discovery with Enrichr. Curr. Protoc. 2021;1:e90. doi: 10.1002/cpz1.90. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The accession number for the RNA-seq data for the various NGFP2N+/− double heterozygous mutant and control iPSC lines is "GEO: GSE182009". The accession number for the RNA-seq for NGFP2N+/− double heterozygous mutant and control MEF lines after 6 days of reprogramming and RRBS for the SGFP1S2+/− and SGFP1S2+/−;S4+/− primary MEFs is "GEO: GSE192655".