Fig. 4. G4 maintenance during differentiation stabilises transcription of genes essential for key cellular functions.
a Scatter plot of the log2 median transcripts per million (TPM) for hESCs vs CNCCs with fitted weighted linear regression model (black line). G4E; G4 promoter status in hESCs and G4D; G4 promoter status in daughter cell (CNCCs). b Box plots of the calculated residuals of G4E+ G4D+ and G4E– G4D–, G4E+ G4D– or G4E– G4D+. Data are presented as median (centre) and interquartile range (box; the lower and upper bounds of the box represent the 25th and 75th percentiles, respectively). Whiskers represent ±1.5x interquartile range. N = 5 and 3 biologically independent samples for hESCs and CNCCs, respectively. Number of genes per group: G4E+ G4D+ (7038) and G4E– G4D– (16818), G4E+ G4D– (3515) or G4E– G4D+ (913). One-tailed F-test for variances. c Schematic depicting a summary of gene expression dynamics associated with each G4 promoter class. d Heatmap showing top 20 significant (FDR < 0.05) biological processes (BP) obtained from Gene Ontology (GO) enrichment analysis performed with g:Profiler for genes which are not differentially expressed (FDR > 0.05, −1 < Log2FC < 1; see Methods section) between hESCs and CNCCs and either maintain a G4 (G4E+ G4D+; n = 4675 genes) or never had a G4 (G4E– G4D–; n = 3425 genes) in their promoter. The number of genes intersecting GO term is shown in the adjacent bar plot. e Genome browser view for representative promoters showing G4E+ G4D+ or G4E– G4D– promoter status after hESC differentiation into CNCCs or NSCs. Yellow box highlights overlap of G4s, open chromatin sites (ATAC) and OQs13. Heatmap showing median gene expression (Log10(TPM + 0.01)) is also shown for the indicated genes.