Abstract
Tissue damage elicits cell fate switching through a process called metaplasia, but how the starting cell fate is silenced and the new cell fate is activated has not been investigated in animals. In cell culture, pioneer transcription factors mediate “reprogramming” by opening new chromatin sites for expression that can attract transcription factors from the starting cell’s enhancers. Here we report that Sox4 is sufficient to initiate hepatobiliary metaplasia in the adult liver. In lineage-traced cells, we assessed the timing of Sox4-mediated opening of enhancer chromatin versus enhancer decommissioning. Initially, Sox4 directly binds to and closes hepatocyte regulatory sequences via a motif it overlaps with Hnf4a, a hepatocyte master regulator. Subsequently, Sox4 exerts pioneer factor activity to open biliary regulatory sequences. The results delineate a hierarchy by which gene networks become reprogrammed under physiological conditions, providing deeper insight into the basis for cell fate transitions in animals.
Metaplasia is an adaptive cellular response to tissue injury and involves a variety of organs, including the lung (squamous metaplasia), esophagus (intestinal metaplasia) and pancreas (acinarductal metaplasia)1,2. Metaplasia involves the induction of multiple transcription factors3–6, and loss-of function studies suggest that such factors function in concert with epigenetic remodeling3–8. Whether the induced transcription factors are sufficient to elicit metaplasia, and how they might initiate cell fate changes in vivo is not known. Furthermore, how the genetic network of the starting cell fate is suppressed while the new cell fate is activated during metaplasia remains unclear. Pioneer transcription factors, by virtue of their inherent ability to target DNA on a nucleosome, elicit cell fate changes by targeting silent genes in chromatin9. Pioneer factors can also enable further chromatin compaction and repression10 and are involved in cancer progression and the expression of circadian rhythm genes11,12. Pioneer factors can trigger reprogramming of one cell type to another in cultured cells – notable examples include OCT4, SOX2 and KLF4 in induced pluripotent stem cells (iPSCs)13–16, FOXA1/2/3 in induced hepatocytes17, and ASCL1 and NEUROD1 in induced neurons18,19. In cultured cells, repression of the starting fate genes is thought to occur by the reprogramming factors, bound at new sites, drawing away the starting cell transcription factors from active enhancers13,20. Transcription factor-mediated enhancer decommissioning has also been reported in the context of cell fate decisions21,22.
We and others have reported that the liver undergoes a hepatobiliary metaplasia (“biliary reprogramming”), wherein hepatocytes are reprogrammed to become biliary epithelial cells as a conserved in vivo response to liver injury23–26. Here, we use a hepatobiliary metaplasia model to characterize the genetic cascades involved in this physiological cell fate change to understand the gene activation and repression programs responsible for the reprogramming process in animals and how they may differ from those present in cell culture. Notably, our experimental design allows us to trace and isolate the individual cells undergoing metaplasia at different time points of the process.
Results
Sox4 induces biliary reprogramming
We recently reported that in response to cholestatic injury induced with a 0.1% 3,5-diethoxycarbonyl-1,4-dihydrocollidine (DDC) diet, adult hepatocytes alter their chromatin landscapes to resemble those of biliary epithelial cells; moreover, newly opened chromatin regions were highly enriched for Sox binding motifs27. Given that Sox transcription factors are known to possess pioneer factor activity in cell culture15,28, we hypothesized that one or more Sox factors facilitate biliary reprogramming by directly eliciting chromatin accessibility. To profile the expression of all Sox genes during biliary reprogramming, we first performed single cell RNA sequencing (scRNA-Seq) during DDC-induced hepatobiliary metaplasia, including pseudotime analysis29 based on known marker genes expressed in hepatocytes and reprogrammed cells at different stages (Extended Data Fig. 1a–b). This allowed us to identify Cd24 as a surface marker of cells at the early-to-intermediate stages of reprogramming and Epcam as a surface marker of cells at the intermediate-to-late stages of reprogramming (Extended Data Fig. 1c–d, Supplementary Fig. 1). RNA-sequencing of cells isolated at various stages of reprogramming (Extended Data Fig. 1e–g, Supplementary Fig. 2–3) revealed Sox4 and Sox9 to be the only Sox factors to be expressed (Extended Data Fig. 2a). Sox9 was weakly expressed in normal hepatocytes, as previously reported in a subpopulation of periportal hepatocytes (Extended Data Fig. 2b)30, while Sox4 expression was virtually undetectable in hepatocytes at baseline but rapidly induced during reprogramming (Extended Data Fig. 2b).
We then asked whether ectopic expression of Sox4 and Sox9 – delivered via an adeno-associated virus (AAV) gene transfer system – could initiate biliary reprogramming of hepatocytes under homeostatic (i.e. non-injury) conditions. To this end, we produced AAV8-TBG-HA-Sox4-P2A-Cre and AAV8-TBG-HA-Sox9-P2A-Cre and injected viral preps individually or concurrently to LSL-Rosa26-Cas9-EGFP mice (Fig. 1a). The system enables robust infection of >95% of hepatocytes and, significantly, allows infected cells at different stages of reprogramming to be recognized and isolated by virtue of an EGFP lineage tracer31. As a control, mice were injected with AAV8 virus carrying Cre but no additional payload (empty vector; EV).
We harvested hepatocytes at 7 days post injection (dpi) and confirmed increased expression of both Sox4 (~1000–1500-fold) and Sox9 (~10–30-fold) by qRT-PCR (Extended Data Fig. 2c). Importantly, RNA-Seq demonstrated comparable increases in Sox4 and Sox9 during DDC-induced reprogramming (Extended Data Fig. 2d); therefore, we concluded that our expression system reasonably recapitulated the upregulation of Sox4 and Sox9 observed under physiological conditions. Strikingly, flow cytometry demonstrated that ectopic expression of Sox4 or Sox4 and Sox9 together, but not Sox9 alone, induced robust expression of Cd24 and modest expression of Epcam in EGFP+ hepatocyte-derived cells (Fig. 1b, Supplementary Fig. 4). A broader analysis of gene expression by qRT-PCR confirmed the induction of multiple biliary genes and the repression of multiple hepatocyte genes following ectopic expression of Sox4 and Sox4/9, while Sox9 alone had no effect (Extended Data Fig. 2e). Therefore, we focused our attention on Sox4.
Sox4 mice lost weight following viral induction, reaching a nadir at 7–8 dpi before recovering most of the weight by 14 dpi (Extended Data Fig. 3a). During this period, the liver became smaller and paler than empty vector-injected controls (Extended Data Fig. 3b–c), with atypical ductal cells (Fig. 3d), a hallmark of chronic and fulminant liver diseases32. Sox4 mRNA expression reached its maximum level from days 1–4 dpi, began to diminish slightly by 7 dpi, and diminished greatly by 10 dpi as confirmed at the protein level by immunofluorescence (Extended Data Fig. 3e–f).
Consistent with the Cd24 to Epcam reprogramming sequence identified at the mRNA level in DDC-treated mice (Extended Data Fig. 1), we found that Cd24 was robustly expressed at 4 dpi in Sox4 hepatocytes at the protein level, whereas Epcam became strongly expressed only at 7 dpi (Fig. 1c). Other intermediate-to-late biliary reprogramming markers, such as Prom1 and Itga6, became detectable between 4 dpi and 7 dpi (Fig. 1c). We never observed EGFP+ cells that co-expressed Krt19, a marker of fully reprogrammed biliary cells (Extended Data Fig. 1a–c), suggesting that Sox4 is unable to induce the final stages of biliary reprogramming (Fig. 1c). Consistent with these data, flow cytometry demonstrated that Cd24 expression peaked at 4 dpi, becoming detectable in over 60% of EGFP+ hepatocytes before dropping back to baseline, while Epcam expression in EGFP+ cells continued to increase after 4 dpi (Extended Data Fig. 3g).
PCA mapping of RNA-Seq data demonstrated that Sox4-expressing hepatocytes exhibited a similar degree of reprogramming compared to DDC-induced early reprogrammed cells (Extended Data Fig. 3g). Using gene sets built as differentially expressed genes (fold changes > 2 or < 0.5, p.adj < 0.05, DESeq233) between hepatocytes and Rep_early cells (2,355 upregulated and 1,189 downregulated genes in Rep_early vs. hepatocytes) (Table S1), GSEA confirmed that Sox4-expressing hepatocytes were significantly enriched for a reprogrammed cell signature and under-represented for the hepatocyte signature (Extended Data Fig. 3i). Taken together, the data show that Sox4 is sufficient to induce the initial stages of biliary reprogramming, including the repression of hepatocyte gene expression.
Sox4 remodels chromatin landscapes
Since Sox4 dominantly initiated a biliary fate conversion in vivo, we hypothesized that Sox4 mediates reprogramming by changing chromatin conformation. We performed Assay for Transposase-Accessible Chromatin using sequencing (ATAC-Seq)34 with hepatocytes isolated at 4 dpi from AAV-EV or AAV-HA-Sox4-injected mice. Differential peak analysis comparing Sox4-expressing hepatocytes to control hepatocytes identified 20,329 regions with increased accessibility (i.e. newly opened), 14,564 regions with decreased accessibility (i.e. newly closed), and 92,894 regions exhibiting no change (Fig. 2a). Analysis of transposase cleavage patterns in deeply sequenced samples (we had >100 million uniquely mapped tags) enables visualization of transcription factor footprints within ATAC-Seq tags35. As predicted, motif analysis of the merged replicates indicated that Sox binding is enriched in newly opened regions, while no difference was observed in the unchanged regions (Fig. 2b). Unexpectedly, however, in the control EV hepatocytes, where Sox4 is not expressed, we observed a Sox4 footprint in regions that become closed at 4 dpi (Extended Data Fig. 4a). Our analysis of published ChIP-Seq data36,37 indicated that these unexpected footprints colocalized with binding of the hepatocyte transcription factors Hnf4a38,39 and Rxra40 (Extended Data Fig. 4b). When compared with previously published ATAC-Seq studies27,41, Sox4-expressing hepatocytes exhibited open chromatin profiles resembling early reprogrammed cells (Fig. 2c), consistent with our findings with RNA-Seq (Extended Data Fig. 3h). Thus, ectopic expression of Sox4 results in a chromatin landscape mirroring that which is present under physiological conditions in the early stages of biliary reprogramming.
Sox4 binding precedes major changes in gene expression during reprogramming
To explore the association between Sox4 binding and chromatin opening, we performed a genome-wide examination of Sox4 binding patterns at the early stage of reprogramming in vivo. To this end, we performed Cleavage Under Targets & Release Using Nuclease sequencing (CUT&RUN-Seq)42,43 using hepatocytes isolated at 18 hours post injection (18 hpi) and 4 dpi. Hepatocytes at 18 hpi showed weak HA-Sox4 expression at the protein level (Extended Data Fig. 4c) and minimal changes in gene expression (Extended Data Fig. 4d), thereby providing a profile of Sox4 binding during the early stages of Sox4-induced reprogramming. We obtained 9,463 and 19,362 Sox4 CUT&RUN peaks at 18 hpi and 4 dpi samples respectively. As predicted, the peaks were highly enriched for Sox binding motifs (Extended Data Fig. 4e).
Using an E-coli DNA spike-in approach, we quantitated Sox4 binding at each timepoint. As expected, Sox4 binding increased from 18 hpi to 4 dpi in both newly closed and newly opened regions (Fig. 2d) in proportions that were comparable to the relative amounts of newly closed and opened peaks, as seen by ATAC-Seq. The increase in binding was greatest in the newly opened regions, with the averaged peak summit (4 dpi/18 hpi) increasing by 160-fold in the newly opened regions and by 32-fold in the newly closed regions (Fig. 2e), a trend that was also seen when the data were normalized across the genome (Fig. 2f, Extended Data Fig. 4f). Our data indicate that extensive Sox4 binding (18 hpi) precedes major changes in gene expression. Consequently, we set out to understand modulators and consequences of Sox4 binding.
Sox4 targets genes for silencing or activation during reprogramming
To assess the association of the ATAC peaks with gene expression, we annotated each peak with its nearest gene, associating 19,985 genes in total to all ATAC peaks (n =127,787) identified at any of the newly opened, unchanged, or newly closed regions. The annotated genes were then ranked based on the fold-change between empty vector and Sox4-expressing hepatocytes, and the ranked gene list was used as input for GSEA44. Consistent with our RNA-Seq results, regions exhibiting decreased chromatin accessibility upon Sox4 expression were found in proximity to hepatocyte genes (Fig. 3a, upper), while regions exhibiting increased chromatin accessibility upon Sox4 expression were found in proximity to reprogramming-associated genes (Fig. 3a, lower). Approximately half of the genes associated with newly closed regions (2,685/5,272; 50.9%) were downregulated during DDC-induced reprogramming, including many known hepatocyte genes (Fig. 3b, Table S2). By contrast, more than half of the genes associated with newly opened regions (3,663/5,823; 62.9%) were upregulated during DDC-induced reprogramming, including many known biliary genes (Fig. 3c, Table S2). Gene ontology analysis of ATAC-Seq (Table S3) and RNA-Seq (Table S3) data revealed that terms enriched for newly closed region-associated genes were shared with the terms under-represented in DDC-induced reprogrammed cells, while terms enriched for newly opened region-associated genes were shared with the terms over-represented in DDC-induced reprogrammed cells (Extended Data Fig. 5a–d). When we annotated Sox4 peaks at the 18 hpi and 4 dpi timepoints with the nearest gene and performed GSEA, there was enrichment of a hepatocyte signature for genes associated with Sox4 peaks at 18 hpi (Fig. 3d, upper) and enrichment of a reprogramming signature for genes associated with Sox4 peaks at 4 dpi (Fig. 3d, lower). We conclude that Sox4-induced changes in chromatin accessibility are associated with corresponding decreased expression of hepatocyte genes and increased expression of biliary genes, as occurs early in DDC-induced reprogramming.
We then performed DNA footprint analysis to identify transcription factor binding sites that were enriched or depleted following Sox4 expression. This confirmed that most of the transcription factor binding sites previously identified as enriched in newly opened regions during DDC-induced reprogramming (e.g. AP1, E2F, and Tead)27 were also enriched in the Sox4 newly-opened regions (Fig. 3e). Interestingly, we also found that transcription factors associated with hepatocyte identity and function (e.g. Hnf4a/g, Rara, and Cebpa/b/d/e/g) lost their footprints following Sox4 expression (Fig. 3e). Hnf4a motifs were also evident within many Sox4-CUT&RUN peaks (Extended Data Fig. 4e).
Collectively, these results indicate that changes in chromatin and transcription factor binding landscapes following Sox4 ectopic expression resemble those of cells undergoing physiological biliary reprogramming.
Sox4 initially closes active hepatocyte enhancers and evicts Hnf4a
We next assessed whether early Sox4 binding (18 hpi) in regions destined to undergo chromatin closing (Fig. 2f, lower) was correlated with a loss of gene expression at 4 dpi (Fig. 4a, left). Strikingly, genes associated with newly closed regions that were bound by Sox4 (n=825 genes) exhibited a marked reduction in gene expression during DDC-induced reprogramming (Fig. 4a, right bottom). By contrast, genes associated with newly closed regions that were not bound by Sox4 (n=5,090 genes) exhibited much smaller reductions in gene expression (Fig. 4a, right top). We observed that newly closed regions were highly enriched in areas distal to the TSS (Extended Data Fig. 6a–b), raising the possibility that Sox4 binding modulates the activity of hepatocyte enhancers.
To assess the initial chromatin targeting by Sox4 more deeply, we compared sites of Sox4 binding with regions previously characterized based on their sensitivity to MNase and histone modifications. Upon high-level MNase digestion, labile nucleosomes and free DNA are destroyed and stable nucleosomes are resistant, whereas low-level MNase digestion preserves labile nucleosomes at enhancers45 (Extended Data Fig. 6c). Open regions initially targeted by Sox4 that later became closed exhibited MNase profiles resembling those of active liver enhancers (Fig. 4b–c) and had patterns of H2B and H3 binding (ChIP-Seq) similar to those associated with active liver enhancers (Fig. 4d–e). Moreover, Sox4 expression was associated with reduced accessibility of these enhancers at 4 dpi compared to empty vector (Fig. 4f–g, Extended Data Fig. 6d) and resulted in a decrease of the active enhancer marks H3K27ac and H3K4me1 at 4 dpi compared to 18 hpi (Fig. 4f–g, Extended Data Fig. 6e). Collectively, these results indicate that shortly after its induction, Sox4 binds to and inactivates active liver enhancers.
We next sought to understand how Sox4 targets active liver enhancers. Given the unexpected finding that the Hnf4a binding motif was enriched in both 18 hpi and 4 dpi Sox4-CUT&RUN peaks (Extended Data Fig. 4e), and more so at 18 hpi than 4 dpi (Fig. 5a), we considered the possibility that targeted rather than promiscuous binding was responsible for Sox4’s localization to Hnf4a sites. Remarkably, we found that the Sox binding motif (CTTTGT/ACAAAG) overlaps the binding motifs for Hnf4a (CAAAG/CTTTG) and other hepatocyte-enriched transcription factors (Fig. 5b). Considering that pioneer factors can bind partial motifs16, Sox4 could recognize such partial motifs for Hnf4a. Indeed, using published ChIP-Seq data37, we found substantial overlap of Sox4 binding sites with Hnf4a binding sites in adult hepatocytes (Fig. 4g bottom, Extended Data Fig. 6f left, g left), but not in colon epithelial cells (Extended Data Fig. 6f right, g right)46. These results indicate that Sox4 initially binds to regions that are open and occupied by Hnf4a in hepatocytes.
To explore the consequence of Sox4 binding, we used ATAC-Seq data to compare the footprints of Hnf4a and another hepatocyte transcription factor, Rxra, with those that were bound by Sox4 more efficiently at 18 hpi than at 4 dpi. Strikingly, Sox4-expressing hepatocytes exhibited decreased Hnf4a and Rxra footprints at 4 dpi compared to hepatocytes that received empty vector (Fig. 5c). We also confirmed the reduction of footprints of the factors in active liver enhancers (Extended Data Fig. 6h). Collectively, the data indicate that Sox4 suppresses hepatocyte identity in part by evicting resident hepatocyte transcription factors and reducing the accessibility and activity of hepatocyte enhancers.
Sox4 opens chromatin in biliary regions
Next, we returned to our earlier hypothesis that Sox4 primes the biliary phenotype in hepatocytes by acting as a pioneer factor. Based on the strong correlation between Sox4 binding and changes in the chromatin landscape (Fig. 2f), we first assessed the ability of Sox4 to bind nucleosomal DNA, as this is a defining feature of pioneer factor activity. To this end, we incubated recombinant Sox4 protein with nucleosome particles assembled on a human LIN28B DNA fragment, which was previously demonstrated to bind Sox216. Electrophoretic mobility shift assay (EMSA) confirmed that recombinant Sox4 binds specifically to its target site (Extended Data Fig. 7a) and to assembled LIN28B nucleosomes (Fig. 6a). Given that pioneer factor activity is associated with opening of previously closed chromatin, we divided the regions of newly opened chromatin (based on ATAC-Seq peaks) into two sub-regions: those whose accessibility increased over baseline upon Sox4 expression (“more opened regions,” or MORs) (Fig. 6b, upper left) and those regions that only became accessible when Sox4 was expressed (“de novo opened regions,” or DORs) (Fig. 6b, lower left). We visualized Sox4-CUT&RUN signals at these regions and confirmed that both MORs and DORs were bound by Sox4 at 4 dpi (Fig. 6b, right heatmaps). DORs, and to a lesser extent MORs, fell into broad domains of chromatin enriched for high-MNase signals, compared to active promoters and enhancers (Figs. 6c–d). Sox4-targeted DORs and MORs were also found in broad domains enriched for core histones (Figs. 6e–f). We conclude that Sox4 acts as a pioneer factor by targeting nucleosomal DORs, and to a lesser extent MORs, leading to increased chromatin accessibility.
At DORs, active enhancer and promoter marks were increased from faint or residual backgrounds, and at MORs the active marks were initially more elevated and increased, by 4 dpi, while the H3K27me3 repressive mark was marginal to begin with and reduced at 4 dpi (Extended Data Fig. 7b–c). In accordance with our earlier finding that newly opened regions were enriched in proximity to reprogramming-related genes (Fig. 3a,c), comparable chromatin/histone mark changes were observed in the regions near biliary genes; concordantly, Sox4 binding was observed either at putative TSS-distal enhancers (Fig. 6g, left) or at promoters/gene bodies (Fig. 6g, right). Moreover, DOR peaks were relatively enriched in TSS-distal regions compared to MOR peaks and unchanged regions (Extended Data Fig. 6a–b), indicating that DORs contain de novo primed enhancers. Collectively, these results indicate that Sox4 binding by 4 dpi leads to chromatin opening and the acquisition of active or primed characteristics of cis-regulatory elements associated with biliary phenotypes.
Sox4 and Sox9 are YAP targets
Finally, we sought to identify a direct link between the signaling pathways known to regulate biliary reprogramming and the expression of Sox4. The transcriptional co-activator Yap, which mediates output from the Hippo signaling pathway, is among the most well-characterized factors driving biliary reprogramming26,41. Thus, we hypothesized that Sox4 is a direct target of Yap. Using previously published ChIP-Seq data from hepatocytes isolated from control or DDC-treated livers41,47, we observed that Yap binds to the Sox4 gene in mice fed DDC (Extended Data Fig. 7e). Similar observations were made for Sox9 (Extended Data Fig. 7e, right). Concordantly, analysis of a previously published microarray data26 revealed that expression of YAPS127A induced expression of Sox4 and Sox9 in hepatocytes (Extended Data Fig. 7f). Taken together, the findings indicate that liver injury induces Yap-mediated expression of Sox4, resulting in subsequent changes in chromatin configuration to facilitate biliary reprogramming.
Discussion
The epigenetic mechanisms underlying cell fate switching, or reprogramming, have been studied in detail in the context of induced pluripotency, where pioneer factors enable a change in cell identify by reconfiguring chromatin. Our study provides evidence that Sox4 acts as a pioneer factor in vivo in a well-defined system of physiological reprogramming: hepatobiliary metaplasia, a hallmark of chronic and fulminant liver disease32. Moreover, we found that Sox4 exerts this activity through a sequential process in which enhancers associated with the starting (hepatocyte) cell type are decommissioned prior to the activation of enhancers associated with the acquired (biliary) cell type. Such enhancer reorganization by pioneer factors, including silencing of the starting cell’s enhancers, has been proposed for iPSCs, where early and preferential binding to active somatic enhancers causes the redistribution of associated somatic transcription factors like P300 and the recruitment of enhancer silencing factors like Hdac113. In addition, as we observed for Hnf4a and Rxra, pioneer factors can compete for binding sites in enhancers to influence lineage trajectories48,49.
Our data support and reconcile these models in an in vivo setting and offer a possible molecular mechanism to account for the specificity of binding in a physiologically relevant context. Specifically, we observed that key hepatocyte transcription factors – including Hnf4a, a master regulator of hepatocyte identity38,39 – share binding motifs with Sox4, suggesting that Sox4 may hijack these sites when it is expressed. Consistent with this idea, Sox4 preferentially binds to open chromatin regions occupied by Hnf4a and Rxra early in biliary reprogramming; subsequently, these factors are evicted from their native binding sites. While our results do not prove that Sox4 is directly responsible for this eviction, the data suggest that Sox4 competes with hepatocyte transcription factors like Hnf4a and Rxra for binding to hepatocyte enhancers early in the reprogramming process. Thus, our study provides evidence that pioneer factors can coordinately disrupt a starting cell fate while engaging closed chromatin to initiate a new cell fate.
Our findings also provide a molecular link between the signaling pathways that have been identified as mediators of biliary reprogramming and the subsequent epigenetic and transcriptional changes that stabilize the reprogrammed state. Cell-cell signaling following liver injury results in activation of the Notch, Yap, and TGFβ signaling pathways, all of which are required for biliary reprogramming23,25,26. The pioneer factor Sox4, whose expression is induced directly by Yap, explains at least in part how such exogenous upstream signals converge on the epigenome to alter cell state. Other epigenetic and transcriptional regulatory factors are also likely to participate in the conversion of hepatocytes to a fully reprogrammed state41.
The results presented here are thus consistent with a multi-step model in which pioneer factor activity, through both positive and negative effects on chromatin accessibility, is essential for the cell fate changes accompanying tissue metaplasia in vivo (summarized in Extended Data Fig. 8). First, signals from the injured liver microenvironment result in the induction of Sox4 through the Hippo/YAP pathway. Sox4 then binds to regions of open chromatin occupied by hepatocyte-specific transcription factors (e.g. Hnf4a and Rxra), causing their displacement. Finally, Sox4 acts as a traditional pioneer factor by binding to canonical Sox binding sites in biliary enhancers in closed chromatin, causing the regions to become open and accessible to other biliary transcription factors. We suspect that this cascade may be employed in other cases of metaplasia, including those leading to human cancers4.
Methods
Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Ben Z. Stanger, email: bstanger@upenn.edu
Materials availability
All unique and stable reagents generated in this study of which sufficient quantities exists are available from the Lead Contact with a completed Materials Transfer Agreement.
Data and code availability
Data that generated during the study have been deposited in Gene Expression Ominubus (GEO) with the accession number GSE221225 (SuperSeries) with SubSeries accession numbers GSE218945 and GSE218947 (RNA-Seq); GSE219052 (ATAC-Seq); GSE221223 and GSE221224 (CUT&RUN-Seq). Detailed scripts and parameters used for each step of the analysis provided by reasonable request to the authors.
Mice
Rosa-LSL-Cas9-EGFP mice50 on a C57BL/6J background were purchased from the Jackson Laboratory (strain #026175) and maintained as homozygotes. All mouse experiment procedures used in this study were performed following the NIH guidelines. All mouse procedure protocols used in this study were in accordance with, and with the approval of, the Institutional Animal Care and Use Committee of the University of Pennsylvania.
For characterization of the reprogramming stage, 4–5 week-old mice were retro-orbitally injected with AAV:ITR-U6-sgRNA(backbone)-TBG-Cre-WPRE-hGHpA-ITR 31 (empty vector: EV) with 5 × 1011 genome copies/mouse/gene. One week later, induction of biliary reprogramming was started by initiating a 0.1% 3,5-diethoxycarbonyl-1,4-dihydrocollidine (DDC) diet (Envigo). Eight weeks after the DDC challenge, reprogrammed cells and biliary cells were harvested as described below.
For the exogenous expression of Sox4, Sox9 and Sox4/Sox9, 4- to 8-week-old mice were retro-orbitally injected with 5 × 1011 genome copies/mouse/gene. For the Sox4 expression experiments for RNA-Seq, ATAC-Seq, and CUT&RUN-Seq experiments, 1 × 1012 genome copies/mouse of AAV8-TBG-HA-Sox4-P2A-Cre were retro-orbitally injected.
Plasmid cloning
All PCR reactions were performed using the Phusion Flash High-Fidelity PCR Master Mix (Thermo) following the manufacturer’s instructions. For all AAV plasmids, endotoxin was eliminated by treating the plasmids with Endozero columns (Zymo Research) before proceeding to AAV production. Transformation was performed using Stbl3 bacteria (Thermo) following the manufacturer’s instruction.
AAV-HA-Sox4-P2A-Cre plasmid
Mouse HA-Sox4-P2A and P2A-Cre blocks were PCR-amplified from pLVXT-Sox4 (Addgene, #101121) and AAV:ITR-U6-sgRNA(backbone)-TBG-Cre-WPRE-hGHpA-ITR31 respectively, using the primers listed in Table S4. The AAV-TBG backbone was prepared by removing EGFP from pAAV.TBG.PI.eGFP.WPRE.bGH (Addgene, #105535) using NotI-HF (NEB) and BamHI-HF (NEB). The two PCR-amplified DNA blocks were inserted into the linearized AAV-TBG vector by Gibson Assembly using an NEBuilder HiFi DNA assembly kit (NEB) following the manufacturer’s instruction.
AAV-HA-Sox9-P2A-Cre plasmid
Mouse HA-Sox9-P2A and P2A-Cre blocks were PCR-amplified from pWPXL-Sox9 (Addgene, #36979) and AAV:ITR-U6-sgRNA(backbone)-TBG-Cre-WPRE-hGHpA-ITR(Katsuda et al., 2022) respectively, using the primers listed in Table S4. The two blocks were inserted into the linearized AAV-TBG vector prepared as described above using the NEBuilder assembly kit.
AAV-FLAG-Sox4-P2A-Cre plasmid
Mouse FLAG-Sox4-P2A and P2A-Cre blocks were PCR-amplified from pLVXT-Sox4 and AAV:ITR-U6-sgRNA(backbone)-TBG-Cre-WPRE-hGHpA-ITR respectively, using the primers listed in Table S4. The two blocks were inserted into the linearized AAV-TBG vector prepared as above using the NEBuilder assembly kit.
CMV-FLAG-Sox4 plasmid
The FLAG-Sox4 block was amplified from the AAV-FLAG-Sox4-P2A-Cre plasmid using a forward primer: 5’- AGTGCTAGCGCCACCATGGACTACAAAGACG, and a reverse primer: 5’- TCGTGTACATCAGTAGGTGAAGACCAGGTTAGAGATGC. The DNA block was then digested with NheI-HF (NEB) and BsrGI-HF (NEB) and cloned into the CMV backbone vector prepared by linearizing the mEGFP-N1-YAPS127A-L318E plasmid (Addgene, #166465) using NheI-HF and BsrGI-HF.
AAV preparation
90–100% confluent 293T cells in 15 cm dishes were replenished with 15 ml fresh DMEM (Thermo) supplemented with 2% FBS (Thermo) without antibiotics. For a 15 cm plate, 16 μg AAV8-Rep/Cap plasmid (Grompe Lab), 16 μg Ad5-Helper plasmid (Grompe Lab), 16 μg AAV transfer vector, and 144 μl of 1 mg/ml polyethylenimine (PEI) (Polysciences) were mixed in 9 ml OptiMEM (Thermo). After incubation at room temperature (RT) for 15 minutes, plasmid/PEI complex was added to 293T cells in a dropwise manner, and the plates were gently rocked to mix. After incubation in a CO2 incubator for 6 days, the cells and culture supernatant were harvested into 50 ml tubes and centrifuged at 1,900 ×g for 15 minutes. The supernatant was transferred to new tubes, and 1/40,000 volume of Benzonase (Sigma-Aldrich) was added and mixed thoroughly by inversion. After digestion of non-viral DNA by incubating at 37°C for 30 minutes, virus medium was centrifuged at 1,900 ×g for 15 minutes, and the supernatant was filtered with a 0.22 μm a filter unit containing a PES membrane (Thermo). Then, 1/4 volume of 40% polyethylene glycol 8000 (PEG8000) in 2.5 M NaCl was added and mixed thoroughly by inversion. Following overnight incubation at 4°C, precipitated AAV was collected by centrifugation at 3,000 ×g for 15 minutes. After removal of the supernatant, precipitate was homogenized in 100 μl PBS per 15 cm dish by through pipetting. Non-AAV precipitate was eliminated by centrifugation at 2,200 ×g for 5 minutes. Smaller debris were further removed by filtrating the eluted AAV with a 0.45 μm filter columns (Corning). This crude AAV was titrated by qPCR using the AAV8-TBG-Cre (Penn Vector Core) as a standard and a forward primer: 5’- GGAACCCCTAGTGATGGAGTT, and a reverse primer: 5’- CGGCCTCAGTGAGCGA and directly used for KO experiments without further purification.
Immunofluorescence
Frozen sections were used for immunofluorescence. Tissue was fixed in zinc-formalin overnight, equilibriated in 30% sucrose/PBS, embedded in Tissue-Tek® O.C.T. compound (Sakura), and 8 μm sections were prepared. Remnant O.C.T. compound was removed by submerging the slides in PBS for 5 minutes. The specimens were permeabilized with 0.1% Triton X-100 (Fisher) in PBS at RT for 15 minutes. After treatment with the Blocking One Histo (Nacalai) at RT for 10 minutes, the specimens were incubated with primary antibodies (Table S5) diluted in 1/20× Blocking One Histo at RT for 1 hour or at 4°C overnight. The sections were then stained using donkey anti-rabbit, rat, or goat antibodies conjugated with AlexaFluor 488, AlexaFluor 594 or AlexaFluor 647 (Invitrogen) (Table S5) at 1/300 dilution and DAPI (Thermo) at 1/1000 dilution. After incubation at RT for 1 hour, the specimens were mounted in Aqua-Poly/Mount (Polysciences), and imaged using an Olympus IX71 inverted fluorescent microscope.
Hematoxylin and Eosin (H&E) staining
H&E staining was performed by Penn Molecular Pathology and Imaging Core (MPIC).
Hepatocyte isolation
Livers were perfused with 40 ml of HBSS (Thermo), followed by 40 ml HBSS with 1 mM EGTA (Sigma), then 40 ml HBSS with 5 mM CaCl2 (Sigma) and 40 μg/ml liberase (Sigma). Following perfusion, livers were mechanically dispersed with tweezers, resuspended in 10 ml wash medium (DMEM supplemented with 5% FBS), and filtrated with a 70 μm cell strainer. The cells were centrifuged at 50 ×g at 4°C for 5 minutes. Then, the cells were resuspended in complete percoll solution (10.8 ml percoll (Cytiva), 12.5 ml wash medium, and 1.2 ml 10× HBSS per liver) and centrifuged at 50 ×g at 4°C for 10 minutes. After a single wash with 10 ml medium, cells were spun at 50 ×g at 4°C for 5 minutes and then used for downstream experiments.
Whole liver cell isolation from normal mice
Livers were digested by the two-step liberase perfusion as described above. Then, the undigested remaining tissue was transferred to a 1.5 ml tube, minced with surgical scissors, and further digested with 10× concentrated liberase (~ 430 μl/tube of 400 μg/ml in HBSS with 5 mM CaCl2) at 37°C for 30 minutes while vortexing the sample several times intermittently. The digested tissue was filtered with a 70 μm cell strainer and combined with the cell suspension digested previously. The cells were then centrifuged at 300 ×g at 4°C for 5 minutes. Then, the cells were suspended in 10 ml ACK lysis buffer (Quality Biological) and incubated on ice for 10 minutes to remove red blood cells. The cells were then collected by centrifugation at 300 ×g at 4°C for 5 minutes and used for downstream analyses.
Whole liver cell isolation from DDC-treated mice
Livers were digested by the two-step liberase perfusion as described above. Following perfusion, livers were submerged in 10 ml fresh HBSS with 5 mM CaCl2, 40 μg/ml liberase and 40 μg/ml DNaseI (Millipore) in a C-tube (Miltenyi) and further digested using a gentleMACS Octo dissociator (Miltenyi) with a heating unit using the “37C_m_LIDK_1” protocol. Dissociated tissue was diluted in flow buffer (HBSS, pH 7.4) supplemented with 25 mM HEPES (Thermo), 5 mM MgCl2 (MedSupply Partners), 1× Pen/Strep (Thermo), 1× Fungizone (Thermo), 1× NEAA (Thermo), 1× Glutamax (Thermo), 0.3% glucose (Sigma), 1× sodium pyruvate (Thermo) supplemented with 40 μg/ml DNaseI (hereafter flow buffer(+)). Undigested tissue was removed by passing it through a 70 μm cell strainer, and the cells were centrifuged at 300 ×g at 4°C for 5 minutes. Then, the cells were suspended in 10 ml ACK lysis buffer and incubated on ice for 10 minutes. The cells were then collected by centrifugation at 300 ×g at 4°C for 5 minutes and used for downstream analyses.
Flow cytometry
Cells were resuspended in 2–3 ml flow buffer(+), and filtered with a 35 μm cell strainer equipped with a FACS tube (BD). The cell suspension was then transferred to a round-bottom 96 well plate at 100–150 μl/well and centrifuged at ~800 ×g at 4°C for 1 minute with a slow brake. The cells were then resuspended in 100 μl/well of flow buffer(+) containing fluorophore-conjugated antibodies (Table S5) and incubated on ice for 20 minutes. After two washes in flow buffer(+) (150–200 μl/well, ~800 ×g at 4°C for 1 minute, slow brake), the cells were resuspended in flow buffer(+) containing 1/1,000× TO-PRO-3 (Thermo) and analyzed using an LSR II flow cytometer (BD).
Fluorescence-activated cell sorting (FACS) of DDC-treated whole liver cells
Cells were resuspended in 5 ml flow buffer(+) by centrifugation at 300 ×g at 4°C for 5 minutes. After removal of the supernatant, the volume was increased to 1–1.5 ml with flow buffer(+), and 1/100 volume of rat anti-Cd45, rat-anti-Cd11b and rat anti-Cd31 antibodies (Table S5) were added and incubated on ice for 10 minutes. After washing with 2 ml flow buffer(+) (300 ×g at 4°C for 5 minutes, slow brake), the cells were resuspended in 5 ml flow buffer(+), 600 μl Dynabeads-anti-rat IgG (Thermo) were added, and the cell/bead mixture was incubated at 4°C for 30 minutes with gentle tilting and rotation. The suspension was transferred to 5 ml FACS tubes and placed on a DynaMag™-5 Magnet (Thermo) for 2 minutes. The supernatant was transferred to a new tube, and the cells were collected by centrifugation at 2,000 rpm (~ 800×g) at 4°C for 2 minutes with a slow brake. The cells were then resuspended in MACS buffer (PBS, 0.5% BSA, 2 mM EDTA) to the final volume of approximately 1.5 ml, and 150 μl CD326 (EpCAM) MicroBeads (Miltenyi) were added. After incubation at 4 °C for 15 minutes, the cells were washed with an equal volume of MACS buffer then centrifuged at 2,000 rpm (~ 800×g) at 4°C for 2 minutes with a slow brake. Cells were resuspended in 2 ml MACS buffer, and Epcam+ cells and Epcam- cells were separated using LS columns (Miltenyi) following the manufacturer’s instruction (4 columns were used per animal; 0.5 ml suspension/column). The cells were then collected by centrifugation at 2,000 rpm (~ 800×g) at 4°C for 2 minutes with a slow brake. The cells were then resuspended in 0.5–1 ml flow buffer(+), and approximately 10–15 μl of each were set aside for fluorescence-minus one (FMO) controls (FMO-Brilliant Violet 421 (BV421): all stained except BV421-Cd24; and FMO-PE/Dazzle 594: all stained except PE/Dazzle594-Epcam), and stained in a 96 well round bottom plate as described earlier. The cells to be used for FACS were stained with BV421-Cd24 (Biolegend), PE/Dazzle594-Epcam (Biolegend), PE/Cy7-Cd11b/Cd31/Cd45 (Table S5) at 1:100 dilution in 15 ml tubes on ice for 20 minutes. After washing in 2 ml flow buffer(+) once by centrifugation at 2,000 rpm (~ 800×g) at 4°C for 2 minutes with a slow brake, the cells were resuspended in 1–3 ml flow buffer(+) with 1/1,000 TO-PRO-3, and the cells were sorted on an Aria II sorter (BD).
Total RNA isolation and reverse transcription
Total RNA was extracted using the NucleoSpin RNA Kit (Takara) following the manufacturer’s instructions. Approximately 500 ng of RNA was reverse transcribed in 20 μl volume using High Capacity cDNA Reverse Transcription Kit (Thermo). cDNA was diluted at 1:20 ratio in water and used for qPCR.
qPCR
qPCR was performed at 10 μl/well using the Bio-Rad CFX 384 qPCR machine (Bio-rad). Each well contained: 3 μl diluted cDNA, 0.25 μl each of 10 μM forward and reverse primers (Table S6), 1.5 μl H2O and 5 μl SsoAdvanced SYBR reagent (Bio-rad).
RNA-Seq
Library preparation and sequencing were performed by Novogene (Sacramento, CA) using a Novaseq 6000 (Illumina).
Bioinformatics for RNA-Seq
Reads were aligned to the mouse genome (GRCm39) using STAR aligner with default parameters 51. Gene-count matrices were produced by featureCounts52. To compare gene expression between samples, expression levels were normalized based on the “median of ratios” method using DESeq2(Love et al., 2014). To compare expression levels between different genes, “Transcripts per million (TPM)” normalization was performed using Salmon53. To build the gene sets for gene set enrichment analysis (GSEA), differential gene expression analysis was performed between hepatocytes and Rep_early cells using the median normalized data with the cut-off values of p.adj < 0.05 and |log2(fold-change)| ≥ 1. The generated gene sets are listed in Table S1.
Bioinformatics for single cell RNA-Seq
Data were obtained in our earlier study27 and deposited with the accession number GSE157698. Using R Seurat package, Seurat objects were created with arguments “min.cells = 3, min.genes = 200.” The cells were computationally filtered for YFP+ cells using “subset = nFeature_RNA > 200 & nFeature_RNA < 4000 & percent.mt < 0.25 & percent.yfp > 0.” Pesudotime analysis was performed using the monocle3 R package29. Briefly, the data were visualized with UMAP, and outlier cells were manually removed using choose_cells function in monocle3. Then, a trajectory was generated using cluster_cells and learn_graph functions, which calculated the pseudotime along the DDC-induced biliary reprogramming of hepatocytes. Each cell was assigned a pseudotime using order_cells function. Gene expression changes along the psuedotime were visualized using the plot_genes_in_pseudotime function.
ATAC-Seq
50,000 cells were isolated from three AAV8-EV- or AAV8-HA-Sox4-P2A-Cre-injected (1 × 1012 gc/mouse) liver samples at 4 dpi and used as input for ATAC-Seq library preparation. Libraries were prepared as described34 with minor modifications. Briefly, nuclei were isolated from the cells using a solution of 10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% IGEPAL CA-630. Immediately following isolation, the transposition reaction was conducted using Tn5 transposase (Diagenode) and TD buffer (Illumina) for 30 minutes at 37 °C. Transposed DNA fragments were purified using a Qiagen MinElute Kit, barcoded, and PCR amplified for 7–9 cycles depending on the samples using NEBNext High Fidelity 2× PCR master mix (New England Biolabs). The optimal cycle number was determined empirically each time by qPCR. The libraries were then purified with AMPure XP beads. Paired-end 150 × 2 sequencing was performed by Novogene (Sacramento, CA) using a NovaSeq 6000 (Illumina).
Bioinformatics for ATAC-Seq
Reads were aligned to the mouse genome (mm10) using Bowtie2 (Langmead and Salzberg, 2012) with options “--very-sensitive -X 1000 --dovetail -1”, and duplicates were removed using Picard (http://broadinstitute.github.io/picard/). Peak calling was performed using MACS254 with an FDR of 0.01 (default setting). Motif analysis was performed using HOMER (http://homer.ucsd.edu/homer/motif/) with the option “-size 300 –mask”. Differential peak analysis was performed using triplicate samples (BAM files and peaks called independently for each replicate) using DiffBind55 and DESeq2. Differential peaks were then annotated to the nearest genes using the annotatePeak function of ChIPSeeker R package56, and the expression of these genes during DDC-induced reprogramming was analyzed using the RNA-Seq data as described above (GSE218945). Using the genes enriched in the newly closed/opened regions, gene ontology analysis was performed using the “enrichGO” function in clusterProfiler R package57. Analysis of the genomic distribution of the ATAC peaks was performed using plotAnnoBar and plotDistToTSS functions in ChIPSeeker. For GSEA, the annotated genes were ranked based on the log-fold change between EV and Sox4-expressing hepatocytes calculated by DESeq2, and the ranked gene list was used as input for the fgsea R package (fast preranked GSEA)44. For visualization using the Integrative Genomics Viewer (IGV) track browser58 and deepTools (for generation of heatmaps)59, BAM files were converted to bigwig files using bamCoverage with the “reads per genome coverage (RPGC)” normalization method.
Footprinting analysis was performed on replicate-merged BAM files using TOBIAS35 with the default settings. For global foot printing analysis, all the foot prints that were assigned “bound” either in EV_Hep or Sox4_Hep were used as the input for the analysis (default setting of the TOBIAS pipeline). The output “bindetect_results” file was imported to R for visualization. For visualization of the aggregate footprints, corrected bigwig signals were retrieved using the ScoreBed function, and plotted using the ggplot2 R package. TOBIAS scores were retrieved using the “PlotAggregate” function with the option, “--output-txt”, and visualized using ggplot2.
CUT&RUN-Seq
CUT&RUN DNA was prepared following EpiCypher® CUTANA™ CUT&RUN Protocol v2.0 with minor modification. Briefly, 500,000 cells were isolated from AAV8-HA-Sox4-P2A-Cre-injected (1 × 1012 gc/mouse) livers at 18 hpi and 4 dpi (n = 3, each timepoint). The cells were washed in wash buffer (20 mM HEPES, pH 7.5, 150 mM NaCl (Sigma), 0.5 mM spermidine (Sigma) and EDTA-free protease inhibitors (Roche)) twice followed by centrifugation at 600 ×g at 4 °C for 3 minutes. The cells were resuspended in 150 μl wash buffer, 15 μl Concanavalin A-coated magnetic beads (EpiCypher) were added, and the cells/bead conjugate was bound to a magnet. After removal of the supernatant, the cells were resuspended in 50 μl antibody reaction buffer (wash buffer with 5% digitonin and 0.5 M EDTA), 0.5 μl antibody was added (Table S5) and incubated at 4°C overnight. Following two washes in 200 μl permeabilization buffer (wash buffer with 5% digitonin), beads were resuspended in 50 μl permeabilization buffer. Then, 2.5μl pAG-MNase (EpiCypher) was added, and the samples were incubated at RT for 10 minutes. While on the magnet, the supernatant was removed, and the samples were washed twice in 200 μl cold permeabilization buffer. Following resuspension in 50 μl permeabilization buffer, 1 μl 100 mM CaCl2 was added to activate MNase, and MNase digestion was performed at 4°C for 2 hours. The reaction was stopped by adding 33 μl STOP buffer (340 mM NaCl, 20 mM EDTA, 4 mM EGTA, 10 μg/ml RNase A, 50 μg/ml glycogen), and 1 μl E-Coli spike-in DNA (0.5 ng/μl) (EpiCypher) to each tube. After incubation at 37°C for 10 minutes, beads were bound to the magnet, and the supernatant was transferred to a new tube. The DNA was cleaned up with the NEB Monarch kit (NEB), eluted in 12 μl elution buffer, and used as CUT&RUN DNA. Library preparation was performed using NEBNext Ultra II End Prep kit (NEB) with a slightly modified protocol. Briefly, after adaptor ligation, HA-Sox4 CUT&RUN DNA samples were purified with 1.75× AMPure XP beads (Beckman), while histone and IgG CUT&RUN samples were purified with 1.1× AMPure XP beads. 14-cycle PCR with index primers (NEB) was performed (initial denaturation: 1 cycle of 98 °C for 45 seconds; annealing/extension: 14 cycles of 98°C for 15 seconds and 60°C for 10 seconds; final extension: 72°C for 1 minute). Finally, the libraries were purified with AMPure XP beads (for HA-Sox4 0.8× followed by 1.2×; for histone and isotype control 0.9×). For samples with adaptor contamination, further AMPure bead cleanup or gel extraction was performed. Sequencing was performed using an Illumina NextSeq500 with a 150 cycle mid-output reagent kit (75-bp paired end) and an Illumina NextSeq2000 with 100 cycle S2 reagent kit (65-bp paired end).
Bioinformatics for Sox4 CUT&RUN-Seq
Sox4 and isotype control CUT&RUN-Seq data were obtained in two separate sequencing experiments. The 75-bp data obtained from the NextSeq500 experiment were first trimmed with Cutadapt (ver. 4.1) shorten the reads to 65-bp in order to merge into single 65-bp fastq files. Reads were aligned to the mouse genome (mm10) using Bowtie2 with the option “-X 1000”, and duplicates were removed using Picard. For Sox4 differential peak analysis, peak calling was performed for each of the three biological replicates using MACS2 with each of the Sox4 BAM files and the combined isotype control BAM file for each timepoint. The FDR was set to 0.1 for peak calling of individual samples, and the called peaks showed robust enrichment of Sox binding motifs as confirmed by HOMER. This analysis generated 13,081 ± 5,151 and 28,160 ± 13,294 peaks for 18 hpi and 4 dpi samples, respectively. Using these peaks and BAM files for each replicate, differential peak analysis was performed using DiffBind and DESeq2 with the FDR cut-off set to 0.05 (default setting). By this analysis, we obtained 2,327 peaks enriched for 18 hpi and 3,136 peaks enriched for 4 dpi. For GSEA, all the Sox4 peaks were annotated to the nearest genes using the annotatePeak function of ChIPSeeker R package(Yu et al., 2015), and the annotated genes were ranked based on the log-fold change between 18 hpi and 4dpi samples calculated by DESeq2, and the ranked gene list was used as input for fgsea R package (fast preranked GSEA)44.
After confirmation of reproducibility across the three biological replicates by PCA mapping (Supplementary Fig. 5), we re-performed peak calling using MACS2 with replicate-merged Sox4 samples and the corresponding replicated-merged isotype control without FDR filtering, which generated 9,463 and 19,362 Sox4 peaks at 18 hpi and 4 dpi samples respectively. Motif analysis was performed using HOMER with the option “-size 300 –mask.” For visualization with IGV and deepTools, Sox4 BAM files were normalized by subtracting the isotype control signals using the bamCompare function.
Quantification of total Sox4 binding was performed by calculating scale factors for each Sox4 sample using the E-coli spike-in controls. Briefly, the reads were aligned to E-coli genome (K12_MG1655) using Bowtie2, and the scale factors were calculated as the ratios of “(Number of mouse-aligned reads)/(#Number of E-coli-aligned reads)”.
Bioinformatics for histone post-translational modification CUT&RUN-Seq
CUT&RUN-Seq of histone post-translational modification was performed with either NextSeq500 (75-bp) or NextSeq2000 (65-bp) using the same samples as used for Sox4 CUT&RUN-Seq. Reads were aligned to the mouse genome (mm10) using Bowtie2 with the option “-X 1000,” and duplicates were removed using Picard. For visualization using IGV and deepTools, BAM files were converted to bigwig files using bamCoverage with the “read counts per million (CPM)” normalization method. Once we confirmed the reproducibility across the three biological replicates by PCA mapping (Supplementary Fig. 5), we combined the fastq files (75-bp data were shortened to 65-bp using Cutadapt) and obtained single bigwig files for each group. Unless otherwise mentioned, all the heatmaps and aggregate plots in this manuscript are shown for the replicate-merged data.
Immunocytochemistry of HA-Sox4 prior to CUT&RUN-Seq
After the overnight primary antibody reaction as described above, the cells were incubated with 1/300 AlexaFluor 594-conjugated anti-rabbit IgG (Invitrogen) diluted in permeabilization buffer at RT for 1 hour. Then, the cells were spread onto a 24 well plate for imaging.
Protein Expression and Purification
CMV-FLAG-Sox4 was produced using 10× 15 cm plates of 293T cells. Approximately 70% confluent plates were replenished with 15 ml/plate fresh DMEM (Thermo) supplemented with 2% FBS (Thermo) without antibiotics. For a 15 cm plate, 48 μg CMV-FLAG-Sox4 plasmid and 144 μl of 1 mg/ml PEI were mixed in 9 ml OptiMEM. After incubation at RT for 15 minutes, plasmid/PEI complex was added to 293T cells in a dropwise manner, and the plates were gently shaken back and forth to mix the medium evenly. After incubation in a CO2 incubator for 2 days, the cells were harvested by standard trypsinization. After washing twice in 10 ml PBS by centrifugation at 800×g for 2 minutes, the cells were resuspended in lysis buffer (50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1 mM EDTA, 1% TritonX-100, 1x protease/phosphatase inhibitor (Pierce)) to the final volume of approximately 12 ml. After incubation at RT on a rotator for 30 minutes, the suspension was split into 300 μl aliquots, and sonicated using the Bioruptor Plus Sonicator (Diagenode) with High Power for 5 cycles (30 seconds ON and 30 seconds OFF for 5 minutes per cycle). Sonicated samples were centrifuged at 4°C at 14,000×g for 15 minutes, and the supernatant was collected and combined into one tube. Small debris were further removed by passing the samples through a 0.45 μm PVDF filter (Millipore). An affinity chromatography column was prepared with ~0.6 ml anti-FLAG M2 beads (Sigma) following the manufacturer’s instructions. The lysate was loaded onto the column under gravity flow. The column was washed with 12 ml (~20× volume) TBS (50 mM Tris-HCl, 150 mM NaCl, pH 7.4), and FLAG-Sox4 protein was eluted by 5 rounds of competitive elution with 1× column volume (0.6 ml) of 100 μg/ml FLAG peptide (Sigma) in TBS. Eluted FLAG-Sox4 was concentrated to ~50 μl using 30K MWCO columns (Thermo). Finally, the buffer was exchanged with 20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 1 mM DTT using 7K MWCO Pierce Zeba™ Desalt Spin Columns (Thermo). FLAG-Sox4 concentration was estimated by quantifying the band densities corresponding to ~65 kDa by SDS-PAGE using BSA standards (Supplementary Fig. 6).
Nucleosome Preparation
The 162 bp LIN28B DNA fragment corresponds to the genomic location: hg18-chr6: 105,638,004–105,638,165 AGTGGTATTAACATATCCTCAGTGGTGAGTATTAACATGGAACTTACTCCAACAATACAGATGCTGAATAAATGTAGTCTAAGTGAAGGAAGAAGGAAAGGTGGGAGCTGCCATCACTCAGAATTGTCCAGCAGGGATTGTGCAA GCTTGTGAATAAAGACA The DNA sequence was created by PCR with end-labeled primers: Cy5-LIN28B-Fw: 5Cy5/AGTGGTATTAACATATCCTCAGTGGTG; LIN28B-Rv: TGTCTTTATTCACAAGCTTGCACAA. The 162 bp fluorescent-tagged DNA fragments were gel extracted. The nucleosomes were reconstituted by salt dilution. Briefly, a reaction mixture was prepared with 10–20 pM labeled DNA fragment and octamer at a 1:1 DNA:histone ratio and diluted to 10 μl in 10 mM Tris-HCl, pH 8.0, 2 M NaCl, and then incubated at RT for 30 minutes. Then, 3.5, 6.5, 13.5, and 46.5 μl of 10 mM Tris-HCl pH 8.0 was added at 30 minutes intervals, which brought the reactions to 1.48, 1.0, 0.6 and 0.25 M NaCl (80 μl total volume). The mononucleosomes were further purified by 10 – 30 % glycerol gradient followed by dialysis with 10 mM Tris-HCl, pH 8.0, and 1 mM BME. The reconstituted nucleosomes were heat-shifted by incubating at 37 °C for 30 min.
Binding Reactions
The end-labeled oligonucleotides containing specific or non-specific sites, LIN28B-DNA, and LIN28B-nucleosomes were incubated with recombinant proteins in DNA-binding buffer (10 mM Tris-HCl (pH7.5), 1 mM MgCl2, 1 mM DTT, 50 mM KCl, 0.3 mg/ml BSA, 5% Glycerol) at RT for 30 minutes. Free and bound DNA were separated on 5% non-denaturing polyacrylamide gels run in 0.5× Tris–borate–EDTA and visualized using a PhosphorImager using Cy5 fluorescence setting (excitation at 633 nm and emission filter 670 BP 30) and high sensitivity setting.
Bioinformatics for ChIP-Seq
ChIP-Seq data were downloaded from either the GEO database with the accession numbers GSE57559 for H2B and H345; GSE137066 for adult liver Hnf4a37; GSE167287 for colon epithelial cell Hnf4a46; GSE53736 for the adult liver Rxra data36; GSE29184 for the mouse adult liver H3K4me360; GSE111502 for Yap of DDC-treated hepatocytes41.
When biological replicates were available, the downloaded fastq files were first combined. Reads were aligned to the mouse genome (mm10) using Bowtie2 with the option “-X 1000,” and duplicates were removed using Picard. For visualization using the IGV and deepTools, BAM files were converted to bigwig files using the bamCoverage with the CPM normalization method. Peak calling was performed using MACS254 with an FDR of 0.01 (default setting) without control inputs for H2B, H3 and H3K4me3 and with control inputs for liver/colon Hnf4a and Rxra.
The bigwig file for the Yap-ChIP-Seq data was directly downloaded from GEO and used for IGV visualization.
Bioinformatics for MNase-Seq
Low- and high-level MNase-Seq data were downloaded from GEO with the accession number GSE5755945. Reads were aligned to the mouse genome (mm10) using Bowtie2 with the option “-X 1000,” and duplicates were removed using Picard. BAM files of biological replicates were combined using “samtools merge,” and inputted into DANPOS361 to calculate the nucleosome occupancy. For visualization using the IGV and deepTools, the DANPOS3-generated BAM files were converted to bigwig files using the bamCoverage with the RPGC normalization method.
Generation of the list of active liver enhancer loci
Liver-specific active enhancers in this study are defined as genomic regions which are p300+ H3K4me1+ H3K4me3- H3K27ac+ with DNA hypersensitivity (DHS). To obtain the list of these regions, we used a previously published adult mouse liver-specific enhancer list that was identified by Shen and colleagues based on ChIP-Seq data of p300, H3K4me1, H3K4me3 (ENCODE, GSE29184)(Shen et al., 2012). Since H3K27ac predominately marks active enhancers, we filtered these enhancers so that their central 1 kb regions have 300 or more base-pair overlap with adult mice liver H3K27ac peaks that were identified in the same study (ENCODE, GSE29184)60. For filtering DHS-positive adult liver-specific enhancers, we further filtered them so that their central 1kb regions have 300 or more base-pair overlap with adult mice liver DHS peaks (ENCODE, GSM1014195).
Generating a list of active liver promoter loci
Active promoters in this study are defined as H3K4me3+ transcription start sites (TSSs) in the adult liver in the proximity of highly expressed genes. H3K4me3+ regions were defined as H3K4me3 ChIP-Seq peaks (GSE29184) with 0.5 kb extension bilaterally. Highly expressed genes in hepatocytes were defined as the genes ranked within the top 25% in normal hepatocytes using the TPM-normalized RNA-Seq data obtained above (GSE218947). The H3K4me3+ regions were annotated with the nearest genes (output of MACS2), and then filtered with the list of hepatocyte-highly expressed genes. TSSs included in these regions were regarded as active liver promoter loci (n = 13,458).
Bioinformatics for microarray of YAPS127A-expressed hepatocytes.
Microarray data of YAPS127A-expressed mouse hepatocytes were downloaded from GEO with the accession number GSE5556026. We used a pipeline provided by Klaus and Reisernauer (https://bioconductor.org/packages/release/workflows/vignettes/maEndToEnd/inst/doc/MA-Workflow.html). Briefly, the “robust multichip average (RMA)” algorithm was used for background correction, quantile normalization and data summarization using the oligo R package. The probes were filtered with a cut-off median signal intensity > 4.
Supplementary Material
Acknowledgements
This work was supported by NIH grants R01DK083355 and R01GM36477, the Fred and Suzanne Biesecker Pediatric Liver Center, the Abramson Family Cancer Research Institute, The International Medical Research Foundation, The Daiichi Sankyo Foundation of Life Science, The Mochida Memorial Foundation for Medical and Pharmaceutical Research, The Mitsukoshi Health and Welfare Foundation, The Uehara Memorial Foundation, The Kanae Foundation, The Japanese Biochemical Society, and the Osamu Hayaishi Memorial Scholarship for study abroad. We thank the Penn Center for Molecular Studies in Digestive and Liver Disease (P30DK050306) for assistance with tissue processing, Véronique Lefebrvre and Rajan Jain for helpful discussions and comments on the manuscript, and members of the Stanger and Zaret labs for useful suggestions.
Footnotes
Competing interests: The authors declare no competing interests.
References
- 1.Giroux V. & Rustgi A. K. Metaplasia: tissue injury adaptation and a precursor to the dysplasia-cancer sequence. Nat Rev Cancer 17, 594–604, doi: 10.1038/nrc.2017.68 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Merrell A. J. & Stanger B. Z. Adult cell plasticity in vivo: de-differentiation and transdifferentiation are back in style. Nat Rev Mol Cell Biol 17, 413–425, doi: 10.1038/nrm.2016.24 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kopp J. L. et al. Identification of Sox9-dependent acinar-to-ductal reprogramming as the principal mechanism for initiation of pancreatic ductal adenocarcinoma. Cancer Cell 22, 737–750, doi: 10.1016/j.ccr.2012.10.025 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Prevot P. P. et al. Role of the ductal transcription factors HNF6 and Sox9 in pancreatic acinar-to-ductal metaplasia. Gut 61, 1723–1732, doi: 10.1136/gutjnl-2011-300266 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Reichert M. et al. The Prrx1 homeodomain transcription factor plays a central role in pancreatic regeneration and carcinogenesis. Genes Dev 27, 288–300, doi: 10.1101/gad.204453.112 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Vercauteren Drubbel A. et al. Reactivation of the Hedgehog pathway in esophageal progenitors turns on an embryonic-like program to initiate columnar metaplasia. Cell Stem Cell 28, 1411–1427 e1417, doi: 10.1016/j.stem.2021.03.019 (2021). [DOI] [PubMed] [Google Scholar]
- 7.Jiang M. et al. Transitional basal cells at the squamous-columnar junction generate Barrett’s oesophagus. Nature 550, 529–533, doi: 10.1038/nature24269 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Regalo G. & Leutz A. Hacking cell differentiation: transcriptional rerouting in reprogramming, lineage infidelity and metaplasia. EMBO Mol Med 5, 1154–1164, doi: 10.1002/emmm.201302834 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zaret K. S. Pioneer Transcription Factors Initiating Gene Network Changes. Annu Rev Genet 54, 367–385, doi: 10.1146/annurev-genet-030220-015007 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sekiya T. & Zaret K. S. Repression by Groucho/TLE/Grg proteins: genomic site recruitment generates compacted chromatin in vitro and impairs activator binding in vivo. Mol Cell 28, 291–303, doi: 10.1016/j.molcel.2007.10.002 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Balsalobre A. & Drouin J. Pioneer factors as master regulators of the epigenome and cell fate. Nat Rev Mol Cell Biol 23, 449–464, doi: 10.1038/s41580-022-00464-z (2022). [DOI] [PubMed] [Google Scholar]
- 12.Sunkel B. D. & Stanton B. Z. Pioneer factors in development and cancer. iScience 24, 103132, doi: 10.1016/j.isci.2021.103132 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Chronis C. et al. Cooperative Binding of Transcription Factors Orchestrates Reprogramming. Cell 168, 442–459 e420, doi: 10.1016/j.cell.2016.12.016 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Sardina J. L. et al. Transcription Factors Drive Tet2-Mediated Enhancer Demethylation to Reprogram Cell Fate. Cell Stem Cell 23, 727–741 e729, doi: 10.1016/j.stem.2018.08.016 (2018). [DOI] [PubMed] [Google Scholar]
- 15.Soufi A., Donahue G. & Zaret K. S. Facilitators and impediments of the pluripotency reprogramming factors’ initial engagement with the genome. Cell 151, 994–1004, doi: 10.1016/j.cell.2012.09.045 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Soufi A. et al. Pioneer transcription factors target partial DNA motifs on nucleosomes to initiate reprogramming. Cell 161, 555–568, doi: 10.1016/j.cell.2015.03.017 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Horisawa K. et al. The Dynamics of Transcriptional Activation by Hepatic Reprogramming Factors. Mol Cell 79, 660–676 e668, doi: 10.1016/j.molcel.2020.07.012 (2020). [DOI] [PubMed] [Google Scholar]
- 18.Matsuda T. et al. Pioneer Factor NeuroD1 Rearranges Transcriptional and Epigenetic Profiles to Execute Microglia-Neuron Conversion. Neuron 101, 472–485 e477, doi: 10.1016/j.neuron.2018.12.010 (2019). [DOI] [PubMed] [Google Scholar]
- 19.Wapinski O. L. et al. Hierarchical mechanisms for direct reprogramming of fibroblasts to neurons. Cell 155, 621–635, doi: 10.1016/j.cell.2013.09.028 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chen J. et al. Hierarchical Oct4 Binding in Concert with Primed Epigenetic Rearrangements during Somatic Cell Reprogramming. Cell Rep 14, 1540–1554, doi: 10.1016/j.celrep.2016.01.013 (2016). [DOI] [PubMed] [Google Scholar]
- 21.Respuela P. et al. Foxd3 Promotes Exit from Naive Pluripotency through Enhancer Decommissioning and Inhibits Germline Specification. Cell Stem Cell 18, 118–133, doi: 10.1016/j.stem.2015.09.010 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Schnappauf O. et al. Enhancer decommissioning by Snail1-induced competitive displacement of TCF7L2 and down-regulation of transcriptional activators results in EPHB2 silencing. Biochim Biophys Acta 1859, 1353–1367, doi: 10.1016/j.bbagrm.2016.08.002 (2016). [DOI] [PubMed] [Google Scholar]
- 23.Schaub J. R. et al. De novo formation of the biliary system by TGFbeta-mediated hepatocyte transdifferentiation. Nature 557, 247–251, doi: 10.1038/s41586-018-0075-5 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Tarlow B. D. et al. Bipotential adult liver progenitors are derived from chronically injured mature hepatocytes. Cell Stem Cell 15, 605–618, doi: 10.1016/j.stem.2014.09.008 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Yanger K. et al. Robust cellular reprogramming occurs spontaneously during liver regeneration. Genes Dev 27, 719–724, doi: 10.1101/gad.207803.112 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Yimlamai D. et al. Hippo pathway activity influences liver cell fate. Cell 157, 1324–1338, doi: 10.1016/j.cell.2014.03.060 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Merrell A. J. et al. Dynamic Transcriptional and Epigenetic Changes Drive Cellular Plasticity in the Liver. Hepatology 74, 444–457, doi: 10.1002/hep.31704 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Fuglerud B. M. et al. SOX9 reprograms endothelial cells by altering the chromatin landscape. Nucleic Acids Res 50, 8547–8565, doi: 10.1093/nar/gkac652 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Cao J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502, doi: 10.1038/s41586-019-0969-x (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Font-Burgada J. et al. Hybrid Periportal Hepatocytes Regenerate the Injured Liver without Giving Rise to Cancer. Cell 162, 766–779, doi: 10.1016/j.cell.2015.07.026 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Katsuda T. et al. Rapid in vivo multiplexed editing (RIME) of the adult mouse liver. Hepatology, doi: 10.1002/hep.32759 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sato K. et al. Ductular Reaction in Liver Diseases: Pathological Mechanisms and Translational Significances. Hepatology 69, 420–430, doi: 10.1002/hep.30150 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Love M. I., Huber W. & Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550, doi: 10.1186/s13059-014-0550-8 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Buenrostro J. D., Giresi P. G., Zaba L. C., Chang H. Y. & Greenleaf W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat Methods 10, 1213–1218, doi: 10.1038/nmeth.2688 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Bentsen M. et al. ATAC-seq footprinting unravels kinetics of transcription factor binding during zygotic genome activation. Nat Commun 11, 4267, doi: 10.1038/s41467-020-18035-1 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.He Y. et al. The role of retinoic acid in hepatic lipid homeostasis defined by genomic binding and transcriptome profiling. BMC Genomics 14, 575, doi: 10.1186/1471-2164-14-575 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Karagianni P., Moulos P., Schmidt D., Odom D. T. & Talianidis I. Bookmarking by Non-pioneer Transcription Factors during Liver Development Establishes Competence for Future Gene Activation. Cell Rep 30, 1319–1328 e1316, doi: 10.1016/j.celrep.2020.01.006 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hayhurst G. P., Lee Y. H., Lambert G., Ward J. M. & Gonzalez F. J. Hepatocyte nuclear factor 4alpha (nuclear receptor 2A1) is essential for maintenance of hepatic gene expression and lipid homeostasis. Mol Cell Biol 21, 1393–1403, doi: 10.1128/MCB.21.4.1393-1403.2001 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Li J., Ning G. & Duncan S. A. Mammalian hepatocyte differentiation requires the transcription factor HNF-4alpha. Genes Dev 14, 464–474 (2000). [PMC free article] [PubMed] [Google Scholar]
- 40.Cai Y. et al. The role of hepatocyte RXR alpha in xenobiotic-sensing nuclear receptor-mediated pathways. Eur J Pharm Sci 15, 89–96, doi: 10.1016/s0928-0987(01)00211-1 (2002). [DOI] [PubMed] [Google Scholar]
- 41.Li W. et al. A Homeostatic Arid1a-Dependent Permissive Chromatin State Licenses Hepatocyte Responsiveness to Liver-Injury-Associated YAP Signaling. Cell Stem Cell 25, 54–68 e55, doi: 10.1016/j.stem.2019.06.008 (2019). [DOI] [PubMed] [Google Scholar]
- 42.Meers M. P., Bryson T. D., Henikoff J. G. & Henikoff S. Improved CUT&RUN chromatin profiling tools. Elife 8, doi: 10.7554/eLife.46314 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Skene P. J. & Henikoff S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. Elife 6, doi: 10.7554/eLife.21856 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Korotkevich G. et al. Fast gene set enrichment analysis. bioRxiv, doi: 10.1101/060012 (2021). [DOI] [Google Scholar]
- 45.Iwafuchi-Doi M. et al. The Pioneer Transcription Factor FoxA Maintains an Accessible Nucleosome Configuration at Enhancers for Tissue-Specific Gene Activation. Mol Cell 62, 79–91, doi: 10.1016/j.molcel.2016.03.001 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gu W. et al. SATB2 preserves colon stem cell identity and mediates ileum-colon conversion via enhancer remodeling. Cell Stem Cell 29, 101–115 e110, doi: 10.1016/j.stem.2021.09.004 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Biagioni F. et al. Decoding YAP dependent transcription in the liver. Nucleic Acids Res 50, 7959–7971, doi: 10.1093/nar/gkac624 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Hu S. et al. Transcription factor antagonism regulates heterogeneity in embryonic stem cell states. Mol Cell 82, 4410–4427 e4412, doi: 10.1016/j.molcel.2022.10.022 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Thompson J. J. et al. Extensive co-binding and rapid redistribution of NANOG and GATA6 during emergence of divergent lineages. Nat Commun 13, 4257, doi: 10.1038/s41467-022-31938-5 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Platt R. J. et al. CRISPR-Cas9 knockin mice for genome editing and cancer modeling. Cell 159, 440–455, doi: 10.1016/j.cell.2014.09.014 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Dobin A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21, doi: 10.1093/bioinformatics/bts635 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Liao Y., Smyth G. K. & Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930, doi: 10.1093/bioinformatics/btt656 (2014). [DOI] [PubMed] [Google Scholar]
- 53.Patro R., Duggal G., Love M. I., Irizarry R. A. & Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14, 417–419, doi: 10.1038/nmeth.4197 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Feng J., Liu T., Qin B., Zhang Y. & Liu X. S. Identifying ChIP-seq enrichment using MACS. Nat Protoc 7, 1728–1740, doi: 10.1038/nprot.2012.101 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Ross-Innes C. S. et al. Differential oestrogen receptor binding is associated with clinical outcome in breast cancer. Nature 481, 389–393, doi: 10.1038/nature10730 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Yu G., Wang L. G. & He Q. Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 31, 2382–2383, doi: 10.1093/bioinformatics/btv145 (2015). [DOI] [PubMed] [Google Scholar]
- 57.Yu G., Wang L. G., Han Y. & He Q. Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287, doi: 10.1089/omi.2011.0118 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Robinson J. T. et al. Integrative genomics viewer. Nat Biotechnol 29, 24–26, doi: 10.1038/nbt.1754 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Ramirez F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res 44, W160–165, doi: 10.1093/nar/gkw257 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Shen Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120, doi: 10.1038/nature11243 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Chen K. et al. DANPOS: dynamic analysis of nucleosome position and occupancy by sequencing. Genome Res 23, 341–351, doi: 10.1101/gr.142067.112 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data that generated during the study have been deposited in Gene Expression Ominubus (GEO) with the accession number GSE221225 (SuperSeries) with SubSeries accession numbers GSE218945 and GSE218947 (RNA-Seq); GSE219052 (ATAC-Seq); GSE221223 and GSE221224 (CUT&RUN-Seq). Detailed scripts and parameters used for each step of the analysis provided by reasonable request to the authors.