In this study, Sussman et al. identify that the histone methyltransferase NSD1 and demethylase KDM2A reciprocally regulate H3K36 methylation during distinct stages of hepatobiliary reprogramming. The work highlights how epigenetic regulators and histone PTMs function concertedly to control cell fate switching during metaplasia and injury response in the liver.
Keywords: epigenetics, CRISPR screening, histone modifier, metaplasia, reprogramming, hepatocyte, cellular plasticity
Abstract
Following prolonged liver injury, a small fraction of hepatocytes undergoes reprogramming to become cholangiocytes or biliary epithelial cells (BECs). This physiological process involves chromatin and transcriptional remodeling, but the epigenetic mediators are largely unknown. Here, we exploited a lineage-traced model of liver injury to investigate the role of histone post-translational modification in biliary reprogramming. Using mass spectrometry, we defined the repertoire of histone marks that are globally altered in quantity during reprogramming. Next, applying an in vivo CRISPR screening approach, we identified seven histone-modifying enzymes that alter the efficiency of hepatobiliary reprogramming. Among these, the histone methyltransferase and demethylase Nsd1 and Kdm2a were found to have reciprocal effects on H3K36 methylation that regulated the early and late stages of reprogramming, respectively. Although loss of Nsd1 and Kdm2a affected reprogramming efficiency, cells ultimately acquired the same transcriptomic states. These findings reveal that multiple chromatin regulators exert dynamic and complementary activities to achieve robust cell fate switching, serving as a model for the cell identity changes that occur in various forms of physiological metaplasia or reprogramming.
Metaplasia is a physiological process by which one differentiated cell type transforms into another, typically as an adaptive response to chronic injury, inflammation, or environmental stress. It has been demonstrated in a variety of organs including the pancreas (acinar–ductal metaplasia), esophagus (intestinal metaplasia), and lungs (squamous metaplasia) (Merrell and Stanger 2016; Giroux and Rustgi 2017). In the liver, an organ with a remarkable regenerative capacity, hepatobiliary metaplasia is a conserved in vivo response to liver injury whereby hepatocytes are reprogrammed to cholangiocytes or biliary epithelial cells (BECs) following exposure to a variety of toxins (Michalopoulos et al. 2005; Yanger et al. 2013; Tarlow et al. 2014; Merrell et al. 2021). Liver injury activates multiple signaling pathways such as YAP, Notch, and TGFβ, which also activate biliary differentiation in the embryo (Zhang et al. 2010; Yimlamai et al. 2014; Wu et al. 2017; Schaub et al. 2018). Affected hepatocytes acquire transcriptional and epigenetic features of BECs and become incorporated into nascent bile ducts. In mice with fully developed biliary systems, this process is transitory and reversible upon removal of the injury stimulus (Tarlow et al. 2014; Kamimoto et al. 2016). However, in a mouse model of Alagille syndrome, in which bile ducts fail to form during development, hepatocytes are reprogrammed into mature cholangiocytes and form functional bile ducts (Schaub et al. 2018).
We have recently shown that hepatobiliary metaplasia is initiated by the pioneer transcription factor SOX4 (Katsuda et al. 2024a,b). Early in the reprogramming process, SOX4 binding alters chromatin accessibility and enhancer landscapes to repress hepatocyte genes and induce biliary genes. Moreover, SOX4 binding was also associated with the de novo deposition of both active and repressive chromatin marks, suggesting that additional histone modifications contribute to hepatobiliary metaplasia (Katsuda et al. 2024a). However, subsequent events in the reprogramming process and their molecular mediators are poorly understood.
Epigenetic modifiers, acting in conjunction with transcription factors, can enact broad changes in cell phenotype that occur during normal development and disease states such as cancer (Nieto et al. 2016; Edwards et al. 2017; Schuettengruber et al. 2017). Here, we hypothesized that histone post-translational modifications (PTMs) contribute to hepatobiliary reprogramming. To identify the relevant epigenetic machinery, we used adeno-associated viruses (AAVs) to conduct a hepatocyte-specific in vivo CRISPR screen of selected histone-modifying enzymes. We found that a complex network of epigenetic modifiers regulates discrete steps during hepatobiliary metaplasia, with different factors necessary for early and late reprogramming stages.
Results
Cholestatic injury induces differential deposition of multiple histone marks
Following exposure to the toxin 3,5-diethoxycarbonyl-1,4-dihydrocollidine (DDC), adult mouse hepatocytes undergo hepatobiliary metaplasia (reprogramming), a series of stepwise phenotypic transitions that can be followed using the lineage-traced Rosa26-LSL-EYFP mouse model (Srinivas et al. 2001). To identify histone marks altered during the reprogramming process, we used histone mass spectrometry to quantitate the abundance of selected histone H3 and H4 PTMs associated with different cell states. To this end, we performed fluorescence-activated cell sorting (FACS) from dissociated livers to isolate cells at different stages of the reprogramming continuum. Using markers that we previously validated for this purpose (Katsuda et al. 2024a), we isolated the following populations (Fig. 1A): (1) YFP+ healthy hepatocytes from mice fed a normal diet; (2) YFP+EpCAM− hepatocyte-derived cells from mice fed with a DDC diet, representing the mixture of cells at early and intermediate stages of reprogramming; and (3) EpCAM+ cells from DDC-fed mice, which includes both hepatocyte-derived cells at nearly terminal stages of biliary reprogramming and DDC-injured BECs. Early/intermediate reprogrammed cells and the whole EpCAM+ fraction were respectively combined to obtain sufficient cells (>1 million) for analysis. We then quantified the global levels of 43 histone H3 and H4 marks in each of those populations (Fig. 1A,B; Supplemental Tables S1, S2).
Figure 1.
Histone mass spectrometry nominates hepatobiliary reprogramming-associated epigenetic changes. (A) Schematic representation of histone mass spectrometry experiment to identify H3 and H4 histone marks differentially modified between healthy hepatocytes, injured hepatocytes undergoing the early stages of reprogramming, and EpCAM+ biliary epithelial cells (BECs)/late reprogrammed cells. (B) Representative flow cytometry gating strategy demonstrating selection of normal hepatocytes (left) and different stages of reprogrammed cells upon DDC-induced injury (right). (C) Summary of histone mass spectrometry results. Histone marks with a P-value < 0.05 between any two groups are highlighted (using a two-sided Student's t-test). (D) Individual plots of significant candidate histone marks. Related marks are grouped by dashed boxes. Marks selected for further study are indicated in red. (E) Individual plots of H3K27 histone marks and H3K36 histone marks including canonical and noncanonical H3.1 and H3.3 variants.
A total of 15 histone marks met our criteria for statistical significance (P < 0.05 by t-test between any of the three pairs) (Fig. 1C,D). H3.3K27me2, H3.3K36me2, and H4K20me2 all increased during the early stages of reprogramming and remained elevated in EpCAM+ cells. Although two of these marks involved the noncanonical histone variant H3.3, we did not observe significant differences in the canonical H3.1 histone variant for those two marks (Fig. 1E). The opposite trend was found in H3.3K27ac, H3K56me2, and H4K20me3, which decreased during the early stages of reprogramming and remained reduced in EpCAM+ cells. Finally, changes in the abundance of H3K9me3, H3K56me3, and H4K16ac exhibited more complex patterns (Fig. 1D).
In vivo AAV CRISPR screening with molecular inversion probe sequencing
Having nominated a series of histone marks with potential roles in hepatobiliary reprogramming, we next sought to understand their functional contributions. To this end, we used an AAV-based in vivo CRISPR screening strategy (Wang et al. 2018; Ye et al. 2019; Santinha et al. 2023) to probe epigenetic modifiers governing the candidate histone marks identified by mass spectrometry. We constructed an AAV plasmid library composed of single-guide RNAs (sgRNAs) targeting candidate epigenetic modifiers, along with Cre recombinase, that could then be delivered via retro-orbital injection into Rosa-LSL-Cas9-EGFP knock-in (Cas9-EGFP) mice (Platt et al. 2014; Katsuda et al. 2023). Hepatocyte-specific activity was ensured by the use of AAV serotype 8 (AAV8), which has a strong tropism for hepatocytes (Yanger et al. 2013), and the use of the hepatocyte-specific thyroxine binding globulin (TBG) promoter to drive Cre expression. Cre recombination activates the expression of Cas9 and EGFP, enabling simultaneous Cas9-mediated gene editing and lineage marking of transduced hepatocytes (analogous to the role of YFP in the cell isolations described above) (Fig. 2A). Because the recombinant AAV genome rarely integrates into the host cell genome (Kiourtis et al. 2021), AAV sequences do not persist. Thus, in contrast to standard (i.e., retroviral) CRISPR screening approaches, sgRNA enrichment or depletion cannot be used as a readout for gene-editing events. To overcome this limitation, we used molecular inversion probe sequencing (MIP-seq) to measure variant frequencies in the immediate region surrounding the expected cut site of the sgRNA as a readout (Supplemental Fig. S1A–C; Cantsilieris et al. 2017). Using this targeted sequencing approach, the frequency of insertion and deletion mutations (indels) can be calculated, which can indicate the likelihood of a gene being necessary for, or inhibitory to, the reprogramming process.
Figure 2.
In vivo CRISPR screen identifying epigenetic mediators regulating the kinetics of biliary reprogramming. (A) Schematic representation of in vivo CRISPR screening strategy. A library of 151 hepatocyte-specific AAV8–sgRNAs targeting 38 genes was injected into mice challenged with DDC-induced cholestatic injury. Healthy hepatocytes, early reprogrammed cells, late reprogrammed cells, and DDC-injured biliary epithelial cells were isolated, and mutations in the genomic regions targeted by the sgRNAs were quantified using molecular inversion probes followed by next-generation sequencing. (B) List of genes evaluated in the CRISPR screen, including writers and erasers of 11 histone marks and eight control genes. (C) Representative sequences of molecular inversion probe sequencing readouts, with the 20 bp sgRNA sequence and protospacer-adjacent motif (PAM) sequence highlighted. Deletions are indicated by dashes. (D) Summary of in vivo CRISPR screening results showing the log2 fold change and P-value between the variant rates of late reprogrammed and early reprogrammed cells at the secondary time point. Significance was assessed using a Welch's two-sided t-test, and significantly enriched or depleted sgRNAs (P < 0.05, |FC| > 0.667) are highlighted in red and blue respectively. (E) Representative sgRNA result from each of the eight histone modifiers identified as significantly enriched or depleted (Ash1l, Crebbp, Kat8, Kdm2a, Kmt5b, Nsd1, Phf2, and Rsbn1) in each experimental condition. Significance was assessed using a Welch's two-sided t-test.
Following a literature search, we selected 30 regulatory enzymes (including 16 writers and 14 erasers) that have been reported to regulate 11 of the significantly altered histone marks identified via mass spectrometry. We excluded H3K9me3 due to its minimal fold change as well as H3K79me1/2/3 given that the biological functions of these marks remain poorly characterized. We also excluded Hdac1/2/3 genes due to their broad substrate specificity, and writers of H3K56me2/3 have not yet been identified (Fig. 2B). In addition, we selected eight genes as controls for the screen: Etf1 and Hspa9 (essential genes), Cdkn1b and Cdkn2a (cell cycle inhibitors), Myod1 (not expressed in the liver), Epcam (the marker used to identify late reprogrammed cells), Sox9 (required for reprogramming) (Katsuda et al. 2024a), and Smad4 (a suppressor of reprogramming) (Fig. 2B; Merrell et al. 2021). We designed four sgRNAs for each of these 38 genes (except three for Cdkn2a), yielding a final library of 151 sgRNAs (Supplemental Table S3).
Next, we optimized the delivery of the library into Cas9–EGFP mice. A low multiplicity of infection (MOI; ∼0.4), as is typically used for in vitro lentiviral CRISPR screening, yielded a low purity (∼60%) of isolated EGFP+EpCAM+ reprogrammed cells due to the sparsity of fully reprogrammed cells (<10% of GFP+ hepatocytes) (data not shown). However, a high viral dose, as described previously (Wang et al. 2018), resulted in many cells being infected with more than one AAV, complicating the interpretation of the results. Therefore, we used an MOI of 1.4, which resulted in ∼75% of the starting hepatocytes being infected with AAVs (Supplemental Fig. S1D). This enabled us to achieve >85% purity of all the isolated cells, in which approximately half of the infected cells were expected to have editing of a single gene (Supplemental Fig. S1E). Next-generation sequencing of the library confirmed similar representation of all sgRNAs (Supplemental Fig. S1F; Supplemental Table S4).
One week after library injection, mice were divided into two groups, with five mice per group. In the first group, the mice were maintained on a normal diet for another week, and then GFP+ lineage-traced hepatocytes were isolated through FACS sorting (referred to here as T0 hepatocytes). The second group was challenged with a DDC diet for 8 weeks, and then hepatocytes and reprogrammed cells were isolated from two points along the reprogramming continuum: DDC-injured hepatocytes/early reprogrammed cells (GFP+CD24−EpCAM−; T1 early reprogrammed) and late reprogrammed cells (GFP+CD24+EpCAM+; T1 late reprogrammed). We also isolated injured BECs (GFP−EpCAM+; T1 BEC) as a negative control (Fig. 2A; Supplemental Fig. S2). MIP-seq probes were designed for all 151 sgRNAs (Supplemental Table S5), and targeted sequencing was performed on genomic DNA from each of the purified cell populations (T0 hepatocytes, T1 BEC, T1 early reprogrammed, and T1 late reprogrammed) (Fig. 2C; Supplemental Table S6). BECs exhibited a very low mutation rate, confirming the hepatocyte-specificity of the AAV system (Supplemental Fig. S3A).
First, we examined the frequency of variants in the control genes (Supplemental Fig. S3B–D). Indel variants in survival-associated genes (Etf1 and Hspa9) were depleted at the 8 week time point compared with T0 (Supplemental Fig. S3B). In contrast, the variant frequencies associated with all four Cdkn1b sgRNAs (but not Cdkn2a sgRNAs) were enriched at the 8 week time point compared with T0 (Supplemental Fig. S3C). This finding suggests that hepatocyte proliferation in vivo is more dependent on Cdkn1b than on Cdkn2a. We then looked at the relative frequency of variants between T1 early reprogrammed and T1 late reprogrammed cells in genes that are functionally implicated in reprogramming. Variants in the critical reprogramming transcription factor Sox9 and reprogramming marker Epcam were enriched in T1 early reprogrammed cells (Supplemental Fig. S3D). Conversely, variants in Smad4 were enriched in T1 late reprogrammed cells, confirming its role as a suppressor of reprogramming (Supplemental Fig. S3D). Last, for our negative control gene Myod1, we observed increased rates of editing for one of four sgRNAs in two reprogramming stages (Supplemental Fig. S3B), indicating a low but detectable rate of nonspecific enrichment. Overall, the data suggest that this screening strategy can identify genes that either positively or negatively mediate reprogramming.
Next, we examined the variant frequencies of the 30 epigenetic regulators in our CRISPR library. We identified candidate genes by the fold change (FC) magnitude (log2|FC| > 0.667) and statistical significance (P < 0.05) in variant rates between T1 late reprogrammed and T1 early reprogrammed cells. Overall, we found that the results favored depletion in the T1 late reprogrammed cells, indicating that most factors play a positive role in facilitating reprogramming (Fig. 2D,E). From these, we prioritized those for which there was a significant depletion of variants from at least two sgRNAs. Seven genes met these criteria: Crebbp, Kat8, Ash1l, Nsd1, Kdm2a, Phf2, and Rsbn1. Of these, Kdm2a was nominated as the top reprogramming promoter, as it was the only gene in which there was significant depletion of variants from all four sgRNAs between T1 late reprogrammed and early reprogrammed cells (Fig. 2D; Supplemental Fig. S3E). In contrast, sgRNAs for only one histone modifier gene, Kmt5b, had an enrichment for variants in late reprogrammed cells in any sgRNAs, so we nominated this gene for further study as well (Fig. 2D,E; Supplemental Fig. S3E). Taken together, these data suggest that multiple histone modifiers facilitate biliary reprogramming, while a small fraction appears to suppress this process.
Validation of the screening results by individual knockouts
To validate these results, we conducted individual hepatocyte-specific gene knockout (KO) experiments by injecting mice with an AAV8 virus packaged with gene-specific sgRNAs or an empty vector (EV) control (Supplemental Table S7). Mice were fed a normal diet for 1 week and then challenged with DDC for 8 weeks followed by flow cytometry (Fig. 3A; Supplemental Fig. S4). In addition to the continued use of EpCAM as a marker of late reprogrammed cells, we used CD24 as a validated marker of intermediate reprogrammed cells, as described previously (Katsuda et al. 2024a). These cell surface markers, in combination with the GFP lineage marker, allowed us to distinguish the effects of each gene deletion on the progression to intermediate reprogrammed (GFP+CD24+EpCAM−) or late reprogrammed (GFP+CD24+EpCAM+) cells compared with injured hepatocytes that have begun to undergo reprogramming (GFP+CD24−EpCAM−) (Fig. 3A).
Figure 3.
Validation of epigenetic mediators in reprogramming kinetics. (A) Schematic representation of experimental strategy. An AAV8–sgRNA containing a single gene knockout was delivered to mice prior to DDC challenge. Whole-liver cells were isolated, and reprogramming kinetics were quantified via flow cytometry using CD24 and EpCAM as markers of intermediate reprogrammed and late reprogrammed cells, respectively. (B) Flow cytometry data showing early reprogramming efficiency as the percentage of hepatocytes (GFP+ cells) that are CD24+. (C) Flow cytometry data showing late reprogramming efficiency as percentage of CD24+ (intermediate + late) reprogrammed cells that are EpCAM+. This is numerically equivalent to the ratio of all GFP+ hepatocytes reaching late reprogramming/all GFP+ hepatocytes reaching intermediate reprogramming. (D) Flow cytometry data showing total reprogramming efficiency as percentage of hepatocytes (GFP+ cells) that are EpCAM+. Data in B–D are combined from at least two repeated experiments with total n > 7 per group. Empty vector (EV) controls were shared between experiments. Statistics were computed using a Welch's two-sided t-test from each sample with respect to the EV controls. (E, top) Western blots of H3K36me1, H3K36me2, and H3K36me3 histone marks compared with total H3 protein in Nsd1-KO, Kdm2a-KO, and EV control hepatocytes. (Bottom) Quantification of Western blots normalized to total H3. (F) Early reprogramming efficiency (CD24+) (top) and total reprogramming efficiency (bottom) of Kdm2a/Nsd1 double-knockout hepatocytes. (G) Early reprogramming efficiency (CD24+) (top) and total reprogramming efficiency (bottom) in YFP-H3K36M and H3K36 wild-type (WT) mice. One outlier (indicated in green) was not included in the statistical analysis. Significance was assessed using a Welch's two-sided t-test.
Using this strategy, we confirmed that seven out of the eight genes identified in our screen alter reprogramming efficiency (Fig. 3B–D). Knockout of Crebbp, Kat8, Phf2, or Rsbn1 reduced early reprogramming and overall reprogramming efficiency, resulting in a decrease in both intermediate reprogrammed and late reprogrammed cells. In contrast, Kmt5b-KO had no effect on the earlier stage but enhanced the late stage, resulting in an overall increase in late reprogrammed cells (Fig. 3B–D). Interestingly, Kdm2a-KO promoted the earlier stage but strongly hindered the later stage, resulting in a net decrease in late reprogrammed cells, whereas Nsd1-KO showed the opposite effect: It interfered with the earlier stage but promoted the later stage, resulting in no overall change in the total reprogramming efficiency. Finally, Ash1l-KO had no significant impact on either stage (Fig. 3B–D). The nomination of Nsd1 and Ash1l as total reprogramming promoters is likely attributable to the relatively high MOI (∼1.4), which may have introduced noise favoring promoters over suppressors under these conditions. Although Nsd1 did not reduce total reprogramming efficiency as predicted by the screen, its reciprocal behavior with Kdm2a suggested that H3K36me2 may be a key mediator of hepatobiliary reprogramming—Nsd1 is a writer and Kdm2a is an eraser of this histone mark (Fig. 3B,C; Kooistra and Helin 2012). Moreover, regulation of H3K36me2 has previously been linked to epithelial-to-mesenchymal transition (EMT) and mesenchymal-to-epithelial transition during oncogenesis (Lu et al. 2019; Yuan et al. 2020; Ko et al. 2024). Given that EMT signatures are upregulated during biliary reprogramming (Tarlow et al. 2014; Merrell et al. 2021), this suggests potentially shared mechanisms of epigenetic plasticity between these disparate processes.
Identification of Kdm2a and Nsd1 as critical modulators for reprogramming
To further investigate the roles of Nsd1 and Kdm2a, we first confirmed the efficiency of their knockouts by inspecting RNA sequencing (RNA-seq) reads at the expected cut sites, which revealed efficient indel formation (Supplemental Fig. S5A), confirming that the observed effects were due to on-target gene editing. Western blots 2 weeks after AAV injection confirmed that the global H3K36me2 level was substantially decreased in Nsd1-KO hepatocytes and modestly increased in Kdm2a-KO hepatocytes (Fig. 3E). Similarly, after DDC-induced injury, immunofluorescence staining of H3K36me2 revealed a near absence of this mark in Nsd1-KO GFP+ cells, with no observable difference in Kdm2a-KO GFP+ cells or in Nsd2-KO, Nsd3-KO, or Ash1l-KO GFP+ cells (Supplemental Fig. S5B,C). These results suggest that Nsd1 is the major H3K36me2 writer in hepatocytes. To assess the combined impact of Kdm2a and Nsd1, we next conducted a double-knockout (DKO) experiment. Kdm2a/Nsd1-DKO resulted in a decrease in the fraction of reprogrammed hepatocytes at both the earlier and later stages (Fig. 3F). Taken together, these results suggest that multiple histone-modifying enzymes are required for efficient hepatobiliary reprogramming and that H3K36 methylation is a critical factor across multiple reprogramming stages.
To further test the role of H3K36me2 in reprogramming, we used Rosa26-LSL-EYFP mice with a lysine-36-to-methionine (K36M) mutation of histone H3.3 (LSL-H3K36M). Expression of this mutant histone, following Cre-mediated deletion of transcriptional stop sequences, prevents the formation of H3K36me2 (Zhuang et al. 2018a). After confirming the absence of H3K36me2 in the hepatocytes of Cre-infected H3K36M mice (Supplemental Fig. S5D), we subjected these mice to an 8 week DDC injury regimen and assessed reprogramming efficiency. Interestingly, the fraction of hepatocytes reprogrammed to both earlier (YFP+CD24+EpCAM−) and later (YFP+EpCAM+CD24+) stages tended to increase in H3K36M mice compared with K36 wild-type controls (Fig. 3G). These results provide additional evidence that H3K36me2 has an overall suppressive effect on reprogramming. Given that triple inhibition of NSD1, NSD2, and SETD2 is required to phenocopy the H3K36M mutation in human cells (Lu et al. 2016), this transgenic model likely reflects the roles of additional H3K36 modifiers. Nevertheless, although the inability to create this histone mark increases overall reprogramming efficiency, our findings with Nsd1 and Kdm2a indicate that dysregulation of H3K36me2 imposes different effects on the early versus late stages of reprogramming. Taken together, these experiments nominate the H3K36me2-modifying genes Kdm2a and Nsd1 as central mediators of reprogramming.
Nsd1 and Kdm2a have reciprocal activities during recovery from liver injury
Interestingly, Kdm2a-KO mice subjected to DDC-induced injury displayed a significant reduction in weight with minimal recovery compared with EV control (Supplemental Fig. S6A,B). In contrast, Nsd1-KO mice displayed a significantly faster body weight recovery during the DDC injury period compared with controls (Supplemental Fig. S6A). Kdm2a/Nsd1-DKO phenocopied Nsd1-KO, indicating that the delayed recovery from liver injury by Kdm2a-KO is rescued by Nsd1-KO (Supplemental Fig. S6A). Blood tests in Kdm2a-KO mice revealed an increase in blood ALP as well as both conjugated and total bilirubin compared with controls, along with increased histologic evidence of ductular reaction (Supplemental Fig. S6C,D). Additionally, there was increased accumulation of bile pigments in Kdm2a-KO compared with wild-type livers, while no clear difference was observed in Nsd1-KO or Kdm2a/Nsd1-DKO livers (Supplemental Fig. S6D), further indicating that knockout of Kdm2a exacerbates cholestatic liver damage. These observations are consistent with a recent report that Kdm2a deficiency causes abnormal liver function and potential liver damage (Martin et al. 2023). Overall, the acceleration of incomplete biliary reprogramming by Kdm2a-KO worsens the short-term response to cholestatic injury and delays recovery. In contrast, Nsd1-KO blocks this transition, protecting the animals from prolonged injury, a result that is also observed in the Nsd1/Kdm2a-DKO.
Nsd1-KO and Kdm2a-KO alter the epigenomes but not the transcriptomes associated with reprogrammed cell states
Given their effects on reprogramming efficiency, we sought to determine the extent to which Nsd1-KO and Kdm2a-KO also altered the transcriptional and epigenetic trajectories of hepatobiliary reprogramming. To this end, we performed bulk RNA-seq and cleavage under targets and tagmentation sequencing (CUT&Tag-seq) (Kaya-Okur et al. 2019) on cells isolated from each reprogramming stage (Fig. 4A). For RNA-seq, we isolated DDC-injured GFP+CD24− cells (early reprogrammed hepatocytes) and GFP+CD24+ cells (intermediate reprogrammed and late reprogrammed hepatocytes) from Kdm2a-KO, Nsd1-KO, or EV control mice. These populations were combined, as there were insufficient Nsd1-KO CD24+EpCAM− intermediate reprogrammed cells and insufficient Kdm2a-KO CD24+EpCAM+ late reprogrammed cells. We then integrated these data with our recently reported RNA-seq data across the reprogramming spectrum, including healthy (untreated) hepatocytes; early reprogrammed, intermediate reprogrammed, and late reprogrammed cells; and DDC-injured BECs, which represent the closest comparator to DDC-injured, terminally reprogrammed cells (Katsuda et al. 2024a). We used principal component analysis (PCA) to map the trajectory of reprogramming in two dimensions. In the RNA-seq data, we found that the first principal component (PC1) reflected the hepatobiliary reprogramming axis, accounting for 78% of the variance in the data (Fig. 4B; Supplemental Fig. S7; Supplemental Tables S8–S11). Notably, all genotypes (Nsd1-KO, Kdm2a-KO, and wild type) closely clustered along the reprogramming trajectory, indicating that the net reprogramming-related transcriptional changes are comparable between genotypes. These findings suggest that although these epigenetic modifiers affect the efficiency of reprogramming, they do not significantly affect the ultimate hepatobiliary transcriptomic phenotypes of cells that reach a given reprogrammed stage.
Figure 4.
Transcriptional profiles of Nsd1-KO and Kdm2a-KO reprogrammed cells. (A) Summary of phenotypic states used for RNA-seq and CUT&Tag-seq profiling, including flow cytometric markers and stages of reprogramming represented. (B) PCA of bulk RNA-seq data across reprogramming stages including wild-type reprogramming (Katsuda et al. 2024a) and from DDC-injured hepatocytes/early reprogrammed cells (GFP+CD24−) and GFP+CD24+ intermediate reprogrammed and late reprogrammed cells after knockout (KO) of Nsd1 or Kdm2a or empty vector (EV) control. PCA was computed using the top 2000 most variable genes. The percent of variance accounted for by each principal component is indicated. (C) Representative CUT&Tag-seq signal tracks around representative reprogramming-related genes (Ctgf) for five representative histone marks (H3K27ac, H3K4me1, H3K4me3, H3K27me3, and H3K36me3). (D–I) PCA representation of global histone mark profiles of H3K27ac (D), H3K4me1 (E), H3K36me1 (F), H3K36me3 (G), H3K4me3 (H), and H3K27me3 (I) across reprogramming stages including Nsd1-KO and Kdm2a-KO cells. PCA was computed using the top 5000 most variable peaks, and the reprogramming trajectories are highlighted with dashed arrows.
We then conducted a similar analysis of global histone profiles using CUT&Tag-seq, which allowed us to obtain reliable data with as few as 70,000 input cells. After incorporating previously generated data (Katsuda et al. 2024a), we compiled high-quality data for 10 histone marks across wild-type reprogramming stages, six of which we also profiled in Nsd1-KO and Kdm2a-KO cells (Fig. 4C; Supplemental Figs. S8–S10). Despite extensive protocol optimization, we were unable to obtain reliable H3K36me2 CUT&Tag-seq data, and Kdm2a-KO yielded insufficient EpCAM+ late reprogrammed cells to profile. Consequently, we sought to determine how closely the global profiles of other histone marks in wild-type reprogrammed cells resembled true BECs under the same injury conditions. As with RNA-seq, PC1 captured the reprogramming axis for each mark. For most marks, the late reprogrammed cells clustered closely to DDC-injured BECs (Supplemental Fig. S10), a pattern similar to that of the transcriptomic profile (Fig. 4B). However, the repressive mark H3K27me3 in late reprogrammed cells remained relatively far from BECs in the PCA projection (Supplemental Fig. S10). H3K27me3 may repress genes related to hepatocyte or biliary cell identity, thereby delaying the rate of cell state transitions during reprogramming (Schaub et al. 2018).
Next, we sought to connect these epigenetic profiles to our previous findings that the SOX4 pioneer transcription factor can initiate reprogramming (Katsuda et al. 2024a). We used ChromHMM (Ernst and Kellis 2017) to model chromatin states across the five wild-type hepatobiliary phenotypes. This method identifies recurring patterns of histone mark combinations across the genome, and we identified 15 distinct chromatin states using the 10 histone mark profiles (Supplemental Fig. S11A). We examined the enrichment of ectopic SOX4 binding peaks (from CUT&RUN-seq) across these states. We found that the 18 h postinjection peaks—the genomic regions bound by SOX4 at the earlier time point—were enriched in active enhancer (state 8) and promoter (states 9 and 10) regions early in reprogramming. Interestingly, we found that the 4 day postinjection peaks—the genomic regions bound by SOX4 when hepatocytes start to acquire the biliary phenotype—were enriched in poised chromatin regions (state 11) in hepatocytes followed by active regions during reprogramming (state 9) (Supplemental Fig. S11B). These findings provide a model of chromatin state dynamics throughout wild-type reprogramming and nominate poised or bivalent chromatin states as potentially important mediators of hepatobiliary metaplasia initiation.
Finally, we applied PCA to construct a differentiation trajectory including wild-type, Nsd1-KO, and Kdm2a-KO cells for the six histone marks profiled across these conditions. For active marks H3K27ac and H3K4me1, PC1 captured the majority of the variance with minimal deviation in the trajectory between the three genotypes, which followed a pattern similar to those of the transcriptomic profiles (Fig. 4D,E,H). However, Kdm2a-KO cells exhibited clear deviation of H3K36me1 and H3K36me3 along PC2, and Nsd1-KO cells exhibited moderated deviation of H3K36me1 in the opposite direction (Fig. 4F–G). These results suggest that the perturbation of H3K36me2 profiles led to global alteration in the distribution of H3K36me1 and H3K36me3. Interestingly, H3K27me3 also exhibited distinct trajectories across the genotypes, with PC1 and PC2 capturing a similar degree of variance (Fig. 4I). Given that previous studies have demonstrated that H3K36 methylation antagonizes PRC2-mediated H3K27 methylation (Wang et al. 2007; Schmitges et al. 2011; Yuan et al. 2011, 2020; Lu et al. 2016; Streubel et al. 2018; Yano et al. 2022; Chen et al. 2023), H3K27me3 may play a key role in reprogramming. Overall, these results suggest that the inability to regulate H3K36me2 results in defective remodeling of the H3K27me3 epigenomic landscape, thereby reducing the dynamics of reprogramming while still allowing reprogrammed cells to acquire appropriate state-specific transcriptomes.
Discussion
In this study, we explored the epigenetic mechanisms driving hepatobiliary metaplasia, in which hepatocytes are reprogrammed to BECs in vivo as a physiological response to cholestatic liver injury. We used a hepatocyte-specific in vivo CRISPR screen to nominate histone modifiers that are functionally involved in reprogramming and confirmed the role of seven through individual knockout experiments. Importantly, this approach did not require in vitro manipulation and implantation of cells, thus maintaining the native in vivo microenvironment. Through validation of the epigenetic modifiers, we found that the earlier and later stages of reprogramming are differentially regulated, and that Nsd1 and Kdm2a have reciprocal effects on the efficiency of the process. Knocking out Nsd1 decreased earlier reprogramming efficiency but promoted later reprogramming efficiency, whereas knocking out Kdm2a promoted earlier reprogramming efficiency but suppressed the later stage of reprogramming. These results complement our previous single-cell transcriptomic findings that biliary reprogramming is not a monotonic process and involves intermediate stages with transient gene expression (Merrell et al. 2021). Thus, we conclude that reprogramming similarly entails a multistep rewiring of the epigenetic landscape, including altered histone marks.
Surprisingly, none of the genes that we identified as playing a role in reprogramming were single-handedly essential for the phenomenon, with Kdm2a-KO exhibiting the strongest effect. Importantly, Kdm2a-KO and Nsd1-KO hepatocytes that completed the early and intermediate steps of reprogramming transcriptionally approximated wild-type hepatocytes that achieved the same degree of reprogramming. These observations raise the possibility that there are multiple redundant epigenetic mechanisms that achieve a similar phenotypic end point but in which the relative efficiencies of different steps in the process differ. Our recent work has implicated epigenetic factors such as DNA methylation as critical machinery for the stabilization of cell phenotypes, and we reported that reprogrammed cells maintain a methylation signature that resembles healthy (uninjured) hepatocytes (Radwan et al. 2024). Consequently, epigenetic factors, including histone modifiers, may act as barriers to shifts in transcriptomic states or affect the robustness of the reprogramming process. Further work is needed to define the interactions between DNA methylation, histone PTMs, and lineage-specific transcription factors that mediate cell state transitions during transdifferentiation. Overall, we present an unbiased investigation of histone modifications associated with metaplasia in vivo. Given the prevalence of metaplasia across tissues and its appearance as a harbinger of cancer, the molecular insights reported in this study provide a foundation for understanding this pathophysiological process.
Finally, this study has several limitations. First, knockout of most genes yielded decreased reprogramming efficiency, raising the possibility that the in vivo CRISPR screen may have lacked the power to detect genes that suppress reprogramming. Additionally, given that the screen compared early and late reprogramming stages, some histone modifiers with specific functions in intermediate stages may not have been captured. Moreover, our validated reprogramming markers may not capture the full spectrum of transition states including novel intermediate populations. Recent advancements in single-cell profiling such as single-cell CUT&Tag-seq (Bartosovic et al. 2021) and single-cell CRISPR screens (Replogle et al. 2020; Santinha et al. 2023) will enable tracing of the continuous reprogramming process, potentially expanding the three discrete reprogramming stages that we describe, as well as elucidating mechanisms of multidirectional plasticity originating from BECs. Second, Kdm2a-KO yields of late reprogrammed cells were insufficient for bulk profiling; therefore, much of our analysis was focused on the earlier stages of reprogramming. Third, H3K36me2 CUT&Tag-seq could not be achieved despite extensive protocol optimization. Although this mark could not be profiled given the lack of compatible antibodies, we speculate that the role of Kdm2a may be limited to regulating a small number of genes, causing the significant changes in reprogramming efficiency despite a modest increase in the global H3K36me2 level. Finally, subsequent investigation is needed to identify species-dependent differences in epigenetic regulation and translate specific findings from our murine model to actionable targets for human disease.
Materials and methods
Mice
Rosa-LSL-Cas9-EGFP mice (Platt et al. 2014) on a C57BL/6J background (strain 026175) and Rosa26-LSL-EYFP mice on mixed backgrounds (Srinivas et al. 2001) were purchased from the Jackson Laboratory and maintained as homozygotes. Rosa26-LSL-H3K36M transgenic mice were generously provided by the Kai Ge laboratory (National Institute of Diabetes and Digestive and Kidney Diseases) (Zhuang et al. 2018b). Homozygous Rosa26-LSL-EYFP mice were crossed with homozygous Rosa26-LSL-H3K36M mice to obtain Rosa26EYFP/H3K36M, which were heterozygous for each genotype. All mouse experiment procedures used in this study were performed following the National Institutes of Health's guidelines. All mouse procedure protocols used in this study were in accordance with, and with the approval of, the Institutional Animal Care and Use Committee of the University of Pennsylvania.
For characterization of the reprogramming stage, 4–5 week old mice were retro-orbitally injected with AAV:ITR-U6-sgRNA(backbone)-TBG-Cre-WPRE-hGHpA-ITR (empty vector [EV]) (Katsuda et al. 2023) with 5 × 1011 genome copies/mouse/gene. One week later, induction of biliary reprogramming was started by initiating a 0.1% 3,5-diethoxycarbonyl-1,4-dihydrocollidine (DDC) diet (Envigo). Eight weeks after the DDC challenge, reprogrammed cells and biliary cells were harvested as described below.
sgRNA cloning into the AAV vector
Oligo DNAs were designed using the CRISPick tool (Broad Institute) as 5′-ACCGNNNNNNNNNNNNNNNNNNNN-3′ (sense strand) and 5′-AACNNNNNNNNNNNNNNNNNNNNC-3′ (antisense strand), respectively, so that the 20 nucleotides (as represented “N”) were complementary to each other. These two oligo DNAs were phosphorylated with T4 PNK (NEB) for 30 min at 37°C and annealed by ramping down the temperature from 95°C to 25°C at 5°C/min. The annealed sgRNA fragment was then diluted to 1:400 with water and cloned into SapI (NEB)-linearized AAV:ITR-U6-sgRNA(backbone)-TBG-Cre-WPRE-hGHpA-ITR plasmid using T4 DNA ligase (NEB) for 30–60 min at room temperature. Transformation was performed using Stbl3 bacteria (Thermo) following the manufacturer's instruction. For all AAV plasmids, endotoxin was eliminated by treating the plasmids with Endozero columns (Zymo Research) before proceeding to AAV production.
AAV preparation
Ninety percent to 100% confluent 293T cells in 15 cm dishes were replenished with 15 mL of fresh DMEM (Thermo) supplemented with 2% FBS (Thermo) without antibiotics. For a 15 cm plate, 16 μg of AAV8-Rep/Cap plasmid (Grompe Laboratory), 16 μg of Ad5-Helper plasmid (Grompe Laboratory), 16 μg of AAV transfer vector, and 144 μL of 1 mg/mL polyethylenimine (PEI; Polysciences) were mixed in 9 mL of OptiMEM (Thermo). After incubation for 15 min at room temperature, the plasmid/PEI complex was added to 293T cells in a dropwise manner, and the plates were gently rocked to mix. After incubation in a CO2 incubator for 6 days, the cells and culture supernatant were harvested into 50 mL tubes and centrifuged at 1900g for 15 min. The supernatant was transferred to new tubes, and 1/40,000 vol of benzonase (Sigma-Aldrich) was added and mixed thoroughly by inversion. After digestion of nonviral DNA by incubation for 30 min at 37°C, virus medium was centrifuged at 1900g for 15 min, and the supernatant was filtered with a 0.22 μm filter unit containing a PES membrane (Thermo). Next, 1/4 vol of 40% polyethylene glycol 8000 (PEG8000) in 2.5 M NaCl was added and mixed thoroughly by inversion. Following overnight incubation at 4°C, precipitated AAV was collected by centrifugation at 3000g for 15 min. After removal of the supernatant, the precipitate was homogenized in 100 μL of PBS/15 cm dish by thorough pipetting. Non-AAV precipitate was eliminated by centrifugation at 2200g for 5 min. Smaller debris were further removed by filtrating the eluted AAV with a 0.45 μm filter column (Corning). This crude AAV was titrated by qPCR using the AAV8-TBG-Cre (University of Pennsylvania Vector Core) as a standard and a forward primer (5′-GGAACCCCTAGTGATGGAGTT-3′) and a reverse primer (5′-CGGCCTCAGTGAGCGA-3′) and directly used for KO experiments without further purification.
Hepatocyte isolation
Livers were perfused with 40 mL of HBSS (Thermo), followed by 40 mL of HBSS with 1 mM EGTA (Sigma), and then 40 mL of HBSS with 5 mM CaCl2 (Sigma) and 40 µg/mL liberase TM (Sigma). Following perfusion, livers were mechanically dispersed with tweezers, resuspended in 10 mL of wash medium (DMEM supplemented with 5% FBS), and filtrated with a 70 μm cell strainer. The cells were centrifuged at 50g for 5 min at 4°C. Next, the cells were resuspended in complete percoll solution (10.8 mL of percoll [Cytiva], 12.5 mL of wash medium, 1.2 mL of 10× HBSS per liver) and centrifuged at 50g for 10 min at 4°C. After a single wash with 10 mL of medium, cells were spun at 50g for 5 min at 4°C and then used for downstream experiments.
Whole-liver cell isolation from normal mice
Livers were digested by the two-step liberase perfusion as described above. Next, the undigested remaining tissue was transferred to a 1.5 mL tube, minced with surgical scissors, and further digested with 10× concentrated liberase (∼430 μL/tube of 400 μg/mL HBSS with 5 mM CaCl2) for 30 min at 37°C while vortexing the sample several times intermittently. The digested tissue was filtered with a 70 μm cell strainer and combined with the cell suspension digested previously. The cells were then centrifuged at 300g for 5 min at 4°C. Next, the cells were suspended in 10 mL of ACK lysis buffer (Quality Biological) and incubated for 10 min on ice to remove red blood cells. The cells were then collected by centrifugation at 300g for 5 min at 4°C and used for downstream analyses.
Whole-liver cell isolation from DDC-treated mice
Livers were digested by the two-step liberase perfusion as described above. Following perfusion, livers were submerged in 10 mL of fresh HBSS with 5 mM CaCl2, 40 µg/mL liberase, and 40 µg/mL DNaseI (Millipore) in a C-tube (Miltenyi) and further digested using a gentleMACS Octo dissociator (Miltenyi) with a heating unit using the “37C_m_LIDK_1” protocol. Dissociated tissue was diluted in flow buffer, HBSS (pH 7.4; Thermo) supplemented with 25 mM HEPES (Thermo), 5 mM MgCl2 (MedSupply Partners), 1× Pen/Strep (Thermo), 1× fungizone (Thermo), 1× NEAA (Thermo), 1× glutamax (Thermo), 0.3% glucose (Sigma), and 1× sodium pyruvate (Thermo) supplemented with 40 µg/mL DNaseI [referred to here as flow buffer(+)]. Undigested tissue was removed by passing it through a 70 μm cell strainer, and the cells were centrifuged at 300g for 5 min at 4°C. Next, the cells were suspended in 10 mL of ACK lysis buffer and incubated for 10 min on ice. The cells were then collected by centrifugation at 300g for 5 min at 4°C and used for downstream analyses.
Flow cytometry
Cells were resuspended in 2–3 mL of flow buffer(+) and filtered with a 35 μm cell strainer equipped with a FACS tube (BD). The cell suspension was then transferred to a round-bottom 96 well plate at 100–150 μL/well and centrifuged at ∼800g for 1 min at 4°C with a slow brake. The cells were then resuspended in 100 μL of flow buffer(+)/well containing fluorophore-conjugated antibodies (Supplemental Table S1) and incubated for 20 min on ice. After two washes in 150–200 μL of flow buffer(+)/well at ∼800g for 1 min at 4°C with a slow brake, the cells were resuspended in flow buffer(+) containing 1/1000× TO-PRO-3 (Thermo) and analyzed using an LSR II flow cytometer (BD).
Fluorescence-activated cell sorting (FACS) of DDC-treated whole-liver cells
Cells were resuspended in 5 mL of flow buffer(+) by centrifugation at 300g for 5 min at 4°C. After removal of the supernatant, the volume was increased to 1–1.5 mL with flow buffer(+), and 1/100 vol of rat anti-CD45, rat anti-CD11b, and rat anti-CD31 antibodies (Supplemental Table S1) was added and incubated for 10 min on ice. After washing with 2 mL of flow buffer(+) at 300g for 5 min at 4°C with a slow brake, the cells were resuspended in 5 mL of flow buffer(+), 600 μL of Dynabeads antirat IgG (Thermo) were added, and the cell/bead mixture was incubated for 30 min at 4°C with gentle tilting and rotation. The suspension was transferred to 5 mL FACS tubes and placed on a DynaMag-5 magnet (Thermo) for 2 min. The supernatant was transferred to a new tube, and the cells were collected by centrifugation at 2000 rpm (∼800g) for 2 min at 4°C with a slow brake. The cells were then resuspended in MACS buffer (PBS, 0.5% BSA, 2 mM EDTA) to the final volume of ∼1.5 mL, and 150 μL of CD326 (EpCAM) MicroBeads (Miltenyi) was added. After incubation for 15 min at 4°C, the cells were washed with an equal volume of MACS buffer then centrifuged at 2000 rpm (∼800g) for 2 min at 4°C with a slow brake. Cells were resuspended in 2 mL of MACS buffer, and EpCAM+ cells and EpCAM− cells were separated using LS columns (Miltenyi) following the manufacturer's instruction (four columns were used per animal; 0.5 mL of suspension/column). The cells were then collected by centrifugation at 2000 rpm (∼800g) for 2 min at 4°C with a slow brake. The cells were then resuspended in 0.5–1 mL of flow buffer(+), and ∼10–15 μL of each was set aside for fluorescence minus one (FMO) controls (FMO-Brilliant Violet 421 [BV421]: all stained except BV421-CD24; FMO-PE/Dazzle 594: all stained except PE/Dazzle594-EpCAM), and stained in a 96 well round-bottom plate as described earlier. The cells to be used for FACS were stained with BV421-CD24 (Biolegend), PE/Dazzle594-EpCAM (Biolegend), and PE/Cy7-CD11b/CD31/CD45 (Supplemental Table S1) at 1:100 dilution in 15 mL tubes for 20 min on ice. After being washed once in 2 mL of flow buffer(+) by centrifugation at 2000 rpm (∼800g) for 2 min at 4°C with a slow brake, the cells were resuspended in 1–3 mL of flow buffer(+) with 1/1000 TO-PRO-3 and then sorted on an Aria II sorter (BD).
Molecular inversion probe (MIP) design and sgRNA selection
MIPs were designed using the MIPgen software (https://github.com/shendurelab/MIPGEN) (Boyle et al. 2014). MIPs with logistic scores >0.8 and sgRNAs were manually selected for the 38 genes as follows (Supplemental Table S3): Each sgRNA was selected to ensure capture by the corresponding probe with the Cas9 cut site positioned within ∼100 bp from the end of the forward sequencing primer, allowing single-end 150 cycle sequencing to capture the indels introduced by Cas9. This approach yielded 151 sgRNA–MIP pairs, with each gene assigned four sgRNAs, except Cdkn2a, which was assigned three sgRNA–MIP pairs.
sgRNA library preparation
All sgRNAs (151 in total) were designed using the CRISPick tool (Doench et al. 2016) (https://portals.broadinstitute.org/gppx/crispick/public) and ordered in a 96 well mix plate format (IDT), where each well contained a pair of sense and antisense oligo DNAs. After individual phosphorylation and annealing reactions, all the sgRNA fragments were pooled into a single tube and transformed into Stbl3 bacteria. The number of transformed colonies was estimated to be 67,650, which covered the library scale by 440-fold. Library uniformity was confirmed by next-generation sequencing (NGS) using an Illumina NextSeq500 following the manufacturer's instructions (Supplemental Table S4).
In vivo knockout screening
AAV viruses were prepared as described above using 16 µg of a 151× pooled sgRNA plasmid. Cas9-EGFP mice (4–5 weeks old) were injected with 5 × 1010 genome copies per mouse, which resulted in ∼75% infection after 2 weeks. To harvest T0 hepatocytes, the animals were fed a normal diet for 2 weeks, and hepatocytes were isolated using the Percoll density gradient method as described previously. For T1 sample preparation, the animals were fed a normal diet for 1 week, followed by an 8 week challenge with a DDC diet. T1 DDC-injured hepatocytes/early reprogrammed cells (GFP+CD24−EpCAM−), late reprogrammed cells (GFP+CD24+EpCAM+), and BECs (GFP−EpCAM+) were harvested and isolated by FACS as described above and used for MIP-seq.
MIP-seq
MIP-seq was conducted according to a previously published protocol (Cantsilieris et al. 2017). Briefly, the MIP pool (IDT, purchased as an oligo pool [oPool]) was phosphorylated using T4 PNK (NEB). Genomic DNA was isolated from T0 and T1 cells using the Quick DNA Plus micropreparation kit (Zymo Research) following the manufacturer's instructions. MIP capture and ligation reactions were carried out with Hemo Klentaq (NEB) and Ampligase (Epicentre) enzymes by incubating the genomic DNA and phosphorylated MIPs for 10 min at 95°C, followed by 22 h at 60°C. Noncircular uncaptured genomic DNA was cleaned up by treating with exonucleases I and III (NEB) for 10 min at 37°C, followed by 2 min at 95°C. MIP-captured DNAs were then amplified using 2× iProof high-fidelity master mix (Bio-Rad) with NGS index primers (Supplemental Table S5). The amplification conditions were one cycle for 30 sec at 98°C; 15–21 cycles of 10 sec at 98°C, 30 sec at 60°C, and 30 sec at 72°C; and then 2 min at 72°C. PCR products were purified using 0.9× Agentcourt AMPure beads and used for NGS. NGS was performed on a NextSeq500 (Illumina) with a NextSeq500/550 high-output kit v2 (150 cycles) in single-end sequencing mode. Each step was conducted to ensure that at least 100-fold coverage was maintained for each sample (i.e., at least 151 × 100 = 15,100 cells were used as input per sample).
Analysis of MIP-seq data
AAV CRISPR screen data were processed as described previously (Wang et al. 2018). Briefly, FastQ reads were mapped to the mm10 genome using the bwa mem function in BWA v0.7.10 with default parameters (Li and Durbin 2009). BAM files were sorted and indexed using SAMtools v1.11 (Li et al. 2009). For each sample, indel variants were called using SAMtools and VarScan 2 (Koboldt et al. 2012). The output of the SAMtools mpileup command with parameters “–10000000000 -B -q 10” was piped to VarScan pileup21indel with parameters “–min-coverage 1 –min-reads2 1 –min-var-freq 0 –P-value 0.” Next, the center position of each indel was mapped to the closest sgRNA cut site. Indels were filtered by requiring that each indel must overlap the ±3 bp flank of the closest sgRNA cut site, and then the number of indel reads was summed for each sgRNA in each sample. The number of times each sgRNA sequence (and its reverse complement) appears in each FastQ file was also tabulated to determine the total sgRNA reads, and the fraction of indel reads was then calculated. The variant frequency was compared between samples using a Welch's t-test to determine variants that were significantly enriched or depleted.
Individual knockout experiments
To target a specific gene, we cotransfected three sgRNAs to maximize knockout efficiency. For these experiments, 5.3 µg of each sgRNA plasmid was combined, resulting in a total of 16 µg being cotransfected into 293T cells. The sgRNAs used are listed in Supplemental Table S7. Except for the sgRNAs targeting Kdm2a and Nsd1, three of the four sgRNAs used in the CRISPR screening were selected. The sgRNAs for Kdm2a knockout were previously confirmed to have high knockout efficiency in our earlier study (Yuan et al. 2020). For Nsd1 and Kmt5b knockouts, we used two sgRNAs that ranked higher than those used in the screening, as determined by the CRISPick sgRNA design tool (Supplemental Table S7). AAVs were injected into 4–5 week old Cas9-EGFP mice at a dose of 5 × 1011 genome copies per mouse. The animals were fed a normal diet for 1 week, followed by an 8 week challenge with a DDC diet to induce biliary reprogramming.
Data and material availability
Sequencing data have been deposited to NCBI GEO under accession numbers GSE270508 (CUT&Tag-seq) and GSE270509 (RNA-seq). Other data used in this study are available under accession numbers GSE218945 (RNA-seq) (Katsuda et al. 2024a) and GSE156894 (RNA-seq) (Merrell et al. 2021). Detailed scripts and parameters used for each step of the analysis will be provided on reasonable request to the authors.
Statistics and reproducibility
All statistical analyses were conducted in R version 4.4.0-4.4.2 (R Foundation for Statistical Computing). A P-value threshold of P < 0.05 was considered statistically significant. Continuous variables were compared using the two-sided Student's or Welch's t-test, as indicated in the respective figure legends, after confirming normality of the data. Body weight metrics were assessed using ANOVA followed by Tukey's multiple comparisons test. All box plots include the median, hinges mark the 25th and 75th percentiles, and whiskers extend 1.5 times the interquartile range. At least two biological replicates were used for each experiment. Animals were randomly allocated to each group.
Supplemental Material
Acknowledgments
We acknowledge the Molecular Pathology and Imaging Core at the University of Pennsylvania for histological services. We are thankful to members of the Stanger and Zaret laboratories for comments and suggestions, Steven Henikoff (Fred Hutchinson Cancer Research Center) for advice on CUT&Tag experiments, and Kai Ge (National Institute of Diabetes and Digestive and Kidney Diseases) for generously providing the LSL-H3K36M mice. This work was supported by National Institutes of Health grants R01DK083355, R01DK125387, HD106051, and CA196539; the Fred and Suzanne Biesecker Pediatric Liver Center; the Abramson Family Cancer Research Institute; the International Medical Research Foundation; the Daiichi Sankyo Foundation of Life Science; the Mochida Memorial Foundation for Medical and Pharmaceutical Research; the Mitsukoshi Health and Welfare Foundation; the Uehara Memorial Foundation, the Kanae Foundation; the Japanese Biochemical Society; and the Osamu Hayaishi Memorial Scholarship for study abroad.
Author contributions: T.K. and B.Z.S. conceived the study. T.K., J.H.S., S.Y., and B.Z.S. performed the methodology. T.K., J.H.S., and B.Z.S. designed the experiments. T.K., H.W.C., and J.H.S. conducted the experiments. B.A.G. and I.A.A. acquired the resources. J.H.S., T.K., and K.I. analyzed the bioinformatics. J.H.S., T.K., and B.Z.S. wrote the manuscript. B.Z.S. acquired the funding. T.K. and B.Z.S. supervised the study.
Footnotes
Supplemental material is available for this article.
Article published online ahead of print. Article and publication date are online at http://www.genesdev.org/cgi/doi/10.1101/gad.352420.124.
Competing interest statement
The authors declare no competing interests.
References
- Bartosovic M, Kabbe M, Castelo-Branco G. 2021. Single-cell CUT&Tag profiles histone modifications and transcription factors in complex tissues. Nat Biotechnol 39: 825–835. 10.1038/s41587-021-00869-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boyle EA, O'Roak BJ, Martin BK, Kumar A, Shendure J. 2014. MIPgen: optimized modeling and design of molecular inversion probes for targeted resequencing. Bioinformatics 30: 2670–2672. 10.1093/bioinformatics/btu353 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cantsilieris S, Stessman HA, Shendure J, Eichler EE. 2017. Targeted capture and high-throughput sequencing using molecular inversion probes (MIPs). Methods Mol Biol 1492: 95–106. 10.1007/978-1-4939-6442-0_6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen C, Shin JH, Fang Z, Brennan K, Horowitz NB, Pfaff KL, Welsh EL, Rodig SJ, Gevaert O, Gozani O, et al. 2023. Targeting KDM2A enhances T-cell infiltration in NSD1-deficient head and neck squamous cell carcinoma. Cancer Res 83: 2645–2655. 10.1158/0008-5472.CAN-22-3114 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C, Orchard R, et al. 2016. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR–Cas9. Nat Biotechnol 34: 184–191. 10.1038/nbt.3437 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edwards JR, Yarychkivska O, Boulard M, Bestor TH. 2017. DNA methylation and DNA methyltransferases. Epigenetics Chromatin 10: 23. 10.1186/s13072-017-0130-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ernst J, Kellis M. 2017. Chromatin-state discovery and genome annotation with ChromHMM. Nat Protoc 12: 2478–2492. 10.1038/nprot.2017.124 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Giroux V, Rustgi AK. 2017. Metaplasia: tissue injury adaptation and a precursor to the dysplasia–cancer sequence. Nat Rev Cancer 17: 594–604. 10.1038/nrc.2017.68 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kamimoto K, Kaneko K, Kok CY-Y, Okada H, Miyajima A, Itoh T. 2016. Heterogeneity and stochastic growth regulation of biliary epithelial cells dictate dynamic epithelial tissue remodeling. eLife 5: e15034. 10.7554/eLife.15034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katsuda T, Cure H, Sussman J, Simeonov KP, Krapp C, Arany Z, Grompe M, Stanger BZ. 2023. Rapid in vivo multiplexed editing (RIME) of the adult mouse liver. Hepatology 78: 486–502. 10.1002/hep.32759 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katsuda T, Sussman JH, Ito K, Katznelson A, Yuan S, Takenaka N, Li J, Merrell AJ, Cure H, Li Q, et al. 2024a. Cellular reprogramming in vivo initiated by SOX4 pioneer factor activity. Nat Commun 15: 1761. 10.1038/s41467-024-45939-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katsuda T, Sussman JH, Zaret KS, Stanger BZ. 2024b. The yin and yang of pioneer transcription factors: dual roles in repression and activation. Bioessays 46: 2400138. 10.1002/bies.202400138 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaya-Okur HS, Wu SJ, Codomo CA, Pledger ES, Bryson TD, Henikoff JG, Ahmad K, Henikoff S. 2019. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat Commun 10: 1930. 10.1038/s41467-019-09982-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kiourtis C, Wilczynska A, Nixon C, Clark W, May S, Bird TG. 2021. Specificity and off-target effects of AAV8-TBG viral vectors for the manipulation of hepatocellular gene expression in mice. Biol Open 10: bio058678. 10.1242/bio.058678 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ko EK, Anderson A, D'souza C, Zou J, Huang S, Cho S, Alawi F, Prouty S, Lee V, Yoon S, et al. 2024. Disruption of H3K36 methylation provokes cellular plasticity to drive aberrant glandular formation and squamous carcinogenesis. Dev Cell 59: 187–198.e7. 10.1016/j.devcel.2023.12.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK. 2012. Varscan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22: 568–576. 10.1101/gr.129684.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kooistra SM, Helin K. 2012. Molecular mechanisms and potential functions of histone demethylases. Nat Rev Mol Cell Biol 13: 297–311. 10.1038/nrm3327 [DOI] [PubMed] [Google Scholar]
- Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25: 1754–1760. 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078–2079. 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu C, Jain SU, Hoelper D, Bechet D, Molden RC, Ran L, Murphy D, Venneti S, Hameed M, Pawel BR, et al. 2016. Histone H3K36 mutations promote sarcomagenesis through altered histone methylation landscape. Science 352: 844–849. 10.1126/science.aac7272 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lu D-H, Yang J, Gao L-K, Min J, Tang J-M, Hu M, Li Y, Li S-T, Chen J, Hong L. 2019. Lysine demethylase 2A promotes the progression of ovarian cancer by regulating the PI3K pathway and reversing epithelial–mesenchymal transition. Oncol Rep 41: 917–927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martin M, Motolani A, Kim H-G, Collins AM, Alipourgivi F, Jin J, Wei H, Wood BA, Ma Y-Y, Dong XC, et al. 2023. KDM2A deficiency in the liver promotes abnormal liver function and potential liver damage. Biomolecules 13: 1457. 10.3390/biom13101457 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merrell AJ, Stanger BZ. 2016. Adult cell plasticity in vivo: de-differentiation and transdifferentiation are back in style. Nat Rev Mol Cell Biol 17: 413–425. 10.1038/nrm.2016.24 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merrell AJ, Peng T, Li J, Sun K, Li B, Katsuda T, Grompe M, Tan K, Stanger BZ. 2021. Dynamic transcriptional and epigenetic changes drive cellular plasticity in the liver. Hepatology 74: 444–457. 10.1002/hep.31704 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Michalopoulos GK, Barua L, Bowen WC. 2005. Transdifferentiation of rat hepatocytes into biliary cells after bile duct ligation and toxic biliary injury. Hepatology 41: 535–544. 10.1002/hep.20600 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nieto MA, Huang RY-J, Jackson RA, Thiery JP. 2016. EMT: 2016. Cell 166: 21–45. [DOI] [PubMed] [Google Scholar]
- Platt RJ, Chen S, Zhou Y, Yim MJ, Swiech L, Kempton HR, Dahlman JE, Parnas O, Eisenhaure TM, Jovanovic M, et al. 2014. CRISPR–Cas9 knockin mice for genome editing and cancer modeling. Cell 159: 440–455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Radwan A, Eccleston J, Sabag O, Marcus H, Sussman J, Ouro A, Rahamim M, Azagury M, Azria B, Stanger BZ, et al. 2024. Transdifferentiation occurs without resetting development-specific DNA methylation, a key determinant of full-function cell identity. Proc Natl Acad Sci 121: e2411352121. 10.1073/pnas.2411352121 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Replogle JM, Norman TM, Xu A, Hussmann JA, Chen J, Cogan JZ, Meer EJ, Terry JM, Riordan DP, Srinivas N, et al. 2020. Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat Biotechnol 38: 954–961. 10.1038/s41587-020-0470-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Santinha AJ, Klingler E, Kuhn M, Farouni R, Lagler S, Kalamakis G, Lischetti U, Jabaudon D, Platt RJ. 2023. Transcriptional linkage analysis with in vivo AAV–Perturb-seq. Nature 622: 367–375. 10.1038/s41586-023-06570-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schaub JR, Huppert KA, Kurial SNT, Hsu BY, Cast AE, Donnelly B, Karns RA, Chen F, Rezvani M, Luu HY, et al. 2018. De novo formation of the biliary system by TGFβ-mediated hepatocyte transdifferentiation. Nature 557: 247–251. 10.1038/s41586-018-0075-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schmitges FW, Prusty AB, Faty M, Stützer A, Lingaraju GM, Aiwazian J, Sack R, Hess D, Li L, Zhou S, et al. 2011. Histone methylation by PRC2 is inhibited by active chromatin marks. Mol Cell 42: 330–341. 10.1016/j.molcel.2011.03.025 [DOI] [PubMed] [Google Scholar]
- Schuettengruber B, Bourbon H-M, Croce LD, Cavalli G. 2017. Genome regulation by Polycomb and Trithorax: 70 years and counting. Cell 171: 34–57. 10.1016/j.cell.2017.08.002 [DOI] [PubMed] [Google Scholar]
- Srinivas S, Watanabe T, Lin C-S, William CM, Tanabe Y, Jessell TM, Costantini F. 2001. Cre reporter strains produced by targeted insertion of EYFP and ECFP into the ROSA26 locus. BMC Dev Biol 1: 4. 10.1186/1471-213X-1-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Streubel G, Watson A, Jammula SG, Scelfo A, Fitzpatrick DJ, Oliviero G, McCole R, Conway E, Glancy E, Negri GL, et al. 2018. The H3K36me2 methyltransferase Nsd1 demarcates PRC2-mediated H3K27me2 and H3K27me3 domains in embryonic stem cells. Mol Cell 70: 371–379.e5. 10.1016/j.molcel.2018.02.027 [DOI] [PubMed] [Google Scholar]
- Tarlow BD, Pelz C, Naugler WE, Wakefield L, Wilson EM, Finegold MJ, Grompe M. 2014. Bipotential adult liver progenitors are derived from chronically injured mature hepatocytes. Cell Stem Cell 15: 605–618. 10.1016/j.stem.2014.09.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang GG, Cai L, Pasillas MP, Kamps MP. 2007. NUP98–NSD1 links H3K36 methylation to Hox-A gene activation and leukaemogenesis. Nat Cell Biol 9: 804–812. 10.1038/ncb1608 [DOI] [PubMed] [Google Scholar]
- Wang G, Chow RD, Ye L, Guzman CD, Dai X, Dong MB, Zhang F, Sharp PA, Platt RJ, Chen S. 2018. Mapping a functional cancer genome atlas of tumor suppressors in mouse liver using AAV–CRISPR–mediated direct in vivo screening. Sci Adv 4: eaao5508. 10.1126/sciadv.aao5508 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu N, Nguyen Q, Wan Y, Zhou T, Venter J, Frampton GA, DeMorrow S, Pan D, Meng F, Glaser S, et al. 2017. The Hippo signaling functions through the Notch signaling to regulate intrahepatic bile duct development in mammals. Lab Invest 97: 843–853. 10.1038/labinvest.2017.29 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yanger K, Zong Y, Maggs LR, Shapira SN, Maddipati R, Aiello NM, Thung SN, Wells RG, Greenbaum LE, Stanger BZ. 2013. Robust cellular reprogramming occurs spontaneously during liver regeneration. Genes Dev 27: 719–724. 10.1101/gad.207803.112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yano S, Ishiuchi T, Abe S, Namekawa SH, Huang G, Ogawa Y, Sasaki H. 2022. Histone H3K36me2 and H3K36me3 form a chromatin platform essential for DNMT3A-dependent DNA methylation in mouse oocytes. Nat Commun 13: 4440. 10.1038/s41467-022-32141-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ye L, Park JJ, Dong MB, Yang Q, Chow RD, Peng L, Du Y, Guo J, Dai X, Wang G, et al. 2019. In vivo CRISPR screening in CD8T cells with AAV–Sleeping Beauty hybrid vectors identifies membrane targets for improving immunotherapy for glioblastoma. Nat Biotechnol 37: 1302–1313. 10.1038/s41587-019-0246-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yimlamai D, Christodoulou C, Galli GG, Yanger K, Pepe-Mooney B, Gurung B, Shrestha K, Cahan P, Stanger BZ, Camargo FD. 2014. Hippo pathway activity influences liver cell fate. Cell 157: 1324–1338. 10.1016/j.cell.2014.03.060 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yuan W, Xu M, Huang C, Liu N, Chen S, Zhu B. 2011. H3k36 methylation antagonizes PRC2-mediated H3K27 methylation. J Biol Chem 286: 7983–7989. 10.1074/jbc.M110.194027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yuan S, Natesan R, Sanchez-Rivera FJ, Li J, Bhanu NV, Yamazoe T, Lin JH, Merrell AJ, Sela Y, Thomas SK, et al. 2020. Global regulation of the histone mark H3K36me2 underlies epithelial plasticity and metastatic progression. Cancer Discov 10: 854–871. 10.1158/2159-8290.CD-19-1299 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang N, Bai H, David KK, Dong J, Zheng Y, Cai J, Giovannini M, Liu P, Anders RA, Pan D. 2010. The Merlin/NF2 tumor suppressor functions through the YAP oncoprotein to regulate tissue homeostasis in mammals. Dev Cell 19: 27–38. 10.1016/j.devcel.2010.06.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhuang L, Jang Y, Park Y-K, Lee J-E, Jain S, Froimchuk E, Broun A, Liu C, Gavrilova O, Ge K. 2018a. Depletion of Nsd2-mediated histone H3K36 methylation impairs adipose tissue development and function. Nat Commun 9: 1796. 10.1038/s41467-018-04127-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhuang L, Jang Y, Park Y-K, Lee J-E, Jain S, Froimchuk E, Broun A, Liu C, Gavrilova O, Ge K. 2018b. Depletion of Nsd2-mediated histone H3K36 methylation impairs adipose tissue development and function. Nat Commun 9: 1796. 10.1038/s41467-018-04127-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




