Abstract
Tissue damage increases cancer risk through poorly understood mechanisms1. In the pancreas, pancreatitis associated with tissue injury collaborates with activating mutations in the Kras oncogene to dramatically accelerate the formation of early neoplastic lesions and ultimately pancreatic cancer2,3. By integrating genomics, single-cell chromatin assays and spatiotemporally-controlled functional perturbations in autochthonous mouse models, we show that the combination of Kras mutation and tissue damage promotes a unique chromatin state in the pancreatic epithelium that distinguishes neoplastic transformation from normal regeneration and is selected for throughout malignant evolution. This cancer-associated epigenetic state emerges within 48 hours of pancreatic injury, and involves an acinar-to-neoplasia ‘chromatin switch’ that contributes to the early dysregulation of genes defining human pancreatic cancer. Among the genes most rapidly activated upon tissue damage in the pre-malignant pancreatic epithelium is the alarmin cytokine IL-33, which cooperates with mutant Kras in unleashing the epigenetic remodeling program of early neoplasia and neoplastic transformation in the absence of injury. Collectively, our study demonstrates how gene-environment interactions can rapidly produce gene regulatory programs that dictate early neoplastic commitment and provides a molecular framework for understanding the interplay between genetics and environmental cues in cancer initiation.
Understanding the mechanisms by which tissue damage promotes cancer initiation may expose rational strategies to prevent, detect, and intercept tumors before they evolve to an intractable stage. A paradigm of damage-associated carcinogenesis is pancreatic ductal adenocarcinoma (PDAC), an invariably lethal cancer that lacks effective therapies. Its major oncogene, mutant Kras, is found altered in virtually all patients and is necessary for disease initiation and maintenance4,5. Intriguingly, Kras gene mutations are only weakly oncogenic but potently cooperate with signals emanating from tissue damage and the resulting inflammation (pancreatitis) to initiate the disease2,3,6. Normally, pancreatic injury triggers a rapid cell fate transition characterized by loss of acinar differentiation and concomitant acquisition of a ‘duct-like’ state, a process termed acinar-to-ductal metaplasia (ADM) that resolves via acinar re-differentiation as the tissue regenerates7. However, in the presence of a Kras mutation, metaplasia aberrantly persists and progresses into pancreatic intraepithelial neoplasia (PanIN)8,9. These observations suggest that oncogenic KRAS co-opts otherwise reparative regenerative responses to drive PDAC initiation10. Given that neoplastic lesions emerge rapidly after tissue damage in the setting of mutant Kras but in the apparent absence of additional mutations2,3,6, we hypothesize that uncharacterized epigenetic mechanisms underlie the interplay between cancer-predisposing mutations and environmental insults in cancer pathogenesis.
Injury-induced chromatin states
As a first step towards dissecting how gene–environment interactions reprogram the pancreatic epithelium during tumor development, we generated chromatin accessibility maps of pancreatic epithelial cells freshly isolated from healthy, damaged, early neoplastic or malignant tissues from genetically and pathologically accurate mouse models engineered to enable selective isolation of exocrine pancreatic epithelial cells using the fluorescent reporter mKate2 (Extended Data Fig. 1a–c; Supplementary Fig. 1a, see Methods). Specifically, mKate2+ cells from the following tissue conditions were subjected to ATAC-seq11: (i) normal healthy pancreas (Normal), (ii) normal pancreas undergoing regenerative ADM driven by tissue damage (Injury); (iii) Kras-mutant pancreata undergoing stochastic neoplastic transformation (Kras*), (iv) Kras-mutated pancreata undergoing synchronous neoplastic reprogramming accelerated by tissue damage (Kras*+Injury) and, as reference for advanced disease, (v) PDAC (PDAC) arising in KPflC mice (Ptf1a-Cre;RIK;LSL-KrasG12D;p53fl/+) or upon syngeneic orthotopic transplantation of KrasG12D;p53-null engineered pancreatic organoids (Fig. 1a–b, Extended Data Fig. 1b–d; Supplementary Table 1).
Differential accessibility analysis was used to identify open chromatin regions (peaks) that were significantly gained or lost in each condition compared to Normal, which we refer to as accessibility-GAIN and accessibility-LOSS regions, respectively (Supplementary Table 2). As illustrated in Fig. 1b, these analyses uncovered large-scale chromatin accessibility changes across conditions, with the majority of changes contributed by cooperative effects of tissue damage and mutant Kras early on in tumorigenesis (PC1: 56%), rather than the later transition from early neoplasia to PDAC (PC2: 16%). Strikingly, cells undergoing synchronous neoplastic reprogramming by the combined effects of tissue damage and oncogenic Kras displayed more than half of the chromatin aberrations that distinguish advanced PDAC from normal pancreas (Fig. 1c), suggesting that PDAC co-opts chromatin regulatory mechanisms from its onset.
Comparison of the dynamic peaks associated with the reversible metaplasia that accompanies physiological regeneration (Injury) to those occurring in persistent, pre-neoplastic metaplasia (Kras* or Kras*+Injury) revealed shared and unique traits. Accessibility-LOSS changes were largely shared and, consistent with the reduction in acinar differentiation that defines both processes10, preferentially affected loci linked to acinar cell functions (Fig. 1d, Extended Data Fig. 1e, Supplementary Table 2). While there was also overlap between accessibility-GAIN regions, the combination of mutant Kras and injury produced a large number (>8,500) of additional chromatin accessibility changes that were not observed in pancreatic epithelial cells expressing mutant Kras or subject to injury alone (Fig. 1d; Extended Data Fig. 1e). Unsupervised hierarchical clustering of all dynamic peaks sensitive to mutant Kras and/or tissue damage identified clusters of open chromatin regions specific to normal healthy (A), regenerative (R) and early neoplastic (N) tissue states as well as shared (S), with a large set of peaks uniquely gained upon co-occurrence of both stimuli (N2) (Extended Data Fig. 2a).
Notably, 67% of the accessibility-GAINS unique to cells undergoing synchronous neoplastic transformation by the combined effects of mutant Kras and injury (cluster N2) were retained in advanced PDAC, whereas those specific to each insult alone (e.g. cluster R) were not (Fig. 1e). These early cancer-associated chromatin configurations arose within 48 hours of tissue damage and were associated with genes linked to PDAC-related pathways, including many cancer-relevant factors (Supplementary Tables 2–3; Extended Data Fig. 2b). The majority of open chromatin regions distinguishing normal (A), regenerating (R), and early neoplastic (N) tissues mapped to non-coding intergenic and intronic regions containing motifs for master transcription factors (TFs) controlling pancreatic cell lineage commitment (e.g. NR5A2, PTF1A) and carcinogenesis (e.g. AP-1, SOX, KLF)12,13 (Extended Data Fig. 2c–d), with TF enrichment and co-occurrence patterns differing across conditions (Extended Data Fig. 2f,g). Of note, Kras mutation and tissue damage cooperatively promoted accessibility-LOSS at loci containing active enhancers and bound by acinar specification TFs in the normal exocrine pancreas14–16 and accessibility-GAIN at loci containing experimentally-validated enhancers of advanced PDAC17 (Extended Data Fig. 3a–c).
Perturbing chromatin output
To functionally relate chromatin changes to cell fate transitions in vivo, we adapted the mouse models described above to incorporate a pancreas-specific and doxycycline (dox)-regulatable GFP-linked short hairpin RNA (shRNA) enabling perturbation of the transcriptional output of active regulatory elements preferentially associated with fate-specifying genes. Specifically, we exploited the activity of a well-characterized chromatin reader, the Bromodomain and Extraterminal (BET) family member Brd4, which binds acetylated (active) chromatin and is particularly important for enhancer-mediated transcription of cell-identity genes18,19. We reasoned that inducible targeting of Brd4 function in Kras-mutant or -wild type pancreatic epithelial cells would perturb and expose, in a TF/context-agnostic manner, transcriptionally-active gene programs defining their states and reveal functional links between chromatin state and phenotypic output in vivo. Moreover, this genetic approach overcomes confounding effects of pharmacological BET inhibition that would simultaneously disrupt epigenetic programs of surrounding stromal cells known to influence pancreatic epithelial cell state/fate20.
To compare pro-neoplastic and regenerative programs, we produced models differing in Kras mutation status and harboring well-validated shRNAs targeting Brd4 (2 independent strains) or Renilla Luciferase (control) (Fig. 2a; Extended Data Fig. 3d; see Methods for details). As expected, dox-treated KCsh (Kras mutant) and Csh (Kras WT) mice expressing Brd4 shRNA (shBrd4), but not the Renilla shRNA (shRen) controls, showed potent suppression of Brd4 protein restricted to mKate2/GFP double-positive cells (Fig. 2b; Extended Data Fig. 3e), and RNA-seq and ATAC-seq data obtained from sorted mKate2/GFP+ cells showed that Brd4 suppression impaired the expression of established enhancer-associated cell-identity genes15,17,21 without decreasing chromatin accessibility at these loci or globally affecting transcription (Fig. 2c, Extended Data Fig. 3f,g; Supplementary Fig. 1b).
We next analyzed the phenotypic impact of perturbing Brd4 function on cells undergoing regenerative and pro-neoplastic cell fate transitions profiled above (Extended Data Fig. 4a). Brd4-suppressed cells effectively lost acinar morphology and expression of acinar markers (CPA1, Amylase) while acquiring ductal (SOX9, KRT19) and dedifferentiation (Clusterin) markers in response to tissue damage, mutant Kras, or their combination in both regenerating and neoplastic conditions (Fig. 3a; Extended Data Fig. 4b–g). Therefore, despite the marked chromatin changes detected in damaged tissues (Fig 1), Brd4 is not required for ADM and, in fact, restrains this transition. In contrast, Brd4 suppression impaired both the resolution of ADM during normal regeneration as well as the initiation of neoplasia in the context of mutant Kras. In the regenerative context, Brd4-suppressed cells retained metaplastic features at a time when metaplasia in controls had resolved (Fig. 3a; Extended Data Fig. 4b,e–g). In the neoplastic context, Brd4 suppression prevented the appearance of PanIN lesions (Fig. 3c–d). In both settings, metaplastic cells expressing shBrd4 disappeared over time leading to tissue atrophy (Fig. 3a–c; Extended Data Fig. 5a–c). Of note, epithelial-specific Brd4 suppression did not reduce Myc protein nor recapitulated effects of Myc suppression (Extended Data Fig. 5d–f). These results uncover distinct epigenetic requirements for the induction versus resolution of metaplasia and link Brd4 function to Myc-independent expression programs required for regenerative and neoplastic outcomes of epithelial cell plasticity.
To identify these programs, the chromatin accessibility landscapes from Figure 1 were compared to RNA-seq and ATAC-seq profiles of pancreatic epithelial cells (mKate2/GFP+) triggered to undergo pro-neoplastic or regenerative fate transitions in the presence and absence of Brd4 (Extended Data Fig. 4a, 6a). Consistent with the differential chromatin states observed during regenerative versus pro-neoplastic metaplasia, the Brd4-sensitive gene expression programs in each condition were also distinct (Fig. 3f; Extended Data Fig. 6b). In the regenerative context (Csh: Injury), Brd4 suppression blunted the expression of genes linked to acinar-specific (A) clusters identified through ATAC-seq (Fig. 3f; Extended Data Fig. 6b–d; Supplementary Table 4). Among these genes were master TFs (Ptf1a22) known to stabilize acinar cell identity (Extended Data Fig. 6b,d), providing an explanation for the exacerbated ADM and impairment of regeneration of Brd4-suppressed pancreata after injury.
In the neoplastic setting (KCsh: Kras*+Injury), Brd4 suppression also reduced acinar gene expression (Extended Data Fig. 6b–d-bottom) but, additionally, impaired the activation of a larger set of genes that otherwise are selectively opened and induced in this context (clusters N1/N2) (Fig. 3f; Extended Data Fig. 6c). These neoplasia-specific Brd4 targets included effectors of oncogenic Kras, targets of cancer-associated transcriptional networks, and genes characteristic of advanced human PDAC23 (Extended Data Fig. 6e–k). Consistent with a direct effect on chromatin transcriptional activity, Brd4 perturbation did not prevent the acquisition of injury-driven chromatin states observed in controls (Extended Data. Fig. 6l,m). These results functionally connect the different injury-induced chromatin states of normal and Kras-mutant cells with distinct Brd4-dependent transcriptional programs required for regeneration or neoplastic transformation, respectively.
Epigenetic reprogramming instructs neoplastic commitment
To reveal potential mechanisms whereby chromatin dysregulation redirects regenerative injury responses to neoplasia, we performed RNA-seq analyses across the full spectrum of epithelial states described above (Fig. 1a). Integration of transcriptional, chromatin accessibility, and shBrd4 sensitivity profiles identified two major classes of differentially expressed genes (DEGs) distinguishing pancreatic epithelial (mKate2+) cells from regenerating (Injury), early neoplastic (Kras*; Kras*+Injury) and cancer (PDAC) tissues from healthy normal (Normal): DEGs with ubiquitously opened regulatory elements across all conditions (chromatin-stable DEGs) and DEGs displaying parallel accessibility-GAINS or LOSSES at one or more associated regulatory element (chromatin-dynamic DEGs) (Fig. 4a; Extended Data Fig. 7a,b; Supplementary Table 5). Interestingly, chromatin-stable DEGs were linked to housekeeping cellular processes, whereas ‘chromatin-dynamic DEGs’ encoded factors regulating traits altered in PDAC, represented an enriched fraction in mutant-Kras pancreata subject to injury (Kras*+Injury) and advanced cancer (PDAC), and were particularly sensitive to Brd4 perturbation (Extended Data Fig. 7c,d; Supplementary Table 6).
Comparison of dynamic cis-regulatory elements of DEGs altered during regeneration, early neoplasia and cancer revealed a common redistribution of transcriptionally-active open chromatin from loci containing binding sites for acinar lineage-specifying TFs (Fig. 4b, blue/green-colored peaks) to newly accessible regions enriched in motifs for wound healing and Ras/cancer-associated TFs including the AP-1 TF family (Fig. 4b, red-colored peaks; Supplementary Table 7). However, unsupervised clustering analysis identified a gene regulatory program that is uniquely induced by cooperative effects of mutant Kras and tissue damage (Fig. 4b). Accordingly, the relative expression of the TFs predicted to bind differentially-active chromatin domains differed between conditions, even among members of the same TF family (Extended Data Fig. 7e–g). Consistent with a role for this neoplasia-specific epigenetic program in carcinogenesis, the gene regulatory activities and expression outputs altered in Kras-mutant pancreata shortly after injury were largely shared with (and further exacerbated in) advanced disease (Extended Data Fig. 7e–g), strongly correlated with signatures defining human PDAC (Fig. 4c), and were blunted in shBrd4 metaplastic cells unable to progress to neoplasia (Fig. 4d). Thus, while loss of acinar differentiation is sufficient to activate certain AP-1 TFs and other cancer-associated networks during physiological metaplasia24, a distinct epigenetic program facilitated by injury-driven chromatin accessibility changes is required for neoplastic commitment.
To discriminate whether this neoplasia-specific chromatin state reflects bona-fide chromatin remodeling (versus a shift in the proportion of pre-existing diverse cell types comprising the epithelium25), we applied single-cell ATAC-seq (scATAC-seq) on over 6,000 Kras-mutant cells isolated from Kras* or Kras*+Injury conditions. These analyses revealed rapid chromatin accessibility shifts induced by injury within and across epigenetically heterogeneous subpopulations of pre-malignant Kras-mutant cells (Extended Data Fig. 8a–e; Supplementary Table 8) through remodeling of specific loci consistent with those detected in bulk populations (r =0.765–0.882) (Extended Data Fig. 8f–i). Also consistent with chromatin remodeling, mapping of activity scores for acinar differentiation (e.g. NR5A2) and Kras/injury-activated (e.g. AP-1) TFs per individual cell showed a shift in TF activity of Kras-mutant cells upon tissue injury (Fig. 4e) and an anticorrelation of their activity scores across single-cell epigenetic profiles (Fig. 4f; Extended Data Fig. 8j). scATAC-seq analyses also captured depletion of cells with specific chromatin states (e.g. acinar-state) coinciding with the emergence of less differentiated subpopulations defined by widespread chromatin opening at neoplasia-associated loci (Extended Data Fig. 8h,i). In contrast, metaplasia-associated genes were found to pre-exist in an open state across most epithelial subpopulations (including in cells with open chromatin at acinar genes) and, in agreement with bulk analyses, did not experience further accessibility gain upon injury (Extended Data Fig. 8i-right). Thus, while the activity of specific TFs that underlie such early epigenetic heterogeneity and injury-driven plasticity currently relies on correlative observations, these results demonstrate that oncogenic mutations and tissue injury cooperatively remodel chromatin to produce neoplasia-specific transcriptional programs. We refer to this process as the ‘acinar-to-neoplasia’ chromatin switch.
IL-33 is a chromatin-activated effector of early neoplasia
Many of the neoplasia-specific chromatin-activated genes encoded membrane-bound and secreted proteins (Fig 4g; Fig 5a). Among the most robustly activated genes was the alarmin cytokine IL-33, an injury-associated factor that coordinates wound healing and tissue repair responses26. Hence, analysis of both bulk and scATAC-seq chromatin accessibility datasets consistently identified numerous peaks at the Il33 locus that were rapidly and selectively gained in Kras-mutant pancreata undergoing the injury-facilitated chromatin switch (Kras*+Injury; Extended Data Fig. 9a–d) and retained in established PDAC (Extended Data 9e). These changes correlated with a Brd4-dependent increase in Il33 expression within the pancreatic epithelium (Fig. 5a,b, Extended Data Fig. 9f–h) that could be cooperatively induced in cultured cells upon transduction with Ras/injury-sensitive TFs previously validated to bind these dynamic loci27 (Extended Data Fig. 9e,i). Accordingly, multiplexed immunoassay for 40 different cytokines identified IL-33 as the most abundant cytokine in Kras-mutant pancreata after injury (Extended Data Fig. 9j).
We next examined the extent to which exogenous IL-33 could recapitulate the effects of tissue damage by intraperitoneal administration of recombinant mouse IL-33 (rIL-33) to Kras-mutant- or Kras-wt mice (Extended Data Fig. 10a). Remarkably, rIL-33 mimicked injury in cooperating with mutant Kras to activate the neoplasia-specific, Brd4-dependent gene expression program induced upon tissue damage (Fig. 5c,d; Extended Data Fig. 10b–e), including genes upregulated in human PDAC (Extended Data Fig. 10f). These transcriptional outputs were preceded by accessibility-GAIN and gene expression at cancer-associated loci sensitive to injury in pre-neoplastic Kras-mutant tissues (Extended Data Fig. 10g–i), and associated with an accelerated appearance of PanIN lesions (see ‘KC-GEMM’ panels in Fig. 5e,f; Extended Data Fig. 10j). Notably, rIL-33 had no detectable effects on normal pancreata (Fig. 5e,f; Extended Data Fig. 10j). Of note, rIL-33 did not significantly induce its own mRNA or nuclear protein staining in Kras-mutant epithelial cells (Extended Data Fig. 10k,l), suggesting that its ability to phenocopy injury in the presence of oncogenic Kras is predominantly due to its soluble form. These results identify IL-33 as a target and effector of gene-environment interactions driving early-stage neoplasia and suggest a chromatin-mediated amplification mechanism whereby tissue damage mediators unleash and enforce oncogene-dependent gene expression.
Discussion
Here we document large-scale chromatin accessibility remodeling in damaged tissues that, in the presence of an oncogenic Kras mutation, leads to an epigenetic program – not accessible during physiological regeneration – that contributes to tumor initiation and is selected for during malignant progression. By combining bulk and single-cell profiling with spatiotemporally-controlled in vivo perturbations of mechanisms that regulate cell identity, we show that pancreatic metaplasia involves epigenetic silencing of acinar identity loci that is exacerbated by Brd4 suppression, suggesting that somatic enhancers and/or super-enhancers linked to cell type specification28,29 actively prevent excessive dedifferentiation upon injury and facilitate its resolution. In contrast, progression to neoplasia couples dedifferentiation to a distinctive chromatin remodeling program that diverts DNA accessibility and Brd4-mediated transcription from normal lineage-specifying to cancer-defining loci. Thus, while enhancer remodeling facilitates metastasis in advanced PDAC17,30, these data imply a role for chromatin dysregulation at an early disease stage (see Supplementary Discussion for additional commentary).
Cancer initiation is facilitated by interactions between genetic and environmental insults. Our studies identify an epigenomic mechanism that contributes to this effect, involving an ‘acinar-to-neoplasia’ chromatin switch that can arise in Kras-mutant cells within 48 hours of tissue damage. One key target of this program is IL-33, whose cytokine activity can replace the requirement for injury in accelerating the formation of early neoplastic (PanIN) lesions. While IL-33 can both restrain31 and amplify anti-tumor immunity32 in advanced cancers, it connects tissue damage responses with oncogene-dependent epithelial plasticity during early neoplasia. Further study of these and other epigenetically-dysregulated programs may provide new opportunities for the rational design of early detection and treatment strategies to intercept inflammation- and RAS-driven malignancies such as PDAC at an earlier stage.
Methods
Generation and authentication of KCshBrd4 ESC clones
KC-shBrd4 ESCs (Ptf1a-Cre;LSL-KrasG12D;RIK;CHC33) were targeted with 2 independent GFP-linked Brd4-shRNAs (shBrd4.552 and shBrd4.1448)34,35 cloned into mir30-based targeting constructs36, as previously described33,36. Targeted ESCs were selected and functionally tested for single intregation of the GFP-linked shRNA element into the CHC locus as previously described37. The KC-shRen ESC control clone used in this study has been previously described33,37. Before injection, ESCs were cultured briefly for expansion in KOSR+2i medium38. The identity and genotype of the ESC, resulting chimeric mice and their progeny was authenticated by genomic PCR using a common Col1a1 primer CACCCTGAAAACTTTGCCCC paired with a transgene specific primer: shRen.713: GTATAGATAAGCATTATAATTCCTA; shBrd4.552: TATTGTTCCCATATCCAT; shBrd4.1448: CTAGTTTAGACTTGATTGTG, yielding an ∼250-bp product. ESC were confirmed to be negative for mycoplasma and other microorganisms before injection.
Animal models
All animal experiments in this study were performed in accordance with a protocol approved by the Memorial Sloan-Kettering Institutional Animal Care and Use Committee. Mice were maintained under specific pathogen-free conditions, and food and water were provided ad libitum. All mice strains have been previously described. Ptf1a-Cre39, LSL-KrasG12D40, CHC41, CAGs-LSL-RIK42, and TRE-GFP-shRen43 strains were interbred and maintained on mixed Bl6/129J backgrounds.
To enable selective isolation of epithelial cells from pancreatic tissues we employed the pancreas-specific Cre driver Ptf1a-Cre39,44 and the lineage-tracing allele LSL-rtTA3-IRES-mKate2 (RIK)41,42 that, by themselves or in combination with a Cre-activatable KrasG12D allele40, enable tagging of pancreatic epithelial cells harboring wild-type (WT) or mutant Kras by the fluorescent reporter mKate2. To compare the effects of tissue injury in the transcriptional and chromatin accessibility landscapes of mutant Kras-expressing or Kras wild-type pancreatic epithelial cells, KC-GEMM (Ptf1a-Cre;RIK;LSL-KrasG12D) or C-GEMM (Ptf1a-Cre;RIK) 5-week old male mice were treated with 8 hourly intraperitoneal injections of 80 μg/kg caerulein (Bachem) or PBS for 2 consecutive days, using littermates when possible. To characterize invasive disease, pancreatic ductal adenocarcinoma (PDAC) cells were isolated from cancer lesions arising in autochthonous transgenic models (KPflC-GEMM; Ptf1a-Cre;RIK;LSL-KrasG12D;p53fl/+)5,45 that were macro-dissected away from pre-malignant tissue. As an orthogonal approach, 8–10 weeks old C57Bl/6 female mice (Harlan) were subjected to for orthotopic transplantations with syngeneic ductal organoids harbouring mutant Kras and inactivated Trp53 gene (see below). Prior to transplantation, organoid cultures were dissociated with TrypLE (Gibco) after mechanical dissociation by pipetting and 1–2×105 cells in serum-free advanced DMEM/F12 (Life Technologies) supplemented with 2 mM glutamine and pen-strep were mixed 1:1 with growth factor reduced matrigel (Corning) and injected into the exposed pancreas using a Hamilton syringe fitted with a 26 gauge needle. In all experiments with orthotopic or transgenic PDAC models, tumors did not exceed a maximum volume corresponding to 10 % of the animal’s body weight (typically 12 mm diameter). Mice were evaluated daily for signs of distress or endpoint criteria. Specifically, mice were immediately euthanized if they presented signs of cachexia, weight loss >20% of initial weight, breathing difficulties, or developed tumours 12 mm in diameter. No tumours exceeded this limit.
For studying the effects of epithelial-specific suppression of Brd4 or Myc in a mutant Kras background, chimeric cohorts of male KCshBrd4 mice derived from the ESCs above described were generated by the Center for Pancreatic Cancer Research (CPCR) at MSKCC or the Rodent Genetic Engineering Core at NYU as previously described33. ESC-derived KCshMyc mice have been previously described33. Only KC-shRNA mice with a coat color chimerism of >95 % were included for experiments. Kras wild-type Csh counterparts were generated by strain intercrossing (to remove LSL-KrasG12D allele). For induction of shRNA expression, mice were switched to a doxycycline diet (, at either 4 weeks of age (for injury-driven regeneration or accelerated neoplasia experiments) or at post-natal day (stochastic Kras-driven-neoplasia) to induce shRNA expression. In the case of C-shRNA mice, both female and male mice were used, allocated at randome to sex-matched treatment groups. Mice were switched to a doxycycline diet 625 mg/kg, Harlan Teklad) that was changed twice weekly at either 4 weeks of age (for injury-driven regeneration or accelerated neoplasia settings) or postnatal day 10 (stochastic Kras-driven-neoplasia) to induce shRNA expression.
For treatment with recombinant IL-33, 5 weeks-old C or KC mice were injected intraperitoneally once daily doses with 1 µg of murine recombinant IL-33 (#580504, R&D Systems) or vehicle (PBS) for 5 consecutive days, using sex-matched experimental groups.
Pancreatic epithelial cell isolation
For RNA-seq, ATAC-seq and scATAC-seq analyses in lineage-traced epithelial cells were freshly-isolated isolated from pancreatic tissues from KC, KPflC, or KC-shRNA mice by FACS-sorting. Specifically, pancreata were finely chopped with scissors and incubated with digestion buffer containing 1 mg/ml Collagenase V (C9263, Sigma-Aldrich), 2 U/mL Dispase (17105041, Life Technologies) dissolved in HBSS with Mg2+ and Ca2+ (14025076, Thermo Fisher Scientific) supplemented with 0.1 mg/ml DNase I (Sigma, DN25–100MG) and 0.1 mg/ml Soybean Trypsin Inhibitor (STI) (T9003, Sigma), in gentleMACS C Tubes (Miltenyi Biotec) for 42 min at 37°C using the gentleMACS Octo Dissociator. Normal (non-fibrotic) pancreas samples were dissociated as above, except that the digestion buffer contained 1mg/mL Collagenase D (11088858001, Sigma-Aldrich). After enzymatic dissociation, samples were washed with PBS and further digested with a 0.05% solution of Trypsin-EDTA (15400054, Thermo Fisher Scientific) diluted in PBS for 5 min at 37°C. Trypsin digestion was neutralized with FACS buffer (10 mM EGTA and 2% FBS in PBS) containing STI. Samples were then washed in FACS buffer containing DNase I and STI, filtered through a 100 μm strainer. Cell suspensions were blocked for 5 min at room temperature with rat anti-mouse CD16/CD32 with Fcblock (Clone 2.4G2, BD Biosciences) in FACS buffer containing DNase I and STI, and an APC-conjugated CD45 antibody (Clone 30-F11, Biolegend, 1:200) or APC-Cy7 CD45 antibody (Clone 30-F11, Biolegend, 1:200) was then added and incubated for 10 min at 4°C. Cells were then washed once with in FACS buffer containing DNase I and STI, filtered through a 40 μm strainer, and resuspended in FACS buffer containing DNase I and STI and 300 nM DAPI as live-cell marker. Sorts were performed on a BD FACSAria III cell sorter (Becton Dickinson) for mKate2 (co-expressing GFP for on dox-shRNA mice), excluding CD45+ cells. Cells were sorted directly into Trizol LS (Thermo Fisher Scientific) for RNA-seq or collected in 2% FBS in PBS for ATAC-seq.
Immunofluorescence, immunohistochemistry and histological analyses
Tissues were fixed overnight in 10% neutral buffered formalin (Richard-Allan Scientific), embedded in paraffin and cut into 5 µm sections. Slides were heated for 30 min at 55°C, deparaffinized, rehydrated with an alcohol series and subjected to antigen retrieval with citrate buffer (Vector Laboratories Unmasking Solution, H-3300) for 25 min in a pressure cooker set on high. Sections were treated with 3% H2O2 for 10 min followed by a wash in deionized water (for immunohistochemistry only), washed in PBS, then blocked in PBS/0.1% Triton X-100 containing 1% BSA. Primary antibodies were incubated overnight at 4°C in blocking buffer. The following primary antibodies were used: mKate2 (Evrogen, AB233, 1:1000), GFP (ab13970, Abcam, 1:500; and 2956S, Cell Signaling Technology, 1:200), Brd4 (HPA015055, Sigma-Aldrich, 1:100), Myc (ab32072, Abcam, 1:100), CPA1 (AF2765, R&D, 1:400), Clusterin (sc-6419, SCBT, 1:200), SOX9 (AB5535, Millipore, 1:1000), Amylase (sc-31869, SCBT, 1:1000), KRT19 (Troma III, Developmental Studies Hybridoma Bank, 1:500), FOSL1 (sc-376148, SCBT, 1:100), JUNB (sc-8051, SCBT, 1:100), AGR2 (NBP2–27393, Novus Biologicals, 1:200), DCLK1 (ab109029, Abcam, 1:200), Ki67 (BD Biosciences 550609, 1:200) and IL-33 (AF3626, R&D, 1:150). For mKate2, GFP and cMyc immunohistochemistry, Vector ImmPress HRP kits and ImmPact DAB (Vector Laboratories) were used for secondary detection. Tissues were then counterstained with Haematoxylin or when indicated Alcian blue (pH 2.5) and 0.1% Nuclear Fast Red Solution, dehydrated and mounted with Permount (Fisher). The immunohistochemistry detection of Brd4 was performed at the Molecular Cytology Core Facility of Memorial Sloan Kettering Cancer Center using Discovery XT processor (Ventana Medical Systems-Roche). The tissue sections were blocked for 30 min in 10% normal goat serum, 2% BSA in PBS. A rabbit polyclonal anti-Brd4 antibody (HPA015055, Sigma-Aldrich) was used in 1 ug/ml (1:100) concentrations. The incubation with the primary antibody was done for 6 hours, followed by 60 minutes incubation with biotinylated goat anti-rabbit IgG (PK610, Vector labs) in 5.75ug/mL concentration. Blocker D, Streptavidin- HRP and DAB detection kit (760–124, Ventana Medical Systems-Roche) were used according to the manufacturer instructions. Slides were counterstained with Hematoxylin (760–2021, Ventana), Bluing Reagent (760–2037, Ventana) and coverslipped with Permount (Fisher Scientific).
For immunofluorescence, the following secondary antibodies were used: goat anti-chicken AF488 (A11039, Invitrogen, 1:500), donkey anti-chicken IgY H&L (FITC) (ab63507, Abcam, 1:500), donkey anti-rabbit AF594 (A21207, Invitrogen, 1:500), goat anti-rabbit AF594 (A11037, Thermo Fisher Scientific, 1:500), donkey anti-goat AF488 (A11055, Invitrogen, 1:500) and donkey anti-goat AF594 (A11058, Thermo Fisher Scientific, 1:500). Slides were counterstained with DAPI and mounted in ProLong Gold (Life Technologies). Hematoxylin and eosin (H&E) was performed using standard protocols. Images were acquired on a Zeiss AxioImager microscope using using a 10 × (Zeiss NA 0.3) or 20 × (Zeiss NA 0.17) objective, an ORCA/ER CCD camera (Hamamatsu Photonics, Hamamatsu, Japan), and Axiovision or Zeiss (ZEN 2.3) software. Bright-field and fluorescence images of pancreata gross morphology were acquired using Nikon SMZ1500 microscope and NIS-Element imaging software.
Histological classification and grading of pancreatic lesions into ADM or PanIN PanIN lesions was performed by a veterinary pathologist blinded to genotype and treatment condition in H&E stained-slides using established criteria46. When applicable GFP+ area marking shRNA-expressing epithelial cells from KCsh mice was quantified using “SpotR software”. All lesions in at least 3 representative 20X fields per section were measured and counted. The results were averaged and normalized to total tissue area analyzed. Statistical analyses were performed using unpaired two-tailed Student’s t-test in GraphPad Prism (v7 and v8). Graphs display means ± s.e.m of independent biological replicates (mice).
Multiplexed immunoassays in tissue lysates
0.1% SDS; 1mM EDTA; 1% NP-40, supplemented with fresh protease inhibitors, Complete™ Mini Protease Inhibitor Cocktail, Sigma) using PowerBead Tubes, Ceramic 2.8mm and PowerLyzer 24 Homogenizer (110/220 V, 2 × 30-sec cycles; S 3500) (Qiagen). After homogenization, RIPA lysates were incubated for 30 mins at 4°C and with continued vortexing, and clarified by centrifugation. Same amounts of total tissue protein (25 µg) diluted in RIPA buffer were subjected to quantitative multiplexed ELISA (Discovery Assays) performed by Eve Technologies (Canada).
Cell culture and retroviral infection
Cells were maintained in a humidified incubator at 37 °C with 5% CO2. The 266–6 pancreatic acinar cell tumor cell line was purchased from ATCC (CRL-2151) and was grown in complete DMEM (DMEM, 10% FBS (Gibco), pen-strep) on non-coated, tissue-culture-treated plates. KPflC cells were derived from PDAC arising in a Ptf1a-Cre;LSL-KrasG12D;p53flox/+;RIK mouse generated by blastocyst injection, as previously described47, and propagated in propagated in complete DMEM on collagen-coated plates (PurCol, Advanced Biomatrix, 0.1 mg/ml). Frozen stocks generated within 2 to 3 passages from date of purchase (266–6) or generation (KPflC) and were used for the in vitro experiments. Cell lines were not externally authenticated. All cell lines used were negative for mycoplasma.
For stable transduction of 266–6 and KPflC cells with FOSL1 and/or cJUN or both, VSV-G pseudotyped retroviral supernatants were generated from transduced Phoenix-gp packaging cells and infections were performed as described elsewhere34. The following plasmids were used: p6599 MSCV-IP N-HAonly FOSL1 (Addgene, #34897), p6600 MSCV-IP N-HAonly JUN (Addgene, #34898), pMIEG3-cJun (Addgene, #40348), and their corresponding empty vectors (MSCV-IP N-HAonly-EV or pMIEG3-EV). Empty vectors were generated by removing FOSL1 (from MSCV-IP N-HAonly FOSL1 plasmid) or cJun (from pMIEG3-cJun plasmid) cDNAs by digestion with Xho/EcoRI-HF or XhoI/BamHI-HF restriction enzymes (New England Biolabs), respectively followed by Klenow step (M0210M, New England Biolabs), gel extraction purification of the digested back-bone fragment (QIAquick Gel Extraction Kit, Qiagen), and ligation through T4 DNA Ligase (New England Biolabs), following manufacturer’ instructions. All plasmids were authenticated by test-digestion and sanger sequencing. Infected cells were selected with 2 μg/mL (for 266–6 cells) or 8 μg/mL (for KPflC;RIK cells) puromycin (Sigma) and/or sorted based on GFP-positivity using Sony MA900 Cell Sorter (Sony) (GFP-vectors), depending on whether they were transduced with MSCV-IP-N-HA and/or pMIEG3 vectors, respectively, and were harvested for expression analyses at day 12 (KPflC;RIK) or day 28 (266–6) post-infection.
Quantitative Real-Time Polymerase Chain Reaction (qRT-PCR) analysis
Total RNA was isolated from mKate2+,CD45-DAPI- sorted primary pancreatic epithelial cells using the Trizol LS (Thermo Fisher Scientific), and cDNA was obtained from 500 ng of RNA using the Transcriptor First Strand cDNA Synthesis Kit (Roche) after treatment with DNAse I (Invitrogen) following manufacturer’s instructions. The following primer sets for mouse sequences were used: Il33_F GCTGCGTCTGTTGACACATT, Il33_R GACTTGCAGGACAGGGAGAC, Agr2_F ACAACTGACAAGCACCTTTCTC, Agr2_R GTTTGAGTATCGTCCAGTGATGT, Muc6_F AGCCCACATTCCCTATCAGC, Muc6_R CACAGTGGAAGATTGCGAGAG, Cpa1_F CAGTCTTCGGCAATGAGAACT, Cpa1_R GGGAAGGGCACTCGAACATC, Sox9_F CGTGCAGCACAAGAAAGACCA, Sox9_R GCAGCGCCTTGAAGATAGCAT, Hprt_F TCAGTCAACGGGGGACATAAA, Hprt_R GGGGCTGTACTGCTTAACCAG, Rplp0_F GCTCCAAGCAGATGCAGCA, Rplp0_R CCGGATGTGAGGCAGCAG, Actb_F GGCTGTATTCCCCTCCATCG and Actb_R CCAGTTGGTAACAATGCCATGT. qRT-PCR was carried out in triplicate (5 cDNA ng/reaction) using SYBR Green PCR Master Mix (Applied Biosystems) on the ViiA 7 Real-Time PCR System (Life technologies). Hprt, Rplp0 (aka 36b4) or ActB served as endogenous normalization controls.
Isolation, culture and genetic manipulation of pancreatic organoids
To isolate untransformed ductal organoids for the transplantable PDAC tumorigenesis model, normal pancreas from LSL-KrasG12D mice (pure Bl/6N background) minced and digested with 0.01 2% collagenase XI (C9407, Sigma-Aldrich) and 0.012 % Dispase (17105041, Life Technologies) in in HBSS with Mg2+ and Ca2+ (14025076, Thermo Fisher Scientific) at 37 °C for a maximum of 30 mins. The material was further digested with TrypLE (GIBCO) for 5 min at 37°C, washed twice with DMEM/F12 (Life Technologies) supplemented with 2 mM glutamine and pen-strep, embedded in growth factor reduced matrigel (Corning), and cultured in complete medium, as described in Boj et al 2014. For activation of mutant Kras, organoids harbouring the LSL-KrasG12D allele were transduced with Ad-mCherry-Cre (Vector Biolabs), and Cherry+ cells were sorted from single cell organoid suspension by flow cytometry 36h thereafter. Resulting clones were assessed for LSL-KrasG12D recombination by genotyping PCR in genomic DNA using the following primers: GTCTTTCCCCAGCACAGTGC, CTCTTGCCTACGCCACCAGCTC, and AGCTAGCCACCATGGCTTGAGTAAGTCTGCA. Validated Cre-recombined clones were then subjected to CRISPR-based inactivation of Trp53 using the PX458 vector (Addgene #48138) and gRNA AGTGAAGCCCTCCGAGTGTC sequence. PX458-sgTrp53 was transduced into organoids by transient transfection using the spinoculation method previously described48, with the modification of using the Effectene transfection reagent (Qiagen). PX458-sgTrp53 introduced cells were sorted by GFP positivity with flow cytometry 36 h post-transfection. p53 null status of targeted clones was validated by western blot, using anti-p53 antibody (CM5, Leica Microsystem) and anti-β-actin−peroxidase antibody (Sigma-Aldrich) as normalization control.
Bulk ATAC-seq analysis
Cell preparation, transposition reaction, ATAC-seq library construction and sequencing:
65,000 mKate2+ cells isolated by FACS, washed once with 50 µL of cold PBS, and resuspended in 50ul cold lysis buffer11. Cells were then centrifuged immediately for 10 min at 500 g, 4°C and nuclei pellet was subjected to transposition with Nextera Tn5 transposase (FC-121–1030, Illumina) for 30 min at 37°C, according to manufacturer’s instructions. DNA was eluted using MinElute PCR Purification Kit in 11.5 µl elution buffer (Qiagen). ATAC-seq libraries were prepared using the NEBNext High-Fidelity 2x PCR Master Mix (NEB M0541) as previously described37. Purified libraries were assessed using a Bioanalyzer High-Sensitivity DNA Analysis kit (Agilent). Approximately 200 million paired-end 50 bp reads were sequenced per replicate on a HiSeq 2500 (High Output) at the New York Genome Center.
Mapping, peak calling and dynamic peak calling:
Fastq files were trimmed with trimGalore and cutadapt49, and the filtered, pair-ended reads were aligned to mm9 with bowtie250. Peaks were called over input using MACS2 51, and only peaks with a p-value of <=0.001 and outside the ENCODE blacklist region were kept. All peaks from all samples were merged by combining peaks within 500bp of each. featureCount52 was used to count the mapped reads for each sample. The resulting peak atlas was normalized using DESeq253. For comparison to DepthNorm, samples were normalized to 10 million mapped reads. Normalized bigwig files were created using the normalization factors from DESeq2 as previously described54 and bedtools genomeCoverageBed55. Dynamic ATAC-peaks were called if they had an absolute log2FC >= 0.58 and a FDR <=0.1.
ATAC-seq heatmap clustering:
The dynamic peaks of regeneration and early neoplasia determined by comparing Normal, Injury, Kras*, and Kras*+Injury conditions (as defined in Fig. 1a and Extended Data Fig. 1a) were clustered using z-score and a kmeans of 6 and plotted using ComplexHeatmap56.
ATAC-seq peak annotation and pathway enrichment analysis.
Peaks were associated with genes based on UCSC.mm9.knownGene57 using ChIPseeker package58. The peaks were further analyzed for genic location (annotatePeaks). The distance of a peak to the nearest TSS was identified and annotated the peak to that gene. For pathway enrichment analysis of genes associated with regeneration and early neoplasia ATAC-seq clusters, genes uniquely associated with peaks belonging to each cluster were subjected to pathway analyses using enrichr59. Genes associated with > 1 ATAC cluster were excluded such that each gene is uniquely associated to one module.
TF motif enrichment and co-occurrence analyses:
Motif enrichment analysis was performed individually on each of the 6 ATAC-peak clusters using the HOMER de novo motif discovery tool60 using findMotifsGenome command with size = given and length = 8 parameters. Motif enrichment scores of the de novo predicted motifs identified from the above analyses were calculated for all 6 ATAC-clusters or accessibility-GAIN or -LOSS regions between Injury, Kras*, Kras*+Injury and PDAC vs Normal conditions were calculated by applying the findMotifsGenome command with the size = given and length = 8 in each peak-set, and results were visualized as heatmaps (Fig. 1g) or bubble plots (Extended Data Fig. 2f) plotted with R package ggplot. For each of the 6 ATAC-cluster peaks, TF motif occurrence in a given peak was defined as previously described14. In brief, Pearson correlation coefficient method was used to determine the similarity between TF pairs based on the whether a specific motif was present (1) or absent (0), and the results were visualized using the R package ggcorrplot with hierarchical clustering.
Metagene plots:
Metagene plots were created with the deepTools package using ATAC-seq peak centers and regions extended to +/− 3,000 bp with 10 bp bins. To evaluate differential enrichment from ATAC-seq meta-profiles between experimental conditions, average signals for each experimental condition were calculated (200 bins) for +/−1 Kb around the peak center, and compared. p-value was determined by Kolmogorov–Smirnov test.
Intersection of ATAC-seq and publicly available ChIP-seq data:
To analyze accessibility dynamics at H3K27ac-enriched regions defining PDAC metastasis, we first extracted 849 peaks significantly enriched in PDAC metastasis vs normal pancreas-derived organoids17 (based on enrichment cut-off of more than 10-fold increase of H3K27ac signal in any of the metastasis-derived organoids compared to the average of normal pancreas previously used17). We then matched the mm9 coordinates of H3K27Ac-GAIN ChIP-seq regions with positive ATAC-seq peak calls from the MACS2 output for PDAC samples over input and used genome_left_join function from R package fuzzyjoin to identify coordinates that were overlapping between PDAC positive peaks and H3K27Ac ChIP-seq of organoid-cultured vs freshly-isolated PDAC cells, respectively. This found positive ATAC-seq match for 53% of metastasis-associated H3K27Ac. In vivo accessibility dynamics at these PDAC opened-loci was evaluated by calculating the proportion of metastasis-associated ATAC-seq loci overlapping with ATAC-GAIN regions between Injury, Kras*, Kras*+Injury and PDAC conditions vs Normal, as well as by measuring differential enrichment of ATAC-seq signals centered on the middle of this metastasis-associated peakset across these same conditions using the metagene plot analyses above described. In addition, the Gene Transcription Regulation Database (GTRD, v20.06) was used to extract publicly available ChIP-seq experiment information on TFs and mine for those validated to bind regulatory regions of the Il33 locus displaying ATAC-GAIN between Injury, Kras* or Kras*+Injury vs Normal conditions. Specifically, dynamic ATAC-seq peaks associated with Il33 were converted from mm9 to mm10 coordinates using the UCSC liftover tool61 and were then intersected with GTRD’s ChIP-Seq datasets. Selected TFs whose binding sites are enriched in these differentially accessible Il33 ATAC-peaks identified in our study and experimentally validated to bind these regions in other contexts are indicated in Extended Data Fig. 9e.
Integration of TF-associated RNA-seq and ATAC-seq data:
To assign motifs enriched at differentially-accessible loci (identified by HOMER de novo analyses) to specific TFs most likely to bind these sequences in each experimental condition, we applied a similar workflow as previously described62 that integrates the motif enrichment and mRNA fold change data between 2 conditions. In brief, we first identified motifs significantly enriched in the 6 ATAC-seq clusters defined from dynamic peaks of regeneration and early neoplasia as described above, to expose potential regulatory nodes linked to accessibility-GAINS or -LOSSES driven by Injury (Injury), mutant Kras (Kras*) or their combination (Kras*+Injury). For each of these enriched motifs, we extracted HOMER’s best matches to known motifs TFs to generate a list of putative TF factors. For each of these TFs, we calculated RNA-seq and ATAC-seq absolute scores reflecting the degree of enrichment of their assigned motif (- log10 p-value, defined by HOMER) or magnitude of their mRNA expression fold change (log2FC) between Injury, Kras*, Kras*+Injury and PDAC conditions vs Normal, respectively. To avoid having results from one of the comparisons dominate the entire analysis, a weight γ was calculated for each TF and comparison by dividing the TF’s absolute score in that specific comparison by the sum of the absolute scores across all:
The combined ATAC-RNA scores for each TF and comparison was calculated by multiplying these weighted RNA- and ATAC- scores:
To reflect directionality of the TF expression change in each tissue state compared to Normal, resulting scores were multiplied by −1 or +1 depending on whether the TF was downregulated or upregulated in each comparison, and motif enrichment values for either ATAC-LOSS or -GAIN peaks of the same comparison were used, respectively. The combined ATAC-RNA scores top 12 TFs with highest combined-scores in the Kras*+Injury and PDAC (vs Normal) are displayed as heatmaps in Extended Data Fig. 7e.
Circos visualization (Fig. 4b):
Dynamic peaks linked to genes displaying consistent gene expression changes (FC>=2, p-adj < 0.05, see below) across Injury, Kras*, Kras*+Injury and PDAC conditions (vs Normal) were clustered across the indicated conditions using log2FC values with single clustering method. Resulting clusters were plotted using ‘circos’ (v0.69–8) and annotated for AP1-motifs (annotatePeaks with AP-1.GSE21512.motif) or NR5A2- or PTF1A-bound loci in normal pancreas (extracted from GSE3429516 or GSE8626215, respectively). In addition, this same class of dynamic peaks were investigated for motif enrichment using HOMER findMotifsGenome.
Single-cell ATAC-seq analysis
Cell preparation, transposition reaction, scATAC-seq library construction and sequencing:
Approximately 50,000 mKate2+ cells (mKate2+;CD45-;DAPI-) were isolated by FACS and subjected to scATAC-Seq protocol (10X Genomics, CG000168 RevA)63. Briefly, FACS-sorted cells were lysed in cold-lysis buffer (0.1% NP-40, 0.1% Tween 20, 0.01% Digitonin, 10 mM NaCl, 3 mM MgCl2 and 10 mM Tris-HCl [pH 7.4]), washed and processed according to ‘Nuclei Isolation for Single-Cell ATAC Sequencing’ protocol (CG000169 RevD). Resulting nuclei suspension was subjected to transposition reaction for 60 min at 37°C and then encapsulated in microfluidic droplets using 10X Chromium instrument following manufacturer’s instructions with a targeted nuclei recovery ~ 5,000. Barcoded DNA material was cleaned and prepared for sequencing according to the Chromium Single Cell ATAC Reagent Kits User Guide (10x Genomics; CG000168 RevA). Purified libraries were assessed using a Bioanalyzer High-Sensitivity DNA Analysis kit (Agilent) and sequenced on a Illumina HiSeq 2500 (High Output) platform at approx. 150M reads (R1 50bp, R2 50 bp, i7 8bp, i5 16bp) per 1 sample (~ 5000 nuclei) at MSKCC’s Integrated Genomics Operation Core.
Pre-processing of scATAC-seq data:
Fastq files for each sample were pre-processed to a cell-by-peak count matrix through the CellRanger ATAC pipeline63 with several modifications as follows: Reads were aligned to the mm10 reference, barcodes were counted, and peaks were called using CellRanger’s default peak-caller. However, since CellRanger frequently calls unusually large peaks, a custom modification of the CellRanger pipeline allowed all peaks within 10 bps of one another to be merged (as opposed to the 500 bps window in the default pipeline), creating an initial peak atlas with sufficiently narrow peaks. This modification increased the resolution of called peaks, and thus allowed us to distinguish nearby peaks that are variable across sub-populations of cells.
To reduce the large number of peaks in this atlas, we filtered out low-coverage peaks, unless these characterized a subpopulation of cells using the following strategy: First, an unbiased clustering of cells was performed using Phenograph64 on these initial peak features to define major cellular compartments. The peaks in the initial atlas were then retained in the final count matrix only if either (1) their coverage (reads per peak) normalized by peak width was above a certain threshold, hence they are confident peak calls, and/or (2) the peak was determined to be enriched in any cluster using a Fisher’s exact test (adjusted p-value<.01), hence they are differentially accessible across clusters of cells. For this analysis, we automatically determined the coverage threshold for filter (1) based on the distribution of per peak coverage for each sample. Specifically, peaks with coverage less than the sample median coverage (across all peaks) failed this filter (1) and were then passed to filter (2) to test for differential accessibility. All downstream CellRanger steps for cell filtering were applied after the above steps as usual. CellRanger’s aggregation function was then used to combine cells across samples to a unified peak atlas, using the depth normalization function to normalize cells across the two samples. A window of 10bp was again used to merge nearby peaks, as opposed to the default 500 bp. The above procedure resulted in a dataset of 11712 putative cells and 152991 peaks from both samples.
scATAC-seq filtering, normalization and visualization:
Low coverage cells were filtered by inspecting the per cell coverage (sum of peak counts per cell) and removing all cells in the lower mode of this distribution, resulting in a final count of 6369 cells. A binary matrix was then produced from the filtered count matrix by setting all nonzero values to 1. This allowed each peak feature to be represented as either open or closed, avoiding biases from wide peaks where counts may be extraordinarily high.
The data was then normalized by regressing out the sum of total counts per cell from this binarized matrix using ordinary linear regression. This normalization step is performed to partially correct for sampling biases across cells. A similar approach has been used in previous scATAC-seq studies65. Specifically, a linear model was fit that explains the counts for each peak , for peaks , with coverage as the sole regressor:
After obtaining estimates for model parameters , corrected values for each peak were given as follows:
Extraordinarily wide (>2000bp) or narrow (<2bp) peaks were removed from the count matrix following normalization, so as to filter non-specific peaks containing multiple transcription factor bindings and/or nucleosome-occupied regions within one called element. Principal Components Analysis (PCA) was then performed for dimensionality reduction, and the top 20 components--which were chosen by inspecting the cumulative variance explained across PCs using the knee point method--were then used as features for a Uniform Manifold Approximation and Projection (UMAP)66 visualization of the cells in two dimensions (Figure 5A). Global structure of the visualization was robust to choice of thresholds on peak width, principal components, and number of UMAP neighbors.
scATAC-seq clustering and differential peak analysis:
Phenograph clustering of the top 20 principal components was used with number of nearest neighbors K=25 to determine highly granular subsets of cells. Clusters were then merged to larger, coherent subsets based on accessibility patterns of peaks nearby known cell type markers. In particular, a Fisher’s exact test was performed on the binarized data for each peak, testing in each case for enrichment of peak accessibility in a Phenograph cluster versus all other cells. Significant peaks were then mapped to nearest target genes based on distance to the transcription start site (using a maximum allowable distance of 10kb), and enriched accessibility patterns were compared across clusters. Clusters were merged based on degree of overlap among cluster-defining peaks and shared opened chromatin at known pancreas cell state markers. To then identify differential peaks across these large compartments, each major subpopulation was compared to the rest by Fisher’s exact test (Fisher’s exact test, adjusted p-value < 0.05) to obtain a final set of significant, compartment-specific enrichments (listed Supplementary Table 8).
Visualization of per-peak accessibility:
To inspect accessibility dynamics across populations, we developed a visualization strategy relying on the binarized matrix to help overcome its sparsity. For each peak, a Gaussian kernel density estimate was fitted on UMAP embedding coordinates for cells in which the peak was open. The density was then estimated at each cell’s location (regardless of accessibility status) in the embedding, producing a continuous-valued metric corresponding to the density of cells harboring the open peak in a particular region of the visualization. These estimates are valid for visualization of cells with open chromatin at a specific locus on the UMAP itself, but do not provide a general estimate of peak-accessibility density within a region of high-dimensional phenotypic space.
Comparison to bulk ATAC-seq dynamics:
To first study reproducibility of bulk accessibility patterns in single cells, signals from corresponding bulk samples were compared to each single cell library where cells were aggregated to produce a ‘pseudo-bulk’ sample. First, bulk ATAC-seq peaks were converted from mm9 to mm10 coordinates using the UCSC liftover tool61. scATAC-seq BAM files (combining reads from all cells) were then converted to normalized bigwig files using the bamCoverage function from the deeptools package67, and accessibility signals were collected per lifted-over peak. Global reproducibility was assessed by correlation between bulk signals per peak (derived from DESeq-corrected signal tracks) and the single cell pseudo-bulk signals. To then assess whether dynamics in accessibility were consistent across bulk and single cell datasets, volcano plots were generated using results from DESeq differential accessibility analysis of bulk data. Fold changes in each peak region based on single cell pseudo-bulk signals were directly compared to inferred dynamics in lifted-over peaks from DESeq results of bulk datasets.
Comparison to bulk ATAC-seq peak modules:
A major goal of the scATAC-seq analysis is to deconvolve the accessibility differences identified at the bulk level, and therefore we sought to investigate the accessibility patterns of bulk peak modules across single cells. To achieve this, coordinates of peaks identified in bulk were converted from mm9 to mm10 coordinates using the UCSC liftover tool. To directly compare these to de novo identified peaks in scATAC-seq, bedtools intersect tool55 was used to find overlapping regions between bulk peak sets and those included in the single cell count matrix. All peaks with any non-zero overlap were retained for downstream analysis. These subsets of peaks were visualized by computing the proportion of cells in each compartment and each biological condition (independent animal) harboring an open peak, which were then z-scored across clusters and conditions for visualization in heatmaps. To ensure coverage differences between conditions did not impact these values systematically, the analysis was performed on down-sampled data, where all cells’ counts were randomly sampled to a total peak count 5000 counts per cell. Global shifts in peak accessibility across clusters and between conditions were robust to various thresholds on total peak count, and were globally consistent in the original dataset without downsampling.
Compartment-specific signal tracks:
To further evaluate accessibility differences across major subpopulations and between conditions, signal tracks were generated per sample per cellular compartment for visualization in the Integrative Genomics Viewer68. First, BAM files for each sample were separated into compartment-specific bins using information about the CellRanger –corrected cell barcode for each read, thus creating a new BAM file per compartment per condition. A signal track bigwig file was then produced from each BAM file using two normalization strategies to ensure biases due to cluster size and sample-specific sequencing depths did not impact visualizations. The first strategy utilized the bamCoverage function from the deeptools package for normalization accounting for total read counts of the sample-specific, cluster-specific BAM file. The second strategy corrected for these differences by randomly sub-sampling each BAM file to the total read count of the smallest, leaving each file with approximately 5,529,294 total reads. In the latter case, the smallest two clusters were excluded from the analysis, as their coverage was too poor due to low cell count. To correct for cell count and sequencing depth disparities among the different conditions compared, display of accessibility signals between conditions are downsampled to same coverage across all tracks shown.
Transcription factor activity scores:
To evaluate the activity of acinar- (NR5A2) and injury/neoplasia-activated (e.g. AP-1) transcription factors in individual cells, we first identified genomic regions with differential chromatin accessibility and linked to genes displaying consistent changes in gene expression between early neoplasia (Kras*+Injury) and normal pancreas (Normal) through integrative analyses of bulk-ATAC and bulk-RNA-seq data (mm9, see below). These regions were annotated for binding sites for NR5A216 or AP-1 (AP-1.GSE21512.motif), respectively, and were lifted over to mm10 coordinates for comparison with scATAC-seq profiles, as above. In this case, lifted over coordinates were then associated with scATAC-seq peaks only when coordinates were fully overlapping using bedtools ‘intersect minimum overlap’ function, to ensure significant overlap between TF binding sites and scATAC-seq peaks. To then quantify binding activity per cell, the proportion of all open peaks per cell overlapping with a binding site was computed. These values were then visualized on UMAP and in heatmaps (values in heatmaps are logged with a .0001 pseudocount). We quantified the relative accessibility of AP1 and NR5A2 binding sites per cell by computing a ratio of logged binding activity scores, which scales with increasing AP1 activity and decreasing NR5A2 activity.
Identification of peaks correlating with AP-1/NR5A2 ratio across individual cells.
To identify individual peaks whose dynamics are associated increasing AP-1/NR5A2 TF activity ratios across individual Kras-mutant cells, we computed Pearson correlation of each normalized peak with AP-1/ NR5A2 ratio across cells. Peaks with a relatively strong correlation magnitude (|r|>.1) were selected for further analysis, and visualized with normalized accessibility trends over cells ranked by increasing AP-1/ NR5A2 ratio. Gene annotations per peak were derived by mapping each peak coordinate to its nearest target gene within a 50 kb window. To confirm the association of positively or negatively correlated-peaks with the AP-1/NR5A2 TF activity switch (e.g. Il33 or Cpa1-associated peaks) we performed a unpaired two-tailed Student’s t-test comparing AP-1/NR5A2 TF activity score ratios of all cells where the identified correlated-peak is open (i.e. at least one scATAC-seq count in that peak) versus those where the peak is closed. GREAT tools69 was used to compare the ontology of genes associated with positively or negatively -correlated peaks, using single nearest gene’ within 1000 kb as input parameters, and with comparable results obtained using the ‘two nearest genes’ option. Top pathways from ‘GO Biological Process’ and ‘GO Molecular Function’ categories are displayed in Fig. 4g.
RNA-seq analysis
RNA extraction, RNA-seq library preparation and sequencing:
Total RNA was isolated from primary mKate2+,CD45-DAPI- pancreatic epithelial cells isolated from normal, regenerating (Injury), early neoplastic (Kras*, Kras*+Injury) and cancer (PDAC) tissues into TRIzolLS and assessed using a Agilent 2100 Bioanalyzer. Sequencing and library preparation was performed at the Integrated Genomics Operation (IGO) at MSKCC. RNA-seq libraries were prepared from total RNA. After RiboGreen quantification and quality control by Agilent BioAnalyzer, 100–500ng of total RNA underwent polyA selection and TruSeq library preparation according to instructions provided by Illumina (TruSeq Stranded mRNA LT Kit, RS-122–2102), with 8 cycles of PCR. Samples were barcoded and run on a HiSeq 4000 or HiSeq 2500 in a 50bp/50bp paired end run, using the HiSeq 3000/4000 SBS Kit or TruSeq SBS Kit v4 (Illumina) at MSKCC’s Integrated Genomics Operation Core. An average of 41 million paired-end reads was generated per sample. At the most the ribosomal reads represented 0.01% of the total reads generated and the percent of mRNA bases averaged 53%.
RNA-seq read mapping, differential expression analysis and heatmap visualization:
Resulting RNA-Seq data was analyzed by removing adaptor sequences using Trimmomatic70. RNA-Seq reads were then aligned to GRCm38.91 (mm10) with STAR71 and transcript count was quantified using featureCounts 52 to generate raw count matrix. Differential gene expression analysis was performed using DESeq2 package (Love et al., 2014) between experimental conditions, using 3–5 independent biological replicates (individual mouse) per condition, implemented in R (http://cran.r-project.org/). Principal component analysis (PCA) was performed using the DESeq2 package in R. Differentially expressed genes (DEGs) were determined by > 2-fold change in gene expression with adjusted P-value < 0.05. For heatmap visualization of DEGs, samples were z-score normalized and plotted using ‘pheatmap’ package in R.
Intersection of gene expression (RNA-seq) and chromatin accessibility (bulk ATAC-seq) data:
To define ‘chromatin-dymamic’ vs ‘chromatin-stable’ DEGs, upregulated and downregulated DEGs for the indicated comparisons (Injury, Kras*, Kras*+Injury or PDAC vs Normal; FC>2, padj<0.05) were classified into the following chromatin-categories based on the accessibility change at associated peaks in the same tissue state vs Normal: DEG with accessibility-GAIN (at least one associated peak showing significant accessibility-GAIN), DEG with accessibility-LOSS (at least one associated peak showing significant accessibility-LOSS) or ‘chromatin stable’-DEG (none of the associated peaks showing significant changes in chromatin accessibility). In the instance that different peaks associated with the same gene showed opposing dynamic accessibility patterns (which as noted for < 4% of DEGs in average), the gene was only classified into accessibility -GAIN or -LOSS categories if the contribution of GAIN or -LOSS peaks were at least 3 times more represented within the dynamic peaks, respectively. DEGs with no associated ATAC-peaks detected in that same tissue state were classified as ND. The chromatin accessibility status of the DEGs in regenerating, early-stage neoplasia and malignant PDAC tissue states in GEMMs is summarized in Supplementary Table 5. Expression dynamics of gene sets associated with bulk ATAC-seq clusters across Normal, Injury, Kras*, Kras*+Injury and PDAC sample were visualized as heatmap of normalized median expression values plotted with seaborn in python. To define these gene sets, peaks from each of the 6 ATAC-seq clusters were associated with genes based on UCSC.mm9.knownGene57 using ChIPseeker package58, as above, and ATAC-cluster-defining genes were defined as those uniquely associated with one cluster. For each RNA-seq sample, the median expression across genes associated to each ATAC-cluster was computed, and z-scored for visualization.
Definition of neoplasia-specific epigenetic programs.
ATAC-seq and RNA-seq datasets were overlapped to identify genes exhibiting significant chromatin accessibility and expression changes in cells undergoing injury-accelerated neoplasia (Kras*+Injury) and advanced PDAC (PDAC) but not in regenerative metaplasia (Injury alone) when compared to normal healthy pancreas (Normal).
Functional annotations of gene sets:
Pathway enrichment analysis was performed in the indicated gene sets with the Reactome and KEGG database using enrichR59. Significance of the tests was assessed using combined score defined by enrichR, described as c = log(p) * z, where c is the combined score, p is Fisher exact test p-value, and z is z-score for deviation from expected rank.
Gene set enrichment analysis (GSEA):
GSEA72 was performed using the GSEA-Preranked tool for conducting gene set enrichment analysis of data derived from RNA-seq experiments (version 2.07) against signatures in the MSigDB database (http://software.broadinstitute.org/gsea/msigdb), signatures derived herein, and published expression signatures derived from human23,73 or organoid samples17.
Overlap with human gene expression datasets:
2 independent public datasets of microarray data from human PDAC and normal pancreas samples (GSE7172923 and GSE6245273) were used. Differential expression analysis was then applied using limma package74 to define differentially expressed genes (DEGs) between PDAC vs normal samples, using > 2-fold change and adjusted p-value < 0.05 cut-off.
Statistics and Reproducibility
Statistical analyses were performed with GraphPad Prism (v7/8) and R (v3.5.1 and v1.26.0), and Python programming language (python version 3.6.4). Pooled data are presented as mean values ± s.e.m. Sample size, error bars and statistical methods are reported in the figure legends. P-values are shown in figures or associated legends. Statistical significance of differences between two experimental groups were assessed by unpaired two-tailed Student’s t-test. In RNA-seq data, significance for differential gene expression between groups was based on adjusted p-value < 0.05. For pathway enrichment analysis of RNA-seq gene clusters, the significance of gene lists was assessed by adjusted p-value and z-score59. Significance of gene sets from GSEA was based on the normalized enrichment score (NES) and the false discovery rate q-value (FDR q-val). In bulk ATAC-seq data, dynamic peaks were called if they had an absolute log2FC >= 0.58 and a FDR <=0.1. Motif enrichment scores were evaluated by p-values scores defined by HOMER60. Kolmogorov–Smirnov test was used to obtain p value to assess differential enrichment from bulk ATAC-seq meta-profiles. Correlation between scATAC-seq data and bulk ATAC-seq data was evaluated using Pearson correlation analysis. The correlation of individual peaks with increasing TF activity ratios across individual cells was evaluated using Pearson correlation analyses, and the significance of biological functions linked to identified positively or negatively correlated peak sets was evaluated by GREAT69 FDR q-value.
No statistical methods were used to pre-determine sample size in the mouse studies, and mice with matched sex and age were randomized into different treatment groups (eg. PBS control, caerulein). The investigators were not blinded to allocation during experiments and outcome assessment, except for the histological assessment of pancreatic lesions, which were classified and graded by a veterinary pathologist blinded to genotype and treatment conditions. All experiments were reliably reproduced. Specifically all in vivo experiments, except for omics data (i.e. RNA-seq, ATAC-seq and scATAC-seq), were performed independently at least two times, with the total number of biological replicates (independent animals) indicated in the corresponding figure legends. Caerulein and rIL-33 treatments (and their respective phenotypic/molecular readouts) yielded similar results irrespective of the experimental cohort (eg. mouse litter). The in vitro TF overexpression experiment was repeated twice (with datapoints representing two independent wells each) with similar results. ATAC/RNA-seq data from freshly isolated cells harvested at different dates are also reliably reproduced, with biological replicates (independent mice) from the same experimental groups clustering together in PCA and hierarchical clustering methods irrespective of experimental cohort and sample processing dates. Mouse illustrations were created with ©BioRender - biorender.com. Figures were prepared using Illustrator CC 2019/2020 (Adobe).
Extended Data
Supplementary Material
Acknowledgements
We thank Mayerlin Chalarca, Sarah Ackermann, Janelle Simon, Alex Wuest and MSKCC animal facility for technical support with animal colonies; Sang Yang, So Young, Zhen Zhao and Ambereen Kahn for assistance with the generation of ESC-derived GEMM; Julián Valdés, Surajit Dhara, Timour Baslan, Sha Tian and Alexa Osterhoudt for support with profiling experiments; Ignas Masilionis and Ojavsi Chaduhary for support with scATAC-seq experiments; Vincent Lavallée for his input and discussion on scATAC-seq data analysis; Rui Garner and the MSKCC Core Facility Staff for assistance with cell sorting; and José Reyes and other members of the Lowe laboratory for helpful advice and discussions. We also acknowledge the use of the Integrated Genomics Operation Core, funded by the NCI Cancer Center Support Grant (CCSG, P30 CA08748), Cycle for Survival, and the Marie-Josée and Henry R. Kravis Center for Molecular Oncology. D.A-C. was supported by the Spanish Fundación Ramón Areces Postdoctoral Fellowhip; C.B. is supported by an NIH F31 grant (F31CA246901); J.P.M. IV was supported by an American Cancer Society Postdoctoral Fellowship (126337-PF-14–066-01-TBE); R.C. is supported by the Pancreatic Cancer Action Network-AACR Pathway to Leadership Award; H-A.C is supported by a NIH F99 Grant (F99CA245797); K.M.T is supported by the Jane Coffin Childs Memorial Fund for Medical Research; F.M.B is supported by MSKCC’s Translational Research Oncology Training Fellowship (5T32CA160001–08); N.T. is supported by an NIH K99 grant (K99CA237736); G.L. was supported by an NIH F32 grant (1F32CA177072–01) and American Cancer Society Fellowship (PF-13–037-01-DMC); E.A. is supported by a NIH K99 Grant (K99CA23019). D.P. This work was additionally supported by MSKCC’s David Rubenstein Center for Pancreatic Research Pilot Project (to S.W.L) and The Alan and Sandra Gerry Metastasis and Tumor Ecosystems Center (to S.W.L./D.P.); the Lustgarten Foundation Research Investigator Award (to S.W.L.); the Agilent Thought Leader Program (to S.W.L.); and the NIH’s Grants P01CA13106 (to S.W.L.), R01CA204228 and P30CA023108 (to S.D.L.), and U54CA209975 (to D.P.). S.W.L. is an investigator in the Howard Hughes Medical Institute and the Geoffrey Beene Chair for Cancer Biology.
Footnotes
Code availability
Custom codes for processing, filtering, and visualization of single-cell ATAC-seq were performed using Python and are demonstrated in a Jupyter notebook available for download at (https://github.com/dpeerlab/pdac-tumorigenesis-scATAC/).
Competing interest
A patent application (PTC/US2019/041670, internationally filing date 12 July 2019) has been submitted based in part on results presented in this manuscript covering methods for preventing or treating KRAS mutant pancreas cancer with inhibitors of Type 2 cytokine signaling. D.A.C and S.W.L are listed as the inventors. S.W.L. is a founder and scientific advisory board member of Blueprint Medicines, Mirimus Inc., and ORIC pharmaceuticals, and Faeth Therapeutics, and on the scientific advisory board of Constellation Pharmaceuticals and PMV Pharmaceuticals. S.D.L. is on the scientific advisory board of Nybo Therapeutics and Episteme Prognostics.
Additional information
Reprints and permissions information is available at www.nature.com/reprints.
Correspondence and requests for materials should be addressed to S.W.L (lowes@mskcc.org).
Data Availability
All ATAC-seq, RNA-seq, and scATAC-seq data generated in this study have been deposited in the Gene Expression Omnibus (GEO) database under the super-series GSE132330. Publicly available RNA-seq, gene expression microarray and ChIP-Seq data reanalyzed for this study is available under the accession codes GSE86262 (PTF1A ChIP-seq), GSE34295 (NR5A2 ChIP-seq), GSE99311 (H3K27Ac ChIP-seq), GSE62452 (human specimen gene expression microarray), and GSE71729 (human specimen gene expression microarray) and in the Gene Transcription Regulation Database (GTRD https://gtrd.biouml.org). All other data supporting the findings of this study are available from the corresponding authors upon reasonable request.
Main References
- 1.Giroux V & Rustgi AK Metaplasia: tissue injury adaptation and a precursor to the dysplasia-cancer sequence. Nat Rev Cancer 17, 594–604, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Guerra C et al. Chronic pancreatitis is essential for induction of pancreatic ductal adenocarcinoma by K-Ras oncogenes in adult mice. Cancer Cell 11, 291–302, (2007). [DOI] [PubMed] [Google Scholar]
- 3.Habbe N et al. Spontaneous induction of murine pancreatic intraepithelial neoplasia (mPanIN) by acinar cell targeting of oncogenic Kras in adult mice. Proc Natl Acad Sci U S A 105, 18913–18918, (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Collins MA et al. Oncogenic Kras is required for both the initiation and maintenance of pancreatic cancer in mice. J Clin Invest 122, 639–653, (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hingorani SR et al. Preinvasive and invasive ductal pancreatic cancer and its early detection in the mouse. Cancer Cell 4, 437–450 (2003). [DOI] [PubMed] [Google Scholar]
- 6.Carriere C, Young AL, Gunn JR, Longnecker DS & Korc M Acute pancreatitis markedly accelerates pancreatic cancer progression in mice expressing oncogenic Kras. Biochem Biophys Res Commun 382, 561–565, (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Strobel O et al. In vivo lineage tracing defines the role of acinar-to-ductal transdifferentiation in inflammatory ductal metaplasia. Gastroenterology 133, 1999–2009, (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Morris JP, Cano DA, Selkine S, Wang SC & Hebrok M beta-catenin blocks Kras-dependent reprogramming of acini into pancreatic cancer precursor lesions in mice. Journal of Clinical Investigation 120, 508–520, (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kopp JL et al. Identification of Sox9-dependent acinar-to-ductal reprogramming as the principal mechanism for initiation of pancreatic ductal adenocarcinoma. Cancer Cell 22, 737–750, (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Storz P Acinar cell plasticity and development of pancreatic ductal adenocarcinoma. Nat Rev Gastroenterol Hepatol 14, 296–304, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Buenrostro JD, Wu B, Chang HY & Greenleaf WJ ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide. Curr Protoc Mol Biol 109, 21 29 21–29, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Stanger BZ & Hebrok M Control of cell identity in pancreas development and regeneration. Gastroenterology 144, 1170–1179, (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Vallejo A et al. An integrative approach unveils FOSL1 as an oncogene vulnerability in KRAS-driven lung and pancreatic cancer. Nat Commun 8, 14294, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Arda HE et al. A Chromatin Basis for Cell Lineage and Disease Risk in the Human Pancreas. Cell Syst 7, 310–322e314, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Hoang CQ et al. Transcriptional Maintenance of Pancreatic Acinar Identity, Differentiation, and Homeostasis by PTF1A. Mol Cell Biol 36, 3033–3047, (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Holmstrom SR et al. LRH-1 and PTF1-L coregulate an exocrine pancreas-specific transcriptional network for digestive function. Genes Dev 25, 1674–1679, (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Roe JS et al. Enhancer Reprogramming Promotes Pancreatic Cancer Metastasis. Cell 170, 875–888e820, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Loven J et al. Selective inhibition of tumor oncogenes by disruption of super-enhancers. Cell 153, 320–334, (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Shi J & Vakoc CR The mechanisms behind the therapeutic activity of BET bromodomain inhibition. Mol Cell 54, 728–736, (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Sherman MH Stellate Cells in Tissue Repair, Inflammation, and Cancer. Annu Rev Cell Dev Biol 34, 333–355, (2018). [DOI] [PubMed] [Google Scholar]
- 21.Cebola I et al. TEAD and YAP regulate the enhancer network of human embryonic pancreatic progenitors. Nat Cell Biol 17, 615–626, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Krah NM et al. The acinar differentiation determinant PTF1A inhibits initiation of pancreatic ductal adenocarcinoma. Elife 4, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Moffitt RA et al. Virtual microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nat Genet 47, 1168–1178, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Cobo I et al. Transcriptional regulation by NR5A2 links differentiation and inflammation in the pancreas. Nature 554, 533–537, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wollny D et al. Single-Cell Analysis Uncovers Clonal Acinar Cell Heterogeneity in the Adult Pancreas. Dev Cell 39, 289–301, (2016). [DOI] [PubMed] [Google Scholar]
- 26.Liew FY, Girard JP & Turnquist HR Interleukin-33 in health and disease. Nat Rev Immunol 16, 676–689, (2016). [DOI] [PubMed] [Google Scholar]
- 27.Yevshin I, Sharipov R, Kolmykov S, Kondrakhin Y & Kolpakov F GTRD: a database on gene transcription regulation-2019 update. Nucleic Acids Res 47, D100–D105, (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Hnisz D et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947, (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Whyte WA et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319, (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.McDonald OG et al. Epigenomic reprogramming during pancreatic cancer progression links anabolic glucose metabolism to distant metastasis. Nat Genet 49, 367–376, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Li A et al. IL-33 Signaling Alters Regulatory T Cell Diversity in Support of Tumor Development. Cell Rep 29, 2998–3008e2998, (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Moral JA et al. ILC2s amplify PD-1 blockade by activating tissue-specific cancer immunity. Nature 579, 130–135, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
Methods References
- 33.Saborowski M et al. A modular and flexible ESC-based mouse model of pancreatic cancer. Genes Dev 28, 85–97, (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Tasdemir N et al. BRD4 Connects Enhancer Remodeling to Senescence Immune Surveillance. Cancer Discov 6, 612–629, (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zuber J et al. RNAi screen identifies Brd4 as a therapeutic target in acute myeloid leukaemia. Nature 478, 524–528, (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Dow LE et al. A pipeline for the generation of shRNA transgenic mice. Nat Protoc 7, 374–393, (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Livshits G et al. Arid1a restrains Kras-dependent changes in acinar cell identity. Elife 7, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Gertsenstein M et al. Efficient generation of germ line transmitting chimeras from C57BL/6N ES cells by aggregation with outbred host embryos. PLoS One 5, e11260, (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Kawaguchi Y et al. The role of the transcriptional regulator Ptf1a in converting intestinal to pancreatic progenitors. Nat Genet 32, 128–134, (2002). [DOI] [PubMed] [Google Scholar]
- 40.Jackson EL et al. Analysis of lung tumor initiation and progression using conditional expression of oncogenic K-ras. Genes Dev 15, 3243–3248, (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Beard C, Hochedlinger K, Plath K, Wutz A & Jaenisch R Efficient method to generate single-copy transgenic mice by site-specific integration in embryonic stem cells. Genesis 44, 23–28, (2006). [DOI] [PubMed] [Google Scholar]
- 42.Dow LE et al. Conditional reverse tet-transactivator mouse strains for the efficient induction of TRE-regulated transgenes in mice. PLoS One 9, e95236, (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Premsrirut PK et al. A rapid and scalable system for studying gene function in mice using conditional RNA interference. Cell 145, 145–158, (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Heiser PW et al. Stabilization of beta-catenin induces pancreas tumor formation. Gastroenterology 135, 1288–1300, (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Rhim AD et al. EMT and dissemination precede pancreatic tumor formation. Cell 148, 349–361, (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gopinathan A, Morton JP, Jodrell DI & Sansom OJ GEMMs as preclinical models for testing pancreatic cancer therapies. Dis Model Mech 8, 1185–1200, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Morris J. P. t. et al. alpha-Ketoglutarate links p53 to cell fate during tumour suppression. Nature 573, 595–599, (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.O’Rourke KP et al. Transplantation of engineered organoids enables rapid generation of metastatic mouse models of colorectal cancer. Nat Biotechnol 35, 577–582, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Martin M Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10–12 (2011). [Google Scholar]
- 50.Langmead B & Salzberg SL Fast gapped-read alignment with Bowtie 2. Nat Methods 9, 357–359, (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Zhang Y et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol 9, R137, (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Liao Y, Smyth GK & Shi W featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930, (2014). [DOI] [PubMed] [Google Scholar]
- 53.Love MI, Huber W & Anders S Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550, (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Pronier E et al. Targeting the CALR interactome in myeloproliferative neoplasms. JCI Insight 3, (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Quinlan AR & Hall IM BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842, (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Gu Z, Eils R & Schlesner M Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849, (2016). [DOI] [PubMed] [Google Scholar]
- 57.Carlson M & Maintainer BP TxDb.Dmelanogaster. UCSC.dm3.ensGene: Annotation package for TxDb object(s) 2015, 2015).
- 58.Yu G, Wang LG & He QY ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 31, 2382–2383, (2015). [DOI] [PubMed] [Google Scholar]
- 59.Chen EY et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 14, 128, (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Heinz S et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38, 576–589, (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Kent WJ et al. The human genome browser at UCSC. Genome Res 12, 996–1006, (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Zhang Z et al. Loss of CHD1 Promotes Heterogeneous Mechanisms of Resistance to AR-Targeted Therapy via Chromatin Dysregulation. Cancer Cell 37, 584–598e511, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Satpathy AT et al. Massively parallel single-cell chromatin landscapes of human immune cell development and intratumoral T cell exhaustion. Nat Biotechnol 37, 925–936, (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Levine JH et al. Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis. Cell 162, 184–197, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Danese A, Richter ML, Fischer DS, Theis FJ & Colomé-Tatché M EpiScanpy: integrated single-cell epigenomic analysis (2019). [DOI] [PMC free article] [PubMed]
- 66.McInnes L, Healy J, Saul N & L., G. UMAP: Uniform Manifold Approximation and Projection (2018).
- 67.Ramirez F et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res 44, W160–165, (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Robinson JT et al. Integrative genomics viewer. Nat Biotechnol 29, 24–26, (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.McLean CY et al. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol 28, 495–501, (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Bolger AM, Lohse M & Usadel B Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120, (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Dobin A et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21, (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Subramanian A et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545–15550, (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Yang S et al. A Novel MIF Signaling Pathway Drives the Malignant Character of Pancreatic Cancer by Targeting NR3C2. Cancer Res 76, 3838–3850, (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Ritchie ME et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43, e47, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Whittle MC et al. RUNX3 Controls a Metastatic Switch in Pancreatic Ductal Adenocarcinoma. Cell 161, 1345–1360, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Wei D et al. KLF4 Is Essential for Induction of Cellular Identity Change and Acinar-to-Ductal Reprogramming during Early Pancreatic Carcinogenesis. Cancer Cell 29, 324–338, (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Diaferia GR et al. Dissection of transcriptional and cis-regulatory control of differentiation in human pancreatic cancer. EMBO J 35, 595–617, (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Delgiorno KE et al. Identification and manipulation of biliary metaplasia in pancreatic tumors. Gastroenterology 146, 233–244e235, (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Kalisz M et al. HNF1A recruits KDM6A to activate differentiated acinar cell programs that suppress pancreatic cancer. EMBO J 39, e102808, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Truty MJ, Lomberk G, Fernandez-Zapico ME & Urrutia R Silencing of the transforming growth factor-beta (TGFbeta) receptor II by Kruppel-like factor 14 underscores the importance of a negative feedback mechanism in TGFbeta signaling. J Biol Chem 284, 6291–6300, (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.von Figura G, Morris J. P. t., Wright CV & Hebrok M Nr5a2 maintains acinar cell differentiation and constrains oncogenic Kras-mediated pancreatic neoplastic initiation. Gut 63, 656–664, (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Jiang M et al. MIST1 and PTF1 Collaborate in Feed-forward Regulatory Loops that Maintain the Pancreatic Acinar Phenotype in Adult Mice. Mol Cell Biol, (2016). [DOI] [PMC free article] [PubMed]
- 83.Dawson MA et al. Inhibition of BET recruitment to chromatin as an effective treatment for MLL-fusion leukaemia. Nature 478, 529–533, (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Delmore JE et al. BET bromodomain inhibition as a therapeutic strategy to target c-Myc. Cell 146, 904–917, (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Mazur PK et al. Combined inhibition of BET family proteins and histone deacetylases as a potential epigenetics-based therapy for pancreatic ductal adenocarcinoma. Nat Med 21, 1163–1171, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Kim J, Lee JH & Iyer VR Global identification of Myc target genes reveals its direct role in mitochondrial biogenesis and its E-box usage in vivo. PLoS One 3, e1798, (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Bian B et al. Gene expression profiling of patient-derived pancreatic cancer xenografts predicts sensitivity to the BET bromodomain inhibitor JQ1: implications for individualized medicine efforts. EMBO Mol Med 9, 482–497, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Dumartin L et al. ER stress protein AGR2 precedes and is involved in the regulation of pancreatic cancer initiation. Oncogene 36, 3094–3103, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All ATAC-seq, RNA-seq, and scATAC-seq data generated in this study have been deposited in the Gene Expression Omnibus (GEO) database under the super-series GSE132330. Publicly available RNA-seq, gene expression microarray and ChIP-Seq data reanalyzed for this study is available under the accession codes GSE86262 (PTF1A ChIP-seq), GSE34295 (NR5A2 ChIP-seq), GSE99311 (H3K27Ac ChIP-seq), GSE62452 (human specimen gene expression microarray), and GSE71729 (human specimen gene expression microarray) and in the Gene Transcription Regulation Database (GTRD https://gtrd.biouml.org). All other data supporting the findings of this study are available from the corresponding authors upon reasonable request.