Skip to main content
Science Advances logoLink to Science Advances
. 2025 Feb 7;11(6):eads4985. doi: 10.1126/sciadv.ads4985

Pioneer factor GATA6 promotes colorectal cancer through 3D genome regulation

Huijue Lyu 1,, Xintong Chen 1,, Yang Cheng 1, Te Zhang 1, Ping Wang 1, Josiah Hiu-yuen Wong 1, Juan Wang 1, Lena Stasiak 1, Leyu Sun 2, Guangyu Yang 2, Lu Wang 1, Feng Yue 1,3,*
PMCID: PMC11804904  PMID: 39919174

Abstract

Colorectal cancer (CRC) is one of the most lethal and prevalent malignancies. While the overexpression of pioneer factor GATA6 in CRC has been linked with metastasis, its role in genome-wide gene expression dysregulation remains unclear. Through studies of primary human CRC tissues and analysis of the TCGA data, we found that GATA6 preferentially binds at CRC-specific active enhancers, with enrichment at enhancer-promoter loop anchors. GATA6 protein also physically interacts with CTCF, suggesting its critical role in 3D genome organization. The ablation of GATA6 through AID and CRISPR systems severely impaired cancer cell clonogenicity and proliferation. Mechanistically, GATA6 knockout induced global loss of CRC-specific open chromatins and extensive alterations of critical enhancer-promoter interactions for CRC oncogenes. Last, we showed that GATA6 knockout greatly reduced tumor growth and improved survival in mice. Together, we revealed a previously unidentified mechanism by which GATA6 contributes to the pathogenesis of colorectal cancer.


GATA6 regulates target oncogene expression and tumor progression by mediating oncogene enhancer-promoter interactions in CRC.

INTRODUCTION

Colorectal cancer (CRC) is one of the most common malignancies worldwide with highly heterogeneous prognosis and drug responses (1). While the canonical mechanism of CRC onset involves the accumulation of genetic mutations due to genome instability and deficient mismatch repair system, more recent efforts have expanded the understanding of CRC toward the role of transcription factor (TF) dysregulation (2, 3). The aberrant activity of TFs such as c-Jun (4), FOXO3a (5), and RUNX3 (6) can lead to abnormal gene expression patterns, disrupting the finely tuned balance of cellular processes and the uncontrolled proliferation and apoptosis evasion. While previously considered undruggable, emerging efforts have emphasized the substantial potential of direct TF modulators in cancer therapies, further highlighting the value of identifying novel regulators and uncovering the intricate molecular mechanisms (710).

TFs canonically function through the direct binding to specific DNA sequences at cis-regulatory elements such as promoters and enhancers. Recently, it has also been shown that some TFs such as Krüppel-like factor 4 (KLF4) and MyoD may also function through orchestrating three-dimensional (3D) chromatin architecture to shape cell identity (1116). Growing evidence, including ours, has suggested TF-associated 3D genome reprogramming in various cancers as well (1720). In CRC, while the reorganization of these higher-order structures has been reported to play a role in the pathogenesis (21, 22), the interplay between TF activities and 3D genome organization remains unclear.

GATA6, a pioneer factor during endoderm differentiation, has been suggested as a key regulator for CRC tumorigenesis and progression and is associated with worse prognoses (2328). For example, Tsuji et al. (29) suggested a potential link between GATA6 and the expression of CRC stem cell marker LGR5. In another work, Whissell et al. (23) showed that GATA6 binds to a distal enhancer region of BMP4, and this enhancer is critical to the self-renewal of CRC stem cells. Moreover, GATA6 has been shown to be a critical open chromatin regulator in definitive endoderm differentiation (24, 28, 30). However, the understanding of GATA6 molecular function in CRC has been primarily limited to individual loci, and its role in global gene regulation, chromatin remodeling, and higher-order chromatin architecture remains to be explored.

In this study, we first generated extensive genomic datasets in paired formalin-fixed, paraffin-embedded (FFPE) CRC patient samples and in two CRC cell lines, DLD-1 and CACO-2. Combining with The Cancer Genome Atlas (TCGA) colorectal patient data, we found the preferential binding enrichment of GATA6 at CRC tumor-specific enhancers and loop anchors. Next, we established both the inducible degradation system and CRISPR knockout (KO) cell lines of GATA6 and demonstrated that the ablation of GATA6 impairs CRC cell proliferation and clonogenicity. Furthermore, as GATA6 binding sites colocalize with colon-specific CTCF, we performed chromatin immunoprecipitation–mass spectrometry (ChIP-MS) and coimmunoprecipitation experiments and confirmed the direct physical interaction between the GATA6 and CTCF proteins. Last, by integrating Hi-C and HiChIP experiments, we unveiled a previously unidentified role of GATA6 as a chromatin looping factor, particularly in the regulation of CRC-specific enhancer-promoter loops. Together, our findings indicated a distinct molecular mechanism by which GATA6 promotes CRC progression and further suggested a potential role of pioneer factors in chromatin topology regulation in the cancer context.

RESULTS

GATA6 is highly expressed in CRC samples and enriched in CRC-specific open chromatin regions

To study the role of GATA6 in colorectal cancer, we performed comprehensive genomic assays in primary colorectal tumors and two commonly used colorectal cancer cell lines, DLD-1 and CACO-2, followed by function assays, including gene KO and xenograft models. The overall design of the study is outlined in Fig. 1A (table S1). We first analyzed primary patient RNA sequencing (RNA-seq) data from TCGA and the Genotype-Tissue Expression (GTEx) project. Among the 367 CRC tumors from TCGA and the 359 normal tissues from both TCGA and GTEx with mRNA expression data (51 matched peritumor colorectum in TCGA and 308 normal colorectum tissues from GTEx), GATA6 has significantly higher expression in CRC compared to normal colorectum tissue (Fig. 1B). Furthermore, we examined the data from the Cancer Dependency Map Project (Achilles Project), and the results showed that the loss of GATA6 expression uniquely impaired the growth of CRC cell lines (Fig. 1C), further suggesting that GATA6 is a critical gene in CRC.

Fig. 1. GATA6 is a CRC-specific regulator.

Fig. 1.

(A) Study design. Created in BioRender.com. (B) GATA6 gene expression (normalized counts) from TCGA and GTEx patient mRNA data for CRC. P value was calculated using two-sided Student’s t test. (C) GATA6 gene dependency score from CRISPR essentiality screen (DepMap 21Q2 Public) for cancer cell lines across various lineages. Colorectal cancer cell lines are labeled as Bowel. (D) GATA6 motif enrichment at TCGA tumor distal ATAC-seq peaks. (E) Annotation of GATA6 binding sites in DLD-1 WT cells. (F) Example regions of GATA6 binding sites. Patient H3K27ac ChIP-seq data are obtained from a previous publication (32). (G) Venn diagram showing the overlap of GATA6 peaks at active enhancers in DLD-1 cells and CRC tumor-specific enhancers defined in a previous publication (32). (H) Heatmap for CUT&Tag showing the enrichment of H3K27ac, GATA6, and ATAC-seq signal in DLD-1 cells at CRC tumor-specific enhancers defined in a previous publication (32).

To examine whether GATA6 might play a role in CRC gene regulation, we downloaded and analyzed pan-cancer assay for transposase-accessible chromatin with sequencing (ATAC-seq) data in primary cancer tissues from TCGA. We observed the strong enrichment of the GATA6 motif in CRC chromatin accessible regions. As shown in Fig. 1D, the GATA6 binding motif is uniquely enriched in the colon adenocarcinoma (COAD) distal ATAC-seq peaks. We also performed ATAC-seq in DLD-1 cells and observed that DLD-1 ATAC-seq profiles are highly similar to TCGA samples (fig. S1A). Clustering for all 410 TCGA ATAC-seq (31) showed that all CRC samples were grouped together, and DLD-1 clusters in the same group, confirming the representativity of DLD-1 (fig. S1B). We also observed GATA6 motif enrichment in our in-house DLD-1 ATAC-seq peaks (fig. S1C). To summarize, GATA6 is overexpressed in the CRC tissues, and its motif is also highly enriched in CRC-specific open chromatin regions, suggesting that it might play a regulatory role in CRC.

GATA6 binding is enriched in CRC enhancers

Given the potential regulatory role of GATA6 in CRC, we profiled the genome-wide GATA6 binding sites through CUT&Tag in CRC cell line DLD-1. To investigate how it contributes to gene regulation, we also performed CUT&Tag for H3K27ac, a mark frequently used for predicting active enhancers, and CTCF, a key factor for loop extrusion and chromatin organization, as well as Hi-C assays to profile the genome-wide chromatin interaction.

Overall, we identified ~38,000 GATA6 binding sites, and among them, 51.3% of which overlap with distal enhancers and 30.7% overlap with gene promoters (Fig. 1E). We observed GATA6 binding at promoters and nearby putative enhancers of genes implicated in CRC tumorigenesis, such as BCL10 and PDCD4 as well as stem cell undifferentiation marker EPCAM (fig. S1A). Gene Ontology analysis of GATA6 binding sites revealed its involvement in CRC-associated terms such as regulation of apoptotic signaling pathway, negative regulation of mitogen-activated protein kinase cascade, epithelial cell proliferation, and β-catenin–T cell factor (TCF) complex assembly (fig. S1D). Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of GATA6 bound oncogenes also showed CRC as the top hit (fig. S1E).

Next, we compared GATA6 binding sites with H3K27ac ChIP-seq data from CRC patient samples (32). In Fig. 1F, we showed several examples of GATA6 binding at promoters and potential enhancers of known CRC oncogenes such as KLF5 and TSPAN8. In total, 45% of the GATA6 binding sites overlap with recurrent enhancers that appeared in at least three CRC patient samples (fig. S1F and table S10). Across a set of previously defined CRC-specific enhancers (32), 55% of these regions overlap with GATA6 bound enhancers in DLD-1 (Fig. 1, G and H). The enrichment of GATA6 binding co-occurring with H3K27ac and ATAC-seq signal in both CRC patient samples and cell lines indicated the potential involvement of GATA6 in tumor-specific enhancer regions.

GATA6 binding sites overlap with CTCF and are enriched at loop anchors

In recent years, various studies have shown that tissue-specific TFs play key roles in regulatory loops between enhancers and promoters in both normal physiological conditions and disease pathogenesis (14, 15, 33). Considering its enrichment at enhancers, we next investigated whether GATA6 can execute its regulatory role through chromatin looping. As CTCF and the cohesin complex are well-characterized in higher-order genome organization (3436), we performed CUT&Tag for CTCF and RAD21, a key component of the cohesin complex, in DLD-1 cells (Table S6). We observed the co-occurrence of GATA6 binding signal with CTCF and RAD21 (Fig. 2A). Globally, 67% (25,774 of 38,514) of GATA6 peaks overlap with CTCF peaks (fig. S2, A and C). Across all 25,774 coenriched regions, 75% of these regions have a genomic distance less than 250 base pair (bp) between CTCF and GATA6 peak centers (fig. S2D). To further confirm the cobinding patterns of CTCF and GATA6, we performed CUT&Tag for CTCF and GATA6 in CACO-2 cells, another commonly used CRC cell line, and observed similar cobinding patterns (fig. S2, B and C). Next, we performed immunostaining with antibodies targeting GATA6 and CTCF and observed a global colocalization of the two proteins in the cell nucleus but not in the cytoplasm (Fig. 2B). Last, we performed ChIP-MS targeting GATA6, and the result showed physical interaction between GATA6 with master chromatin architectural proteins CTCF and core cohesion complex units RAD21 and SMC1A (Fig. 2C). The interaction between CTCF and GATA6 was confirmed through GATA6 immunoprecipitation in both DLD-1 and CACO-2 cell lines (Fig. 2D).

Fig. 2. GATA6 binds at enhancers and is involved in regulatory loops.

Fig. 2.

(A) Heatmap for GATA6, CTCF, and RAD21 CUT&Tag centered at GATA6 binding peaks at the overlapping regions (±3 kb). (B) Confocal microscopy image showing the colocalization of GATA6 (red) and CTCF (green) in cell nucleus from immunofluorescent staining of DLD-1 cells. Nuclei denoted with 4′,6-diamidino-2-phenylindole (DAPI) (blue). (C) GATA6 ChIP-MS protein hits identified CTCF and cohesin components. Abundance value (IBAQ) shown. (D) Immunoprecipitation with GATA6 antibody validated the protein-protein interaction between GATA6 with CTCF. IgG is used as negative control. (E) Left: Heatmap of CTCF and GATA6 CUT&Tag signal enrichment in DLD-1 cells at cell type–specific CTCF binding sites and universal CTCF binding sites. Right: Percentage of GATA6 binding at cell type–specific CTCF binding sites and universal CTCF binding sites. (F) APA analysis of DLD-1 GATA6-associated Hi-C loops in CRC patient sample Hi-C (A001, A003, B001, B003, and C003). (G) Top: Virtual 4-C (50-kb resolution) analysis showed consistent presence of GATA6-associated loops in DLD-1 and CRC patient samples at Axin2 locus. y-axis value was the normalized Hi-C frequency, and the data range is between 0 and 1, with the anchor peak height equals 1. Bottom: CUT&Tag track for GATA6, CTCF, and H3K27ac in DLD-1 WT. IBAQ, intensity-based absolute quantification; IB, immunoblot.

It was previously reported that conserved CTCF binding sites are more likely to be involved in architectural loops, while tissue-specific CTCF occupancies are more associated with tissue-type regulatory loops (3739). To characterize what type of CTCF peaks are more likely to colocalize with GATA6, we divided the CTCF CUT&Tag peaks in DLD-1 cells into different subgroups by comparing them with the 457 CTCF ChIP-seq dataset across 54 tissue types and cell lines from the ENCODE project (40, 41). We identified 5669 CTCF binding sites shared in DLD-1 and all ENCODE tissue/cell types (universal) and 7097 DLD1-specific binding sites that are only observed in DLD-1 but not in the 423 CTCF ChIP-seq from non-colon ENCODE cell/tissue types. We observed that CTCF binding intensities are higher in DLD-1 for shared peaks (fig. S2F), consistent with previous reports (39, 42, 43). Then, we compared the overlap between CTCF and GATA6 binding profiles and observed that 16.3% of the conserved CTCF overlap with GATA6, and 79.4% of the DLD1-specific binding sites overlap with GATA6 (Fig. 2E). We also plotted the heatmap for these groups in Fig. 2E and the results also confirmed that GATA6 preferentially binds at cell type–specific CTCF sites. This observation is consistent with a previous report that oncogenic TF binding is associated with cell type–specific CTCF binding, rather than constitutive CTCF binding (39). A similar pattern was also observed in CACO-2 (fig. S2E). Together, our result suggested that GATA6 might be involved in transcriptional loops.

To further understand the involvement of GATA6 in chromatin looping, we performed in situ Hi-C in DLD-1 and five paired FFPE CRC tissues (fig. S3A). On average, each Hi-C library was sequenced to ~600 million reads and the high ratio of long-range interaction confirmed the high quality of our sequenced libraries (tables S7 and S8). We performed loop calling at 10-kb resolution using the Peakachu software (44) and identified ~15,000 loops in each sample. A total of 35% of the chromatin loops in DLD-1 cells overlapped with GATA6 binding sites; 27% of the loops contained GATA6 binding sites in one anchor, while 8% contained GATA6 binding sites in both anchors (fig. S3B). Aggregate peak analysis (APA) analysis shows that GATA6-associated loops identified in DLD-1 also occurred in our five patients with FFPE Hi-C (Fig. 2F). For example, we observed the highly consistent occurrence of GATA6 enriched loops linking CRC oncogene Axin2 and distal enhancers marked by H3K27ac signals across all samples (Fig. 2G). Together, our results revealed a potential role of GATA6 in CRC loops.

GATA6 depletion through both AID and CRISPR KO leads to impaired cell growth and transcriptional dysregulation

To further study the role of GATA6, we first established an acute GATA6 depletion system based on the auxin-inducible degron (AID) system to examine the acute effect of GATA6 depletion (45). Specifically, the mini-IAA7 tag and the F-box protein AtAFB2 were introduced simultaneously to the endogenous GATA6 locus in a homozygous manner (fig. S4A). In the presence of Auxin (IAA), the F-box protein AtAFB2 recruits the ubiquitin ligase components, leading to the rapid degradation of the GATA6 protein (Fig. 3A). The introduction of mini-IAA7 did not affect the basal GATA6 protein level (Fig. 3B). No GATA6 protein degradation was observed in the wild-type (WT) DLD-1 upon auxin treatment (fig. S4B). As validated by Western blot, upon the Auxin treatment, GATA6 protein depleted as soon as 15 min (Fig. 3B). For further experiments and analysis, we chose the 12-hour treatment time point, at which the protein was completely depleted, while the cells have not gone through a complete cell cycle. After the withdrawal of Auxin from culture media, the GATA6 protein level was restored after 24 hours (Fig. 3B and fig. S4C). For the rescue experiment, we chose the 24-hour wash time point. The auxin-inducible cells are referred to as GATA6IAA7 cells throughout this paper.

Fig. 3. Depletion of GATA6 through both AID system and CRISPR KO leads to dysregulation and impaired cell growth.

Fig. 3.

(A) Schematics for auxin inducible degradation system. Created in BioRender.com. (B) Western blot for auxin treatment time points DLD-1 GATA6IAA7cells. (C) Sanger sequencing validation for GATA6 KO in DLD-1 cells. Created in BioRender.com. (D) Western blot for GATA6 KO in DLD-1 cells. (E) CCK8 proliferation curve for DLD-1 control and two GATA6 KO mutants. (F) Colony formation assay for DLD-1 control and two GATA6 KO clones. Left: Quantification of colony number with ImageJ. Right: Example images. Three replicates were performed for each condition. (G) Differential expression analysis DLD-1 GATA6 KO compared to control. P value was calculated using two-tailed t test. (H) GSEA for DEGs after DLD-1 GATA6 KO in TCGA CRC tumor and normal.

To assess the long-term effect of complete GATA6 removal, we also knocked out GATA6 using the CRISPR-Cas9 system in DLD-1 cells. We designed single-guide RNA (sgRNA) specifically targeting the second exon of the gene. We obtained two homozygous GATA6 mutant lines, confirmed by Sanger sequencing (Fig. 3C). Western blot confirmed the successful depletion of WT GATA6 protein (Fig. 3D). To further confirm the observations we made in the DLD1 cell line, we also performed GATA6 KO by CRISPR and established the auxin-inducible depletion system in CACO-2 cells, another commonly used colon cancer cell line (fig. S4, E and F).

After establishing the GATA6 KO cell lines, we first investigated whether the loss of GATA6 would affect the CRC cell behavior. After GATA6 KO, we observed a substantial reduction in the cell proliferation rate (Fig. 3E). We also observed impaired ability in colony formation, where both the size and number of colonies markedly decreased after GATA6 KO (Fig. 3F). The DLD-1 GATA6IAA7 cells also showed impaired clonogenicity (fig. S4D). Similar phenotypic changes were also observed in CACO-2 GATA6 KO cells (fig. S4, G and H). This result suggested that GATA6 is essential for CRC cell proliferation.

To study the impact of GATA6 KO in transcription, we performed RNA-seq in DLD-1 WT and KO cells. Globally, we observed 809 up-regulated genes and 405 down-regulated genes after GATA6 KO (Fig. 3G), with the significant down-regulation of many known CRC oncogenes such as c-MYC. To further assess whether the differentially expressed genes (DEGs) induced by GATA6 KO are enriched in primary CRC samples, we examined their expression in 624 (275 colon tumors and 349 normal colon/peritumor tissues) primary tumors and normal tissues from TCGA and GTex. Our gene set enrichment analysis (GSEA) results showed significant enrichment of the down-regulated gene set in TCGA primary tumors compared with normal tissues (Fig. 3H). Furthermore, GATA6 KO greatly reduced the expression of epithelial stem cell marker LGR5 while increased the expression of epithelial differentiation marker KRT20 (Fig. 3G). We investigated whether the altered gene expression in the GATA6 KO cells could also be observed in the AID systems. For the DLD-1 GATA6IAA7 system, we observed relatively moderate changes at 12 hours after GATA6 depletion (100 up-regulated and 63 down-regulated) and increasingly differential transcriptome at 72 hours (483 up-regulated and 310 down-regulated) (fig. S5, A and B). Overall, we did see a similar set of genes were affected in KO and 12 or 72 hours of auxin treatment, such as c-MYC, TCF4, and CLDN2. Furthermore, when we withdrew the auxin treatment and performed RNA-seq experiments, the changed expression of this group of genes, including MYC and CLDN2, was reversed (fig. S5C), suggesting that GATA6 controls a set of critical genes in colon cancer.

GATA6 depletion reshaped CRC specific open chromatin

As GATA6 has been shown as a pioneer factor and functions in chromatin remodeling in definitive endoderm fate determination and cardiac development (24, 27), we next investigated whether it exerted a similar function in the CRC context. Therefore, we performed ATAC-seq in DLD-1 GATA6IAA7 untreated control, 12-hour auxin treatment, and 24-hour washout samples in triplicates. We found that upon the loss of GATA6, there were 1738 ATAC-seq peaks whose signals were severely weakened. After we withdrew Auxin and restored the GATA6 protein, the ATAC-Seq peaks were restored as well (fig. S6, A and B). GATA6 CUT&Tag data showed that there was indeed GATA6 binding in this set of dynamic ATAC-Seq peaks (right panel in fig. S6A). To further confirm whether GATA6 controls this set of open chromatin regions, we also performed ATAC-seq in the GATA6 KO cells and observed that their signals were reduced as well (fig. S6C). To investigate the role of this group of ATAC-seq peaks, we performed pan-cancer ATAC-seq analysis using TCGA data from 23 distinct cancer types (31), and these GATA6-mediated ATAC-seq regions were specifically enriched in TCGA CRC ATAC-seq datasets (fig. S6D), further suggesting the role of GATA6 as a colon cancer regulator.

GATA6 depletion changed 3D genome organization leading to reduced gene expression of CRC oncogenes

Previous work through Hi-C data analysis has shown that the human genome is organized into different compartments: active “A” compartments that are with open chromatin and repressive “B” compartments that are correlated with closed and inactive chromatin. On the basis of the first principal component (PC1) of the Hi-C matrices, we observed 6.12% of the genome underwent compartment B to A switch and 1.65% underwent A to B switch in the GATA6 KO cells (fig. S7A). Among these switched regions, 51.35% of the B to A switched regions are also A compartments in the normal colon tissue, while 58% of the A to B switched regions are B compartments in normal colon (fig. S7B). Further, we defined 1666 tumor-specific loops that recurrently appeared in all FFPE colorectal tumors while absent in paired normal samples (fig. S7C). When we examined these recurrent loops in WT versus GATA6 KO DLD-1 cells, we found that the loss of GATA6 also caused the reduction of the intensity of these loops. We observed GATA6 binding at 45.4% (757 of 1666) of the tumor-specific loop anchors (fig. S7D). Further analysis showed that these regions were also enriched for CTCF and RAD21 signals, suggesting a possible collaboration of GATA6 with the two architectural factors at loop anchors (fig. S7E).

To study the impact of GATA6 in 3D genome organization at a higher resolution, we performed H3K27ac HiChIP in DLD-1 control and GATA6 CRISPR KO cells (fig. S8A). Loop calling was performed at 5-kb resolution using the Model-based Analysis of PLAC-seq and HiChIP (MAPS) pipeline (46), yielding, on average, 90,000 loops with H3K27ac peak support on at least one loop anchors. After GATA6 KO, we observed 8937 H3K27ac HiChIP loops with reduced contact frequency, and only 528 loops were newly gained (Fig. 4A). To understand the involvement of GATA6 in these differential interactions, we examined GATA6 binding status at the loop anchors. We found that GATA6 binding was highly enriched on the anchors of H3K27ac HiChIP loops with reduced interaction (86.54%), with 68.82% of these loops having GATA6 binding at one anchor and 17.72% having GATA6 binding at both anchors (Fig. 4B).

Fig. 4. GATA6 controls transcriptional regulatory chromatin interactions involving CRC oncogenes.

Fig. 4.

(A) APA of differential H3K27ac HiChIP loops after GATA6 depletion in DLD-1 cells. (B) Percentage of reduced H3K27ac HiChIP loops with GATA6 binding at one anchor, GATA6 binding at both anchors, and no GATA6 binding. (C) GATA6 binding signal intensity at the enhancer side and promoter side for the reduced H3K27ac HiChIP loops. (D) Loop annotation for reduced H3K27ac HiChIP loops. (E) Gene expression for DEG associated with reduced E-P loops (n = 309) and gained E-P loops (n = 74). P value was calculated using Wilcoxon test. (F) GSEA for genes involved in reduced E-P loops. (G) H3K27ac HiChIP showed the reduction of interaction strength between MYC promoters with its enhancers. (H) H3K27ac HiChIP showed the reduction of interaction strength between both TSPAN8 and LGR5 promoters with their enhancers. NES, normalized enrichment score; FDR, false discovery rate.

We observed higher GATA6 binding enrichment on enhancers compared with promoters involved in the reduced H3K27ac loops, indicating that GATA6-mediated transcriptional regulation is at least partly through distal enhancers, in addition to previously reported direct promoter binding (Fig. 4C and fig. S8B). Consistently, loop annotation revealed the extensive involvement of GATA6 in enhancer-associated chromatin interactions, with 24.5% of the reduced H3K27ac loops between enhancers and 21.3% between enhancers and promoters (Fig. 4D).

We next examined the relationship between differentially expressed genes and differential enhancer-promoter loops. In total, we identified 309 differentially expressed genes that were down-regulated in GATA6 KO cells with reduced H3K27ac HiChIP signals and 74 differentially expressed genes that were up-regulated and with increased H3K27ac HiChIP signals. We observed that enhancer-promoter (E-P) loops reduced by the loss of GATA6 were more associated with down-regulated genes, while gained E-P loops have higher enrichment of up-regulated genes (Fig. 4E). We then evaluated whether GATA6-mediated loops might play a role in CRC development. We performed KEGG pathway analysis for genes involved in the reduced E-P loops, and the result showed the enrichment of CRC (P value = 0.005, fig. S8C). We also observed several functional pathways that are known to be important in cancer oncogenesis in the KEGG pathway analysis, including Wnt signaling, and PI3K-Akt (fig. S8C). By examining the GSEA of genes associated with all the disappeared E-P loop, we identified the enriched down-regulation of cancer-associated pathways such as E2F targets and mammalian target of rapamycin (mTOR) signaling (Fig. 4F). We showed three example genes from each pathway shown in Fig. S8D.

The oncogenic protein MYC is overexpressed in CRC and functions as a key driver of the disease initiation and progression (4749). It has been shown that distal enhancers regulate MYC gene expression through chromatin looping in various cancers, including CRC (5052). In DLD-1, we observed the reduction of GATA6 involved enhancer-promoter loops at the MYC locus after GATA6 depletion, alongside expression reduction (Fig. 4G). Further, quantitative polymerase chain reaction (qPCR) for MYC in both DLD-1 and CACO-2 confirmed its down-regulation after GATA6 KO, consistent with RNA-seq results (fig. S9A). These loops also have higher interaction frequencies in multiple CRC samples (tumor versus peritumor) from our patient cohort and publicly available resources (fig. S9C and table S10) (21). We also observed the negative enrichment of MYC target gene signatures in the AID system after GATA6 depletion (fig. S9B), further suggesting that MYC is a downstream target gene regulated by GATA6. Furthermore, TCGA data showed the up-regulation of both GATA6 and MYC in CRC compared with peritumor tissue (fig. S9D). We also found the looping strengths between known CRC oncogenes LGR5 and TSPAN8 promoters to distal enhancers were greatly reduced after GATA6 KO (Fig. 4H, top) (53, 54). The disappearance of these E-P loops was accompanied by gene down-regulation and much weakened H3K27ac signals (Fig. 4H, bottom). Together, these results showed that the CRC-promoting role of GATA6 is, at least partially, executed through E-P loop regulation of key CRC genes.

Loss of GATA6 impaired CRC tumor progression in vivo

To determine the oncogenic role of GATA6 in colorectal tumorigenesis in vivo, we generated xenograft mice models. GATA6 depleted and control DLD-1 cells were inoculated subcutaneously into the right flank of nude mice. The tumor volume was measured daily starting at day 4 after the xenograft. Consistent with our in vitro result, GATA6 loss significantly impaired the tumorigenesis ability compared with control (Fig. 5A). The median tumor weight was reduced from 1.05 to 0.34 g (Fig. 5B), and the tumor volume was reduced from 1100.9 to 378.5 mm3 at day 25 (Fig. 5C). Mice injected with GATA6-depleted cells also showed significantly prolonged survival (around 50 days) compared with mice injected with control cells (around 30 days; P < 0.001) (Fig. 5D). RNA-seq for the control and GATA6-depleted tumors showed a consistent change in gene expression between the mice tumors and DLD-1 cells, confirming that the same gene set is responsible for the phenotype (Fig. 5E). For example, CRC oncogenes such as CXCL5 and PRSS2 were down-regulated after GATA6-depletion (Fig. 5F). Overall, our study highlighted the oncogenic role of GATA6 both in vitro and in vivo.

Fig. 5. GATA6 loss impaired tumor progression in vivo.

Fig. 5.

(A) Image of DLD-1 control and GATA6 KO cell xenografts in mice at 25 days after injection. Top: Control. Bottom: GATA6 KO. (B and C) Quantification of tumor weight (B) and tumor volume (C) of DLD-1 control and GATA6 KO cell xenografts at 25 days after injection. P value was calculated by two-sided Student’s t test. (D) Kaplan-Merier analysis of mice with DLD-1 control and GATA6 KO cell xenografts. P value was calculated by two-sided Student’s t test. (E) Change in gene expression in GATA6 KO xenograft tumor compared with control. (F) RNA-seq track for mice control, mice GATA6 KO, DLD-1 control, and DLD-1 GATA6 KO at CXCL5 and PRSS2 gene loci. (G) Model of GATA6-mediated E-P loops. Created in BioRender.com

DISCUSSION

In this study, we systematically studied the essential role of GATA6 in CRC gene regulation and tumor progression. First, through large-scale genomic profiling and reanalysis of published datasets, we showed that GATA6 binds at CRC-specific enhancers and loop anchors. We systematically manipulated GATA6 expression through both acutely inducible AID system for immediate changes and CRISPR KO for long-term functional impact. Both systems evidently confirmed GATA6’s essentiality in maintaining CRC cell proliferation and colony formation. GATA6 depletion also caused the loss of a set of CRC-specific chromatin accessible regions, which can be rescued by auxin removal, suggesting a critical function of GATA6 in maintaining open chromatin landscape. Further, our work unveiled a previously unidentified role of GATA6 in long-range transcriptional regulation. We experimentally confirmed the colocalization and physical interaction between GATA6 and tissue type–specific CTCF. We observed a high enrichment of GATA6 binding at loop anchors and GATA6-associated loops can also be found in CRC patient samples. GATA6 loss led to weakened enhancer-promoter interactions involving a panel of CRC oncogenes and a reduction in the target gene expression. Loss of GATA6 in mice resulted in reduced tumor burden and prolonged survival.

Emerging works have proposed various models for the formation of TF-mediated chromatin conformation (11); however, the precise mechanism remains to be further explored. In addition to the canonical loop extrusion model, the current chromatin “kissing” model suggested mediator clusters transiently facilitate enhancer-promoter communication (55). Pluripotent TFs such as Nanog and sox2 form droplet condensates serving as hubs for target genes (56). In leukemia, NUP98 mediates CTCF-independent interaction through phase separation (57). After GATA6 depletion, we observed the loss of CTCF binding sites genome widely, but only minimal change occurred at loop anchors. While the decrease in CTCF and cohesin binding coinciding with the loss of GATA6-directed ATAC-seq peaks can explain the reduced loop strength at these regions, these loops only constitute a small percentage of the total reduced loops. Our evidence implied that the loss of GATA6 involved loop was unlikely attributed to the destabilization of CTCF binding, but rather a distinct mechanism. Intriguingly, a recent work suggested extensive cobinding of GATA6 with Nanog (58), opening up the possibility of GATA6-mediated chromatin loop being associated with phase separation. On the other hand, we also observed cohesin enrichment at the majority of GATA6-bound anchors, suggesting the possibility that these loops formed through the collaboration with cohesin and loop extrusion. Therefore, it would also be interesting to further explore the intrinsic relationship between GATA6 and cohesin as a future direction.

TFs have been compelling targets in cancer mechanistic studies given the intricate balance between normal development and tumorigenesis onset. While TFs are hard to target, several previous works validated the efficacy of pharmaceutical inhibition of GATA family members. We previously showed that Pyrrithiogatain can inhibit GATA3 binding in GM12878 (59, 60). Oral administration of K11706 inhibited GATA1-3 and enhanced HIF-1 in anemia mice through the competition with GATA for DNA binding (61). However, it is unclear whether these inhibitors work with GATA6. Therefore, we are also interested in testing whether these drugs work in CRC through the inhibition of GATA6, which will further highlight the therapeutic value of our study.

MATERIALS AND METHODS

Experimental design

Cell culture

Human CRC cell line DLD-1 [American Type Culture Collection (ATCC), CCL-221] was gifted by A. Shilatifard’s laboratory. CACO-2 (ATCC, HTB-37) was directly purchased from ATCC. All cell lines used in this study were subjected to short tandem repeat (STR) profiling to confirm the cellular identify. All cells were tested negative for Mycoplasma using Mycoplasma detection kit (SouthernBiotech, 13100-01).

DLD-1 cells were cultured in RPMI 1640 medium (Gibco, 11875093) supplemented with 10% fetal bovine serum (Thermo Fisher Scientific, 12440053) and 1% penicillin-streptomycin. CACO-2 cells were cultured in Eagle’s minimal essential medium (ATCC, 30-2003) with 20% fetal bovine serum (Thermo Fisher Scientific, 12440053) and 1% penicillin-streptomycin. All cells were cultured in a 37°C incubator with 5% CO2.

Patient sample collection

The paraffin-embedded pathology specimens used in this study as well as related clinical characteristics and demographics from the electronic medical record were obtained in accordance with the approval by the Institutional Review Board at our institution (STU00213678). Informed consent was obtained from all participants.

Establishment of auxin inducible GATA6 degradation cell system via CRISPR knock-in

GATA6-IAA7 DLD-1 and CACO-2 cells were generated following the procedure described previously (62). Briefly, sgRNA targeting the end of GATA6 gene (#1: CCTGAGCCCACGCCGCCAGG or #2: GAGTGGAGTGAGGCCCGCGG) was inserted into a Cas9-expressing plasmid PX330 (Addgene, 42230). The resulting plasmid was cotransfected with the IAA7-AtAFB2 donor plasmids with hygromycin resistance using Lipofectamine 3000 (Invitrogen, L3000001). The IAA7-AtAFB2 cassette was integrated after GATA6 stop codon via homologous recombination. The cells were then subjected to hygromycin selection for 2 weeks, and single clones were selected and verified by Western blot and PCR.

GATA6 protein degradation was induced by treating positive GATA6-IAA7 clones with DLD-1 complete medium supplemented with 500 μM IAA (Sigma-Aldrich, I5148) for various timepoints. For IAA washout, we removed IAA 12 hours after IAA treatment. Cells were first washed with 1× Dulbecco’s phosphate-buffered saline (DPBS) three times and then maintained in DLD-1 complete medium for 24 or 48 hours.

GATA6 stable KO through CRSIPR-Cas9

CRISPR-Cas9 system was used for GATA6 KO. GATA6 targeting sgRNAs were designed on the basis of previous publication (24) and cloned into BsmBI-V2 (NEB, R0739) digested LentiCRISPRV2GFP (Addgene, #82416) backbone. The plasmid was then transformed in Stbl3-competent cells for expansion (Invitrogen, C737303), and positive clones were ampicillin selected. The plasmid was purified using endotoxin-free mini-prep kit (101Bio, W2106), and the proper insertion was confirmed by Sanger sequencing.

Lentiviruses were packed using pMD2.G, psPAX2, and the GATA6-LentiCRISPRV2GFP plasmids. The plasmid mixture was then transfected into human embryonic kidney–293T cells using Lipofectamine 3000 (Invitrogen, L3000001). The medium was changed the next day, and the virus was collected 48 hours after infection. DLD-1 and CACO-2 cells were then immediately infected with filtered virus and polybrene (MilliporeSigma, TR1003G) and incubated overnight. Additional virus was added the next day to improve the transfection efficiency. The green fluorescent protein (GFP)–positive cells were then selected and sorted into single cells by fluorescence-activated cell sorting a week after the infection.

To select the cells with proper cut at the desired sites, the single clones were subjected to genomic DNA extraction and Sanger sequencing. The positive clones were then expanded, and the successful protein depletion was then confirmed by Western blot with GATA6 antibody [Cell Signaling Technology (CST), 5851].

In situ Hi-C

In situ Hi-C was performed following the previously described protocol with modifications (34). Specifically, 1 million viable cells in cell culture were collected by centrifugation at 500g for 5 min. The cells were then crosslinked with 2% formaldehyde (MilliporeSigma, 252549) for 15 min, and the reaction was quenched with 0.2 M glycine for 10 min. The cells were then pelleted by centrifuging at 800g for 15 min, and the pellet was washed with 1× DPBS and subjected to lysis with Hi-C lysis buffer [10 mM tris-HCl (pH 8.0), 10 mM NaCl, and 0.2% Igepal CA630] supplemented with protease inhibitor (Sigma-Aldrich, P8340) for 15 min on ice. The lysed cells were then pelleted and washed with the same buffer. The cells were further lysed by incubation in 50 μl of 0.5% SDS at 62°C for 10 min, and the reaction was quenched with 145 μl of water and 25 μl of 10% Triton X-100 (Sigma-Aldrich, 93443) at 37°C for 15 min. The chromatin was then digested with 100 units of Mbo I [New England Biolabs (NEB), R0147] with 25 μl of NEBuffer 2 (NEB, B7207) at 37°C overnight. Mbo I was then deactivated by incubation at 62°C for 20 min. Digested DNA was end-repaired and biotin-labeled with a mix of 37.5 μl of 0.4 mM biotin-14-deoxyadenosine triphosphate (dATP), 1.5 μl of 10 mM deoxycytidine triphosphate (dCTP), 1.5 μl of 10 mM deoxyguanosine triphosphate (dGTP), 1.5 μl of 10 mM dTTP, and 8 μl of DNA polymerase I (5 U/μl), large (Klenow) fragment (NEB, M0210) at 37°C for 1.5 hours. DNA was then ligated by adding ligation master mix [669 μl of water and 120 μl of 10X NEB T4 DNA ligase buffer (NEB, B0202), 100 μl of 10% Triton X-100, 12 μl of bovine serum albumin (BSA) (10 mg/ml), and 5 μl of T4 DNA ligase (400 U/μl; NEB, M0202)] at room temperature for 4 hours. The chromatin was then reverse crosslinked with 50 μl of proteinase K (20 mg/ml) and 120 μl of 10% SDS at 55°C for 30 min, followed by the addition of 130 μl of 5 M NaCl and 68°C overnight incubation. The chromatin was the ethanol precipitated and sheared to an average size of 300 to 500 bp using Covaris sonicator. The biotin-labeled DNA fragment was pulled down using 150 μl of streptavidin T1 beads (10 mg/ml; Invitrogen, 65602). Before usage, the beads were washed with 400 μl of 1× tween washing buffer (TWB; 5 mM tris-HCl (pH 7.5), 0.5 mM EDTA, 1 M NaCl, and 0.05% Tween 20). The beads were then resuspended in 300 μl of 2× binding buffer [10 mM tris-HCl (pH 7.5), 1 mM EDTA, and 2 M NaCl]. The beads were then added to the chromatin and incubated at room temperature for 1 hour. The DNA bound beads were then washed twice with 600 μl of 1X TWB at 55°C for 2 min. The sheared DNA were then resuspended and repaired using repair master mix [88 μl of 1X NEB T4 ligase buffer with 10 mM ATP, 2 μl of 25 mM deoxyribonucleotide triphosphate (dNTP) mix, 5 μl of NEB T4 PNK (10 U/μl; NEB, M0201), 4 μl of NEB T4 DNA polymerase I (3 U/μl; NEB, M0203), and 1 μl of NEB DNA polymerase I (5 U/μl), large (Klenow) fragment (NEB, M0210)] and incubated at room temperature for 30 min. Beads were then washed with TWB as above described and also washed with 100 μl of 1X NEBuffer2. Beads were then resuspended in 100 μl of dATP attachment master mix [90 μl of 1X NEBuffer2, 5 μl of 10 mM dATP, and 5 μl of NEB Klenow exo minus (5 U/μl; NEB, M0212)] and incubated at 37°C for 30 min. After TWB washing, beads were resuspended in 50 μl of 1X NEB Quick ligation buffer. A total of 2 μl of NEB DNA Quick ligase and 3 μl of Illumina indexed adapter were added, and the reaction was incubated at room temperature for 15 min. Beads were then washed with TWB twice, and 100 μl of 10 mM tris-HCl (pH 8) once and resuspended in 50 μl of 10 mM tris-HCl (pH 8). Hi-C library was dissociated from beads by heating at 98°C for 10 min, followed by PCR amplification with 1.5 μl of PCR primer mix, 25 μl of KAPA 2x library mix, and 23.5 μl of Hi-C library. The PCR was conducted under following setting: 45 s of 98°C followed by 4 to 12 cycles of 15 s of 98°C, 30 s of 65°C, and 45 s of 72°C. The reaction was then held at 72°C for 5 min and 4°C until next step. The amplified library was then subjected to size selection with AMPure XP beads and quantification. The final library was sequenced as 150 paired-end (PE) reads on Illumina NovaSeq 6000 platform.

Hi-C for FFPE patient samples were performed according to the same protocol with modifications specifically for FFPE samples. Specifically, FFPE samples were deparaffinized with 1 ml of xylene for 10 min at room temperature with rotation. Samples were recollected by centrifugation at 17,900g for 5 min at room temperature. Supernatant was discarded in the fume hood. Samples were then rehydrated first in 1 ml of 100% ethanol for 10 min at room temperature with rotation. Samples were recollected by centrifugation at 17,900g for 5 min at room temperature. After discarding the supernatant, samples were then resuspended in 1 ml of water and incubated for 10 min at room temperature with rotation. The deparaffined and rehydrated tissues were then resuspended in 200 μl of 1X PBS. Tissue was then homogenized with Powermasher II instrument, followed by the Hi-C protocol. The conditions for cell lysis step and reverse crosslink step were optimized for FFPE samples.

HiChIP experiment

HiChIP experiments were performed using Arima HiC+ kit following the manufacturer’s protocol with modifications. Briefly, 10 million cells were crosslinked with 2% formaldehyde (MilliporeSigma, 252549), followed by cell lysis, restriction enzyme digestion, end repair and biotin labeling, ligation, and shearing following Arima Hi-C procedure. Chromatin was then subjected to immunoprecipitation with H3K27ac antibody (Active Motif, 39133). The final library was prepared using the KAPA HyperPrep Kit (KAPA, 7962363001). The final library was sequenced as 150 PE reads on Illumina NovaSeq 6000 platform.

CUT&Tag

CUT&Tag libraries were prepared per a previously established protocol (63) (www.protocols.io/view/bench-top-cut-amp-tag-kqdg34qdpl25/v2?version_warning=no). Briefly, 0.25 M cells were used for each reaction. Two replicates were performed for each experiment. The following primary antibodies used were: GATA6 (CST, 5851), CTCF (ABclonal, A18627), H3K27ac (Active Motif, 39133), and RAD21 (ABclonal, A18850). The final library was sequenced as 150 PE reads on Illumina NovaSeq 6000 platform.

ATAC-seq

ATAC-seq was performed following the exact Omni-ATAC protocol (64). Viable cells (50,000) were used, and the lysis condition was optimized for DLD-1. The final library was sequenced as 150 PE reads on Illumina NovaSeq 6000 platform.

RNA-seq

Total RNA was extracted from 1 million cells using the QIAGEN RNeasy Plus Mini Kit (QIAGEN, 74136). RNA quality was examined using TapeStation with RNA ScreenTape. Library preparation was performed using the NEBNext Ultra II RNA Library Prep Kit (NEB #E7775) with polyadenylate selection and directional module. The final library was sequenced as 150 PE reads on Illumina NovaSeq 6000 platform.

Colony formation assay

The colony formation assay was performed following online protocol (www.protocols.io/view/colony-formation-wiefcbe?step=5) with modifications. Cells (500 to 1000) were plated per well in 2-ml media in each well of a six-well plate and three technical replicates were performed for each condition. The cells were incubated at 37°C with 5% CO2 for 12 days. The plates were imaged with Bio-Rad ChemiDoc imaging system and quantified using ImageJ.

Cell proliferation assay

Cell proliferation was measured by Cell Counting Kit-8 (CCK8) assay, according to the manufacturer’s instruction. Briefly, 1000 cells in 100-μl media were plated in each well of 96-well plates and allowed for attachment overnight. Cell proliferation was measured every 24 hours with CCK8 reagent (ApexBio, K1018) for 12 days. Three technical replicates were performed for each condition.

Immunofluorescence

One million cells were plated onto poly-l-lysine–coated coverslips and fixed in 4% paraformaldehyde for 20 min at room temperature. Slides were then washed three times with PBS. Cells were then permeabilized with 0.1% Triton X-100 in PBS for 10 min at room temperature, followed by blocking with 1% BSA in PBS for 30 min at room temperature to minimize nonspecific antibody binding. Slides were then incubated at 4°C overnight with 200 μl of primary antibodies: CTCF (1:10000; ABclonal, A18627) and GATA6 (1:200; R&D Systems, MAB1700). Slides were then washed with PBST (0.1% Tween 20) five times. Secondary antibodies Alexa Fluor 594 (Invitrogen, A-11005) and Alexa Fluor 488 (Invitrogen, A-21206) were used with optimal dilution suggested by the manufacturer. Slides were incubated at room temperature for 1 hour avoiding light, followed by five PBST washes. Coverslips were then mounted using mountant with 4′,6-diamidino-2-phenylindole (ProLong, P36966). Slides were imaged with Zeiss LSM800 microscope with 100× objective, and images were analyzed using ImageJ.

Co-immunoprecipitation

Twenty million cells per condition were crosslinked with 1.5 mM ethylene glycol bis(succinimidyl succinate) (EGS) (Life Technology, 21565) at room temperature for 45 min, and the reaction was quenched with 0.2 M glycine. Cells were washed with DPBS twice and subjected to lysis with 1 ml of lyse buffer [50 mM tris-HCl (pH 8), 150 mM NaCl, 0.5% Triton X-100, 10% glycerol, and 1 mM dithiothreitol] supplemented with protease inhibitor (Sigma-Aldrich, P8340) for 1 hour at 4°C. The lysed cells debris were pelleted by centrifugation at 12,000 rpm for 10 min at 4°C. The 2% total input was collected and stored at −80°C until further use. The rest of the supernatant was incubated with 20 μg of primary antibodies [GATA6 (CST, 5851) and immunoglobulin G (IgG) (CST, 2729)] overnight at 4°C with rotation. Protein A beads (100 μl; Invitrogen, 10002D) were washed with PBST (0.1% Triton X-100) for three times and added to sample, followed by 6-hour incubation at 4°C with rotation. Beads were then washed three times with cold PBST at 4°C and eluted in 60 μl of Laemmli sample buffer (Bio-Rad, # 1610737EDU) supplemented with 2-mercaptoethanol. The result was examined using Western blot.

ChIP-MS

One hundred twenty million cells were crosslinked once with 2% formaldehyde (MilliporeSigma, 252549) for 15 min. After quenching with 0.2 M glycine and washing with DPBS, the cells were further crosslinked twice with 1.5 mM EGS (Life Technology, 21565). Cells were then lysed with lysis buffer [50 mM Hepes-KOH (pH 7.5), 150 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% sodium deoxycholate, and 0.1% SDS] supplemented with protein inhibitor (PI) for 1 hour at 4°C. The cells were further lysed in 100 μl of 0.1% SDS at room temperature for 5 hours with rotation. Cell pellet was collected by centrifuging at 5000 rpm for 5 min at 4°C and resuspended in 1 ml of lysis buffer, followed by sonication. Sheared chromatin was then precleared and incubated with primary antibody-protein A beads complex at 4°C overnight with rotation. Beads were then washed with lysis buffer for three times, high salt buffer twice (50 mM tris-HCl, 350 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% sodium deoxycholate, and 0.1% SDS), LiCl buffer twice (10 mM tris-HCl, 250 mM LiCl, 1 mM EDTA, 0.5% NP-40, and 0.5% sodium deoxycholate), and TE buffer for two to three times. After each wash, the sample was rotated for 2 min at 4°C. The final protein was eluted with 100 μl of TE buffer supplemented with 1% SDS by incubating at room temperature for 30 min with rotation. The elution was stored at −80°C and submitted to UMAS proteomics core for sample preparation and MS analysis.

Mouse xenograft

Mice experiments were conducted in the Institutional Animal Care and Use Committee (IACUC) and Association for Assessment and Accreditation of Laboratory Animal Care approved Center for Comparative Medicine facilities. All procedures on live animals were approved by Northwestern University IACUC (animal protocol no. IS00013610) and were conducted in compliance with the ethical guidelines. For each condition, 10 5- to 6-week-old female athymic nude mice purchased from Envigo and were used. One million human DLD-1 control/ GATA6 KO cells were injected into the right flank of each nude mice. Daily measurement of tumor size started 4 days after inoculation using a calibrated caliper. Statistical significance was determined with two-sided Student’s t test.

Statistical analysis

Sequencing and quality control

Publicly available resource data used in this study were downloaded and prefetched using sratoolskit (v.2.9.6). All raw FASTQ files were downloaded using the Illumina BaseSpace CLItools (v.1.0.0). All sequenced fastq raw reads were trimmed and filtered using TrimGalore (v.0.6.10), including RNA-seq, CUT&Tag, ATAC-seq, HiC, and HiChIP. All trimmed FASTQ files went through quality control using FastQC (v.0.12.1).

RNA-seq data processing

The trimmed FASTQ files of colon cancer cell lines are aligned against hg38 human reference genome using STAR (65)(v.2.7.10b) with parameters “--outSAMunmapped Within --outFilterType BySJout --outSAMattributes NH HI AS NM MD --outFilterMultimapNmax 20 --outFilterMismatchNmax 999 --outFilterMismatchNoverReadLmax 0.04 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --sjdbScore 1”. RSEM (66)(v1.3.3) facilitated the quantification and expression calculation for known genes with GENCODE v33. Genes with TPM value < 1 in all samples were excluded from downstream analyses.

DEGs analysis

Normalized expression counts were taken and imported to R(v4.1.3) for differential expression analysis using DEseq2(v.34.0.0), applying a threshold of P value < 0.05 and log2 fold change > 2. Pearson correlations were calculated between all the replicates to confirm the reproducibility of the replicates. The Gene Ontology terms analysis of differential expression genes was performed by DAVID (v.2023q4). The GSEA was performed using GSEA (67)(v.4.3.2) with All Hallmarks genes (h.all.v2023.2.Hs.symbols.gmt).

Enrichment analysis for down-regulated and up-regulated genes following GATA6 depletion was performed using bulk RNA-seq data from all COAD patients (275 tumor and 349 normal) from TCGA and GTEx. DEGs between COAD tumors and normal samples were identified with DESeq2 (v1.34.0). Median gene expression values for tumor and normal patients were calculated, and all genes were ranked by test statistics for input into GSEA (v4.3.2) preranked analysis. DEGs from our in-house DLD1 GATA6 KO data were used as the observed gene set, and a permutation test with 1000 iterations was conducted, sampling gene sets of the same size as the observed set.

CUT&Tag data processing

CUT&Tag data was processed using the ENCODE ChIP-seq pipeline, which is detailed at (https://github.com/ENCODE-DCC/chip-seq-pipeline2). The trimmed reads were aligned to the human genome reference hg38 human reference genome using Bowtie2 (v.2.4.2) (68). PCR-duplicated reads were filtered out using Picard MarkDuplicates (v.2.6.0). Peak calling for histone marks and TFs was performed with MACS2 (v.2.1.1) (69). Peaks located in the ENCODE hg38 blacklist regions were excluded, and only peaks with a MACS2-reported q value <10−5 and Poisson P value < 0.01 were kept for further analysis. For reproducibility and repeatability purposes, peak files and signal files with threshold Irreproducible Discovery Rate (IDR) of 0.05 between at least two biological replicates were pooled and used in further analyses. The peaks annotation analysis was conducted using HOMER (v4.11) (70) annotatePeaks function. For visualization purposes, the bigwig files with log-transformed P values [−log10(P value)] were used in the Integrative Genomics Viewer (IGV).

The annotation of GATA6 binding sites was performed using ChIPseeker (v1.30.3) (71). Promoter regions were defined as the transcription start site (TSS) ± 2500 bp. To avoid overcounting, active enhancers were identified on the basis of H3K27ac peaks, excluding those within promoter regions.

Identification of universal/Cell type–specific CTCF binding sites

We downloaded 457 ENCODE CTCF ChIP-seq datasets across various tissue types (table S9). CTCF binding levels were obtained from a normalized read count matrix, where ChIP-seq read counts (RPKM) were calculated for CTCF binding sites across all datasets, followed by quantile normalization.

Cell type–specific CTCF binding sites were identified on the basis of the following criteria: (i) regions with no CTCF binding detected in any tissue types except for DLD1/CACO2 cell lines and colon samples based on MACS2 peak calling (binary) and (ii) regions where the CTCF binding level (quantified as normalized ChIP-seq read counts) showed at least a twofold lower change in other tissue types compared to DLD1/CACO2 cell lines and colon samples.

Universal CTCF binding sites were identified as follows: (i) regions where at least half of all tissue samples exhibited CTCF binding, according to MACS2 peak calling (binary) and (ii) regions where CTCF binding levels (quantified as normalized ChIP-seq read counts) were consistently preserved in at least half of all tissue samples.

Identification of CRC patients recurrent enhancers

We downloaded publicly available H3K27ac ChIP-seq data for 74 patients with CRC tumor/normal (accession GSE156613) and realigned them to hg38 following the ChIP-seq data processing methods described above (32). Recurrent enhancers were identified as common intervals among H3K27ac ChIP-seq peaks across the 74 tumor samples. Specifically, overlapping intervals between samples were reported, and a list was compiled of sample names with at least one overlap within these recurrent peak intervals. The number of samples sharing the same overlapping intervals was also recorded. To ensure a comprehensive list of recurrent enhancer peaks, nearby subintervals (within <50 bp) were merged into single intervals in the final peak list. For downstream analysis, we used the list of recurrent enhancers that appeared in at least three samples of patients with tumor.

ATAC-seq data processing

The ATAC-seq data were processed using the ENCODE ATAC-seq pipeline, accessible at (https://github.com/ENCODE-DCC/atac-seq-pipeline). In this process, the reads, once trimmed, were aligned to the hg38 human reference genome using Bowtie2 (v.2.4.2) (68). PCR duplicate reads were then removed with the Picard MarkDuplicates tool (v.2.6.0). Peak calling was carried out using MACS2 (v.2.1.1) (69), and peaks located in the ENCODE hg38 blacklist regions were excluded. Peaks that met specific criteria—a MACS2-reported q value less than 10−5 and a Poisson P value below 0.01—were retained. In addition, peak summits were expanded by 250 bp on each side, resulting in a total width of 500 bp for further analysis. To ensure reproducibility and repeatability, peak files and signal files with an IDR threshold of 0.05, shared between at least two biological replicates, were combined and used in subsequent analyses. For visualization, the log-transformed P values [−log10(P value)] bigwig files were used in the IGV.

The motif enrichment analysis for open chromatin regions specific to gain and lost ATAC-seq peaks was conducted using the HOMER (v4.11) (70) findMotifsGenome function using settings of “-size 1000 –mask.”

GATA6 motif enrichment in TCGA ATAC-seq

Distal ATAC-seq peaks for TCGA ATAC-seq were defined by removing the promoter regions (TSS ± 2.5 kb), and motif enrichment analysis was performed using HOMER (v4.11) (70) findMotifsGenome function. Enrichment score were defined as the −logP value.

Hi-C data processing

The processing of the HiC data—including mapping, filtering, and binning—was conducted through our in-house runHiC (72) pipeline (v.0.8.6), as detailed at https://xiaotaowang.github.io/HiC_pipeline/quickstart.html. The alignment of the trimmed FASTQ files to the hg38 human reference genome was performed by the Burrows-Wheeler Aligner-Maxima (BWA-MEM) (v.0.7.17) (73). Subsequent steps involved the removal of low-quality reads and PCR duplicates. Aligned read pairs were then assembled, and redundant PCR artifacts, as well as read pairs mapped to identical restriction fragments, were eliminated. The binning process involved sorting the reads into various resolutions: 5 kb, 10 kb, 50 kb, 1 Mb, 10 Mb, and 50 Mb. This stage also concurrently implemented ICE normalization. The completion of the binning stage resulted in the creation of ICE normalized matrices in .mcool file format, which were then used for visualization and downstream analyses.

HiC data analysis

The loop interactions of Hi-C datasets were detected at a 10-kb resolution using Peakachu (v.2.2.post1) (44). Peakachu distinguishes loops as either CTCF or H3K27ac types based on the use of a pretrained model for either CTCF or H3K27ac. Subsequently, only those loops identified by both the CTCF and H3K27ac models were considered valid, thereby minimizing the occurrence of false-positive loop detections. This tool assigns a probability score to each loop, reflecting the confidence level in the loop’s identification. The probability score also correlates directly with the loop’s intensity, which facilitates differential loop analysis.

The compartment analysis of Hi-C data was conducted using cooltools (DOI:10.1101/2022.10.31.514564). Compartment A/B assignments, based on PC1 values at a 100-kb resolution, were determined using the “eigs-cis” command in cooltools. A scatter plot illustrating the PC1 values before and after GATA6 modification was created with R’s ggplot2 package (v.4.1.3). Regions exhibiting positive PC1 values were classified as compartment A, while those with negative PC1 values were categorized as compartment B.

The TADs and insulation score were called using perl cworld module (74). The Perl script matrix2insulation.pl 21 was used to calculate the insulation score at a 50-kb resolution matrix using parameters “--ss 100000 --im iqrMean --is 600000 --ids 400000”. The Perl script insulation2tads.pl was used to identify the topologically associated domains, with a threshold of 0.3 set for the minimum boundary strength.

Virtual 4-C analysis

The Virtual 4-C analysis was conducted using a custom script. In summary, a gene (anchor) and its surrounding region were selected, and the rows corresponding to the gene’s TSS regions and flanking areas were extracted from the Hi-C matrix. The observed contact counts were graphed with a smoothing window to produce Virtual 4-C profiles. To ensure consistency in interaction frequencies across various libraries, interactions within the respective chromosomes were adjusted to the range of 0 to 1 based on the number of interactions at the anchor regions.

HiChIP data analysis

The H3K27ac HiChIP data analysis from DLD1 WT/GATA6-KO cells was conducted using MAPS (v.2.0.0) (46). Loop interactions were detected with a 5-kb resolution and limited to a maximum distance of 1 Mb. Significant interactions were identified using a positive Poisson model, with a false discovery rate of 2% adhered to. MACS2 (69) was used to call H3K27ac narrow peaks using the parameter setting “-p narrow.” Loops underwent further filtration based on H3K27ac binding peaks. The analysis included differential loop analysis and annotation of promoter-promoter (P-P), promoter-enhancer (P-E), and enhancer-enhancer (E-E) loops, as detailed below.

Differential loops analysis

The lost/gain loops before and after GATA6 KO were identified based on the loop probability from Peakachu (v.2.2.post1) (44) calculated based on Gaussian mixture model. Loops were detected by Peakachu for each sample first. Then, all loops detected were pooled and deduplicated. For each loop pair, the fold change of Peakachu probability and its reciprocal fold change served as inputs for a Gaussian mixture model. Comparison was then made between the probability scores of predicted loops in sample A and the scores of loop pairs in sample B. A cutoff of twofold change and a probability score threshold above 0.9 were set. For a detailed algorithm, refer to https://github.com/tariks/peakachu/tree/master/diffPeakachu.

Loops annotation and aggregation analysis

The loops were classified into several categories, including GATA6-associated loops and cobinding loops. GATA6-associated loops were identified if at least one loop anchor contained GATA6 CUTTAG binding peaks. These loops were further annotated on the basis of their characteristics into P-P, P-E, and E-E loops, guided by the presence of H3K27ac marks within the loop anchors. P-P loops were identified when both anchors contained gene promoters but lacked H3K27ac marks. In contrast, P-E loops were characterized by one anchor with a gene promoter and the other anchor marked by H3K27ac. E-E loops were defined as loops where both anchors exhibited H3K27ac peaks. Any loops that did not fit these specified features were labeled as “other.” All loop aggregation analysis and visualization were conducted using HiCPeaks (v.0.3.5) and the R package hictoolsr (v1.1.2).

Acknowledgments

We thank the Robert H. Lurie Comprehensive Cancer Center of Northwestern University in Chicago, IL, for the use of the Flow Cytometry Core Facility. We thank B. Zheng for providing the IAA7-AtAFB2 donor plasmids.

Funding: F.Y. is supported by NIH grants R35GM124820, 1R01HG009906, and R01HG011207. L.W. is supported by NIH grant R35GM146979. J.H.-y.W. is supported by T32 CA009560.

Author contributions: Conceptualization: F.Y., H.L., and X.C. Methodology: H.L., X.C., and F.Y. Software: X.C. and F.Y. Validation: H.L., Y.C., T.Z., J.H.-y.W., and L.S. Formal analysis: X.C., H.L., and J.W. Investigation: H.L., X.C., Y.C., T.Z., P.W., J.H.-y.W., L.S., L.W., and F.Y. Resources: L.Y.S. and G.Y. Data curation: X.C. and F.Y. Writing— original draft: H.L. and F.Y. Writing—review and editing: F.Y., H.L., J.H.-y.W., and J.W. Visualization: H.L., X.C., and F.Y. Supervision: F.Y., G.Y., and L.W. Project administration: F.Y. and H.L. Funding acquisition: F.Y. and L.H.W.

Competing interests: The authors declare that they have no competing interests.

Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. The processed and raw sequencing data CUT&Tag, ATAC-seq, Hi-C, HiChIP, and RNA-seq of cell lines as well as processed Hi-C data for primary samples have been deposited at Gene Expression Omnibus under accession code GSE266750. No custom software or tools were used in this study. All open-source tools used were listed in Materials and Methods.

Supplementary Materials

This PDF file includes:

Figs. S1 to S9

Tables S1 to S10

sciadv.ads4985_sm.pdf (9.5MB, pdf)

REFERENCES AND NOTES

  • 1.Siegel R. L., Wagle N. S., Cercek A., Smith R. A., Jemal A., Colorectal cancer statistics, 2023. CA Cancer J. Clin. 73, 233–254 (2023). [DOI] [PubMed] [Google Scholar]
  • 2.Xu H., Liu L., Li W., Zou D., Yu J., Wang L., Wong C. C., Transcription factors in colorectal cancer: Molecular mechanism and therapeutic implications. Oncogene 40, 1555–1569 (2021). [DOI] [PubMed] [Google Scholar]
  • 3.Li J., Ma X., Chakravarti D., Shalapour S., DePinho R. A., Genetic and biological hallmarks of colorectal cancer. Genes Dev. 35, 787–820 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Nateri A. S., Spencer-Dene B., Behrens A., Interaction of phosphorylated c-Jun with TCF4 regulates intestinal cancer development. Nature 437, 281–285 (2005). [DOI] [PubMed] [Google Scholar]
  • 5.Tenbaum S. P., Ordonez-Moran P., Puig I., Chicote I., Arques O., Landolfi S., Fernandez Y., Herance J. R., Gispert J. D., Mendizabal L., Aguilar S., Raon y Cajal S., Schwartz S. Jr., Vivancos A., Espin E., Rojas S., Baselga J., Tabernero J., Munoz A., Palmer H. G., Beta-catenin confers resistance to PI3K and AKT inhibitors and subverts FOXO3a to promote metastasis in colon cancer. Nat. Med. 18, 892–901 (2012). [DOI] [PubMed] [Google Scholar]
  • 6.Kim B. R., Na Y. J., Kim J. L., Jeong Y. A., Park S. H., Jo M. J., Jeong S., Kang S., Oh S. C., Lee D. H., RUNX3 suppresses metastasis and stemness by inhibiting Hedgehog signaling in colorectal cancer. Cell Death Differ. 27, 676–694 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Gormally M. V., Dexheimer T. S., Marsico G., Sanders D. A., Lowe C., Matak-Vinkovic D., Michael S., Jadhav A., Rai G., Maloney D. J., Simeonov A., Balasubramanian S., Suppression of the FOXM1 transcriptional programme via novel small molecule inhibition. Nat. Commun. 5, 5165 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Tran C., Ouk S., Clegg N. J., Chen Y., Watson P. A., Arora V., Wongvipat J., Smith-Jones P. M., Yoo D., Kwon A., Wasielewska T., Welsbie D., Chen C. D., Higano C. S., Beer T. M., Hung D. T., Scher H. I., Jung M. E., Sawyers C. L., Development of a second-generation antiandrogen for treatment of advanced prostate cancer. Science 324, 787–790 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Sun Y., Hu L., Tao Z., Jarugumilli G. K., Erb H., Singh A., Li Q., Cotton J. L., Greninger P., Egan R. K., Tony Ip Y., Benes C. H., Che J., Mao J., Wu X., Pharmacological blockade of TEAD-YAP reveals its therapeutic limitation in cancer cells. Nat. Commun. 13, 6744 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Henley M. J., Koehler A. N., Advances in targeting ‘undruggable’ transcription factors with small molecules. Nat. Rev. Drug Discov. 20, 669–688 (2021). [DOI] [PubMed] [Google Scholar]
  • 11.Kim S., Shendure J., Mechanisms of interplay between transcription factors and the 3D genome. Mol. Cell 76, 306–319 (2019). [DOI] [PubMed] [Google Scholar]
  • 12.Di Giammartino D. C., Kloetgen A., Polyzos A., Liu Y. Y., Kim D., Murphy D., Abuhashem A., Cavaliere P., Aronson B., Shah V., Dephoure N., Stadtfeld M., Tsirigos A., Apostolou E., KLF4 is involved in the organization and regulation of pluripotency-associated three-dimensional enhancer networks. Nat. Cell Biol. 21, 1179–1190 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Stadhouders R., Filion G. J., Graf T., Transcription factors and 3D genome conformation in cell-fate decisions. Nature 569, 345–354 (2019). [DOI] [PubMed] [Google Scholar]
  • 14.Wang R. T., Chen F. L., Chen Q., Wan X., Shi M. L., Chen A. K., Ma Z., Li G. H., Wang M., Ying Y. C., Liu Q. Y., Li H., Zhang X., Ma J. B., Zhong J. Y., Chen M. H., Zhang M. Q., Zhang Y., Chen Y., Zhu D. H., MyoD is a 3D genome structure organizer for muscle cell identity. Nat. Commun. 13, 205 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ramirez R. N., Chowdhary K., Leon J., Mathis D., Benoist C., FoxP3 associates with enhancer-promoter loops to regulate Treg-specific gene expression. Sci. Immunol. 7, eabj9836 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Hu Y. G., Figueroa D. S., Zhang Z. H., Veselits M., Bhattacharyya S., Kashiwagi M., Clark M. R., Morgan B. A., Ay F., Georgopoulos K., Lineage-specific 3D genome organization is assembled at multiple scales by IKAROS. Cell 186, 5269–5289 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Wang P., Tang Z., Lee B., Zhu J. J., Cai L., Szalaj P., Tian S. Z., Zheng M., Plewczynski D., Ruan X., Liu E. T., Wei C. L., Ruan Y., Chromatin topology reorganization and transcription repression by PML-RARα in acute promyeloid leukemia. Genome Biol. 21, 110 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kato H., Tateishi K., Iwadate D., Yamamoto K., Fujiwara H., Nakatsuka T., Kudo Y., Hayakawa Y., Ijichi H., Otsuka M., Kishikawa T., Takahashi R., Miyabayashi K., Nakai Y., Hirata Y., Toyoda A., Morishita S., Fujishiro M., HNF1B-driven three-dimensional chromatin structure for molecular classification in pancreatic cancers. Cancer Sci. 114, 1672–1685 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Debruyne D. N., Dries R., Sengupta S., Seruggia D., Gao Y., Sharma B., Huang H., Moreau L., McLane M., Day D. S., Marco E., Chen T., Gray N. S., Wong K. K., Orkin S. H., Yuan G. C., Young R. A., George R. E., BORIS promotes chromatin regulatory interactions in treatment-resistant cancer cells. Nature 572, 676–680 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Sanalkumar R., Dong R., Lee L., Xing Y. H., Iyer S., Letovanec I., La Rosa S., Finzi G., Musolino E., Papait R., Chebib I., Nielsen G. P., Renella R., Cote G. M., Choy E., Aryee M., Stegmaier K., Stamenkovic I., Rivera M. N., Riggi N., Highly connected 3D chromatin networks established by an oncogenic fusion protein shape tumor cell identity. Sci. Adv. 9, eabo3789 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Johnstone S. E., Reyes A., Qi Y., Adriaens C., Hegazi E., Pelka K., Chen J. H., Zou L. S., Drier Y., Hecht V., Shoresh N., Selig M. K., Lareau C. A., Iyer S., Nguyen S. C., Joyce E. F., Hacohen N., Irizarry R. A., Zhang B., Aryee M. J., Bernstein B. E., Large-scale topological changes restrain malignant progression in colorectal cancer. Cell 182, 1474–1489.e23 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kim K., Kim M., Lee A. J., Song S. H., Kang J. K., Eom J., Kim Y. J., Kang G. H., Bae J. M., Min S., Kim Y., Lim Y., Kim H. S., Kim T. Y., Jung I., Spatial and clonality-resolved 3D cancer genome alterations reveal enhancer-hijacking as a potential prognostic marker for colorectal cancer. Cell Rep. 42, 112778 (2023). [DOI] [PubMed] [Google Scholar]
  • 23.Whissell G., Montagni E., Martinelli P., Hernando-Momblona X., Sevillano M., Jung P., Cortina C., Calon A., Abuli A., Castells A., Castellvi-Bel S., Nacht A. S., Sancho E., Stephan-Otto Attolini C., Vicent G. P., Real F. X., Batlle E., The transcription factor GATA6 enables self-renewal of colon adenoma stem cells by repressing BMP gene expression. Nat. Cell Biol. 16, 695–707 (2014). [DOI] [PubMed] [Google Scholar]
  • 24.Heslop J. A., Pournasr B., Liu J. T., Duncan S. A., GATA6 defines endoderm fate by controlling chromatin accessibility during differentiation of human-induced pluripotent stem cells. Cell Rep. 35, 109145 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kawasaki Y., Matsumura K., Miyamoto M., Tsuji S., Okuno M., Suda S., Hiyoshi M., Kitayama J., Akiyama T., REG4 is a transcriptional target of GATA6 and is essential for colorectal tumorigenesis. Sci. Rep. 5, 14291 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Shen F., Li J., Cai W., Zhu G., Gu W., Jia L., Xu B., GATA6 predicts prognosis and hepatic metastasis of colorectal cancer. Oncol. Rep. 30, 1355–1361 (2013). [DOI] [PubMed] [Google Scholar]
  • 27.Sharma A., Wasson L. K., Willcox J. A., Morton S. U., Gorham J. M., DeLaughter D. M., Neyazi M., Schmid M., Agarwal R., Jang M. Y., Toepfer C. N., Ward T., Kim Y., Pereira A. C., DePalma S. R., Tai A., Kim S., Conner D., Bernstein D., Gelb B. D., Chung W. K., Goldmuntz E., Porter G., Tristani-Firouzi M., Srivastava D., Seidman J. G., Seidman C. E., Pediatric Cardiac Genomics Consortium , GATA6 mutations in hiPSCs inform mechanisms for maldevelopment of the heart, pancreas, and diaphragm. eLife 9, e53278 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Li Q. V., Dixon G., Verma N., Rosen B. P., Gordillo M., Luo R. H., Xu C. L., Wang Q., Soh C. L., Yang D. P., Crespo M., Shukla A., Xiang Q., Dündar F., Zumbo P., Witkin M., Koche R., Betel D., Chen S. B., Massague J., Garippa R., Evans T., Beer M. A., Huangfu D. W., Genome-scale screens identify JNK-JUN signaling as a barrier for pluripotency exit and endoderm differentiation. Nat. Genet. 51, 999–1010 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Tsuji S., Kawasaki Y., Furukawa S., Taniue K., Hayashi T., Okuno M., Hiyoshi M., Kitayama J., Akiyama T., The miR-363-GATA6-Lgr5 pathway is critical for colorectal tumourigenesis. Nat. Commun. 5, 3150 (2014). [DOI] [PubMed] [Google Scholar]
  • 30.Shi Z. D., Lee K., Yang D., Amin S., Verma N., Li Q. V., Zhu Z., Soh C. L., Kumar R., Evans T., Chen S., Huangfu D., Genome editing in hPSCs reveals GATA6 haploinsufficiency and a genetic interaction with GATA4 in human pancreatic development. Cell Stem Cell 20, 675–688.e6 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Corces M. R., Granja J. M., Shams S., Louie B. H., Seoane J. A., Zhou W., Silva T. C., Groeneveld C., Wong C. K., Cho S. W., Satpathy A. T., Mumbach M. R., Hoadley K. A., Robertson A. G., Sheffield N. C., Felau I., Castro M. A. A., Berman B. P., Staudt L. M., Zenklusen J. C., Laird P. W., Curtis C., Cancer Genome Atlas Analysis, Greenleaf W. J., Chang H. Y., The chromatin accessibility landscape of primary human cancers. Science 362, eaav1898 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Li Q. L., Lin X., Yu Y. L., Chen L., Hu Q. X., Chen M., Cao N., Zhao C., Wang C. Y., Huang C. W., Li L. Y., Ye M., Wu M., Genome-wide profiling in colorectal cancer identifies PHF19 and TBC1D16 as oncogenic super enhancers. Nat. Commun. 12, 6407 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Dejosez M., Dall’Agnese A., Ramamoorthy M., Platt J., Yin X., Hogan M., Brosh R., Weintraub A. S., Hnisz D., Abraham B. J., Young R. A., Zwaka T. P., Regulatory architecture of housekeeping genes is driven by promoter assemblies. Cell Rep. 42, 112505 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Rao S. S. P., Huntley M. H., Durand N. C., Stamenova E. K., Bochkov I. D., Robinson J. T., Sanborn A. L., Machol I., Omer A. D., Lander E. S., Aiden E. L., A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Rao S. S. P., Huang S. C., Glenn St Hilaire B., Engreitz J. M., Perez E. M., Kieffer-Kwon K. R., Sanborn A. L., Johnstone S. E., Bascom G. D., Bochkov I. D., Huang X., Shamim M. S., Shin J., Turner D., Ye Z., Omer A. D., Robinson J. T., Schlick T., Bernstein B. E., Casellas R., Lander E. S., Aiden E. L., Cohesin loss eliminates all loop domains. Cell 171, 305–320.e24 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Nora E. P., Goloborodko A., Valton A. L., Gibcus J. H., Uebersohn A., Abdennur N., Dekker J., Mirny L. A., Bruneau B. G., Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell 169, 930–944.e22 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kubo N., Ishii H., Xiong X., Bianco S., Meitinger F., Hu R., Hocker J. D., Conte M., Gorkin D., Yu M., Li B., Dixon J. R., Hu M., Nicodemi M., Zhao H., Ren B., Promoter-proximal CTCF binding promotes distal enhancer-dependent gene activation. Nat. Struct. Mol. Biol. 28, 152–161 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Kentepozidou E., Aitken S. J., Feig C., Stefflova K., Ibarra-Soria X., Odom D. T., Roller M., Flicek P., Clustered CTCF binding is an evolutionary mechanism to maintain topologically associating domains. Genome Biol. 21, 5 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Fang C., Wang Z., Han C., Safgren S. L., Helmin K. A., Adelman E. R., Serafin V., Basso G., Eagen K. P., Gaspar-Maia A., Figueroa M. E., Singer B. D., Ratan A., Ntziachristos P., Zang C., Cancer-specific CTCF binding facilitates oncogenic transcriptional dysregulation. Genome Biol. 21, 247 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Luo Y., Hitz B. C., Gabdank I., Hilton J. A., Kagda M. S., Lam B., Myers Z., Sud P., Jou J., Lin K., Baymuradov U. K., Graham K., Litton C., Miyasato S. R., Strattan J. S., Jolanki O., Lee J. W., Tanaka F. Y., Adenekan P., O’Neill E., Cherry J. M., New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Res. 48, D882–D889 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.ENCODE Project Consortium , An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Kai Y., Andricovich J., Zeng Z., Zhu J., Tzatsos A., Peng W., Predicting CTCF-mediated chromatin interactions by integrating genomic and epigenomic features. Nat. Commun. 9, 4221 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Willi M., Yoo K. H., Reinisch F., Kuhns T. M., Lee H. K., Wang C., Hennighausen L., Facultative CTCF sites moderate mammary super-enhancer activity and regulate juxtaposed gene in non-mammary cells. Nat. Commun. 8, 16069 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Salameh T. J., Wang X., Song F., Zhang B., Wright S. M., Khunsriraksakul C., Ruan Y., Yue F., A supervised learning framework for chromatin loop detection in genome-wide contact maps. Nat. Commun. 11, 3428 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Li S., Prasanna X., Salo V. T., Vattulainen I., Ikonen E., An efficient auxin-inducible degron system with low basal degradation in human cells. Nat. Methods 16, 866–869 (2019). [DOI] [PubMed] [Google Scholar]
  • 46.Juric I., Yu M., Abnousi A., Raviram R., Fang R., Zhao Y., Zhang Y., Qiu Y., Yang Y., Li Y., Ren B., Hu M., MAPS: Model-based analysis of long-range chromatin interactions from PLAC-seq and HiChIP experiments. PLOS Comput. Biol. 15, e1006982 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Satoh K., Yachida S., Sugimoto M., Oshima M., Nakagawa T., Akamoto S., Tabata S., Saitoh K., Kato K., Sato S., Igarashi K., Aizawa Y., Kajino-Sakamoto R., Kojima Y., Fujishita T., Enomoto A., Hirayama A., Ishikawa T., Taketo M. M., Kushida Y., Haba R., Okano K., Tomita M., Suzuki Y., Fukuda S., Aoki M., Soga T., Global metabolic reprogramming of colorectal cancer occurs at adenoma stage and is induced by MYC. Proc. Natl. Acad. Sci. U.S.A. 114, E7697–E7706 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Kress T. R., Cannell I. G., Brenkman A. B., Samans B., Gaestel M., Roepman P., Burgering B. M., Bushell M., Rosenwald A., Eilers M., The MK5/PRAK kinase and Myc form a negative feedback loop that is disrupted during colorectal tumorigenesis. Mol. Cell 41, 445–457 (2011). [DOI] [PubMed] [Google Scholar]
  • 49.Sansom O. J., Meniel V. S., Muncan V., Phesse T. J., Wilkins J. A., Reed K. R., Vass J. K., Athineos D., Clevers H., Clarke A. R., Myc deletion rescues Apc deficiency in the small intestine. Nature 446, 676–679 (2007). [DOI] [PubMed] [Google Scholar]
  • 50.Lin X., Liu Y., Liu S., Zhu X., Wu L., Zhu Y., Zhao D., Xu X., Chemparathy A., Wang H., Cao Y., Nakamura M., Noordermeer J. N., La Russa M., Wong W. H., Zhao K., Qi L. S., Nested epistasis enhancer networks for robust genome regulation. Science 377, 1077–1085 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Zhu X., Qi C., Wang R., Lee J. H., Shao J., Bei L., Xiong F., Nguyen P. T., Li G., Krakowiak J., Koh S. P., Simon L. M., Han L., Moore T. I., Li W., Acute depletion of human core nucleoporin reveals direct roles in transcription control but dispensability for 3D genome organization. Cell Rep. 41, 111576 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Schuijers J., Manteiga J. C., Weintraub A. S., Day D. S., Zamudio A. V., Hnisz D., Lee T. I., Young R. A., Transcriptional dysregulation of MYC reveals common enhancer-docking mechanism. Cell Rep. 23, 349–360 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Zhu R., Wang H., TSPAN8 promotes cancer cell stemness via activation of sonic hedgehog signaling. Pancreas 49, 155–155 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Morgan R. G., Mortensson E., Williams A. C., Targeting LGR5 in colorectal cancer: Therapeutic gold or too plastic? Brit J Cancer 118, 1410–1418 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Cho W. K., Spille J. H., Hecht M., Lee C., Li C., Grube V., Cisse I. I., Mediator and RNA polymerase II clusters associate in transcription-dependent condensates. Science 361, 412–415 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.de Wit E., Bouwman B. A., Zhu Y., Klous P., Splinter E., Verstegen M. J., Krijger P. H., Festuccia N., Nora E. P., Welling M., Heard E., Geijsen N., Poot R. A., Chambers I., de Laat W., The pluripotent genome in three dimensions is shaped around pluripotency factors. Nature 501, 227–231 (2013). [DOI] [PubMed] [Google Scholar]
  • 57.Ahn J. H., Davis E. S., Daugird T. A., Zhao S., Quiroga I. Y., Uryu H., Li J., Storey A. J., Tsai Y. H., Keeley D. P., Mackintosh S. G., Edmondson R. D., Byrum S. D., Cai L., Tackett A. J., Zheng D., Legant W. R., Phanstiel D. H., Wang G. G., Phase separation drives aberrant chromatin looping and cancer development. Nature 595, 591–595 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Thompson J. J., Lee D. J., Mitra A., Frail S., Dale R. K., Rocha P. P., Extensive co-binding and rapid redistribution of NANOG and GATA6 during emergence of divergent lineages. Nat. Commun. 13, 4257 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Nomura S., Takahashi H., Suzuki J., Kuwahara M., Yamashita M., Sawasaki T., Pyrrothiogatain acts as an inhibitor of GATA family proteins and inhibits Th2 cell differentiation. Sci. Rep. 9, 17335 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Yang H., Zhang H., Luan Y., Liu T., Yang W., Roberts K. G., M.-x. Qian, Zhang B., Yang W., Perez-Andreu V., Xu J., Iyyanki S., Kuang D., Stasiak L. A., Reshmi S. C., Gastier-Foster J., Smith C., Pui C.-H., Evans W. E., Hunger S. P., Platanias L. C., Relling M. V., Mullighan C. G., Loh M. L., Yue F., Yang J. J., Noncoding genetic variation in GATA3 increases acute lymphoblastic leukemia risk through local and global changes in chromatin conformation. Nat. Genet. 54, 170–179 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Nakano Y., Imagawa S., Matsumoto K., Stockmann C., Obara N., Suzuki N., Doi T., Kodama T., Takahashi S., Nagasawa T., Yamamoto M., Oral administration of K-11706 inhibits GATA binding activity, enhances hypoxia-inducible factor 1 binding activity, and restores indicators in an in vivo mouse model of anemia of chronic disease. Blood 104, 4300–4307 (2004). [DOI] [PubMed] [Google Scholar]
  • 62.Zheng B., Aoi Y., Shah A. P., Iwanaszko M., Das S., Rendleman E. J., Zha D., Khan N., Smith E. R., Shilatifard A., Acute perturbation strategies in interrogating RNA polymerase II elongation factor function in gene expression. Genes Dev. 35, 273–285 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Kaya-Okur H. S., Wu S. J., Codomo C. A., Pledger E. S., Bryson T. D., Henikoff J. G., Ahmad K., Henikoff S., CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat. Commun. 10, 1930 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Corces M. R., Trevino A. E., Hamilton E. G., Greenside P. G., Sinnott-Armstrong N. A., Vesuna S., Satpathy A. T., Rubin A. J., Montine K. S., Wu B., Kathiria A., Cho S. W., Mumbach M. R., Carter A. C., Kasowski M., Orloff L. A., Risca V. I., Kundaje A., Khavari P. A., Montine T. J., Greenleaf W. J., Chang H. Y., An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Dobin A., Davis C. A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T. R., STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Li B., Dewey C. N., RSEM: Accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Subramanian A., Tamayo P., Mootha V. K., Mukherjee S., Ebert B. L., Gillette M. A., Paulovich A., Pomeroy S. L., Golub T. R., Lander E. S., Mesirov J. P., Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U.S.A. 102, 15545–15550 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Langmead B., Salzberg S. L., Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Zhang Y., Liu T., Meyer C. A., Eeckhoute J., Johnson D. S., Bernstein B. E., Nusbaum C., Myers R. M., Brown M., Li W., Liu X. S., Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Heinz S., Benner C., Spann N., Bertolino E., Lin Y. C., Laslo P., Cheng J. X., Murre C., Singh H., Glass C. K., Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Yu G., Wang L. G., He Q. Y., ChIPseeker: An R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 31, 2382–2383 (2015). [DOI] [PubMed] [Google Scholar]
  • 72.X. Wang, runHiC: A user-friendly Hi-C data processing software based on hiclib (Zenodo, 2016). [Google Scholar]
  • 73.Li H., Durbin R., Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Crane E., Bian Q., McCord R. P., Lajoie B. R., Wheeler B. S., Ralston E. J., Uzawa S., Dekker J., Meyer B. J., Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figs. S1 to S9

Tables S1 to S10

sciadv.ads4985_sm.pdf (9.5MB, pdf)

Articles from Science Advances are provided here courtesy of American Association for the Advancement of Science

RESOURCES