Abstract
Clinical divergence between patients harboring CIC-rearrangements is frequently observed. For example, the prototypical CIC::DUX4 fusion associates with soft tissue tumors while CIC::NUTM1 fusions typically localize to the central nervous system (brain/spinal cord). The basis for these differences is poorly understood due to a lack of molecular tools. To address this need, we generated patient-informed, synthetic coding sequences for CIC::NUTM1, CIC::LEUTX, and ATXN1::DUX4 and validated them in structure-function studies and in genetic zebrafish models. We found that CIC::NUTM1 drives a transcriptional program distinct from that of CIC::DUX4 due to a C-terminal NUTM1 functional domain, CIC::LEUTX weakly activates CIC target genes through LEUTX transactivation sequences, and ATXN1::DUX4 upregulates CIC target genes via the ATXN1 AXH domain. Our findings indicate that the CIC fusion binding partner may alter overall fusion oncoprotein activity. Implications: these first-generation synthetic tools illuminate partner gene-specific mechanistic biology while providing an unprecedented resource to study CIC-family fusions beyond CIC::DUX4 and allow for the dissection of this rare subgroup of cancers.
Introduction
Transcription factor (TF) fusion oncoproteins often possess the ability to rewire cellular states to promote cellular transformation and malignant progression. It has become increasingly clear that TF fusion oncogenes and the diseases they cause are not strictly limited to a 1:1 pairing – highly similar or identical fusions may cause different cancers, and groups of fusions that vary in one or both partner genes may collectively drive a singular family of tumors (termed fusion gene promiscuity, reviewed here (1,2)). Our research focuses on sarcomas, where several types of pediatric and young adult cancers are driven by fusion oncoproteins (e.g. Ewing sarcoma – EWSR1::FLI1, alveolar rhabdomyosarcoma – PAX3/7::FOXO1, synovial sarcoma – SS18:SSX1/2). These tumors also consist of relatively promiscuous fusions, where there remains debate over whether differences in the partner genes comprising the driver fusion oncogene may have implications for tumor biology and patient outcomes. In alveolar rhabdomyosarcoma for example, PAX3::FOXO1 vs PAX7::FOXO1 prognostic data are quite mixed (3–5), while recent isogenic modeling work in fibroblasts suggests that these two versions of the fusion may possess unique DNA binding sites and downstream gene activation (6). As molecular diagnostics improve in the clinic, an improved understanding of whether such differences in gene fusion partners translate to divergent mechanisms of tumorigenesis and patient outcomes will be important to tailor treatment strategies.
Our group studies the gene capicua (CIC), which encodes a developmental transcription factor (7) that acts as a default repressor and serves to negatively regulate the expression of cell cycle genes and MAPK pathway regulators (8,9). As a tumor suppressor, both genetic and functional inactivation of wild-type CIC is associated with several types of cancer (10–13). In a subset of undifferentiated round cell sarcoma, the C-terminal end of wild type (WT) CIC is replaced with another TF with transactivating capacity to generate a relatively promiscuous family of gene fusions. Mechanistically, the founding member of this family, CIC::DUX4, is thought to activate transcription through the recruitment of the histone acetyltransferases p300/CBP by the DUX4 transactivating domain to DNA binding sites determined by a largely intact CIC fragment (14–18). Recent studies have expanded the spectrum of CIC rearrangements beyond those involving DUX4, with the most common alternative 3’ partner genes being NUTM1 (19–24), LEUTX (25–28), and FOXO4 (29–31). These additional 3’ partners all possess evidence that they interact with p300/CBP (32–36), lending support to the current consensus that CIC-rearrangements may recruit p300/CBP to activate CIC target genes. Fusions with a 5’ gene partner of ATXN1 or ATXN1L, homologs which interact with CIC and modulate its stability (37,38), have also been reported with 3’ partners of DUX4 (39,40), NUTM1 (41), or NUTM2A (42). Where performed, the methylation profiles of tumors bearing these ATXN1-/ATXN1L-fusions have grouped with those harboring CIC rearrangements (39,40,42), suggesting they together comprise a larger group perhaps more aptly termed “CIC-family” fusion oncogenes.
The presumed mechanism of action of CIC-fusion oncogenes (p300/CBP recruitment to CIC binding sites on DNA) would imply that all CIC-rearranged cancers should drive equivalent molecular programs and would present similarly in the clinic, regardless of 3’ fusion partner. However, the available data suggest that there may be differences between CIC fusions harboring different 3’ partner genes. Transcriptionally, the limited RNA-sequencing data from patients shows that while CIC::DUX4, CIC::NUTM1, and CIC::FOXO4 positive tumors cluster away from other fusion-driven tumors, they also somewhat subdivide based on 3’ fusion partner (20). Structurally, we have previously observed that CIC::NUTM1 fusion sequences tend not to retain the CIC C1 domain, which is almost universally preserved in CIC::DUX4 fusions and appears to be required for maximal function of CIC::DUX4 (16). Anatomically, CIC::DUX4 tumors are largely a tumor of the soft tissue, while CIC::NUTM1 tumors appear to predominantly present in the central nervous system (CNS) and spine (19–21), as we have reviewed recently (43). Furthermore, the rare tumors with CIC-family fusions in the brain have recently been divided into two diagnostic entities: sarcomas and high-grade neuroepithelial tumors (HGNET-CIC) which cluster separately by methylation profiling and have distinct morphological features (25,44). Notably, while patient numbers are limited, the majority of HGNET-CIC tumors appear to carry the CIC::LEUTX rearrangement, while the distribution of fusion partners in CNS sarcoma with CIC-family rearrangement is much more varied (44). These comparisons all suffer from small patient numbers and scarce data, but collectively suggest that the manifestation of CIC-family fused tumors may vary based on the fusion partners.
These data are likely explained by at least one of two hypotheses. The first is that different combinations of CIC-family fusion partners may be more likely to arise in various cell types, which could explain their distinct anatomical locations. This is a difficult hypothesis to test as the cell of origin is not known for any CIC-family fusion. A second possibility is that the molecular mechanism of action for CIC-family fusions could vary depending on the partner genes and may not simply be activation of CIC target genes by p300/CBP. In this scenario, the unique functionality imparted by different partner genes could be sufficient to modify the overall behavior of the fusion and enable it to transform different cell types. This too is difficult to investigate, as there are very few tools to study CIC::DUX4 (a handful of plasmids (14,45), patient-derived cell lines (46–48), and transgenic mice (15) have been developed), and none of any kind exist to model the other CIC- or ATXN1/1L-fusions in the lab. Waiting to encounter a patient with a fusion-bearing tumor, cloning that fusion and/or using the tumor to establish xenografts or cell lines, and using those tools for modelling is painfully slow when the tumor type in question is exceedingly rare and understudied, as is the case here. However, if there are existing data describing the breakpoint sequence of the fusion in patients, an alternative approach is to manually clone synthetic but patient-informed fusion oncogenes from fragments of the partner genes. Such an approach is certainly subject to caveats, but is relatively quick and could allow for a head start on studying fusions that lack tools while traditional models are developed.
In this study we employ such an approach to create the first generation of molecular tools for studying non-CIC::DUX4 CIC-family fusions including CIC::NUTM1, CIC::LEUTX, and ATXN1::DUX4. We then employ these tools for structure-function studies to determine how the various partner genes contribute to the activity of their fusions, and how they may converge or diverge from each other. We find that CIC::NUTM1 fusions are tumorigenic in vivo, we show they are capable of driving a slightly distinct transcriptional program from that of CIC::DUX4, and we nominate a novel C-terminal NUTM1 functional domain which may explain this divergent behavior. We additionally show CIC::LEUTX to be a relatively weak transactivator, with dependence on two small C-terminal LEUTX motifs to execute this function. Finally, we show evidence to support the initial hypothesis (39) that ATXN1::DUX4 may function by colocalizing the DUX4 moiety with CIC via the ATXN1 AXH domain, yielding a new mechanism as an “indirect CIC fusion”. Together, these tools enable the first modeling of these ultra-rare fusion oncogenes and yield mechanistic insights that may help to explain why CIC-family fusions can diverge in clinical settings. Looking forward, we aim to lower the barrier to entry into these emerging fields of research by making these tools publicly available in the hopes that we may collectively work to develop better models and new treatments for cancers driven by CIC-family fusion oncogenes.
Materials and Methods
Cell Culture
293T cells (female) were purchased from ATCC (CRL-3216, RRID: CVCL_0063) in September 2021 and were secondarily short tandem repeat verified with the FTA Human (135-XV) Authentication Service (ATCC). Cells were managed as described previously (16). Briefly, cell lines were grown in humidified incubators at 37°C with 5% CO2 in DMEM with high glucose, L-glutamine, and sodium pyruvate (SH30243.02, Cytiva) supplemented with 10% FBS, 100 U/mL penicillin, and 100 ug/mL streptomycin. Cells were routinely screened for Mycoplasma using the e-Myco plus Mycoplasma PCR Detection Kit (Boca Scientific); for any given thawed cell line this typically included testing after use for the final experiment prior to disposal. Thawed cells were allowed to recover for approximately 48 hours prior to use in experiments. Cells were typically grown for no longer than one month before being replaced with a lower-passage vial.
Cloning
The pcDNA3.1-HA-CIC::DUX4 plasmid was a gift from Takuro Nakamura (14). All primers used for cloning are described in Supplementary Dataset S1. The pCMV3 empty plasmid, which served as a backbone for the cloned CIC- and ATXN1- synthetic fusion constructs, was derived from Sino Biologicals AG13105-CF by replacing the GFP-FLAG insert with a START-STOP motif via deletion PCR with the Q5 Site-Directed Mutagenesis Kit (NEB E0554). NEB 10-beta E. coli cells (NEB C3019) were used for most cloning, as in our hands they gave better efficiencies when cloning large plasmids. Full plasmid sequencing was performed by Plasmidsaurus using Oxford Nanopore Technology with custom analysis and annotation. Sanger sequencing was performed by Quintara Biosciences.
The synthetic fusions were modeled after the following patient reports (identifiers & references given): CIC(ex20)::NUTM1(ex6) – RNA012_16_073 (20); CIC(ex18)::NUTM1(ex3) – SARC084 / RNA012_16_074 (20); CIC::LEUTX – AS1 (26); ATXN1::DUX4 – the patient described in Satomi et al 2022 (39) (Supplementary Figure S1). The synthetic fusions were cloned with N-terminal HA tags into the pCMV3 empty backbone using NEB HiFi (NEB E2621), and an overview of the strategies is described in Supplementary Figure S2. The two CIC::NUTM1 fusions are comprised of CIC sequences from the pcDNA3.1-HA-CIC::DUX4 plasmid and NUTM1 sequences from pcDNA5 frt/to N-bioTAP-C-BRD4-NUT, a gift from Chris French (Addgene #171630, RRID:Addgene_171630) (49). The CIC::LEUTX fusion is comprised of a CIC sequence from Origene RC215209 (contains CIC [NM_015125] C-terminally Myc-DDK tagged in a pCMV6 backbone) and a LEUTX sequence synthesized by Twist Bioscience that was derived from NM_001382345.1. The ATXN1::DUX4 fusion is comprised of an ATXN1 sequence from Flag-ATXN1, a gift from Huda Zoghbi (Addgene #48189, RRID:Addgene_48189) (50) and a DUX4 sequence from pCW57.1-DUX4-WT, a gift from Stephen Tapscott (Addgene #99282, RRID:Addgene_99282) (51).
Tol2 transposon-based expression vectors were constructed for transgenic expression of CIC::NUTM1 in zebrafish as previously described (52). In brief, CIC(ex20)::NUTM1(ex6) and CIC(ex18)::NUTM1(ex3) fusion genes were cloned into a pDONR vector to generate the 3′ entry clones. Using the Multisite Gateway expression system, each 3′ entry clone, combined with a 5′ entry clone carrying the ubiquitin (ubi) promoter (Addgene plasmid #27320; gift from Leonard Zon (53), RRID:Addgene_27320) and a middle entry clone containing mTagBFP-T2A (derived from mTagBFP2-pBAD, a gift from Michael Davidson (54) (Addgene plasmid # 54572; RRID:Addgene_54572)), was assembled into the destination vector pDestTol2pA2 by LR reaction (Thermo Fisher Scientific), yielding the final CIC::NUTM1 expression constructs (Figure 2E). Tol2 transposase mRNA was synthesized in vitro from pCS2FA using the mMESSAGE mMACHINE kit (Applied Biosystems/Ambion).
Figure 2: Functional validation of the synthetic CIC::NUTM1 fusion oncoproteins.

(A) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or the indicated constructs, representative of three independent experiments. The HSP90 blot was aggregated with HA and ETV5 blots from replicate identically loaded, simultaneously processed samples. For this and subsequently applicable immunoblots, see Supplementary Dataset S2 for full details and complete Ponceau S loading controls.
(B) Immunofluorescence microscopy of 293T cells approximately 48 hours after transfection with the indicated constructs and stained with DAPI (DNA), rhodamine-phalloidin (Actin), or an anti-HA antibody. Scale bars indicate approximately 10 μm, representative cells are shown from one experiment.
(C) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV), full-length HA-CIC(ex18)::NUTM1(ex3), or truncated versions of HA-CIC(ex18)::NUTM1(ex3) that are comprised only of the HA tag motif and the fractional coding sequence of the indicated partner, split at the breakpoint. Representative of three independent experiments. The HSP90 blot was aggregated with HA and ETV5 blots from replicate identically loaded, simultaneously processed samples.
(D) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or one of three versions of HA-CIC(ex20)::NUTM1(ex6): full-length (FL), CIC C1 domain deleted (dC1), or CIC C1 domain and HMG box deleted (dC1 + dHMG). Representative of three independent experiments.
(E) Design of the Tol2 transposon-based expression constructs used to generate transgenic CIC::NUTM1-expressing zebrafish. ubi indicates the ubiquitin promoter, and the shaded regions of CIC and NUTM1 represent the HMG box, C1 domain, and TAD domains as shown in Figure 1.
(F) Incidence of tumors in transgenic zebrafish engineered with Tol2 constructs encoding CIC(ex20)::NUTM1(ex6) or CIC(ex18)::NUTM1(ex3). *** = p < 0.001 by Logrank (Mantel-Cox) test, chi square value of 14.62 with 1 degree of freedom. N = 39 and N = 46 BFP+ embryos were raised and monitored for the CIC(ex20)::NUTM1(ex6) and CIC(ex18)::NUTM1(ex3) groups, respectively.
(G) Merged brightfield and BFP fluorescence images of a representative fish bearing a CIC(ex20)::NUTM1(ex6)-driven tumor. Scale bar indicates 1mm.
(H) H&E staining of the tumor shown in (G). Scale bar represents 100 μm.
Because the synthetic fusions were cloned from various sources, the sequences of their partner genes do not always exactly match the respective reference transcript sequences. This is also true of the CIC::DUX4 coding sequence, as it was originally cloned from a patient tumor. The amino acid differences are described below as compared to reference sequences for CIC (NM_015125.5), DUX4 (NM_001306068.3), NUTM1 (NM_001284292.2), LEUTX (NM_001382345.1), and ATXN1 (NM_001128164.2):
CIC::DUX4 – CIC fragment has a silent mutation at C270 and a micro deletion of the reference G1443-T1444.
CIC(ex20)::NUTM1(ex6) – CIC fragment has a silent mutation at C270 and a micro deletion of the reference G1443-T1444. NUTM1 fragment has a silent mutation at G867.
CIC(ex18)::NUTM1(ex3) - CIC fragment has a silent mutation at C270. NUTM1 fragment has a P50L mutation, a silent mutation at D103, a series of silent mutations involving A366-K368 and R370-P372, and a silent mutation at G867.
CIC::LEUTX – no differences from reference sequences.
ATXN1::DUX4 – ATXN1 fragment has a silent mutation at V118, only two Q residues in the CAG trinucleotide repeat region (reference Q197-Q225), and a silent mutation at L233.
Since the CIC::LEUTX fusion was cloned with a different CIC donor than that used for CIC::NUTM1 (which derived their CIC sequences from CIC::DUX4), there are minor differences in the CIC sequences between the two, of which the only coding difference is that CIC::LEUTX does not have deletion of G1443-T1444. Please also note that the pcDNA3.1-HA-CIC::DUX4 plasmid backbone has minor differences from the pCMV3 backbone used for the other fusions, specifically that it has AmpR instead of KanR, NeoR instead of HygR, and a slightly different CMV promoter-enhancer element. The pCMV3 backbone also possesses a chimeric intron element between the CMV promoter and the start of the coding sequences which the pcDNA3.1 vector does not. Despite these differences, we routinely observed clear expression of all fusions regardless of their backbone or their mutations, though we cannot rule out plasmid-intrinsic effects. The empty vector control for most experiments was the empty pCMV3 construct, but was a previously used backbone (16) more similar to the pcDNA3.1 backbone for some experiments only involving CIC::DUX4.
All deletion mutants were made with the Q5 Site-Directed Mutagenesis Kit (NEB E0554). The HA-CIC::DUX4 +E and +E.2 mutants were made with NEB HiFi (NEB E2621). The HA-CIC::DUX4 +linker+E.2 construct was made with the Q5 Site-Directed Mutagenesis Kit (NEB E0554), with a (GGSG)x3 linker whose sequence was derived from iGEM, part number BBa_K243006.
The amino acid coordinates for specific mutants and functional domains that were mutated are listed below, with the syntax of…
(mutant/domain name): [first_residue]-[last_residue], (specific coding sequence), {notes on the source of domain annotation or mutant identity}
… where the number of the residues refers to their position in the fusion coding sequence. The first residue is the initiating methionine at the start of the 3x HA-linker motif. Fully annotated Snapgene or Genbank files of the coding sequences for all five fusions are available at https://github.com/cuylerluck/CICfamily_models. Where applicable, the source of the domain annotation is given.
HA-CIC (truncated): M1-T1505, pCMV3-HA-CIC(ex18)::NUTM1(ex3), i.e. removes the NUTM1 fragment
HA-NUTM1 (truncated): M1-A63 + A1506-Q2632, pCMV3-HA-CIC(ex18)::NUTM1(ex3), i.e. removes the CIC fragment
CIC C1 domain: R1525-M1580, pCMV3-HA-CIC(ex20)::NUTM1(ex6), sequence originally from Forés et al 2017 (55)
CIC HMG box: I263-K331, pCMV3-HA-CIC(ex20)::NUTM1(ex6), sequence from UniProt (RRID: SCR_002380) entry Q96RK0
NUTM1 TAD region: W1645-R1738, pCMV3-HA-CIC(ex20)::NUTM1(ex6), sequence from Yu et al 2023 (35) (the ΔF1c mutant)
NUTM1 A region: P1625-D1771, pCMV3-HA-CIC(ex20)::NUTM1(ex6)
NUTM1 B region: G1772-K1921, pCMV3-HA-CIC(ex20)::NUTM1(ex6)
NUTM1 C region: Q1922-C2071, pCMV3-HA-CIC(ex20)::NUTM1(ex6)
NUTM1 D region: S2072-Q2221, pCMV3-HA-CIC(ex20)::NUTM1(ex6)
NUTM1 E region: E2222-Q2400, pCMV3-HA-CIC(ex20)::NUTM1(ex6)
NUTM1 E.1 region: E2222-E2281, pCMV3-HA-CIC(ex20)::NUTM1(ex6)
NUTM1 E.2 region: D2282-G2341, pCMV3-HA-CIC(ex20)::NUTM1(ex6)
NUTM1 E.3 region: K2342-Q2400, pCMV3-HA-CIC(ex20)::NUTM1(ex6)
LEUTX TAD1: S1743-P1751, pCMV3-HA-CIC::LEUTX, sequence from Gawriyski et al 2023 (34)
LEUTX TAD2: I1727-L1736, pCMV3-HA-CIC::LEUTX, sequence from Katayama et al 2018 (56)
ATXN1 AXH domain: S598-G729, pCMV3-HA-ATXN1::DUX4, sequence from Uniprot entry P54253
Plasmid Transfections
Typically, 1.5 μg of each plasmid was reverse transfected into either 300,000 or 500,000 293T cells (depending on the experiment) in one well of a 6-well plate. Plasmid mixes were prepared with FuGENE 6 (Promega) at a 2:1 FuGENE:DNA ratio in Opti-MEM (Gibco). Transfected cells were usually used for downstream experiments approximately 48 hours after transfection.
Western Blot
Western blots were performed in line with previous work (16). At the time of protein harvest, adherent cell-containing plates were placed on ice, media was aspirated, and cells were gently washed 3 times with ice-cold Dulbecco’s Phosphate-Buffered Saline (DPBS). RIPA buffer supplemented with Halt protease and phosphatase inhibitors (Thermo Fisher Scientific) was then added, and cells were mechanically disrupted with cell scrapers. Cell suspensions were then incubated on ice for at least 15 minutes before sonication and centrifugation, after which the supernatant was stored at −20°C or −80°C until analysis.
Protein lysates were quantitated with the Pierce BCA kit (Thermo Fisher Scientific), normalized, boiled, and separated by denaturing electrophoresis on Criterion TGX 4–15% gels (Bio-Rad), with Precision Plus Protein Dual Color Standards (Bio-Rad) used as ladders. Proteins were transferred to nitrocellulose membranes by the Trans-Blot Turbo system (Bio-Rad), evaluated for transfer by Ponceau S staining (Sigma-Aldrich), and blocked in 5% Bovine Serum Albumin (BSA) in Tris Buffered Saline with 0.1% Tween 20 (TBS-T) for at least 1 hour. Membranes were cut horizontally for each target of interest and incubated in primary antibody diluted in blocking buffer on a cold room rotator overnight. Separate blots from identically loaded samples run at the same time were used to visualize targets of similar sizes. Where applicable, this is described in figure legends, and these and other per-blot details and full Ponceau S loading controls are shown in Supplementary Dataset S2. The next day, blots were washed 3x for 5 minutes in TBS-T, incubated in secondary antibody diluted in blocking buffer for 1 hour on a room temperature rotator, and again washed 3x for 5 minutes in TBS-T. Blots were imaged on a Bio-Rad ChemiDoc Touch using ECL Prime reagent (Millipore Sigma) by briefly drying the blot, submerging the blot in ECL Prime mixture, dabbing excess solution off, and imaging. When required, brightness/contrast was adjusted either on the ChemiDoc or using Image Lab (Bio-Rad). Please note that while HSP90 is shown for most blots as a presence/absence loading control, it was not optimized for quantitation. For better analysis of quantitation between lanes, please refer to the matched Ponceau S stains in Supplementary Dataset S2.
Antibodies and dilutions or concentrations used were: anti-HA clone C29F4 (Cell Signaling Technology 3724, 1:1000–1:5000, RRID:AB_1549585), anti-HSP90 (Cell Signaling Technology 4874, 1:1000, RRID:AB_2121214), anti-ETV5 clone E5G9V (Cell Signaling Technology 16274, 1:1000, RRID:AB_3717441), anti-FOXB1 (Invitrogen PA5–28134, 1:1000, RRID:AB_2545610), anti-SOX2 (Cell Signaling Technology 2748, 1:1000, RRID:AB_823640), anti-FOXC2 (Proteintech 23066–1-AP, 1:1000, RRID:AB_2879204), anti-FOXD3 (Abcam ab64807, 1:1000, RRID:AB_2105420), anti-FOXG1 (Abcam ab18259, 1 ug/mL, RRID:AB_732415), anti-PARP (Cell Signaling Technology 9542, RRID:AB_2160739), and horseradish peroxidase-linked anti-Rabbit IgG (Cell Signaling Technology 7074, 1:3000, RRID:AB_2099233).
Immunofluorescence
Immunofluorescence was performed similarly to that in previous work (16). 293T cells were transfected as described above with the appropriate constructs. The next day, transfected cells were transferred onto poly-L-lysine (0.01%, Sigma-Aldrich P4707–50mL) coated coverslips and allowed to adhere overnight. Then, coverslips were briefly and gently rinsed once with DPBS, fixed with 4% paraformaldehyde in PBS at 37°C for 10 minutes, washed 1x with DPBS, quenched with 100 mM glycine at RT for 30 minutes, washed 2x with DPBS, and permeabilized with 0.2% Triton X-100 for 15 minutes at RT. Coverslips were than washed 3x for 5 minutes with DPBS and blocked for one hour in 2% BSA in DPBS for one hour at RT. Next, coverslips were inverted onto solutions of primary antibody diluted in blocking buffer for 1–2 hours at RT and then washed 3x for 5 minutes with DPBS. Coverslips were then again inverted onto solutions containing secondary antibody, DAPI (Invitrogen D1306, 1–5 mg/mL, diluted 1:1000–1:2000), and Rhodamine-Phalloidin (Invitrogen R415, 1:400) in blocking buffer for one hour in the dark. After a final set of 3× 5 minute DPBS washes, coverslips were mounted onto slides with Prolong Glass mounting media (Invitrogen P36984) and allowed to cure overnight at RT in the dark. Cells from the CIC::NUTM1 experiment were imaged with a Yokogawa CSU-X1 spinning disk confocal on a Nikon Ti-E microscope equipped with a Photometrics cMYO cooled CCD camera using a 60x/1.45NA lens (Nikon) with NIS-Elements AR (v5.21.03, Nikon) software at consistent exposure times and laser powers. Cells from the ATXN1::DUX4 experiment were imaged with a ZEISS Axio Imager 2 with ZEISS ZEN 2 (blue edition, version 2.0.0.0) software and a 40x/1.4 Plan-Apochromat oil objective (ZEISS) at consistent exposure times and laser powers. Images were processed using FIJI/ImageJ (https://github.com/fiji/fiji). Briefly, multiple planes were converted to single images by maximum intensity projection, and brightness/contrast was consistently adjusted between all images.
For the CIC::NUTM1 experiment, the primary antibody and dilution used was: anti-HA-tag clone C29F4 (Cell Signaling Technology 3724, dilution 1:300, RRID:AB_1549585). The accompanying secondary antibody and dilution used was: anti-Rabbit IgG Alexa 647 (Invitrogen A-21245, dilution 1:300, RRID:AB_2535813). For the ATXN1::DUX4 experiments, the primary antibodies used were: anti-CIC (Invitrogen PA5–83721, dilution 1:200, RRID:AB_2790874) and anti-HA-tag clone 6E2 with conjugated Alexa 488 (Cell Signaling Technology 2350, dilution 1:1000, RRID:AB_491023). The accompanying secondary antibody and dilution used was: anti-Rabbit IgG Alexa 647 (Invitrogen A-21245, dilution 1:500, RRID:AB_2535813).
RNA Sequencing and Analysis
The RNA sequencing data in this study were generated from three separate experiments, which are designated by the month and year they were executed. These designations delineate which supplemental files correspond to their respective experiments. The experiments and the plasmid transfections they included were:
April 2023 – EV, HA-CIC::DUX4, and HA-CIC(ex20)::NUTM1(ex6). These data are shown in Figure 3, and the processed edgeR comparisons for each fusion versus empty vector control are in Supplementary Datasets S3–S4.
May 2024 – EV, HA-CIC::DUX4, HA-CIC(ex20)::NUTM1(ex6), HA-CIC(ex20)::NUTM1(ex6) dTAD, HA-CIC(ex20)::NUTM1(ex6) E, HA-CIC(ex18)::NUTM1(ex3), HA-CIC::LEUTX. These data are shown in Figures 5 and 6, and Supplementary Figure S9, and the processed edgeR comparisons for each full-length fusion versus empty vector control are in Supplementary Datasets S5–S8. The lists of E ignorer and E responder genes are in Supplementary Datasets S9–S10.
February 2025 – EV, HA-ATXN1::DUX4, HA-ATXN1::DUX4 delAXH. These data are shown in Figure 7 and Supplementary Figure S11, and the processed edgeR comparisons for each fusion versus empty vector control are in Supplementary Datasets S11–S13.
Figure 3: Differential gene expression analysis identifies FOXB1 as a CIC::NUTM1 fusion specific target.

(A) edgeR multi-dimensional scaling plot of RNA-seq data from 293T cells 48 hours after transfection with empty vector (EV) or the indicated fusion-expressing constructs.
(B) Row-scaled RNA-seq expression heatmap of 464 significantly upregulated genes (log2 fold change > 1.5 and q-value < 0.001) in at least one of the fusion-transfected 293T conditions vs EV.
(C) log2 counts per million plots of the indicated genes in 293T cells 48 hours after transfection with empty vector (EV) or the indicated fusion-expressing constructs. Per edgeR quasi-likelihood F-test results, ETV1, ETV4, ETV5, and FOXC2 were significantly activated (log2 fold change > 1.5 and q-value < 0.001) in both the HA-CIC::DUX4 and HA-CIC(ex20)::NUTM1(ex6) conditions vs EV. However, FOXD3, FOXB1, and SOX2 were only significantly activated in the HA-CIC(ex20)::NUTM1(ex6) condition vs EV. In the HA-CIC(ex20::NUTM1(ex6) vs EV comparison, FOXG1 met the significance threshold but was slightly under the log2 fold change threshold (value: 1.17), but was included due to its role in brain development. FOXG1 did not meet either significance cutoff in HA-CIC::DUX4 vs EV. q-values from the respective edgeR results are indicated for specific comparisons. Individual data points reflect one of three replicates, and bars indicate mean values.
(D) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or the indicated constructs, representative of the same three independent experiments as in Figure 2A but with a different replicate shown. The following sets of blots were all aggregated from replicate identically loaded, simultaneously processed samples: SOX2 and FOXC2; HA, ETV5, and FOXB1; HSP90 and FOXG1; FOXD3.
(E) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV), full-length HA-CIC(ex18)::NUTM1(ex3), or truncated versions of HA-CIC(ex18)::NUTM1(ex3) that are comprised only of the HA tag motif and the fractional coding sequence of the indicated partner, split at the breakpoint. Representative of the same three independent experiments as in Figure 2C but with a different replicate shown. The HSP90 and FOXB1 blots were aggregated with HA and ETV5 blots from replicate identically loaded, simultaneously processed samples.
(F) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or one of three versions of HA-CIC(ex20)::NUTM1(ex6): full-length (FL), CIC C1 domain deleted (dC1), or CIC C1 domain and HMG box deleted (dC1 + dHMG). Representative of the same three independent experiments as in Figure 2D but with a different replicate shown.
Figure 5: The NUTM1 E region permits activation of a gene program beyond core CIC/CIC::DUX4 target genes.

(A) Row-scaled RNA-seq expression heatmap of 577 genes significantly activated (log2 fold change > 2 and q-value < 0.00001) in the HA-CIC(ex20)::NUTM1(ex6) transfected 293T cells vs EV comparison. Values are shown for cells transfected with EV, full-length HA-CIC(ex20)::NUTM1(ex6) (FL), and HA-CIC(ex20)::NUTM1(ex6) with the E region deleted (E). E ignorers and responders as defined in panel B are shaded in on the left.
(B) Description of how the E ignorer and E responder gene sets were defined. FL and E-deleted refer to the HA-CIC(ex20)::NUTM1(ex6) full-length or E region deletion constructs, respectively. For both gene sets, significant activation in the FL vs empty vector (EV) comparison used cutoffs of log2 fold change > 2 and q-value < 0.00001. For the E ignorer gene set, significant activation in the E-deleted vs EV comparison was also defined as log2 fold change > 2 and q-value < 0.00001. For the E responder gene set, no significant change in the E-deleted vs EV comparison was decided using stricter cutoffs of an absolute value log2 fold change < 1, or a q-value > 0.01, or both.
(C) Selected gene set enrichment analysis output using gProfiler2 on the E responder and E ignorer gene sets.
(D) Depiction of where the E responder and E ignorer gene sets fall on RNA-seq volcano plots of comparisons between 293T transfected with the indicated fusion-expressing constructs vs. empty vector (EV). The percentages indicate what proportion of the specified gene set meets the thresholds for being significantly activated in a given volcano plot (log2 fold change > 2 and q-value < 0.00001).
Figure 6: CIC::LEUTX is a relatively weak activator of CIC target genes in a manner dependent on its two C-terminal transactivation sequences.

(A) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or constructs expressing the indicated fusion oncoproteins. The experiment was performed once but replicated in a similar experiment in Supplementary Figure S10.
(B) Immunoblot of 293T cells 48 hours after transfection with empty vector (EV) or constructs expressing the following HA-CIC::LEUTX sequences: full-length (FL), dTAD1, and dTAD1&2, where the dTAD mutants have deletion of the indicated LEUTX C-terminal amino acids. Representative of three independent experiments. The HA and ETV5 blots were aggregated with an HSP90 blot from replicate identically loaded, simultaneously processed samples.
(C) Row-scaled RNA-seq expression heatmap of 326 genes significantly activated (log2 fold change > 2 and q-value < 0.00001) in at least one of the HA-CIC::DUX4 or HA-CIC::LEUTX transfected 293T cells vs EV comparisons.
(D) Volcano plot of RNA-seq expression data from 293T cells 48 hours after transfection with empty vector (EV) or a construct encoding HA-CIC::LEUTX. Select CIC/CIC::DUX4/CIC::NUTM1 target genes are indicated.
Figure 7: ATXN1::DUX4 activates select CIC target genes in an AXH domain-dependent manner.

(A) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or constructs expressing the full-length (FL) or AXH domain deleted (delAXH) versions of HA-ATXN1::DUX4. One representative replicate of three independent experiments is shown. The HA and ETV5 blots were aggregated with an HSP90 blot from replicate identically loaded, simultaneously processed samples.
(B) Immunofluorescence of 293T cells approximately 48 hours after transfection with empty vector (EV) or the indicated constructs and stained with DAPI (DNA), rhodamine-phalloidin (Actin), an anti-HA antibody, or an anti-CIC antibody. Scale bars indicate approximately 10 μm, representative cells are shown from one of two experiments, with three fields each imaged for the fusion-bearing conditions in each experiment.
(C) edgeR multi-dimensional scaling plot of RNA-seq data from 293T cells 48 hours after transfection with empty vector (EV) or the indicated fusion-expressing constructs.
(D) Row-scaled RNA-seq expression heatmap of 57 genes significantly activated (log2 fold change > 1 and q-value < 0.001) in at least one of the indicated comparisons. Values are shown for cells transfected with EV, HA-ATXN1::DUX4, and AXH-deleted HA-ATXN1::DUX4 (delAXH).
(E) Volcano plot of RNA-seq expression data from 293T cells approximately 48 hours after transfection with the indicated constructs.
All experiments were performed in 293T cells, with plasmids separately reverse transfected as described above into either 300,000 (April 2023, May 2024) or 500,000 (February 2025) 293T cells per transfection. For each experiment, three replicate sets of transfections were performed on different days. After approximately 48 hours, RNA was harvested from transfected cells with the RNeasy Mini kit (Qiagen) including an on-column DNase digest (Qiagen). The April 2023 experiment also included an extra, short Buffer RPE wash. The May 2024 and February 2025 experiments were then additionally purified with the Monarch Spin RNA Cleanup Kit (NEB). RNA from all experiments was checked for purity with a TapeStation (Agilent) by either ourselves, Novogene, or both, with RIN values of 9.5 or higher. RNA was then submitted to Novogene for library preparation and sequencing as described previously (16). Samples were polyA enriched and libraries were generated using the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (NEB). Libraries were PE150 sequenced on a NovaSeq 6000 (April 2023) or NovaSeq X Plus (May 2024, February 2025). At least 6G of data (20 million paired-end 150bp reads) were generated for all samples.
The April 2023 experiment initially included an HA-CIC(ex18)::NUTM1(ex3) condition, which was excluded due to poor QC metrics in one of the replicates and not deposited in the NCBI Gene Expression Omnibus (GEO, RRID: SCR_005012) (57). FASTQ files for all other data showed no concerning quality indications per both the Novogene report and our own FastQC analysis (RRID: SCR_014583) (58). Full code describing the processing and analysis of the data from each experiment is available at https://github.com/cuylerluck/CICfamily_models. Processing and analysis was similar to previous work (16). Briefly, for a given experiment, FASTQ files were aligned to the human GRCh38 genome with STAR (RRID: SCR_004463) (59), including quantitation of gene counts using the option quantMode GeneCounts. Uniquely mapped read rates were between 87–94% for samples across all experiments. Custom R scripts (version 4.2.2) (60–69)(70), also available at the same GitHub repository as above, were used to separately perform differential gene expression analysis for each experiment. Briefly, for a given experiment, column 4 from STAR GeneCount output was merged between all samples. Then, edgeR (version 3.40.2, RRID: SCR_012802) (71–74) was used for differential expression analysis with a GLM method and quasi-likelihood F-tests, with the resulting P-values then FDR corrected. Log2(counts per million) values were extracted from the trimmed mean of M-values (TMM)-normalized datasets using the cpm function. Where applicable, gene list functional profiling was performed using the gProfiler2 (75) package with multiple testing correction using the built-in gSCS method. The third replicate of the HA-ATXN1::DUX4 sample from the February 2025 experiment was found to be an outlier on an MDS plot, so we excluded the whole replicate and analyzed those data as duplicates using replicates 1 and 2. Consequently, the third replicate data was not deposited in GEO. For all analyzed data, differential expression results and lists of the E-ignorer and E-responder genes are available in Supplementary Datasets S3–S13. The raw FASTQ files, log2(cpm) data, and counts data have been deposited in NCBI GEO (57) and are accessible through GEO Series accession numbers GSE295624, GSE295623, and GSE295625 for the April 2023, May 2024, and February 2025 datasets, respectively.
CIC::DUX4 and CIC::NUTM1 Patient RNA-seq Analysis
The RNA-seq data used for this analysis were described in Watson et al. J. Pathol 2018 (20) and are housed in the European Genome-Phenome Archive as dataset EGAD00001003121. The code we used to process and analyze these data are available at https://github.com/cuylerluck/CICfamily_models. Briefly, fastq files were aligned to the GRCh38 human genome and gene counts were simultaneously generated using STAR. Three of the samples were sequenced on a different sequencer model than the other eight and showed significantly lower unique mapping rates (around 76% vs low 90% on average), so we then batch-corrected raw gene counts by sequencer model using ComBat-seq (76), with 3’ fusion partner (DUX4 or NUTM1) as a preserved group of interest. Batch-corrected counts were then processed with a standard edgeR differential expression analysis, with an exact test used to generate p-values followed by FDR correction. Counts per million values were extracted from the TMM-normalized datasets using the cpm function.
Zebrafish Husbandry
Zebrafish (Danio rerio) were maintained in adherence to established industry standards within an AALAAC-accredited facility. WIK wild-type fish were sourced from the ZIRC Zebrafish International Resource Center (https://zebrafish.org). All experiments using zebrafish were done under a protocol approved by the Children’s Hospital Los Angeles Institutional Animal Care and Use Committee.
Zebrafish Microinjections
Zebrafish embryos were injected at the single-cell stage with 2 nL of a mixture containing 50 ng/μL Tol2 transposase mRNA, 30 ng/μL CIC::NUTM1 expression construct, 0.1% phenol red, and 0.3× Danieau’s buffer.
Tumor Monitoring and Collection
Injected embryos were screened at 3 days post-fertilization (dpf) under a Leica fluorescence stereomicroscope to identify individuals expressing BFP, confirming transgene integration and expression. A total of 39 BFP-positive embryos injected with CIC(ex20)::NUTM1(ex6) expression vector and 46 embryos injected with CIC(ex18)::NUTM1(ex3) were selected and raised for tumor formation monitoring. Tumor incidence was recorded and analyzed using Kaplan–Meier survival curves in GraphPad Prism 10.6.1.
Fish developing visible tumors were imaged and humanely euthanized, then fixed in 4% paraformaldehyde (PFA) in 1× phosphate-buffered saline (PBS) for 48 hours at 4°C. Samples were subsequently decalcified in 0.5 M EDTA (pH 8.0) for 5 days and processed for paraffin embedding and sectioning. Hematoxylin and eosin (H&E) staining was performed on deparaffinized sections for histopathological evaluation.
Propidium Iodide Cell Cycle Analysis by Flow Cytometry
293T cells (500,000 or 300,000 per condition, depending on the replicate) were reverse transfected with 1.5 μg of the appropriate plasmid as described above and incubated for approximately 48 hours. Following incubation, transfected cells were trypsinized, centrifuged, and counted, and approximately 1 million cells per condition were aliquoted to a new conical for centrifugation and media aspiration before slow resuspension in ice-cold 70% ethanol under gentle vortexing. Samples were stored at −20°C for several days prior to staining and analysis.
On the day of flow cytometry analysis, samples were removed to ice and cells were pelleted prior to washing with DPBS. Samples were then again pelleted and washed once with flow cytometry buffer (DPBS + 2% FBS + 1 mM EDTA). Samples were then pelleted and resuspended in propidium iodide staining buffer (Cell Signaling Technology 4087) at a volume of 500 uL per 1 million cells. An untransfected, unstained sample was included as a negative control lacking propidium iodide signal. Samples were stained for 15 minutes at room temperature shielded from light, and then were kept on ice for analysis following staining. Flow cytometry analysis was performed on a BD LSRFortessa X-20 SORP, and singlet cells were measured for propidium iodide signal using a 561nm laser with a 600nm long pass filter and a 610/20nm bandpass filter. Further data analysis was performed using FlowJo v10 (BD, RRID:SCR_008520). The built-in Watson (pragmatic) algorithm was used to determine cell cycle staging. Between 9,700 and 10,200 cells per condition were analyzed for histogram plotting and cell cycle staging.
Data Availability Statement
Plasmids encoding CIC::DUX4, CIC(ex20)::NUTM1(ex6), CIC(ex18)::NUTM1(ex3), HA-CIC::LEUTX, and HA-ATXN1::DUX4 are available on Addgene as plasmids 247361–247365. Other constructs described in this paper are available upon request.
RNA-seq data that we generated have been deposited at GEO: GSE295623, GEO: GSE295624, and GEO: GSE295625 and are publicly available as of the date of publication.
RNA-seq data used for the patient-derived CIC::DUX4 vs CIC::NUTM1 analysis were previously published (20) and are housed in the European Genome-Phenome Archive as dataset EGAD00001003121.
Uncropped western blot images and Ponceau S loading control images are available in Supplementary Dataset S2.
All original code is available at https://github.com/cuylerluck/CICfamily_models and is publicly available as of the time of publication.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
Results
Generation of synthetic, patient-informed CIC-family fusions
The absence of molecular models for CIC-family fusions aside from CIC::DUX4 poses a major roadblock to answering basic mechanistic questions about how different CIC-family fusions function. Direct cloning out of patient samples was not feasible due to the rarity of these tumors, and gene synthesis was not ideal due to the long and complex coding sequences of these fusions. Instead, we chose to leverage the fact that several studies have published trans-breakpoint Sanger sequencing of CIC::NUTM1, CIC::LEUTX, and ATXN1::DUX4 identified in patients. This information allows for cloning fusion coding sequences from existing plasmids that contain the coding sequences for individual partner genes via a modified Gibson assembly approach (NEBuilder HiFi DNA Assembly). Using such a strategy, we chose four reported breakpoints (20,26,39) and used them to generate four synthetic, patient-informed, in-frame N-terminally HA-tagged CIC-family fusions in a CMV-driven transfection backbone: HA-CIC(ex18)::NUTM1(ex3), HA-CIC(ex20)::NUTM1(ex6), HA-CIC::LEUTX, and HA-ATXN1::DUX4 (Figure 1, Supplementary Figures S1–S2, further sequence details are available in the Methods section). These constructs, in concert with an existing HA-CIC::DUX4 plasmid (14), allow for expression of various CIC-family fusions in cells.
Figure 1: Structure of plasmids encoding CIC-family fusions.

(A) To-scale representation of key regulatory and coding sequences in the original HA-CIC::DUX4 plasmid (14) and the four synthetic CIC-family fusions. All plasmids are driven by a CMV enhancer-promoter region (“CMV Enh-Prom”) though the exact sequence varies slightly between the HA-CIC::DUX4 plasmid and the others, owing to slightly different backbones (a more complete description is available in the Methods). Shaded regions for the LEUTX and NUTM1 fragments indicate previously identified or nominated transactivating motifs that interact with p300/CBP. The shaded region for ATXN1 indicates the AXH domain, which mediates interaction with CIC. The CIC HMG-box (dark gray) and C1 (light gray) DNA-binding domains are indicated.
For CIC::NUTM1, we cloned two versions due to our prior interest in the CIC C1 domain, which works with the CIC HMG box to bind DNA (55). We previously found that this functional domain is retained in almost all CIC::DUX4 patient fusions, and is necessary for maximal transcriptional activity of the fusion oncoprotein (16). However, we also noted that in our previous breakpoint dataset (16), the majority of CIC::NUTM1 sequences identified in patients did not retain the C1 domain. This led us to clone two CIC::NUTM1 constructs based on two independent patient breakpoints: one that does retain the C1 domain (CIC(ex20)::NUTM1(ex6)) and one that does not (CIC(ex18)::NUTM1(ex3)) (Figure 1).
Molecular and functional characterization of the CIC::NUTM1 fusion oncoprotein.
To characterize the behavior of these fusion oncoproteins we chose to perform cell-based experiments using the 293T cell line. Derivatives of HEK293 cells, including 293T, are a frequent model of choice in structure-function studies of fusion oncogenes (33,35,77–80) and have the advantage of enabling rapid isogenic comparisons while being easy to work with. Additionally, HEK293 derivatives express wild type CIC and consequently do not express high levels of many known CIC target genes, yielding an easily measurable increase in CIC target gene expression when wild type CIC is perturbed or when CIC-fusions are overexpressed (9,16). Indeed, transient transfection of 293T with a construct encoding CIC::DUX4 leads to transcriptional activation of many high-confidence CIC::DUX4 target genes which have been identified through other studies (81,82) (Supplementary Figure S3). When used as described in this manuscript, transient transfection of plasmids expressing CIC-fusions into 293T cells seems not to cause overt short-term toxicity as evaluated by a lack of PARP cleavage, and seems not to markedly change cell cycle stage as measured by propidium iodide staining (Supplementary Figures S4–S5).
When overexpressed in 293T cells, both CIC::NUTM1 isoforms markedly increased ETV5 expression, a known CIC (9,55,83) and CIC::DUX4 (14) target gene, at levels comparable to that seen by CIC::DUX4, and they both demonstrated nuclear localization (Figure 2A and 2B). We tested whether CIC::NUTM1 is a bona-fide fusion that requires both partner genes for activity, as opposed to acting as a truncation mutant of one or both partners, by independently expressing the fractional partner genes and comparing their ability to upregulate ETV5 relative to the full-length (FL) fusion. These studies indicate that while the FL CIC::NUTM1 fusion could induce ETV5 expression, neither truncated partner alone was sufficient, suggesting that CIC::NUTM1 is dependent on both partners to upregulate ETV5 (Figure 2C).
We unexpectedly observed that both CIC::NUTM1 isoforms could strongly activate ETV5 (Figure 2A) despite CIC(ex18)::NUTM1(ex3) lacking the C1 domain. This was surprising since we previously found that when the CIC::DUX4 fusion lacked the C1 domain its ability to transcriptionally upregulate ETV5 was strongly impaired (16), but directly comparing the two CIC::NUTM1 isoforms could be confounded by differences in fusion protein expression via plasmid transfection. Thus, we directly tested the necessity of the C1 domain for CIC::NUTM1 to activate ETV5 by deleting it alone or together with the HMG box from CIC(ex20)::NUTM1(ex6), and found that loss of the C1 domain alone had a mild impact on ETV5 induction, while loss of the C1 domain and HMG box together abrogated expression (Figure 2D).
To directly test whether CIC::NUTM1 is capable of driving tumorigenesis in vivo we took advantage of zebrafish mosaic transgenesis via the Tol2 transposon approach, which has been established for modelling several types of sarcomas including CIC::DUX4-driven sarcoma (81,84,85). Constructs were cloned to express mTagBFP-P2A-CIC::NUTM1 for both the CIC(ex20)::NUTM1(ex6) and CIC(ex18)::NUTM1(ex3) isoforms to allow fluorescent labeling of CIC::NUTM1-expressing cells via an unfused BFP reporter (Figure 2E). Constructs were injected into single cell-stage zebrafish embryos and fish were screened for BFP expression prior to monitoring for tumor development. Fish expressing CIC(ex20)::NUTM1(ex6) rapidly developed BFP positive tumors, with visible lesions within several weeks, while rare tumors developed significantly slower in the CIC(ex18)::NUTM1(ex3) group (Figure 2F–2G). Fish tumors that developed from either isoform demonstrated small blue round cell morphology, consistent with the prior CIC::DUX4 zebrafish model (81) and with the histology observed in patient CIC::NUTM1 tumors (19,86) (Figure 2H, Supplementary Figure S6A–S6D). Together, these results indicate that CIC::NUTM1 is a bona-fide fusion oncogene capable of activating the CIC target gene ETV5 and driving tumorigenesis in vivo.
Differential target gene regulation by CIC::NUTM1 and CIC::DUX4
The observed clinical differences between CIC::NUTM1 and CIC::DUX4 suggested that these fusions may potentially regulate unique gene targets. To evaluate this, we performed bulk RNA-sequencing (RNA-seq) of 293T cells expressing either CIC::DUX4 or CIC(ex20)::NUTM1(ex6). Our data indicated primary separation on a multidimensional scaling plot based on the presence or absence of the fusion oncoprotein, but we additionally observed a clear secondary separation between CIC::DUX4 and CIC(ex20)::NUTM1(ex6) expressing cells (Figure 3A). Indeed, a heatmap of all genes significantly upregulated in at least one of the fusion-transfected conditions vs. control showed that while many genes were significantly increased by both CIC::DUX4 and CIC(ex20)::NUTM1(ex6), there were gene clusters that were differentially regulated by one fusion relative to the other (Figure 3B).
To further investigate which genes were preferentially activated by CIC(ex20)::NUTM1(ex6) vs CIC::DUX4, we first validated that expression of the canonical CIC targets ETV1, ETV4, and ETV5 was strongly upregulated by both fusions, regardless of the 3’ partner gene (Figure 3C). Next, we noted upregulation of multiple forkhead box TF family members by CIC::DUX4 and CIC(ex20)::NUTM1(ex6), which are broadly implicated in development. Of these genes, we were particularly struck by the presence of FOXD3, FOXB1, and FOXG1, which have been previously implicated in the development of various neural structures and cell types (87). Importantly, both FOXD3 and FOXB1 were specifically upregulated in the CIC(ex20)::NUTM1(ex6) condition but not in the CIC::DUX4 expressing cells, and FOXG1 met the significance threshold in the CIC(ex20)::NUTM1(ex6) but was slightly under the log2 fold change cutoff (while meeting neither cutoff in the CIC::DUX4 condition) (Figure 3C). These were striking to us because of the predilection for CIC::NUTM1 tumors to present in the CNS and spine of patients, whereas CIC::DUX4 tumors typically arise in the soft tissue. We also noted the shared upregulation of FOXC2 (which is involved in the epithelial to mesenchymal transition (88)) by both fusions, as well as the upregulation of the stemness factor SOX2 specifically in the CIC(ex20)::NUTM1(ex6) condition (Figure 3C).
Of these genes, we validated that FOXB1 protein levels were specifically elevated following CIC::NUTM1 but not CIC::DUX4 expression (Figure 3D). We then revisited our structure-function experiments and noted that upregulation of FOXB1 is observed following expression of FL CIC(ex18)::NUTM1(ex3), but not its individual truncated partners (Figure 3E). We also observed that like ETV5, FOXB1 protein levels were mildly impaired when the C1 domain was deleted from CIC(ex20)::NUTM1(ex6) and lost upon deletion of both CIC DNA binding domains (Figure 3F).
To investigate this finding beyond our 293T system, we then performed our own analysis of the only RNA-seq dataset of patient tumor samples including both CIC::DUX4 and CIC::NUTM1 (20). Raw gene counts from tumor samples indicated heterogeneity of FOXB1 expression in samples with either fusion, with some samples having zero or few reads while others had several thousand (Supplementary Figure S7A). After batch-correction for sequencer model and normalization, CIC::NUTM1 samples on average had over 10 times more FOXB1 expression than CIC::DUX4 samples did, but this was narrowly not significant (q-value 0.126) and actual expression levels varied drastically between patients (Supplementary Figures S7B and S7C). This could not be explained by the CIC::NUTM1 samples generally having higher activation of CIC target genes compared to the CIC::DUX4 samples, as exemplified by the canonical CIC targets ETV1/4/5 (Supplementary Figures S7C and S7D). In summary, these data indicate that FOXB1 transcript and protein levels are preferentially elevated by CIC::NUTM1 compared to CIC::DUX4 expression, potentially including in tumor samples, and provide rationale to further explore the role of FOXB1-mediated neurogenic development of CIC::NUTM1 sarcomas.
Defining the mechanism of CIC::NUTM1-mediated transcriptional activation.
Since CIC::DUX4 is thought to gain activating capacity through its interaction with p300 and since NUTM1 is a known p300 interactor (89), we hypothesized that p300 recruitment was mediated by the addition of NUTM1 to CIC in the context of CIC::NUTM1. Prior studies of BRD3/4::NUTM1 fusions have localized the minimal region within NUTM1 that binds to p300 to a pair of transactivating domains (TADs) within a roughly 100 amino acid region (35,36). Importantly, this region is retained in both of the CIC::NUTM1 constructs that we engineered (Figure 1, TAD domain). To test if this region is necessary for the activation of CIC target genes by CIC::NUTM1, we genetically deleted this domain from both of our CIC::NUTM1 constructs (termed dTAD mutants) and evaluated their ability to regulate ETV5 levels. We observed that even following TAD domain deletion, both CIC::NUTM1 versions still displayed moderate activation of ETV5 (Figure 4A). This suggested that while p300 does play a role in the activation of ETV5 by CIC::NUTM1, there may be alternative mechanisms that facilitate a CIC::NUTM1-mediated increase in ETV5 levels.
Figure 4: Genetic dissection of the CIC::NUTM1 fusion oncoprotein reveals a key NUTM1 domain that is necessary and sufficient to drive FOXB1 expression.

(A) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or one of the indicated versions of HA-CIC(ex20)::NUTM1(ex6) or HA-CIC(ex18)::NUTM1(ex3): full-length (FL) or with the p300-interacting domain deleted (dTAD). Representative of three independent experiments. The HSP90 blot was aggregated with HA and ETV5 blots from replicate identically loaded, simultaneously processed samples.
(B) Diagram of the strategy for the CIC(ex20)::NUTM1(ex6) deletion screen. Note that the dTAD mutant has a deletion that is a subset of that for the deletion A mutant.
(C) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or one of the indicated versions of HA-CIC(ex20)::NUTM1(ex6): full-length (FL), dTAD as in panel A, or a deletion of one of the regions depicted in panel B. Representative of two independent experiments. The HSP90, HA, and FOXB1 blots were aggregated with an ETV5 blot from replicate identically loaded, simultaneously processed samples.
(D) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or one of the indicated versions of HA-CIC(ex20)::NUTM1(ex6): full-length (FL), the E deletion from panels B and C, or one of three smaller approximately 60 amino acid sub deletions within the E region, described in the Methods section. Representative of three independent experiments. The HSP90 and FOXB1 blots were aggregated with HA and ETV5 blots from replicate identically loaded, simultaneously processed samples.
(E) Immunoblot of 293T cells approximately 48 hours after transfection with empty vector (EV) or one of the indicated versions of HA-CIC::DUX4: full-length (FL), or with the E region or E.2 region from panels B-D cloned onto the C-terminus of the HA-CIC::DUX4 coding sequence. Representative of three independent experiments. The HSP90, HA, and FOXB1 blots were aggregated with an ETV5 blot with from replicate identically loaded, simultaneously processed samples.
A C-terminal domain of NUTM1 is necessary and sufficient to drive FOXB1 activation by CIC(ex20)::NUTM1(ex6) or CIC::DUX4
We considered two explanations for the residual increase in ETV5 levels: first, that the remaining NUTM1 moiety could provide dominant negative activity to de-repress ETV5 in a manner not seen with truncated CIC alone, or second, the possible presence of an unidentified NUTM1 functional domain that contributes to ETV5 activation. We chose to test the second of these hypotheses by undertaking a deletion screen of the NUTM1 fragment of CIC(ex20)::NUTM1(ex6) that deleted five non-overlapping roughly 150 amino acid stretches of the NUTM1 fractional coding sequence (Figure 4B). We expressed these in 293T cells and used expression of both ETV5 and FOXB1 as readouts, where ETV5 serves as a reporter on general CIC-fusion activity and FOXB1 levels indicate CIC::NUTM1-specific activity. While the deletion of region A phenocopied the dTAD mutants (as the TAD region is inside of the A deletion), deletions B, C, and D displayed no change in ETV5 and FOXB1 induction (Figure 4C). However, the deletion of the E region towards the C-terminal end of NUTM1 led to complete abrogation of FOXB1 induction while ETV5 elevation was intact (Figure 4C). Deletion of the E region also led to complete loss of FOXB1 induction and a weaker level of ETV5 activation in the CIC(ex18)::NUTM1(ex3) construct (Supplementary Figure S8A). Co-deletion of the E region with the TAD region of NUTM1 did not completely eliminate ETV5 activation by CIC(ex20)::NUTM1(ex6) (Supplementary Figure S8B), indicating this region is not responsible for the residual ETV5 activation observed without the p300-interacting domain of NUTM1.
A further sub-screen of the E region using smaller ~60 amino acid non-overlapping deletions indicated that the minimal sequence necessary for this phenotype was likely in the middle of the domain (E.2 mutant), possibly with some contribution from the very C-terminal sequence (E.3 mutant) (Figure 4D). To test if this moiety was sufficient to confer FOXB1 activation to CIC::DUX4, which we previously observed not to strongly activate FOXB1, we then cloned the E region or the E.2 subregion onto the C-terminus of CIC::DUX4 and evaluated ETV5 and FOXB1 expression. We observed a moderate increase in FOXB1 activation only by the broader E region (Figure 4E), with the E.2 subregion not sufficient to elevate FOXB1 levels on its own or with a short flexible linker sequence (Figure 4E, Supplementary Figure S8C). These results support the identification of the E region as sufficient to activate FOXB1 by CIC::NUTM1 in a manner potentially uncoupled from the fusion’s ability to activate ETV5 expression.
The NUTM1 E region permits activation of a gene program beyond core CIC/CIC::DUX4 target genes
To determine if this phenomenon might help to explain the differential regulation of genes beyond FOXB1 by CIC::NUTM1 vs CIC::DUX4, we performed another bulk RNA-seq experiment in 293T cells expressing several CIC-family fusions as well as the dTAD and E-deletion mutants of CIC(ex20)::NUTM1(ex6). Multidimensional scaling analysis reproduced our previous observation that CIC::NUTM1 samples clustered separately from CIC::DUX4 samples (Supplementary Figure S9A). Of the two CIC::NUTM1 isoforms, the CIC(ex18)::NUTM1(ex3) samples clustered between the empty vector control and the CIC(ex20)::NUTM1(ex6) samples, suggesting overall weaker activation of target genes by the CIC(ex18)::NUTM1(ex3) isoform compared to the CIC(ex20)::NUTM1(ex6) isoform. This was in line with our zebrafish in vivo data indicating slower tumor formation by CIC(ex18)::NUTM1(ex3) (Figure 2F). We also observed that while the TAD deletion generally seemed to make CIC(ex20)::NUTM1(ex6) samples weaker (i.e. closer to empty vector control), the E-deletion samples (“delE”) still clustered far from control but were now located much closer to the CIC::DUX4 samples (Supplementary Figure S9A).
Heatmap analysis of genes upregulated in the FL CIC(ex20)::NUTM1(ex6) condition revealed two groups of genes: those that were activated regardless of whether or not the E region was present (termed “E ignorers”), and those whose activation was dependent on the presence of the E region (termed “E responders”) (Figures 5A and 5B). We defined lists of these genes (Figure 5B, Supplementary Datasets S9 and S10) and employed gene set enrichment analysis with gProfiler2 (75) (Figure 5C). Among the terms enriched for each group, we noted that those describing general development were significant for both groups, while regulation of ERK1/2 signaling was a hallmark of the E ignorer genes, consistent with prior knowledge that CIC and CIC::DUX4 (which do not have the E region) regulate the MAPK cascade (8,90). We additionally noted several terms related to CNS development that were specifically enriched for the E responder group, suggesting their activation was associated with the presence of the E region.
We next mapped the expression of these genes to the CIC(ex18)::NUTM1(ex3) samples (Figure 5D), and observed that while in general fewer of them met the significance threshold for activation (consistent with an overall weaker fusion compared to CIC(ex20)::NUTM1(ex6)), a similar amount of both E ignorers (35%) and E responders (24%) were significantly activated in these samples. However, in the CIC::DUX4 samples, 90% of the E ignorer genes were significantly activated while just 14% of E responder genes met the same criteria (Figure 5D). This suggested that the E ignorer genes are likely a hallmark of both CIC::DUX4 and CIC::NUTM1 gene signatures, while the E responder genes are specifically biased towards CIC::NUTM1 expression due to the NUTM1 E region. Comparison of the dTAD mutant with the E-deletion mutant of CIC(ex20)::NUTM1(ex6) further demonstrated that TAD deletion generally led to a reduction of target gene expression, while loss of the E region retained high expression levels of many genes including those known to be CIC and CIC::DUX4 targets (Supplementary Figures S9B–S9D). Together, these results support the notion that the NUTM1 E region enables CIC::NUTM1 to activate a gene program separate from core CIC and CIC::DUX4 target genes, which may begin to explain the divergent behavior between CIC::NUTM1- and CIC::DUX4-bearing tumors.
CIC::LEUTX is a relatively weak activator of CIC target genes in a manner dependent on its two C-terminal transactivation sequences
We next turned to CIC::LEUTX, which in our synthetic patient-informed construct is comprised of almost the entire CIC coding sequence with a small LEUTX fragment at the C-terminus, similar to CIC::DUX4 (Figure 1). Expression of this construct led to a milder ETV5 activation compared to the response seen from CIC::DUX4 or CIC::NUTM1, with no accompanying increase in FOXB1 expression (Figure 6A, Supplementary Figure S10A). RNA-seq data similarly showed that CIC::LEUTX was relatively weak compared to the other FL fusions (i.e. clustered closer to the empty vector control samples, Supplementary Figure S9A).
WT LEUTX has been shown to interact with p300/CBP through at least one of two short transactivating motifs towards its C-terminus (34,56), both of which are retained in our construct. We deleted one or both of these motifs and tested the mutants for their ability to activate ETV5, which indicated that loss of the SSLNQYLFP motif alone (previously suggested to abrogate the p300/CBP-LEUTX interaction (34)) is not sufficient to block ETV5 induction (Figure 6B). However, deletion of both motifs almost entirely eliminated ETV5 induction, suggesting that CIC::LEUTX does indeed activate ETV5 in a manner dependent on these motifs and the recruitment of p300/CBP.
We compared the transcriptional activation signature driven by CIC::LEUTX to that generated by CIC::DUX4 using our RNA-seq data and noted that CIC::LEUTX drove the activation of a smaller set of genes than CIC::DUX4, but still significantly increased the expression of several core CIC target genes (Figure 6C–D). Taken together, these data indicate that our CIC::LEUTX fusion operates as an attenuated version of CIC::DUX4 that is capable of activating core CIC target genes through two C-terminal transactivation motifs, likely in a p300/CBP-dependent manner.
ATXN1::DUX4 activates the expression of select CIC target genes in a manner dependent on the AXH domain.
Finally, we aimed to investigate ATXN1::DUX4, which is interesting because it is thought to indirectly activate CIC target genes due to the physical interaction between ATXN1 and CIC, which is mediated by the ATXN1 AXH domain (39). If proven true, ATXN1/ATXN1L fusions would be the first example of fusions that drive tumors with characteristics of CIC-rearrangements without directly altering the CIC locus. Notably, the construct that we generated was based on a breakpoint that does retain the ATXN1 AXH domain (Figure 1), making this mechanism testable with our model. We expressed both the FL ATXN1::DUX4 and a mutant lacking the AXH domain and observed that ETV5 levels were increased only in the context of the FL fusion (Figure 7A).
To test if the ATXN1::DUX4 fusion impacts endogenous CIC localization we performed immunofluorescence microscopy on 293T cells transfected with the full-length or AXH-deleted versions of ATXN1::DUX4. Interestingly, regardless of the presence of the AXH domain we observed some cells with puncta of ATXN1::DUX4 that appeared to co-localize with CIC, whereas in cells not expressing the fusion, CIC was diffuse in the nucleus (Figure 7B). However, we also noted that some transfected cells clearly expressed ATXN1::DUX4 but did not show these puncta, again regardless of the presence or absence of the AXH domain. Across all cells expressing either version of HA-ATXN1::DUX4, about two-thirds displayed punctate HA-tag signal (67.9%, 300 out of 442 HA+ cells quantified). It is difficult to determine if these puncta are biologically relevant or artifacts related to overexpression. However, we have not observed puncta formation when using similar constructs to overexpress CIC::DUX4 or CIC::NUTM1 in our current (Figure 2B) or in our prior studies (16), suggesting this is a phenomenon specific to the ATXN1::DUX4 fusion.
Regardless of its impact on co-localization with CIC, deletion of the AXH domain clearly impacted ETV5 activation (Figure 7A), which led us to perform RNA-seq on 293T cells expressing the full-length and AXH-deleted ATXN1::DUX4 constructs. We observed that samples generally segregated by whether ATXN1::DUX4 was present or not, with separation on the second dimension based on the presence or absence of the AXH domain (Figure 7C). Heatmap analysis showed surprisingly few genes whose activation was significantly lost when the AXH domain was deleted (Figure 7D), and a volcano plot revealed that all five of these genes are known or likely CIC target genes (9,91) (Figure 7E). We also noticed activation of ZSCAN4 following expression of either form of ATXN1::DUX4, which is a known DUX4 target gene (92) and likely due to the fact that in this particular fusion the DUX4 breakpoint is positioned early in the coding sequence allowing for retention of the second of two DUX4 DNA-binding homeobox domains (Supplementary Figure S11A–S11C). These data strongly suggest that the ATXN1::DUX4 fusion does indeed activate select CIC target genes in a manner at least partially dependent on the AXH domain.
Discussion
Of the plethora of fusion oncogene-driven cancers that arise in patients, the majority are rare entities that will only affect a small number of individuals each year. Within these families of rare tumors, it can be even more difficult to identify patients harboring variants of promiscuous fusions with varying partner genes which may alter their activity. While learning about the minute differences between fusions with differing partner gene composition can be biologically and clinically informative, the scarcity of patient samples makes them difficult to study through the traditional approaches of cloning from tumor tissue. In this study, we used an alternative approach of cloning synthetic, patient-informed fusions designed from real breakpoint sequences to study convergent and divergent biology among CIC-family fusion oncogenes.
While limited clinical and transcriptional data has suggested that CIC::NUTM1 may somewhat diverge in its functional capabilities from CIC::DUX4, we provide here the first evidence that they differ in functional domain requirements and transcriptional programs. We validated that CIC::NUTM1 fusions activate known CIC target genes and are capable of driving tumorigenesis in vivo. We further followed up on our prior observation that CIC::NUTM1 breakpoints often exclude the C1 domain (16) to show that these fusions indeed seem less reliant on the C1 domain, differing from CIC::DUX4. We then localized a poorly defined 150 amino acid C-terminal domain of NUTM1 that through an as of yet unknown mechanism confers the ability of CIC::NUTM1 to activate a gene program including FOXB1. This program is largely not driven by CIC::DUX4 and appears to be enriched for genes involved in neural development, suggesting the possibility that this neomorphic behavior may in part explain why CIC::NUTM1 tumors have preferential tropism for the CNS and spine, though this is beyond the scope of our study. We anticipate that future studies will be able to use these newly engineered molecular tools to further understand how CIC::NUTM1 may leverage FOXB1 or its broader gene program to mediate tumor development. In particular, dissection of where CIC::DUX4, CIC::NUTM1, and other CIC-family fusions bind to DNA and how they alter its accessibility will be invaluable to identify mechanistic differences between these fusion oncoproteins.
The E region falls on a part of the NUTM1 protein where few annotated functional domains exist. It appears to contain both putative nuclear localization and nuclear export sequences (93), and closely abuts a defined but unstudied “TAD2” region (94). It also overlaps a Uniprot annotation for a long disordered region at the C-terminus of NUTM1 (Uniprot Q86Y26). We noted that in addition to its effects on the transcriptional program exerted by CIC::NUTM1, the deletion of the E region appeared to consistently increase fusion protein expression (Figure 4C–D, Supplementary Figure S8A–S8B), and its addition to CIC::DUX4 decreased expression of those fusions (Figure 4E, Supplementary Figure S8C). We tend to be conservative about making conclusions on protein stability using plasmid models because our system does not precisely control expression levels, but the reproducibility of this phenomenon across several mutants and fusion constructs suggests the E region may impact protein stability or expression.
We think at least one of three explanations is likely to explain how the E region allows for the activation of new target genes in CIC::NUTM1. The first is that it may alter the DNA binding properties of CIC::NUTM1 compared to those of CIC::DUX4, allowing fusion binding at new sites and activation of nearby genes. The second is that the E region may amplify overall transactivation by CIC::NUTM1, increasing the expression of genes that are normally only very weakly regulated by CIC::DUX4. The third is that it may recruit different protein complexes to the fusion and drive additional chromatin remodeling separate from p300/CBP activity, allowing for the induction of new genes not activated by CIC::DUX4. Plausibly, these alternative protein complexes may be specifically functional in the nervous system, shaping CIC::NUTM1 tropism. Regardless of the mechanism, perhaps the most interesting future questions will be whether or not the gene program driven by the E region is reproducible in more patient-relevant models and is sufficient to explain the CNS and spine tropism observed in CIC::NUTM1 tumors. This is an important question which requires improvement and refinement of our current models and the discovery of the cell-of-origin for these tumors to dissect.
Given the prevailing hypothesis for how CIC-fusions mechanistically drive transcription, we also took the opportunity to study the role of p300/CBP in activation of CIC target genes using our CIC::NUTM1 and CIC::LEUTX constructs. While we found that CIC::LEUTX does appear to be dependent on its two C-terminal LEUTX transactivation domains to activate ETV5, likely through recruitment of p300/CBP, we observed a substantial increase in known CIC target genes by CIC::NUTM1 even when its putative p300-recruiting domain was deleted. The mechanism explaining this behavior in CIC::NUTM1 is not immediately clear, as it is not explained by the E region and a truncating CIC mutation did not result in the same level of ETV5 activation. One potential explanation is that the addition of a bulky NUTM1 moiety, even if incapable of recruiting p300/CBP, is sufficient to generate a stronger dominant-negative mutant and increase CIC target gene expression. Deciphering this mechanism is worth pursuing because recent work has nominated using p300/CBP inhibitors and degraders in CIC::DUX4-bearing tumors to block target gene activation (18,95). While our data suggests this approach may also be useful in blocking CIC::LEUTX-mediated signaling, it might not prove efficacious in CIC::NUTM1-bearing tumors which may partially operate through a p300/CBP independent manner to activate CIC targets.
The recent rise in reports of tumors bearing fusions of ATXN1 or ATXN1L with 3’ partner genes including DUX4, NUTM1, and NUTM2A has been of special interest because ATXN1 and ATXN1L are known to functionally interact with CIC. The observation that ATXN1/ATXN1L-fused tumors tend to cluster with CIC-rearranged tumors by methylation profiling has raised the idea that these fusions likely activate CIC target genes by colocalizing a p300-interacting domain to CIC target sites (39). Our data suggest ATXN1::DUX4 is indeed capable of activating certain CIC target genes in a manner at least partially dependent on the AXH domain, consistent with a model of indirect CIC-target gene activation. Additionally, our results indicate that ATXN1::DUX4 may be able to form nuclear puncta which include CIC, but the mechanism and biological relevance remain unclear. Future studies are necessary to confirm that activation of CIC target genes is required to drive oncogenesis in ATXN1-rearranged tumors since it remains possible that other as of yet undefined functions of ATXN1::DUX4 underly tumorigenesis.
The primary limitation of this work is that our experiments were largely conducted in 293T cells, which are likely not entirely biologically relevant to patient-derived tumors. However, we consistently observed the proper regulation of key transcriptional responses matching those in tumors. Moreover, we largely performed structure-function experiments, which are unlikely to be cell type dependent. The main area where caution should be applied is in extending our results to explain tumorigenesis, for example whether FOXB1 and the E region may impact the anatomical location of CIC::NUTM1 tumors. While we can hypothesize about such a relationship, our data do not permit any conclusions of this nature and answering such questions will require more robust models.
While our constructs are based on breakpoints derived from actual patients, they are not cloned directly out of patient tumors and thus are potentially subject to caveats based on the sequences used to clone them. To be transparent, we have included complete details about the sequences used for cloning and what (if any) mutations they have compared to reference partner gene sequences in the Materials and Methods section.
Finally, our approach of transiently transfecting plasmids into 293T cells and evaluating response after a short period of time (typically 48 hours) enables rapid experimentation but has several limitations. First, despite being driven by similar promoter-enhancer elements, our constructs do not tightly regulate expression levels and thus there may be some impacts on readouts like target gene activation that could be dependent on transgene expression levels. Where relevant, we have tried to only draw conclusions that seem not to be affected by this (e.g. CIC::LEUTX expresses at higher protein levels than other CIC-fusions, but yields lower activation of ETV5) or we designed experiments such that mutants were compared to FL constructs of the same fusion. However, caution should be taken when directly comparing readouts from different fusions to each other, or when comparing mutants which have clear expression differences. Second, the short timeframe at which we typically assayed for results could mean that some of our outputs (particularly gene programs) could be relevant at short timescales but not longer ones. For example, while we found some patient-derived data potentially suggesting FOXB1 upregulation in CIC::NUTM1 vs CIC::DUX4-bearing tumors, expression of FOXD3 and FOXG1 was limited in the same samples. This could imply a limitation of using 293T cells, or it could suggest that these genes play a transient role in tumorigenesis. Addressing this caveat will require the development of stable, longer-term models that sustain expression of these fusions over long time periods and permit time-based study of tumor development. Third, a related issue is that we often use a strong promoter-enhancer element to drive expression of these fusions, which may not directly relate to endogenous expression levels observed in tumors. Thus, future studies are required in order refine our initial interpretations and overcome these limitations.
We chose to generate synthetic, patient-informed constructs of understudied CIC-family fusions to enable rapid structure-function studies and to overcome a barrier for studying these ultra-rare tumor types. While their synthetic nature and our choice to largely use them in 293T cells make them far from perfect models, they faithfully recapitulate known transcriptional signatures and are easily mutated to investigate functional domain requirements. In this study, we used these tools to identify a new functional domain of NUTM1 that influences CIC::NUTM1 activity, characterize CIC::LEUTX as a weak p300-dependent activator of CIC target genes, and provide evidence for the indirect-fusion theory of CIC target gene activation by ATXN1::DUX4. We view these tools as first-generation constructs that allow us as a community to start exploring fusion promiscuity in CIC-rearranged tumors. We thus fully intend to distribute these constructs as a key resource (Addgene) and invite the scientific community to partake in future studies that utilize this CIC fusion molecular tool kit. We hope that these constructs will stimulate active investigation into the basic mechanisms that underlie these understudied CIC-family fusion members and provide a foundation for the development of future CIC-rearranged model systems.
Supplementary Material
Acknowledgements
C. Luck acknowledges funding from the University of California, San Francisco BMS Graduate Program Training Grant T32GM136547-01 (National Institute of General Medical Sciences), the University of California, San Francisco Discovery Fellows Program, and an NCI F31 (CA287493). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. K.A. Jacobs acknowledges funding from Tobacco-Related Disease Research Program Predoctoral Fellowship T33DT6442. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under grant no. 2038436 (to K.A. Jacobs). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. R.A. Okimoto acknowledges funding from an NCI grant (R37 CA255453), the Children’s Cancer Research Fund, and Cookies for Kids’ Cancer. J.F. Amatruda was supported by grants P30CA014089 and U54CA231649 from NIH. We want to thank Takuro Nakamura, Chris French, Huda Zoghbi, Stephen Tapscott, Leonard Zon, and Michael Davidson for sharing their plasmids either directly or through Addgene. We also want to thank Sarah Watson, Franck Tirode, and colleagues for depositing and allowing access to the data used for the patient-derived CIC::DUX4 vs CIC::NUTM1 RNA-seq analysis. We thank Hillary Mahon and ACF staff for zebrafish care, and the Children’s Hospital Los Angeles (CHLA) Translational Pathology Core, supported by the USC Norris Comprehensive Cancer Center grant P30CA014089 from the National Institutes of Health (NIH) for histology services. We also thank the UCSF Parnassus Flow CoLab (RRID:SCR_018206) for assistance generating flow cytometry data, supported in part by the DRC Center Grant NIH P30 DK063720. C. Luck wishes to extend special thanks to Kevin Shannon, Alejandro Sweet-Cordero, and Dave Toczyski for their support and input during the development of this project. We are particularly grateful to the patients in the original studies that guided design of our constructs, who chose to donate their samples for scientific research, and the authors of those studies for sharing their data via their publications.
Footnotes
Conflict of Interest Disclosure Statement
C.L. has been recently employed with Genentech, who played no role in this study. The remaining authors declare no potential conflicts of interest.
References
- 1.Antonescu CR, Dal Cin P. Promiscuous genes involved in recurrent chromosomal translocations in soft tissue tumours. Pathology. 2014;46:105–12. [DOI] [PubMed] [Google Scholar]
- 2.Marcelis L, Folpe AL. “Putting the cart before the horse”: an update on promiscuous gene fusions in soft tissue tumors. Virchows Arch. Springer Berlin Heidelberg; 2025; [DOI] [PubMed] [Google Scholar]
- 3.Stegmaier S, Poremba C, Schaefer K, Leuschner I, Kazanowska B, Békássy AN, et al. Prognostic value of PAX–FKHR fusion status in alveolar rhabdomyosarcoma: A report from the cooperative soft tissue sarcoma study group (CWS). Pediatr Blood Cancer. 2011;57:406–14. [DOI] [PubMed] [Google Scholar]
- 4.Sorensen PHB, Lynch JC, Qualman SJ, Tirabosco R, Lim JF, Maurer HM, et al. PAX3-FKHR and PAX7-FKHR gene fusions are prognostic indicators in alveolar rhabdomyosarcoma: A report from the Children’s Oncology Group. J Clin Oncol. 2002;20:2672–9. [DOI] [PubMed] [Google Scholar]
- 5.Raze T, Lapouble E, Lacour B, Guissou S, Defachelles A, Gaspar N, et al. PAX–FOXO1 fusion status in children and adolescents with alveolar rhabdomyosarcoma: Impact on clinical, pathological, and survival features. Pediatr Blood Cancer. 2023;70. [DOI] [PubMed] [Google Scholar]
- 6.Manceau L, Albert JR, Lollini PL, Greenberg MVC, Gilardi-Hebenstreit P, Ribes V. Divergent transcriptional and transforming properties of PAX3-FOXO1 and PAX7-FOXO1 paralogs. PLoS Genet. 2022;18:1–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jiménez G, Guichet A, Ephrussi A, Casanova J. Relief of gene repression by Torso RTK signaling: Role of capicua in Drosophila terminal and dorsoventral patterning. Genes Dev. 2000;14:224–31. [PMC free article] [PubMed] [Google Scholar]
- 8.Weissmann S, Cloos PA, Sidoli S, Jensen ON, Pollard S, Helin K. The tumor suppressor CIC directly regulates MAPK pathway genes via histone deacetylation. Cancer Res. 2018;78:4114–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.LeBlanc VG, Firme M, Song J, Chan SY, Lee MH, Yip S, et al. Comparative transcriptome analysis of isogenic cell line models and primary cancers links capicua (CIC) loss to activation of the MAPK signalling cascade. J Pathol. 2017;242:206–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Simón-Carrasco L, Graña O, Salmón M, Jacob HKC, Gutierrez A, Jiménez G, et al. Inactivation of Capicua in adult mice causes T-cell lymphoblastic lymphoma. Genes Dev. 2017;31:1456–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Okimoto RA, Breitenbuecher F, Olivas VR, Wu W, Gini B, Hofree M, et al. Inactivation of Capicua drives cancer metastasis. Nat Genet. 2017;49:87–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bunda S, Heir P, Metcalf J, Li ASC, Agnihotri S, Pusch S, et al. CIC protein instability contributes to tumorigenesis in glioblastoma. Nat Commun. Springer US; 2019;10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Yang R, Chen LH, Hansen LJ, Carpenter AB, Moure CJ, Liu H, et al. Cic loss promotes gliomagenesis via aberrant neural stem cell proliferation and differentiation. Cancer Res. 2017;77:6097–108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Kawamura-Saito M, Yamazaki Y, Kaneko K, Kawaguchi N, Kanda H, Mukai H, et al. Fusion between CIC and DUX4 up-regulates PEA3 family genes in Ewing-like sarcomas with t(4;19)(q35;q13) translocation. Hum Mol Genet. 2006;15:2125–37. [DOI] [PubMed] [Google Scholar]
- 15.Hendrickson PG, Oristian KM, Browne MR, Luo L, Ma Y, Cardona DM, et al. Spontaneous expression of the CIC::DUX4 fusion oncoprotein from a conditional allele potently drives sarcoma formation in genetically engineered mice. Oncogene. Springer US; 2024; [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Luck C, Jacobs KA, Okimoto RA. The Capicua C1 Domain Is Required for Full Activity of the CIC::DUX4 Fusion Oncoprotein. Cancer Res Commun. 2024;4:3099–113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Okimoto RA, Wu W, Nanjo S, Olivas V, Lin YK, Ponce RK, et al. CIC-DUX4 oncoprotein drives sarcoma metastasis and tumorigenesis via distinct regulatory programs. J Clin Invest. 2019;129:3401–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bosnakovski D, Ener ET, Cooper MS, Gearhart MD, Knights KA, Xu NC, et al. Inactivation of the CIC-DUX4 oncogene through P300/CBP inhibition, a therapeutic approach for CIC-DUX4 sarcoma. Oncogenesis. Springer US; 2021;10:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Sturm D, Orr BA, Toprak UH, Hovestadt V, Jones DTW, Capper D, et al. New Brain Tumor Entities Emerge from Molecular Classification of CNS-PNETs. Cell. Cell Press; 2016;164:1060–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Watson S, Perrin V, Guillemot D, Reynaud S, Coindre JM, Karanian M, et al. Transcriptomic definition of molecular subgroups of small round cell sarcomas. J Pathol. John Wiley and Sons Ltd; 2018;245:29–40. [DOI] [PubMed] [Google Scholar]
- 21.Yang S, Liu LL, Yan Y, Jiang L, Han S, Shen D, et al. CIC-NUTM1 Sarcomas Affecting the Spine: A Subset of CIC-Rearranged Sarcomas Commonly Present in the Axial Skeleton. Arch Pathol Lab Med. College of American Pathologists; 2022;146:735–41. [DOI] [PubMed] [Google Scholar]
- 22.Ma Y, Feng J, Ding D, Tian F, Zhao J. CIC–NUTM1 sarcoma in an 8-year-old female patient with a new fusion: A case report. Pediatr Blood Cancer. 2023;10–2. [DOI] [PubMed] [Google Scholar]
- 23.Sun J, Wang L. CIC::NUTM1 sarcoma misdiagnosed as NUT carcinoma: A case report and literature review. Oral Oncol. 2024;156:1–3. [DOI] [PubMed] [Google Scholar]
- 24.Schaefer IM, Dal Cin P, Landry LM, Fletcher CDM, Hanna GJ, French CA. CIC-NUTM1 fusion: A case which expands the spectrum of NUT-rearranged epithelioid malignancies. Genes Chromosom Cancer. Blackwell Publishing Inc.; 2018;57:446–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sievers P, Sill M, Schrimpf D, Abdullaev Z, Donson AM, Lake JA, et al. Pediatric-type high-grade neuroepithelial tumors with CIC gene fusion share a common DNA methylation signature. npj Precis Oncol. 2023;7:30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Huang SC, Zhang L, Sung YS, Chen CL, Kao YC, Agaram NP, et al. Recurrent CIC gene abnormalities in angiosarcomas: A molecular study of 120 cases with concurrent investigation of PLCG1, KDR, MYC, and FLT4 gene alterations. Am J Surg Pathol. 2016;40:645–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Linos K, Dermawan JK, Bale T, Rosenblum MK, Singer S, Tap W, et al. Expanding the Molecular Diversity of CIC-Rearranged Sarcomas With Novel and Very Rare Partners. Mod Pathol. United States & Canadian Academy of Pathology; 2023;36:100103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Lake JA, Donson AM, Prince E, Davies KD, Nellan A, Green AL, et al. Targeted fusion analysis can aid in the classification and treatment of pediatric glioma, ependymoma, and glioneuronal tumors. Pediatr Blood Cancer. 2020;67:1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Sugita S, Arai Y, Tonooka A, Hama N, Totoki Y, Fujii T, et al. A Novel CIC-FOXO4 Gene Fusion in Undifferentiated Small Round Cell Sarcoma. Am J Surg Pathol. 2014;38:1571–6. [DOI] [PubMed] [Google Scholar]
- 30.Solomon DA, Brohl AS, Khan J, Miettinen M. Clinicopathologic features of a second patient with ewing-like sarcoma harboring CIC-FOXO4 gene fusion. Am J Surg Pathol. 2014;38:1724–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Brohl AS, Solomon DA, Chang W, Wang J, Song Y, Sindiri S, et al. The Genomic Landscape of the Ewing Sarcoma Family of Tumors Reveals Recurrent STAG2 Mutation. PLoS Genet. 2014;10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Dansen TB, Smits LMM, Van Triest MH, De Keizer PLJ, Van Leenen D, Koerkamp MG, et al. Redox-sensitive cysteines bridge p300/CBP-mediated acetylation and FoxO4 activity. Nat Chem Biol. 2009;5:664–72. [DOI] [PubMed] [Google Scholar]
- 33.Asante Y, Benischke K, Osman I, Ngo QA, Wurth J, Laubscher D, et al. PAX3-FOXO1 uses its activation domain to recruit CBP/P300 and shape RNA Pol2 cluster distribution. Nat Commun. Springer US; 2023;14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Gawriyski L, Jouhilahti E-M, Yoshihara M, Fei L, Weltner J, Airenne TT, et al. Comprehensive characterization of the embryonic factor LEUTX. iScience. 2023;26:106172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Yu D, Liang Y, Kim C, Jaganathan A, Ji D, Han X, et al. Structural mechanism of BRD4-NUT and p300 bipartite interaction in propagating aberrant gene transcription in chromatin in NUT carcinoma. Nat Commun. Nature Research; 2023;14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Reynoird N, Schwartz BE, Delvecchio M, Sadoul K, Meyers D, Mukherjee C, et al. Oncogenesis by sequestration of CBP/p300 in transcriptionally inactive hyperacetylated chromatin domains. EMBO J. 2010;29:2943–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Wong D, Sogerer L, Lee SS, Wong V, Lum A, Levine AB, et al. TRIM25 promotes Capicua degradation independently of ERK in the absence of ATXN1L. BMC Biol. BMC Biology; 2020;18:1–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lam YC, Bowman AB, Jafar-Nejad P, Lim J, Richman R, Fryer JD, et al. ATAXIN-1 Interacts with the Repressor Capicua in Its Native Complex to Cause SCA1 Neuropathology. Cell. 2006;127:1335–47. [DOI] [PubMed] [Google Scholar]
- 39.Satomi K, Ohno M, Kubo T, Honda-Kitahara M, Matsushita Y, Ichimura K, et al. Central nervous system sarcoma with ATXN1::DUX4 fusion expands the concept of CIC-rearranged sarcoma. Genes Chromosom Cancer. 2022;61:683–8. [DOI] [PubMed] [Google Scholar]
- 40.Pratt D, Kumar-Sinha C, Cieślik M, Mehra R, Xiao H, Shao L, et al. A novel ATXN1-DUX4 fusion expands the spectrum of ‘CIC-rearranged sarcoma’ of the CNS to include non-CIC alterations. Acta Neuropathol. Springer Berlin Heidelberg; 2021;141:619–22. [DOI] [PubMed] [Google Scholar]
- 41.Siegfried A, Masliah-Planchon J, Roux FE, Larrieu-Ciron D, Pierron G, Nicaise Y, et al. Brain tumor with an ATXN1-NUTM1 fusion gene expands the histologic spectrum of NUTM1-rearranged neoplasia. Acta Neuropathol Commun. Acta Neuropathologica Communications; 2019;7:5–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Xu F, Viaene AN, Ruiz J, Schubert J, Wu J, Chen J, et al. Novel ATXN1/ATXN1L::NUTM2A fusions identified in aggressive infant sarcomas with gene expression and methylation patterns similar to CIC-rearranged sarcoma. Acta Neuropathol Commun. BioMed Central; 2022;10:1–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Ponce RKM, Luck C, Okimoto RA. Molecular and therapeutic advancements in Capicua (CIC)-rearranged sarcoma. Front Cell Dev Biol. 2024;12:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Tauziède-Espariat A, Ebrahimi A, Boddaert N, Pietsch T, Grajkowska W, Blau T, et al. CIC/ATXN1-rearranged tumors in the central nervous system are mainly represented by sarcomas: A comprehensive clinicopathological and epigenetic series. Brain Pathol. 2024;1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Yoshimoto T, Tanaka M, Homme M, Yamazaki Y, Takazawa Y, Antonescu CR, et al. CIC-DUX4 induces small round cell sarcomas distinct from ewing sarcoma. Cancer Res. American Association for Cancer Research Inc.; 2017;77:2927–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Yoshimatsu Y, Noguchi R, Tsuchiya R, Kito F, Sei A, Sugaya J, et al. Establishment and characterization of NCC-CDS2-C1: a novel patient-derived cell line of CIC-DUX4 sarcoma. Hum Cell. Springer Japan; 2020;33:427–36. [DOI] [PubMed] [Google Scholar]
- 47.Oyama R, Takahashi M, Yoshida A, Sakumoto M, Takai Y, Kito F, et al. Generation of novel patient-derived CIC-DUX4 sarcoma xenografts and cell lines. Sci Rep. Springer US; 2017;7:1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Nakai S, Yamada S, Outani H, Nakai T, Yasuda N, Mae H, et al. Establishment of a novel human CIC-DUX4 sarcoma cell line, Kitra-SRS, with autocrine IGF-1R activation and metastatic potential to the lungs. Sci Rep. 2019;9:1–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Alekseyenko AA, Walsh EM, Zee BM, Pakozdi T, Hsi P, Lemieux ME, et al. Ectopic protein interactions within BRD4-chromatin complexes drive oncogenic megadomain formation in NUT midline carcinoma. Proc Natl Acad Sci U S A. 2017;114:E4184–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Kim E, Lu HC, Zoghbi HY, Song JJ. Structural basis of protein complex formation and reconfiguration by polyglutamine disease protein ataxin-1 and Capicua. Genes Dev. 2013;27:590–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Jagannathan S, Shadle SC, Resnick R, Snider L, Tawil RN, van der Maarel SM, et al. Model systems of DUX4 expression recapitulate the transcriptional profile of FSHD cells. Hum Mol Genet. 2016;25:4419–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kendall GC, Amatruda JF. Zebrafish as a Model for the Study of Solid Malignancies. Methods Mol Bio. 2016;1451:121–42. [DOI] [PubMed] [Google Scholar]
- 53.Mosimann C, Kaufman CK, Li P, Pugach EK, Tamplin OJ, Zon LI. Ubiquitous transgene expression and Cre-based recombination driven by the ubiquitin promoter in zebrafish. Development. 2011;138:169–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Kwan KM, Fujimoto E, Grabher C, Mangum BD, Hardy ME, Campbell DS, et al. The Tol2kit: A multisite gateway-based construction kit for Tol2 transposon transgenesis constructs. Dev Dyn. 2007;236:3088–99. [DOI] [PubMed] [Google Scholar]
- 55.Forés M, Simón-Carrasco L, Ajuria L, Samper N, González-Crespo S, Drosten M, et al. A new mode of DNA binding distinguishes Capicua from other HMG-box factors and explains its mutation patterns in cancer. PLoS Genet. 2017;13:1–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Katayama S, Ranga V, Jouhilahti E-M, Airenne TT, Johnson MS, Mukherjee K, et al. Phylogenetic and mutational analyses of human LEUTX, a homeobox gene implicated in embryogenesis. Sci Rep. Nature Publishing Group; 2018;8:17421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30:207–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Andrews S FastQC [Internet]. 2010. Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
- 59.Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.R Core Team. R: A Language and Environment for Statistical Computing [Internet]. Vienna, Austria; 2022. Available from: https://www.r-project.org/ [Google Scholar]
- 61.Wickham H, François R, Henry L, Müller K, Vaughan D. dplyr: A Grammar of Data Manipulation [Internet]. 2023. Available from: https://cran.r-project.org/package=dplyr
- 62.Dowle M, Srinivasan A. data.table: Extension of `data.framè [Internet]. 2023. Available from: https://cran.r-project.org/package=data.table
- 63.Wickham H, Vaughan D, Girlich M. tidyr: Tidy Messy Data [Internet]. 2023. Available from: https://cran.r-project.org/package=tidyr
- 64.Wickham H ggplot2: Elegant Graphics for Data Analysis [Internet]. Springer-Verlag; New York; 2016. Available from: https://ggplot2.tidyverse.org [Google Scholar]
- 65.Kolde R pheatmap: Pretty Heatmaps [Internet]. 2019. Available from: https://cran.r-project.org/package=pheatmap
- 66.Slowikowski K ggrepel: Automatically Position Non-Overlapping Text Labels with “ggplot2” [Internet]. 2023. Available from: https://cran.r-project.org/package=ggrepel
- 67.Pedersen TL. patchwork: The Composer of Plots [Internet]. 2022. Available from: https://cran.r-project.org/package=patchwork
- 68.Durinck S, Spellman PT, Birney E, Huber W. Mapping identifiers for the integration of genomic datasets with the R/ Bioconductor package biomaRt. Nat Protoc. 2009;4:1184–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Durinck S, Moreau Y, Kasprzyk A, Davis S, De Moor B, Brazma A, et al. BioMart and Bioconductor: A powerful link between biological databases and microarray data analysis. Bioinformatics. 2005;21:3439–40. [DOI] [PubMed] [Google Scholar]
- 70.Wilkins D gggenes: Draw Gene Arrow Maps in “ggplot2.” 2023.
- 71.Robinson MD, McCarthy DJ, Smyth GK. edgeR : a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11:R25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40:4288–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Lun ATL, Chen Y, Smyth GK. It’s DE-licious: A Recipe for Differential Expression Analyses of RNA-seq Experiments Using Quasi-Likelihood Methods in edgeR. Methods Mol Biol. 2016;1418:391–416. [DOI] [PubMed] [Google Scholar]
- 75.Peterson H, Kolberg L, Raudvere U, Kuzmin I, Vilo J. gprofiler2 -- an R package for gene list functional enrichment analysis and namespace conversion toolset g: Profiler. F1000Research. 2020;9:1–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Leek JT, Johnson WE, Parker HS, Fertig EJ, Jaffe AE, Zhang Y, et al. sva: Surrogate Variable Analysis. 2022. [Google Scholar]
- 77.Guo S, Hu X, Cotton JL, Ma L, Li Q, Cui J, et al. VGLL2 and TEAD1 fusion proteins identified in human sarcoma drive YAP/TAZ-independent tumorigenesis by engaging EP300. Elife. 2025;13:1–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Seong BKA, Dharia NV, Lin S, Donovan KA, Chong S, Robichaud A, et al. TRIM8 modulates the EWS/FLI oncoprotein to promote survival in Ewing sarcoma. Cancer Cell. 2021;39:1262–1278.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Saito T, Nagai M, Ladanyi M. SYT-SSX1 and SYT-SSX2 interfere with repression of E-cadherin by snail and slug: A potential mechanism for aberrant mesenchymal to epithelial transition in human synovial sarcoma. Cancer Res. 2006;66:6919–27. [DOI] [PubMed] [Google Scholar]
- 80.Alekseyenko AA, Walsh EM, Wang X, Grayson AR, Hsi PT, Kharchenko PV., et al. The oncogenic BRD4-NUT chromatin regulator drives aberrant transcription within large topological domains. Genes Dev. 2015;29:1507–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Watson S, Kendall GC, Rakheja D, McFaul ME, Draper BW, Tirode F, et al. CIC-DUX4 expression drives the development of small round cell sarcoma in transgenic zebrafish: a new model revealing a role for ETV4 in CIC-mediated sarcomagenesis. bioRxiv. 2019; [Google Scholar]
- 82.Ringnalda FCAS, van Son GJF, Verweij LHG, Kim SY, Amo-Addae V, Flucke UE, et al. Small round cell sarcoma tumoroid biobank reveals CIC::DUX4 sarcoma vulnerability to MCL-1 inhibition. Nat Commun. Springer US; 2025;16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Dissanayake K, Toth R, Blakey J, Olsson O, Campbell DG, Prescott AR, et al. ERK/p90RSK/14-3-3 signalling has an impact on expression of PEA3 Ets transcription factors via the transcriptional repressor capicúa. Biochem J. 2011;433:515–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Watson S, LaVigne CA, Xu L, Surdez D, Cyrta J, Calderon D, et al. VGLL2-NCOA2 leverages developmental programs for pediatric sarcomagenesis. Cell Rep. Elsevier B.V.; 2023;42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Leacock SW, Basse AN, Chandler GL, Kirk AM, Rakheja D, Amatruda JF. A zebrafish transgenic model of Ewing’s sarcoma reveals conserved mediators of EWS-FLI1 tumorigenesis. Dis Model Mech. 2012;5:95–106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Le Loarer F, Pissaloux D, Watson S, Godfraind C, Galmiche-Rolland L, Silva K, et al. Clinicopathologic Features of CIC-NUTM1 Sarcomas, a New Molecular Variant of the Family of CIC-Fused Sarcomas. Am J Surg Pathol. 2018;43:268–76. [DOI] [PubMed] [Google Scholar]
- 87.Golson ML, Kaestner KH. Fox transcription factors: From development to disease. Dev. 2016;143:4558–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Mani SA, Yang J, Brooks M, Schwaninger G, Zhou A, Miura N, et al. Mesenchyme Forkhead 1 (FOXC2) plays a key role in metastasis and is associated with aggressive basal-like breast cancers. Proc Natl Acad Sci U S A. 2007;104:10069–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Shiota H, Barral S, Buchou T, Tan M, Couté Y, Charbonnier G, et al. Nut Directs p300-Dependent, Genome-Wide H4 Hyperacetylation in Male Germ Cells. Cell Rep. 2018;24:3477–3487.e6. [DOI] [PubMed] [Google Scholar]
- 90.Lin YK, Wu W, Ponce RK, Kim JW, Okimoto RA. Negative MAPK-ERK regulation sustains CIC-DUX4 oncoprotein expression in undifferentiated sarcoma. Proc Natl Acad Sci. 2020; [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Hwang I, Pan H, Yao J, Elemento O, Zheng H, Paik J. CIC is a critical regulator of neuronal differentiation. JCI Insight. 2020;5:1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Geng LN, Yao Z, Snider L, Fong AP, Cech JN, Young JM, et al. DUX4 Activates Germline Genes, Retroelements, and Immune Mediators: Implications for Facioscapulohumeral Dystrophy. Dev Cell. Elsevier Inc.; 2012;22:38–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.French CA, Ramirez CL, Kolmakova J, Hickman TT, Cameron MJ, Thyne ME, et al. BRD-NUT oncoproteins: A family of closely related nuclear proteins that block epithelial differentiation and maintain the growth of carcinoma cells. Oncogene. 2008;27:2237–42. [DOI] [PubMed] [Google Scholar]
- 94.Eagen KP, French CA. Supercharging BRD4 with NUT in carcinoma. Oncogene. 2021;40:1396–408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Marsh GP, Cooper MS, Goggins S, Reynolds SJ, Wheeler DF, Cresser-Brown JO, et al. Development of p300-targeting degraders with enhanced selectivity and onset of degradation. RSC Med Chem. Royal Society of Chemistry; 2025; [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Plasmids encoding CIC::DUX4, CIC(ex20)::NUTM1(ex6), CIC(ex18)::NUTM1(ex3), HA-CIC::LEUTX, and HA-ATXN1::DUX4 are available on Addgene as plasmids 247361–247365. Other constructs described in this paper are available upon request.
RNA-seq data that we generated have been deposited at GEO: GSE295623, GEO: GSE295624, and GEO: GSE295625 and are publicly available as of the date of publication.
RNA-seq data used for the patient-derived CIC::DUX4 vs CIC::NUTM1 analysis were previously published (20) and are housed in the European Genome-Phenome Archive as dataset EGAD00001003121.
Uncropped western blot images and Ponceau S loading control images are available in Supplementary Dataset S2.
All original code is available at https://github.com/cuylerluck/CICfamily_models and is publicly available as of the time of publication.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
