Summary
Splicing factor mutations are common among cancers, recently emerging as drivers of myeloid malignancies. U2AF1 carries hotspot mutations in its RNA binding motifs; yet how they affect splicing and promote cancer remains unclear. The U2AF1/U2AF2 heterodimer is critical for 3’ splice site (3’SS) definition. To specifically unmask changes in U2AF1 function in vivo, we developed a crosslinking and immunoprecipitation procedure detecting contacts between U2AF1 and the 3’SS AG at single-nucleotide resolution. Our data reveal that U2AF1 S34F and Q157R mutants establish new 3’SS contacts at −3 and +1 nucleotides, respectively. These effects compromise U2AF2-RNA interactions, resulting predominantly in intron retention and exon exclusion. Integrating RNA binding, splicing and turnover data, we predicted that U2AF1 mutations directly affect stress granule components, corroborated by single-cell RNA-seq. Remarkably, U2AF1-mutant cell lines and patient-derived MDS/AML blasts displayed a heightened stress granule response, pointing to a novel role for biomolecular condensates in adaptive oncogenic strategies.
Graphical Abstract

eTOC Blurb
Biancon et al. unmask a stress granule signature in U2AF1 mutant myeloid malignancies, via multi-omics dissection of RNA binding, splicing and turnover. They document novel mutant-specific U2AF1-RNA binding peaks at 3’ splice site positions, determining aberrant splice outcomes. U2AF1 mutant cells display enhanced stress granule formation and stress resistance.
Introduction
Somatic mutations in splicing factor (SF) genes SF3B1, SRSF2, U2AF1 and ZRSR2 function as drivers of myeloid malignancies and solid tumors such as lung adenocarcinoma (Anczuków and Krainer, 2016; Dvinge et al., 2016). They are generally mutually exclusive and determine clinical outcomes (Papaemmanuil et al., 2013; Saez et al., 2017). SF mutations occur in approximately 10% of de novo acute myeloid leukemia (AML) and reach more than 50% in myelodysplastic syndromes (MDS) and secondary AML (Kennedy and Ebert, 2017; Visconte et al., 2019; Yoshida et al., 2011). MDS and AML are clonal hematopoietic stem cell malignancies characterized by bone marrow failure and peripheral blood cytopenias due to the uncontrolled proliferation and dysfunctional maturation of myeloid blasts (Döhner et al., 2015; Steensma, 2015). Despite novel agents that lead to significant improvements in patient outcome (Hellström-Lindberg et al., 2020; Shallis et al., 2019), the 5-year relative survival rate is still only 38.3% in MDS and 28.7% in AML (https://seer.cancer.gov, cited 2021 Apr 2). We need to better understand pathogenic mechanisms in order to develop more efficient targeted therapeutics.
U2AF1 (U2 small nuclear RNA auxiliary factor 1), together with U2AF2, forms the U2AF complex that recognizes the 3’ splice site (3’SS) of U2 introns (Figure 1A) and recruits U2 small nuclear ribonucleoproteins (snRNPs) (Agrawal et al., 2016; Merendino et al., 1999; Wu et al., 1999; Zorio and Blumenthal, 1999). U2AF1 domains include a U2AF homology motif (UHM) that mediates the heterodimerization with U2AF2 (Kielkopf et al., 2001), and two conserved CCCH-type zinc fingers (ZnF1 and ZnF2) that have been shown to cooperatively bind the 3’SS intron-exon boundaries of target RNA in vitro (Yoshida et al., 2015). Heterozygous hotspot mutations affecting the S34 and Q157 residues within the ZnF1 and ZnF2 domains, respectively, result in sequence-dependent mis-splicing of genes critical for hematopoiesis (Graubert et al., 2011; Przychodzen et al., 2013; Shirai et al., 2015). It has been previously reported that the S34 mutation preferentially leads to exclusion of exons bearing a U in position −3 of the 3’SS region and to inclusion of exons bearing a C at the same position. The Q157 mutation preferentially excludes exons starting with A and includes exons starting with G in position +1 of the 3’SS region. Yet the molecular mechanisms leading to these splicing alterations and ultimately to disease are still not fully understood.
Figure 1. Distinct targets of U2AF subunits achieved through fractionated eCLIP-seq.
(A) Schematic representation of the U2AF complex on the intronic 3’SS. The presumed locations of U2AF1 and U2AF2, their relevant domains, the approximate position of U2AF1 pathological mutations and the consensus sequence of the human 3’SS are displayed. RRM, RNA recognition motif; ULM, U2AF ligand motif; UHM, U2AF homology motif; ZnF, zinc finger domain. (B) Schematic overview of eCLIP-seq vs freCLIP-seq protocol. Light Fraction: 37–65 kD, Heavy Fraction: 65–110 kD. (C) Alluvial plot showing different classes of intron-exon junctions bound by U2AF1, U2AF2 or both (color-coded), based on eCLIP-seq. Junction classes are defined by the presence of canonical AG at the intronic 3’SS, constitutive vs non-constitutive exon, and the position within the transcript. (D) Binding metaprofile (y-axis, mean±SEM of the percentage of crosslinking events) and 3’SS sequence logo for U2AF1 (top panel, n=3) and U2AF2 (bottom panel, n=4) eCLIP-seq. N, number of intron-exon junctions. (E) Binding metaprofile (mean±SEM) and 3’SS sequence logo for heavy (top panel, n=4) and light (bottom panel, n=3) fractions of U2AF1 WT freCLIP-seq. N, number of intron-exon junctions. See also Figure S1 and Table S1.
Previous studies, relying on in vitro binding assays and structural predictions, suggest that aberrant splicing may directly result from loss-of-binding, where U2AF1 mutants display lower affinity for splice junctions of excluded exons or retained introns (Fei et al., 2016; Ilagan et al., 2015; Okeyo-Owuor et al., 2015). However, recent structural studies by Yoshida et al. (Yoshida et al., 2020) show that the S34F/Y mutants do not display reduced binding affinities for 3’SS sequences that predict exon exclusion, suggesting a more complex and not yet dissected binding-splicing relationship.
To overcome these ambiguities in splicing factor driver mutations, we took a “multi-omics” approach that integrates the data reflecting the activities of WT U2AF1 alongside two cancer-associated U2AF1 mutants (S34F, Q157R). Central to this aim was our development of (fr)eCLIP-seq – for fractionated eCLIP-seq, a modification of enhanced UV-crosslinking and immunoprecipitation followed by sequencing (Van Nostrand et al., 2016) – that enabled us to dissect U2AF1 from U2AF2 RNA binding signals within the 3’SS region in vivo. Our freCLIP-seq data revealed specific changes in the binding of the two U2AF1 mutants to 3’SSs. Specifically, we observed de novo binding peaks in positions −3 for S34F and +1 for Q157R, the critical positions that determine aberrant splice outcomes. Combining this high-resolution binding data with bulk as well as single cell RNA-seq and RNA turnover data, we could show that aberrant splicing due to the Q157R mutation was predominantly associated with loss-of-binding, while aberrant splicing due to the S34F mutation followed a gain-of-binding pattern. Unexpectedly, the data pointed to the involvement of stress granules (SG) in these cancer models. We validated SG upregulation in cancer at the RNA level via single-cell (sc)RNA-seq and at the functional level via immunofluorescence (IF) imaging in primary samples derived from patients affected by U2AF1-mutant MDS/AML. Enhanced stress granule formation in U2AF1 mutant cells was associated with improved cell fitness under stress, reverted when using a stress granule inhibitor. Thus, our findings span a range of molecular phenotypes to reveal roles for biomolecular condensates and thereby major cellular principles of oncogenesis.
Results
Fractionated eCLIP-seq resolves distinct U2AF1 versus U2AF2 binding in vivo.
To investigate the role of U2AF1 mutants in myeloid malignancies, we generated HEL cell lines with doxycycline (dox)-inducible expression of FLAG-tagged WT, S34F or Q157R U2AF1 via lentiviral transduction (Yoshida et al., 2011) (Figure S1A). We induced FLAG-tagged U2AF1 proteins in GFP sorted cells for 48 hours and validated the induction via Sanger sequencing and western blot, confirming that expression levels of exogenous wild-type and mutant U2AF1 proteins were similar to endogenous levels in HEL cells (Figure S1B).
To characterize transcriptome-wide in vivo U2AF-RNA interactions, we performed U2AF1 and U2AF2 eCLIP-seq in U2AF1 WT cells via immunoprecipitation of U2AF1-RNA complexes with an anti-FLAG antibody (Van Nostrand et al., 2017) and of U2AF2-RNA complexes with an anti-U2AF2 antibody (Shao et al., 2014) (Figure 1B, Table S1). Selecting a window ranging from −30 to +5 nucleotides around the 3’SS, we identified 142,829 intron-exon junctions in 10,821 genes specifically bound by U2AF1 (teal, n=5,670), U2AF2 (pink, n=32,097) or both (grey, n=105,062) (Figure 1C). The detection of U2AF binding signal in intron-exon junctions positively correlated with gene expression levels (Figure S1C). 73.6% of the identified junctions were bound by the heterodimer, confirming the cooperative function of U2AF1 and U2AF2 at both constitutive and non-constitutive exons. The U2AF complex recognized internal and last exons with similar efficiency (54.1% and 36.2%, respectively), and also bound first exons, suggesting possible noncanonical roles of U2AF1 in translation regulation (Palangat et al., 2019). U2AF1 and U2AF2 mainly bound AG-dependent 3’SS (90.8% of the identified junctions) as previously reported (Dvinge et al., 2016; Guth et al., 2001) (Figure 1C).
To dissect RNA binding at single-nucleotide resolution, we developed a computational binding analysis pipeline taking advantage of the fact that protein-RNA crosslinks stop reverse transcription (Van Nostrand et al., 2016). Mapping and counting the end of eCLIP-seq reads, we were able to quantify the binding occupancy at single-nucleotide resolution (Figure S1D). Considering internal and last exons of protein coding genes, we built 3’SS binding metaprofiles for 73,676 intron-exon junctions in U2AF1 eCLIP-seq and 109,990 junctions in U2AF2 eCLIP-seq (Figure 1D). These metaprofiles showed a broad peak over the −15;−5 region, corresponding to the polypyrimidine tract (PPT), and a sharp peak at the −2 nucleotide. The near-identical shapes for U2AF1 and U2AF2 binding metaprofiles demonstrated that U2AF1 and U2AF2 mostly bind their target RNAs as a dimer but also that standard eCLIP-seq could not resolve their individual binding contributions.
To dissect U2AF1 vs U2AF2 specific binding, we developed fractionated eCLIP-seq by size-selection of membrane regions to isolate the “light fraction”, containing the U2AF1 monomer plus bound RNA (U2AF1), and the “heavy fraction”, including U2AF1-U2AF2-RNA complexes (U2AF1+U2AF2) (Figures 1B–S1E, Table S1). For the first time this allowed to visualize isolated U2AF1 RNA binding in vivo with a sharp binding peak at position −2 of the 3’SS (AG) (Figure 1E). The broad peak over the PPT was absent in the U2AF1-specific profile suggesting that the PPT is occupied by U2AF2 (Figure 1E, Figure S1F).
In summary, we precisely mapped U2AF interactions across >140,000 splice junctions in >10,000 human genes dissecting for the first time U2AF1 versus U2AF2 binding at the 3’SS with single-nucleotide resolution in vivo.
S34F and Q157R mutations alter splicing and the position of U2AF1 contacts at the 3’SS
U2AF1 mutations result in aberrant splicing in a sequence-specific manner dependent on the −3 position for S34 and the +1 position for Q157 mutants (Brooks et al., 2014; Ilagan et al., 2015; Okeyo-Owuor et al., 2015). We generated RNA-seq libraries from dox-induced and uninduced U2AF1 WT, S34F and Q157R cells. After exclusion of solely dox-dependent events (see Methods; Figure S2A), alternative splicing analysis of S34F and Q157R mutant vs WT cells detected 6,879 and 2,914 differentially spliced events, respectively (Figure 2A; Table S2). Skipped exons (SE) represented the most frequent event type. We observed a clear trend favoring exon exclusion and retained introns (RI) in both mutants (Figure 2A). Selected differentially spliced events were confirmed by splicing-specific RT-PCR (Nguyen et al., 2018) (Figure S2B). While splicing alterations were the predominant events, U2AF1 mutations also caused changes in gene expression: we identified 1,129 differentially expressed genes in S34F mutant cells (Figure S2C) and 98 differentially expressed genes in Q157R mutant cells (Figure S2D).
Figure 2. U2AF1 mutations determine the position of 3`SS contacts.
(A) Number of differentially spliced events in S34F and Q157R compared with WT (absolute delta PSI>10%; FDR<0.05). SE, skipped exons; A5SS, alternative 5’SS; A3SS, alternative 3’SS, MXE, mutually exclusive exons; RI, retained introns. (B) 3’SS sequence logos for differential SE events in U2AF1 S34F (top panel) and Q157R (bottom panel) conditions. Sequence-specific positions between more included and less included exons are highlighted (position −3 for S34F mutant, +1 for Q157R mutant) N, number of differentially spliced events. (C) Binding metaprofiles (y-axis, mean±SEM of the percentage of crosslinking events) and 3’SS sequence logo for U2AF1 freCLIP-seq fractions, comparing WT with S34F (n=3) and Q157R (n=2) mutations. Arrows indicate the de novo binding peaks in position −3 for S34F mutant and +1 for Q157R mutant, respectively. N, number of intron-exon junctions. See also Figure S2 and Tables S2–S3–S4.
Next, 16 published datasets (Bamopoulos et al., 2020; Brooks et al., 2014; Esfahani et al., 2019; Fei et al., 2016; Ilagan et al., 2015; Kim et al., 2018; Okeyo-Owuor et al., 2015; Pellagatti et al., 2018; Yip et al., 2017) were queried in a comprehensive meta-analysis for aberrant splicing patterns secondary to pathogenic U2AF1 mutations (Table S3). The analysis of 15 S34F/Y datasets revealed that 3,688 differentially spliced genes were identified in only one dataset, suggesting that splicing changes are context dependent (Brooks et al., 2014; Motta-Mena et al., 2010) (Figures S2E–S2F, Table S4). Splicing alterations in 910 genes were only present in our HEL cell system, but 2,696 genes were identified as aberrantly spliced also in other datasets (Figure S2F, Table S4). Similar conclusions can be drawn from the analysis of Q157R/P datasets, although only 2 out of 16 datasets contained Q157 mutant samples (Ilagan et al., 2015; Pellagatti et al., 2018) (Figures S2G–S2H, Table S4). Importantly, we detected more than half of the genes (54.6%) reported as aberrantly spliced in seven patient-derived sample datasets in our system (Figures S2E–S2G, Table S4).
Previous studies have identified sequence- and position-specific dependencies of alternative splicing events. We confirmed that, in S34F mutant cells, exon inclusion favored a C in position −3 of the 3’SS (71.6%) and exon exclusion favored a U (85.2%) (Figure 2B), while in Q157R mutant cells, exon inclusion favored a G in position +1 of the 3’SS (65.0%) and exon exclusion favored an A (74.7%) (Figure 2B). We identified the same specificity in A3SS, MXE and RI differential splice events (Figures S2I–S2J). Overall, our system nicely recapitulated the expected changes in splicing and junction sequence preference (Figures 2B, S2B, S2E–S2J; Tables S3 and S4).
To understand whether sequence and position specificity were the direct consequence of alterations in U2AF1 RNA binding, we applied freCLIP-seq to cells expressing S34F or Q157R mutant U2AF1 (Figure S1E, Table S1). As in U2AF1 WT cells, freCLIP-seq successfully separated U2AF1 monomer (U2AF1) from U2AF heterodimer (U2AF1+U2AF2) binding (Figure 2C). Strikingly, the metaprofiles for both U2AF1 mutants identified distinct de novo peaks in position −3 of the 3’SS for S34F and in position +1 for Q157R regardless of nucleotide sequence (Figure 2C). Moreover, the metaprofiles revealed a relative reduction in the signal over the PPT, suggesting that U2AF1 mutations alter U2AF2-RNA binding (Figure 2C, upper panel).
The novel binding peaks at −3 (for S34F) and +1 (for Q157R) perfectly matched the splicing-predictive positions at the 3’SS.
In summary, analysis of ~40,000 intron-exon junctions uncovers that two oncogenic U2AF1 mutations alter U2AF1 and U2AF2 3’SS contacts with position-specific consequences for splicing outcomes.
Integrative binding-splicing analysis reveals that mutant U2AF1 gain-of-binding can lead to loss-of-splicing.
Differential splicing events represent primary and secondary consequences of altered mutant U2AF1 binding. By contrast, alternative splice events at junctions that are differentially bound are likely direct consequences of U2AF1 mutations. We performed an integrative multi-omics analysis identifying intron-exon junctions affected by both differential binding (freCLIP-seq; Figure S3A, Table S5) and splicing (RNA-seq) in U2AF1 S34F and Q157R cells. Taking into consideration all the possible combinations of increased vs decreased mutant binding and increased vs decreased junction inclusion (Table S6), we analyzed four binding-splicing classes focusing on the most highly represented differential splicing event, skipped exons (Figure 2A). The “<binding;<inclusion” and “>binding;>inclusion” classes, that represent congruent binding-splicing outcomes, match the previously proposed model (Fei et al., 2016; Ilagan et al., 2015; Okeyo-Owuor et al., 2015) where binding or loss of binding directly translate into exon inclusion or exclusion. Conversely, the “>binding;<inclusion” and “<binding;>inclusion” classes represent an “inverse” model, where mutant U2AF1 binding is inversely related to exon inclusion, suggesting functional impairment in splicing progression secondary to mutant U2AF1 binding. In S34F mutant cells, we identified 67 junctions following the “<binding;<inclusion” model and 71 junctions following the “>binding;>inclusion” model (Figure 3A, right panel). On the other hand, 55.3% of the aberrantly bound and spliced junctions followed the “inverse” model with “>binding;<inclusion” as the most enriched class (123 out of 309 junctions). Metaprofiles divided into the four binding-splicing classes showed that increased U2AF1 S34F binding in position −3 of the 3’SS was associated with a reduction in U2AF2 binding, particularly evident for less included exons (Figure 3A, left panel). On the contrary, a decrease in U2AF1 S34F binding was accompanied by increased U2AF2 binding, especially for more included exons (Figure S3B). Quantitation of binding differences (delta analysis) also suggested that increased or decreased mutant U2AF1 binding resulted in splicing alterations by affecting U2AF2-RNA interactions in the opposite direction (Figures S3C).
Figure 3. Mutant U2AF1 binding strength and position influence splicing outcome.
(A) Right panel: scatter plot representing intron-exon junctions significantly affected by both differential binding (y-axis, log2 FC based on freCLIP-seq) and differential splicing (x-axis, delta PSI of SE events based on RNA-seq) in S34F vs WT. Significantly affected junctions (P-value<0.05, Fisher’s method) in each binding-splicing class (“<binding;<inclusion”, “>binding;>inclusion”, “>binding;<inclusion”, “<binding;>inclusion”) are color-coded. N, number of significantly affected junctions in each class. Left panel: binding metaprofiles (y-axis, mean±SEM of the number of crosslinking events per million reads) of U2AF subunits, based on freCLIP-seq fractions, considering junctions belonging to the most represented binding-splicing class. Positions characterized by a significant change in mutant vs WT U2AF1 binding are starred (P-value<0.05, one-tailed t-test). (B) Binding-splicing analysis in Q157R vs WT, displayed as in (A). See also Figure S3 and Tables S5–S6.
U2AF1 Q157R mutant affected a smaller number of junctions than S34F mutant (180 vs 309), most of which exhibited a loss-of-binding pattern with 51.1% of junctions in the “<binding;<inclusion” class and 31.7% of junctions in the “<binding;>inclusion” class (Figure 3B, right panel). Metaprofiles showed prominent binding of Q157R mutant at position +1 of the 3’SS (Figure S3B) accompanied by a general reduction in U2AF2 signal (Figure 3B, left panel; Figures S3B–S3C).
Collectively, in-depth integrative analysis of in vivo binding and splicing alterations revealed a complex picture that expands upon previous models inferred from splicing and in vitro binding data. Mutation-induced gain-of-binding often results in exon skipping, possibly mediated by compromised recruitment of U2AF2.
U2AF1 mutations alter stress granule biology.
Hotspot S34 and Q157 mutations occur in the two zinc fingers of U2AF1 that are critical to RNA binding (Yoshida et al., 2011). S34 mutations are more common, but both occur in myeloid malignancies and other cancers. Understanding common and distinct binding and splicing perturbations associated with these two mutations can shed light on their oncogenic driver mechanisms in cancer. Although the overlap between U2AF1 mutants considering either differentially spliced or differentially bound genes was approximately 30% (Figure S4A), the two mutants converged on biological processes crucial to tumorigenesis, such as cell cycle, DNA repair, transcription, translation and RNA processing (Figure S4B, Tables S7–S8).
To identify biological processes directly affected by U2AF1 mutations, we performed functional analysis on all genes characterized by concurrent aberrant binding and splicing (S34F: 496 genes; Q157R: 312 genes; Table S6). For both S34F and Q157R mutants, we observed a significant enrichment in several GO terms related to RNA biology, in particular to RNA granules: RNA transport, ribonucleoprotein complex assembly, RNA helicases, RNA binding, cytoplasmic stress granules (Ivanov et al., 2019; Van Treeck and Parker, 2018) (Figure 4A; Table S9). This observation was further supported by a significant enrichment in genes coding for proteins containing low-complexity domains (Figure S4C). These proteins promote the formation of membraneless organelles, dynamic condensates of RNA and RNA-binding proteins, such as nucleoli and Cajal bodies in the nucleus, P-bodies and stress granules in the cytoplasm (Courchaine et al., 2016; Protter and Parker, 2016).
Figure 4. Mutant U2AF1 binding-splicing alterations affect transcripts enriched in stress granules.
(A) Enrichment in GO terms related to RNA granule biology for genes differentially bound and spliced in U2AF1 S34F vs WT and Q157R vs WT HEL cells. Node size: number of affected genes belonging to a specific term. P-value based on Fisher’s exact test. (B) Enrichment analysis of mutant U2AF1 differentially bound-spliced genes among SG-related experimental datasets. Top panel: SG-enriched proteins; Bottom panel: SG-enriched (white-highlighted) and SG-depleted (grey-highlighted) transcripts. Dot size is proportional to the overlap, measured by odds ratio (Fisher’s exact test). (C) Network visualization of differentially bound-spliced transcripts in U2AF1 mutants that are also enriched in stress granules, according to multiple SG experimental datasets. See also Figure S4 and Tables S7–S8–S9–S10.
To probe the significance of these findings, we analyzed a total of 16 experimental datasets characterizing both proteins and transcripts enriched in stress granules, 10 of which identified SG-enriched proteins by mass-spectrometry and 6 of which identified SG-enriched and SG-depleted transcripts by RNA-seq (Jain et al., 2016; Khong et al., 2017; Markmiller et al., 2018; Marmor-Kollet et al., 2020; Matheny et al., 2021; Youn et al., 2018) (Table S10, Figure S4D). Direct targets of mutant U2AF1 aberrant binding and splicing significantly overlapped with SG-enriched, but not SG-depleted, proteins and transcripts from these 16 datasets (Figure 4B). The “>binding;<inclusion” class was the most represented in SG-enriched transcripts and proteins for both U2AF1 mutants (Figure 4B), highlighting the importance of the dysfunctional splicing process in disease pathology. The 125 SG-enriched and differentially bound and spliced mRNAs (network representation in Figure 4C) are characterized by long coding sequences (CDS) and long untranslated regions (UTR), typical physical characteristics of SG-enriched mRNAs (Khong et al., 2017). In addition, 97 differentially bound and spliced transcripts code for proteins enriched in SGs according to proteome studies (network representation in Figure S4E). For 36 mutant U2AF1 targets, both the mRNA and the corresponding protein were enriched in SGs, such as DEAH-Box Helicase 33 (DHX33) and Kinesin Light Chain 1 (KLC1) for S34F, Ubiquitin-associated Protein 2-like (UBAP2L) and Heterogeneous Nuclear Ribonucleoprotein H3 (HNRNPH3) for Q157R, and Ataxin 2 Like (ATXN2L) and Proline Rich Coiled-Coil 2C (PRRC2C) for both mutants. Collectively, our comprehensive analysis pointed out stress granule components as novel direct targets of binding and splicing alterations driven by U2AF1 mutations.
U2AF1 mutations enhance stress granule formation improving cell fitness under stress.
To probe how aberrant binding and splicing of SG components affected SG biology we assessed SG formation at steady state and after stress induction with sodium arsenite in HEL cells by immunofluorescence staining against the SG-marker protein G3BP1 (Kedersha and Anderson, 2007) (Figure 5A). At steady state and further pronounced after exposure to oxidative stress, IF staining detected a prominent SG signal in U2AF1 S34F and Q157R cells compared to WT cells (Figure 5A, Files S1–S2). Quantitation of SGs, with an IMARIS-based image analysis pipeline developed for accurate SG identification (Figure S5A, Table S11), confirmed significantly higher SG signal in arsenite-treated mutant compared to WT U2AF1 cells. The involvement of increased G3BP1 protein level in causing enhanced SG assembly was ruled out by western blot analysis (Figure S5B).
Figure 5. Mutant U2AF1 cells show increased capability to form stress granules.
(A) Representative IF images (scale bars, 10 μm) and quantification of stress granules in mutant and WT U2AF1 HEL cells. SGs were identified by IMARIS (see Methods). The plot displays the mean±SEM G3BP1 field intensity, normalized to the relative controls (ctrl, uninduced HEL cells without arsenite treatment; dox, doxycycline-induced HEL cells without arsenite treatment; dox+ars, doxycycline-induced HEL cells treated with 500 μM arsenite for 1 hour). G3BP1 field intensity is the mean intensity of all the single SGs identified in the field. For each condition, 6 fields were acquired (3 fields per replicate), containing on average 74 cells each. Differences between S34F or Q157R and WT were tested with two-tailed t-test. (B, D) Scatter plot of gene expression changes (x-axis) and relative stability/degradation contributions (y-axis) in S34F (B) or Q157R (D) vs WT, measured by TL-seq (2 replicates per condition). Transcripts enriched (left panel) or depleted (right panel) in stress granules are highlighted. N, number of transcripts in each TL-seq class (stabilized, induced, destabilized, shutdown) (C, E) Fraction of SG enriched vs depleted transcripts in each TL-seq class in S34F (C) or Q157R (E) vs WT. Differences in fractions within each class were tested with a proportion test. (F) Percentage of 7-AAD negative viable cells detected by flow cytometry under arsenite stress (500 μM for 24 hours). Statistical differences between mutant (S34F, n=3; Q157R, n=3) vs WT (n=3) U2AF1 cells were calculated by t-test. (G) Difference in cell viability upon arsenite stress and ISRIB treatment (20 nM for 24 hours), compared with arsenite stress alone. P-values between mutant (S34F, n=3; Q157R, n=3) vs WT (n=3) U2AF1 cells were calculated by t-test. See also Figure S5, Table S11 and Files S1–S2.
To specifically investigate the effect of binding-splicing alterations at the protein level for SG enriched genes, we performed label-free quantitative (LFQ) mass spectrometry on U2AF1 WT, S34F and Q157R HEL cells, and we compared peptide levels in U2AF1 mutant vs WT cells. Focusing on stress granule protein components with transcripts differentially bound and differentially spliced in U2AF1 mutant cells (34 genes for S34F, 23 genes for Q157R), we observed a heterogeneous spectrum of variations, with a global shift towards upregulation in S34F cells and with no global shift in Q157R cells (Figure S5C). To understand how mutant U2AF1 induced aberrant binding and splicing would result in enhanced SG formation at the RNA level, we analyzed RNA transcript dynamics by TimeLapse-seq (TL-seq) (Schofield et al., 2018). This technique allows to disentangle the contributions of RNA synthesis vs stability on total RNA levels. Comparing mutant vs WT U2AF1 cells with TL-seq, transcripts were sorted into four classes: upregulated transcripts with increased stability (“stabilized”) vs increased synthesis rate (“induced”), and downregulated transcripts with reduced stability (“destabilized”) vs reduced synthesis rate (“shutdown”) (Figure 5B). Integration of TL-seq variations with the afore mentioned experimental datasets characterizing SG-enriched and SG-depleted transcripts (Table S10), yielded significant over-representation of SG-enriched RNAs among transcripts with increased stability (S34F, Figures 5B–5C) and synthesis (Q157R, Figures 5D–5E). Conversely, SG-depleted transcripts were mainly in the destabilized and shutdown classes for both mutants (Figures 5B–5E).
To assess the effect of the increased SG formation on cell fitness, we evaluated the level of cell viability in U2AF1 mutant and WT cells under stress conditions. We specifically stressed HEL WT, S34F and Q157R cells with sodium arsenite for 24h and performed 7-Aminoactinomycin D (7-AAD) staining to isolate dead cells from viable cells by flow cytometry. U2AF1 S34F and Q157R cells, compared to WT cells, showed significantly higher cell viability in response to stress, 7.5% and 4.1% respectively (Figure 5F). Moreover, to determine if the increased SG formation was specifically the cause of the observed increased cellular adaptation, U2AF1 cells were treated with the drug-like compound ISRIB (Integrated Stress Response Inhibitor). This small molecule disrupts the activation of the integrated stress response (ISR) antagonizing the inhibitory effect of phosphorylated eIF2 on eIF2B with consequent restoration of protein synthesis and reduction in SG assembly (Rabouw et al., 2019; Sidrauski et al., 2013; Zyryanova et al., 2021). After 24h of ISRIB treatment under stress conditions, we evaluated the change in the level of cell viability in comparison to untreated cells. Importantly, the treatment with the inhibitor led to reduced cell viability in U2AF1 mutant cells, especially in S34F cells, but not in U2AF1 WT cells (Figure 5G). ISRIB treatment reduced by 44% and 87.8% the survival differences between U2AF1 mutants (S34F and Q157R, respectively) and WT cells.
The same results were obtained using a second cell viability assay (WST-1 assay, Figures S5D–S5E).
Collectively, these data suggest that U2AF1 mutations increase the availability of RNAs and possibly proteins prone to participate in stress granule formation, supporting a model whereby U2AF1 mutations increase the cell’s potential to respond and adapt to stress providing a clonal advantage.
U2AF1-mutant primary MDS/AML blasts are enabled with enhanced stress response
Cell lines and inducible models may not be representative of primary disease. To validate whether SG perturbations detected in our cell line model are reflective of disease pathology in patients with U2AF1-mutant myeloid cancers, we performed single-cell RNA sequencing (scRNA-seq) on 4271 CD34+ cells, comparing MDS patients with or without U2AF1 S34F mutation (Table S12). By single-cell mutation calling on reads mapping to the U2AF1 locus, we classified each cell as either WT or S34F, and verified that all cells from the control MDS were WT (Figure 6A). We then assigned to each cell a SG-expression score, based on the average expression of 149 SG-enriched genes characterized by differential binding and splicing in mutant U2AF1 cells (Figures 4C and S4E). S34F mutant cells carried a significantly higher SG-expression score compared to WT cells from the same patient and to all cells from the control MDS patient (Figure 6B), directly linking U2AF1 mutant genotype to alterations in stress granules in primary patient cells.
Figure 6. U2AF1-mutant MDS and AML patient cells display increased stress granule response.
(A) UMAP representation of 4271 CD34+ cells isolated from MDS patients, based on single-cell RNA-seq. WT and S34F cells from each patient are color-coded. (B) Distributions of SG-expression scores in WT and S34F cells from each MDS patient. The SG-expression score for each cell is based on the average expression of 149 genes enriched in stress granules and differentially bound and spliced by U2AF1 mutants. Differences in the distributions were tested with two-tailed Wilcoxon rank-sum test. (C, D) Representative IF images (scale bars, 10 μm) and quantification of stress granules in AML patients with WT (n=3) or S34F/Y (n=3) U2AF1. SGs were identified by IMARIS (see Methods). The plot displays the mean±SEM G3BP1 field intensity, normalized to the relative controls (ctrl, primary cells without arsenite treatment; ars, primary cells treated with 500 μM arsenite for 1 hour). G3BP1 field intensity is the mean intensity of all the single SGs identified in the field. For each patient, 6 fields per condition were acquired, containing on average 25 cells each. Differences between S34F/Y and WT patients were tested with two-tailed t-test. (E) Enrichment analysis of genes differentially spliced in 2 published cohorts of U2AF1-S34 AML patients among SG experimental datasets (Fisher’s exact test). See also Figure S6 and Tables S12–S13.
Previous studies demonstrated that mutant KRAS promotes cell fitness in colon and pancreatic cancers through increased stress granule formation (Grabocka and Bar-Sagi, 2016). Since the MDS patient affected by U2AF1 S34F mutation also presented NRAS and KRAS mutations (NRAS G12D, KRAS G12S; Table S12), we analyzed scRNA-seq data to identify distinct mutated cell populations and to compare their SG-expression scores. We observed that also NRAS- and KRAS-mutated cell populations were associated with higher SG-expression score in comparison to WT cells from the control MDS patient. Of note, these results suggest a synergistic effect between U2AF1 and NRAS mutations on enhanced SG formation (Figure S6).
Next, we performed SG-IF staining on bone marrow and peripheral blood mononuclear cells from three patients with U2AF1 S34 mutant AML and from three patients without U2AF1 mutations as controls (Table S12). U2AF1 mutations consistently resulted in increased formation of stress granules in U2AF1 mutant vs WT AML blasts upon oxidative stress (Figures 6C–6D, Table S13). These findings are corroborated by two previously published datasets on S34 vs WT AML patients (Table S3) in which differential splicing analysis confirmed the significant over-representation of SG-enriched proteins and SG-enriched transcripts (Figure 6E).
Together, these data show that U2AF1-mutant MDS/AML carry an increased ability to respond to stress by forming stress granules, a possible mechanism by which mutant cells gain clonal dominance over WT cells (Figure 7).
Figure 7. Proposed model connecting cancer-associated U2AF1 mutations to enhanced stress adaptation.
Discussion
In this study, we have defined RNA binding and splicing mechanisms in two cancer-associated U2AF1 neomorphic mutations and identified RNA granule biology and cellular stress response as a unifying oncogenic mechanism between the two hotspot mutations. Applying a multi-omics approach to assess the function of WT U2AF1 compared to two different cancer missense mutations in U2AF1 (S34F and Q157R), we were able to: i) separate in vivo individual RNA binding signals of U2AF1 and U2AF2; ii) identify alterations in mutant U2AF1 binding to the 3’SS, suggesting conformational changes in the U2AF1-U2AF2-RNA complex; iii) reveal the binding-splicing relationship, showing that splicing alterations are driven not only by loss, but also by gain of mutant U2AF1 binding; iv) demonstrate that U2AF1 mutations lead to alterations in RNA granule biology, in particular stress granules, leading to increase survival under stress; v) show that U2AF1-mutant primary MDS/AML cells exhibit an enhanced stress granule response. Taken together, our findings show that the U2AF1 mutations lead to highly localized changes in binding interaction at 3’SSs that amplify throughout the cell by dramatically altering the transcriptome and subcellular compartmentalization of RNA. We propose that these changes underlie the cancerous phenotype of hematopoiesis in MDS and AML.
The application of U2AF1 freCLIP-seq, in which we combined the high-sensitivity eCLIP-seq protocol with the fractionation step, led to a high-resolution positional analysis that in comparison with previous U2AF1 CLIP-seq studies (Esfahani et al., 2019; Palangat et al., 2019) allowed us to determine U2AF1- and U2AF2-specific binding contributions in WT and mutant context. U2AF1 S34F and Q157R freCLIP-seq revealed de novo peaks in position −3 for the S34F mutant and in position +1 for the Q157R mutant, together with a general reduction in U2AF2 signal. Crucially, these peaks perfectly matched sequence-specific positions affected by aberrant splicing in RNA-seq data. We validated our set of differentially spliced genes against 16 published datasets (Bamopoulos et al., 2020; Brooks et al., 2014; Esfahani et al., 2019; Fei et al., 2016; Ilagan et al., 2015; Kim et al., 2018; Okeyo-Owuor et al., 2015; Pellagatti et al., 2018; Yip et al., 2017) and in addition we were able to fully characterize sequence-specific differences also for less frequent events such as alternative 3’ splice sites and intron retention.
The literature to date suggests that aberrant splicing directly results from loss of mutant U2AF1 binding at specific junctions (Fei et al., 2016; Ilagan et al., 2015; Okeyo-Owuor et al., 2015). A recent study, assessing in vitro U2AF1 binding, questioned sequence-specific reductions in binding, because U2AF1 S34F compared to WT showed little discrimination for the −3 nucleotide sequence and displayed similar affinity to any nucleotide except G (Yoshida et al., 2020). Similarly, limited −3 sequence specificity was also observed in studies of the effects of S34F mutation on the open vs closed conformations of U2AF2 (Warnasooriya et al., 2020). Our integrative analysis unveiled an unexpected relationship between mutant U2AF1 binding and splicing outcomes. While Q157R mainly induced a loss-of-binding pattern where mutant binding and splicing were directly proportional, S34F predominantly supported a gain-of-binding process, where increased binding instead led to reduced U2AF2 binding and reduced exon inclusion.
Our in-depth integrative binding-splicing analysis also offered the possibility to discover novel biological processes directly influenced by pathogenic U2AF1 mutations in the context of myeloid malignancies. We specifically noticed a significant enrichment in stress granule-related processes among perturbed genes. SGs are membraneless organelles constituted by multiple RNAs and RNA binding proteins. They are formed in the cytoplasm of eukaryotic cells upon stress, improving cellular adaptation to stress conditions (Van Treeck et al., 2018). Increased SG formation has been linked to tumorigenesis as a strategy exploited by cancer cells to enhance their fitness under stress (Anderson et al., 2015; Grabocka and Bar-Sagi, 2016; Song and Grabocka, 2020). We confirmed the enrichment of differentially-bound, spliced genes in a collection of 16 experimental datasets characterizing SG-enriched proteins and transcripts. By IF imaging, we revealed a marked increase in stress granules in U2AF1 S34F and Q157R over WT cells upon arsenite treatment, indicating an increased capability of mutant cells to aggregate RNA-protein granules when facing stress. Moreover, analysis of RNA dynamics confirmed that U2AF1 mutations increased the stability/synthesis of transcripts enriched in SGs, and conversely promoted the degradation/shutdown of transcripts depleted in SGs, providing a molecular explanation for the increase in SG observed by imaging. Mass spectrometry analysis of protein levels in S34F cells confirmed a global shift towards upregulation for SG protein components with differentially bound and spliced transcripts.
From a therapeutic perspective, current clinical approaches for targeting RNA splicing include antisense oligonucleotides, inducing splice-site switches of specific RNA isoforms, and small-molecule modulators directed against spliceosomal components (Bonnal et al., 2020; Wang and Aifantis, 2020). In a gain-of-binding but not in a loss-of-binding model, anti-sense oligonucleotides may be of utility. Moreover, the gain-of-binding/loss-of-splicing mechanism could be exploited to exacerbate splicing defects in U2AF1 mutant cells. Indeed, Chatrikhi et al. (Chatrikhi et al., 2021) recently described a compound that inhibits in vitro splicing by stalling the spliceosome machinery through increased U2AF2-RNA binding.
Our observations also lay the foundation for a new paradigm where mutations in splicing factors ultimately alter membraneless organelles by acting on the availability of their RNA and protein components (Figure 7). Importantly, the increase in stress granules were translationally validated through scRNA-seq and IF analysis of U2AF1-mutant vs WT primary MDS and AML samples. In particular, analysis at single-cell resolution confirmed linkage of the U2AF1 mutant genotype to an increase in stress granule components.
The enhanced formation of stress granules potentially increases the stress adaptation to U2AF1 mutant cells, contributing to their clonal advantage in MDS/AML. This model is supported by the increased cell viability we observed in U2AF1 mutant cells under stress condition and by their decreased viability after treatment with a stress response inhibitor.
Splicing factor-mutant hematopoietic cells rely on the wild-type counterpart for their survival, rendering them sensitive to pharmacologic inhibition of splicing catalysis (Fei et al., 2016; Lee et al., 2016). To date, the full efficacy of spliceosome inhibitors in the treatment of U2AF1-mutant myeloid malignancies remains to be shown (Fong et al., 2019; Shirai et al., 2017; Wang et al., 2019). Protein arginine methyltransferase (PRMT) inhibitors, that exhibit a strong anti-leukemic effect on spliceosomal mutant AML in preclinical studies (Fong et al., 2019), induce methylation changes in numerous intrinsically disordered proteins, such as G3BP1, suggesting an impact on membraneless organelle formation. Our results highlight the relevance of future studies to: i) further characterize SG perturbations in the presence of splicing factor mutations; ii) evaluate the synergy between splicing factor mutations and other oncogenic mutations in triggering enhanced stress response; iii) assess the therapeutic potential of drug combinations targeting stress granules and spliceosome. Since SG components contribute to various cancer-related processes such as cell cycle progression, apoptosis inhibition, resistance to stress and therapeutics, the possibility of targeting SGs as an alternative therapeutic approach (Gao et al., 2019; Wang et al., 2020) is tantalizing and could not have been previously anticipated.
Limitations of the Study
In this study, we integrated high-resolution interactome and transcriptome data uncovering that U2AF1 splicing factor mutations result in enhanced stress response in myeloid malignancies. We determined alterations in U2AF1-RNA binding at the 3’SS at single nucleotide resolution, using the number of eCLIP-seq read ends as a measure for RNA binding. Our data suggest that binding of mutant U2AF1 alters binding of U2AF2. Splicing outcomes may be secondary to mutant U2AF1 effects on other factors of the spliceosome. Future structural studies will be needed to fully dissect the mechanism by which mutant U2AF1 alters recruitment and turnover of other spliceosome components at U2AF1 bound intron-exon junctions, and to further untangle these complexities.
Stress granule formation represents a collective mechanism driven by the interaction of numerous RNAs and proteins rather than by the activity of a single main player. Our results open a new avenue of analysis on how enhanced SG formation may affect splicing factor mutant cell biology and to identify those SG components that contribute to clonal advantage of MDS HSCs. We show that an inhibitor of the stress response more significantly affects survival of mutant than wild-type cells to stress, but future functional studies in primary patient cells will be needed to determine whether this mechanism could be therapeutically exploited. Development of novel agents that target various mechanisms in stress granule genesis will further our understanding of these critical processes.
STAR Methods
Resource availability
Lead contact
Further information and requests for reagents and resources may be directed to the lead contact, Stephanie Halene (stephanie.halene@yale.edu)
Materials availability
This study did not generate new unique reagents.
Data and code availability
Sequencing files generated from cell lines (eCLIP-seq, freCLIP-seq, RNA-seq, TL-seq) have been deposited in the GEO database and are available under the accession number GSE195620. Sequencing files generated from patient samples (scRNA-seq) are available upon request. Original blot, gel and autoradiogram images have been deposited in Mendeley Data at https://doi.org/10.17632/f3xhcbyn4b.1
eCLIP-seq analysis code is publicly available on GitHub (https://github.com/TebaldiLab/eCLIP_seq) on the date of publication.
Any additional information required to reanalyze the data reported in this paper is available upon request from the lead contact.
Experimental models and subject details
Cell lines
All cell lines were cultured under 5% CO2 at 37°C. 293FT cells for lentivirus production were grown in DMEM (ThermoFisher SCIENTIFIC, Cat #11965092) supplemented with 10% FBS (Gemini Bio-Products, Cat #100-106). HEL erythroleukemia cells (ATCC, Cat #TIB-180) were grown in RPMI 1640 (ThermoFisher SCIENTIFIC, Cat #11875093) supplemented with 10% FBS (Gemini Bio-Products, Cat #100-106) before transduction or 9% tetracycline-negative FBS (Gemini Bio-Products, Cat #100-800) after transduction, 1% L-glutamine (ThermoFisher SCIENTIFIC, Cat #25030081) and 1% penicillin-streptomycin (pen-strep; ThermoFisher SCIENTIFIC, Cat #15140122).
Human subjects
Human primary cells were obtained with patients’ written consent after approval by the Yale University Human Investigation Committee. We included in our cohort 8 patients (age: 66–88 y.o., median 72; gender: female n=2, male n=6; Table S12) affected by MDS or AML and characterized by U2AF1 WT (n=4) or S34F/Y (n=4), as reported by next-generation sequencing report using Yale 49-gene myeloid panel (Ion Torrent platform). Mononuclear cells were isolated by density gradient centrifugation from BM or PB samples collected at the time of diagnosis and frozen in FBS+10% dimethyl sulfoxide. After thawing, primary cells for IF imaging were cultured overnight in StemSpan SFEM (STEMCELL Technologies, Cat #09650) supplemented with 1% pen-strep and recombinant human cytokines: FLT-3 ligand (50 ng/ml), SCF (50 ng/ml), TPO (100 ng/ml), IL-3 (10 ng/ml) and IL-6 (25 ng/ml). All cytokines were purchased from Gemini Bio-Products.
Method details
U2AF1 cell line generation and verification
Full length human WT, S34F or Q157R FLAG-tagged U2AF1 in CS-TRE-PRE-Ubc-tTA-I2G plasmids (Figure S1A), encoding for tetracycline-responsive element (TRE) and enhanced green fluorescent protein (EGFP), were a kind gift from Tomoyuki Yamaguchi at Japan Science and Technology Agency (Yamaguchi et al., 2012; Yoshida et al., 2011). Lentivirus production was obtained by co-transfecting 293FT cell line with psPAX2 plasmid (Addgene, Cat #12260), VSV.G plasmid (Addgene, Cat #14888) and U2AF1 (WT, S34F or Q157R)-containing plasmid, followed by spin-concentration. HEL cells were infected with viral supernatants via spinoculation (1000 g for 90 min at 30°C) with addition of 4 μg/ml polybrene (Sigma-Aldrich, Cat #TR-1003-G). 48 hours after transduction, GFP+ cells (mean: WT = 21.2%, S34F = 22.2%, Q157R = 27.4%) were sorted by fluorescence-activated cell sorting (FACSAria II, BD Biosciences, Yale Flow Cytometry Facility). To express FLAG-tagged U2AF1 proteins, HEL cells were induced with 1 μg/ml doxycycline for 48 hours and the expression was verified through PCR followed by Sanger sequencing (3730xL DNA Analyzer, ThermoFisher SCIENTIFIC, Yale Keck DNA Sequencing Facility) and through western blotting (Figure S1B).
RNA extraction, reverse-transcription and cDNA amplification
RNA was isolated using the RNeasy Mini kit (QIAGEN, Cat #74104) following manufacturer’s instructions. One microgram of extracted RNA was reverse transcribed into cDNA using iScript cDNA Synthesis Kit (BIO-RAD, Cat #1708890). Amplification of cDNA for validation of doxycycline induction was performed with U2AF1 primers (forward: 5’-GGCACCGAGAAAGACAAAGT-3’; reverse: 5’-CTCTGGAAATGGGCTTCAAA-3’) and PCR products were purified using QIAquick PCR purification kit (QIAGEN, Cat #28104) before Sanger sequencing. Sanger chromatograms were aligned to U2AF1 reference sequence by SnapGene. Three representative alternative splicing events were validated, in triplicate, with previously reported target specific primers (Nguyen et al., 2018). PCR products were resolved by agarose gel electrophoresis, visualized using Image Lab 3.0 software (BIO-RAD), and quantified in ImageJ.
Western blotting
Cellular lysates were incubated at 95°C for 5 min and separated in 12% Mini-PROTEAN TGX Precast Protein gels (BIO-RAD, Cat #4561044). Proteins were then transferred to 0.45 μm PVDF membrane at 100V for 1 hr. Membranes were blocked with 5% nonfat milk in TBST for 40 min, incubated with primary antibodies overnight at 4°C and incubated with secondary antibodies for 1 h at room temperature. Washing steps were performed using 1X TBST. Membranes were developed with SuperSignal West Femto Maximum Sensitivity Substrate (ThermoFisher SCIENTIFIC, Cat #34095). Antibodies were used at the following dilutions: rabbit polyclonal anti-U2AF1 (Bethyl Laboratories, Cat #A302-079) 1:5000, mouse monoclonal anti-FLAG M2 (Sigma-Aldrich, Cat #F1804) 1:1000, rabbit monoclonal anti-G3BP1 (Abcam, Cat #ab181149) 1:5000, mouse monoclonal anti-HSP90 (StressMarq Biosciences, Cat #SMC-107B) 1:5000, rabbit polyclonal anti-GAPDH (FL-335, Santa Cruz Biotechnology, Cat #sc-25778) 1:5000, goat anti-rabbit IgG HRP-linked (Cell Signaling TECHNOLOGY, Cat #7074) 1:5000, horse anti-mouse IgG HRP-linked (Cell Signaling TECHNOLOGY, Cat #7076) 1:5000. Chemiluminescence was visualized by Image Lab 3.0 software (BIO-RAD) and protein bands’ quantification was performed in ImageJ.
eCLIP-seq
eCLIP-seq experiments (U2AF1 eCLIP-seq, U2AF2 eCLIP-seq, U2AF1 freCLIP-seq) were performed at least in duplicate (Table S1), as per ENCODE guidelines (https://www.encodeproject.org), according to the published protocol (Van Nostrand et al., 2016) with the following modifications: HEL cells containing FLAG-tagged U2AF1 WT, S34F or Q157R were treated with doxycycline (1 μg/ml) for 48 hours before UV-crosslinking (400 mJ/cm2 UV Stratalinker 2400, STRATAGENE). Crosslinked cell pellets were lysed in eCLIP lysis buffer and sonicated, then RNA was partially digested with RNase I (ThermoFisher SCIENTIFIC, Cat #AM2295). Immunoprecipitation of RNA-protein complexes were performed with 12 μg anti-FLAG M2 antibody (Sigma-Aldrich, Cat #F1804) or 8 μg anti-U2AF2 antibody (Sigma-Aldrich, Cat #U4758) and Dynabeads Protein G (ThermoFisher SCIENTIFIC, Cat #10004D), followed by RNA linker ligation as per the relevant protocol and P32-labeling (PerkinElmer, Cat #BLU502Z250UC). U2AF-RNA complexes were isolated by SDS-PAGE and transferred to nitrocellulose membranes. For U2AF1 freCLIP-seq, membrane region between 37–65 kD was excised to obtain the “light fraction” containing U2AF1 monomer plus bound RNA, and membrane region between 65–110 kD was excised to obtain the “heavy fraction” containing the U2AF heterodimer plus bound RNA (Figure S1E). Light and heavy fractions were obtained for each sample. For both eCLIP-seq and freCLIP-seq, excised membranes were treated with proteinase K and the RNA was isolated using the RNA Clean & Concentrator-5 kit (Zymo Research, Cat #R1016) according to manufacturer’s instructions. Reverse transcription and library preparation were carried out according to the published protocol. Libraries were deep-sequenced on Illumina HiSeq2500 system, paired-end 75 bp, and on Illumina HiSeq4000 or NovaSeq 6000 system, paired-end 100bp, at the Yale Center for Genome Analysis (YCGA).
RNA-seq
Total RNA from induced U2AF1 WT, S34F and Q157R HEL cells (in duplicate as per the ENCODE guidelines, https://www.encodeproject.org), as well as from uninduced HEL cells (in duplicate, as controls), was extracted using the RNeasy Mini kit (QIAGEN, Cat #74104). Library preparation with rRNA depletion (KAPA RNA HyperPrep Kit with RiboErase (HMR), Kapa Biosystems, Cat #KK8560) and sequencing on Illumina HiSeq4000 system in paired-end 100 bp mode were performed at the YCGA.
TL-seq
Doxycycline-induced U2AF1 WT, S34F and Q157R HEL cells and not-infected HEL cells were labeled in duplicate with 100 μM of the uridine analog 4-thiouridine (s4U, Alfa Aesar, Cat #AAJ60679MC) for 2 hours at 37°C in the dark. Total RNA from labeled samples and from not-labeled control was isolated using 1 ml TRIzol (ThermoFisher SCIENTIFIC, Cat #15596-018), purified and treated with TL chemistry as reported (Schofield et al., 2018). In brief, RNA isolated from TRIzol was precipitated in 50% isopropanol supplemented with 1 mM DTT. Genomic DNA was depleted by treating with TURBO DNase (ThermoFisher SCIENTIFIC, Cat #AM2239) and RNA was purified with one volume of Agencourt RNAClean XP beads (Beckman Coulter, Cat #A63987) according to manufacturer’s instructions. 5 μg of total RNA was subjected to TL chemistry for one hour at 45°C followed by reducing treatment for 30 minutes at 37°C. 10 ng of RNA was used to prepare cDNA sequencing libraries with the SMARTer Stranded Total RNA-Seq kit v2 - Pico Input Mammalian removing cDNAs derived from rRNA (Takara Bio USA, Cat #634413). Prepared libraries were submitted to paired-end 100 bp sequencing on Illumina NovaSeq 6000 system at the YCGA. The TL chemistry allows to recode the hydrogen bonds of the uridine analog to match those of cytosine, thereby introducing U-to-C mutations in newly transcribed RNAs during reverse transcription. After alignment of the sequencing data, the level of T-to-C mutations is used to assess RNA synthesis (high T-to-C mutation rates) vs stability (low T-to-C mutation rates).
Immunofluorescence staining and confocal microscopy
Immunofluorescence staining for stress granules’ analysis was conducted on induced and uninduced U2AF1 WT, S34F and Q157R HEL samples (in duplicate), and on AML primary samples (WT, n=3; S34F/Y, n=3). Samples treated with 500 μM sodium arsenite (Sigma-Aldrich, Cat #S7400) for 1 hour and not-treated samples were collected. After PBS wash, primary cells were fixed with 4% paraformaldehyde for 15 min at room temperature. Fixation step was followed by permeabilization with 100% methanol (pre-chilled at −20°C) for 10 min at room temperature as reported in SG staining published protocol (Kedersha and Anderson, 2007). Cells were then rinsed in PBS, blocked with 8% donkey serum (Abcam, Cat #ab7475) for 45 min at room temperature, rinsed in PBS, and incubated with primary antibody rabbit monoclonal anti-G3BP1; (Abcam, Cat #ab181149) 1:300 for 1 hour at room temperature. After 3x PBS washes, cells were incubated with secondary antibody donkey anti-rabbit IgG Alexa Fluor 555 conjugate (Abcam, Cat #ab150062) 1:500 for 1 hour at room temperature. After 2x PBS washes, cells were spun onto glass slide and covered with DAPI-containing ProLong Gold Antifade Mountant (ThermoFisher SCIENTIFIC, Cat #P36935) and coverslip. Each step was followed by centrifugation at 500×g for 5 min. Z-stack images (HEL samples: zoom=1.7X; n steps=10–12, step size=0.38μm, 3 fields/slide; primary samples: zoom=2.5X; n steps=32–40, step size=0.25μm, 6 fields/slide; image format=1024×1024 pixels) were acquired by Leica TCS SP5 confocal microscope with 63X NA 1.40 oil objective.
Mass spectrometry
Cell pellets of 1×106 U2AF1 WT, S34F or Q157R HEL cells were collected after doxycycline-induction in three independent biological experiments. Sample processing and mass spectrometry data collection were performed by the Keck MS & Proteomics Resource at Yale University according to their standard operating procedures. Specifically, after protein extraction, samples were reduced (DTT), alkylated (iodoacetamide), and digested with LysC followed by trypsin. Data collection was obtained by label-free quantitative (LFQ) liquid chromatography with tandem mass spectrometry (LC-MS/MS) using Q Exactive HF-X hybrid quadrupole-Orbitrap mass spectrometer (ThermoFisher SCIENTIFIC).
Cell viability assays
U2AF1 WT, S34F and Q157R HEL cells were plated at 5×104 cells/ml in 96-well plates and induced with 1 μg/ml doxycycline for 24 hours. Then, cells were spun down at 500×g for 5 min and resuspended in RPMI 1640 containing 1 μg/ml doxycycline, 500 μM sodium arsenite (Sigma-Aldrich, Cat #S7400) and 20 nM ISRIB (Sigma-Aldrich, Cat #SML0843) or vehicle (DMSO). After 24 hours, cell viability measurements were obtained by flow cytometry and by colorimetric WST-1 assay (Abcam, Cat #ab65475). For flow cytometry analysis, cells were stained with 1.5 μl 7-aminoactinomycin D (7-AAD) solution (STEMCELL Technologies, Cat #75001) and data were acquired by LSR Fortessa (BD Biosciences, Yale Flow Cytometry Facility). The percentage of viable (7-AAD−) cells was calculated by FlowJo software (BD Biosciences) after debris removal. For the WST-1 assay, cells were incubated with 10 μl WST-1 reagent at 37°C for 2 hours. The level of cell viability was assessed measuring the absorbance at 450 nm with a plate reader.
Flow sorting and cell suspension preparation for scRNA-seq
Cryovials containing primary mononuclear cells from MDS patients were warmed in a 37 °C water bath and thawed cells were spun down at 500×g for 5 min. Cells were then resuspended in 10 ml fluorescence-activated cell sorting (FACS) buffer (PBS,0.5% BSA,2 mM EDTA), counted, and spun down at 300×g for 10 min. Cells were stained with Pacific Blue anti-human CD34 antibody (BioLegend, Cat #343512, for CD34+ blast isolation) 1:100 and 7-AAD (STEMCELL Technologies, Cat #75001, for viability evaluation) 1:100 in 500 ul FACS buffer, and incubated at 4°C for 30 min. For each MDS patient sample, one million cells were used as unstained control sample. Cells were then washed with 10 ml FACS, spun down at 300×g for 10 min at 4°C, resuspended in 200 ul FACS buffer per million cells and pipetted through a 70-μm filter into a 5-ml tube for sorting. Cells were sorted by the Yale Flow Cytometry facility on the FACSAria instrument (BD Biosciences) and gated using the FACSDiva software (BD Biosciences). Viable (7-AAD−) CD34+ sorted cells were directly collected into PBS/0.04% BSA cell suspension buffer for scRNA-seq. Cells were then spun down at 500×g for 5 min at 4°C and washed with PBS/0.04% BSA. Single cell suspension containing 20,000 live cells in PBS/0.04% BSA (~500 cells/ul) was subsequently processed for scRNA-seq library preparation by the YCGA using Chromium Next GEM Single Cell 5’ kit v2 (10x Genomics, Cat#PN-1000265). Samples were then sequenced on Illumina NovaSeq 6000 system.
eCLIP-seq analysis
eCLIP-seq reads were processed using FastUniq (Xu et al., 2012) for duplicate removal and Cutadapt (Martin, 2011) for adapter trimming. After quality control (FastQC, https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), reads were aligned to the human genome (GRCh38.p10) with STAR v2.7.0f --quantMode GeneCounts (Dobin et al., 2013), using the GENCODE Release 27 for transcript annotation. BAM files were converted into BED files through bedtools (Quinlan and Hall, 2010) to extract the genomic position of the crosslinked nucleotide right after the end of each sequenced read and were then re-converted into single nucleotide BAM files for post-processing analysis. Intron-exon junctions (interval from −40 to +10 around the 3’SS) were filtered using a coverage threshold of at least 10 reads in at least two samples to remove junctions with low signal. Normalization was performed with the TMM method implemented in the edgeR Bioconductor package (Robinson et al., 2010). Differential analysis of mutant vs WT U2AF1 binding sites was performed comparing “light” vs “heavy” fraction signals in freCLIP-seq data. We specifically considered genomic positions covered by at least 5 reads in all replicates and in an interval from −4 to +2 around the 3’SS in the light fractions (U2AF1 signal) or from −20 to −5 in the heavy fractions (U2AF2 signal). Significant differentially bound sites were identified applying the “glmQLFTest” function in edgeR with the following thresholds: mean normalized counts per million (CPM) > 1 in either mutant or WT U2AF1 samples; absolute log2 FC > 0.75; P-value < 0.05. We defined events as characterized by increased mutant U2AF1 binding when the signal of the mutant over the wild-type was shifted toward the light fraction and, on the opposite, by decreased mutant U2AF1 binding when the signal of the mutant was shifted toward the heavy fraction (Figure S3A).
RNA-seq analysis
RNA-seq reads were processed with FastUniq to remove duplicates and aligned to the human genome (GRCh38.p10) with STAR (v2.7.0f, --quantMode GeneCounts), using the GENCODE Release 27 for transcript annotation. Normalization with the TMM method was performed with the edgeR package. To identify differentially expressed genes, we applied the “glmQLFTest” function in edgeR considering 2 factors: genotype (3 levels: WT, S34F, Q157R) and treatment (2 levels, uninduced and doxycycline-induced). Significant genes were then filtered according to the following thresholds: CPM > 1; absolute log2 FC > 0.75; P-value < 0.05. Alternative splicing analysis was performed with rMATS v4.0.2 (Shen et al., 2014), capable of handling replicates with high processing speed. Alternative splicing events with absolute difference in percent spliced-in (delta PSI) > 10% and FDR < 0.05 were considered significant. Events in the comparison induced vs uninduced U2AF1 WT HEL cells (dox-dependent events) were removed from the final list of differentially spliced events in mutant U2AF1 conditions.
Integrative binding-splicing analysis
Differential binding and alternative splicing data were integrated by performing Fisher’s method through the metaseqR Bioconductor package (Moulos and Hatzis, 2015). Only events with combined P-value < 0.05 were considered as significantly affected by aberrant binding and aberrant splicing. The four classes describing all the possible relationships between binding and cassette exons in mutant vs WT U2AF1 conditions (“<binding;<inclusion”, “>binding;>inclusion”, “>binding;<inclusion”, “<binding;>inclusion”) were defined considering absolute log2 FC (freCLIP-seq analysis) > 1 and absolute delta PSI (RNA-seq analysis) > 10%.
Functional annotation enrichment analysis
Pathway enrichment for junctions affected by differential binding (freCLIP-seq), differential splicing (RNA-seq) and differential binding-splicing (integrative analysis) was evaluated using enrichR package (Kuleshov et al., 2016) considering all the available databases.
TL-seq analysis
Filtering and alignment to the human GRCh38 genome version 26 (Ensembl 88) were performed essentially as previously described (Schofield et al., 2018). Briefly, reads were trimmed of adapter sequences with Cutadapt v1.16 (Martin, 2011) and aligned to the GRCh38 genome with HISAT2 (Kim et al., 2019) with default parameters and --mp 4,2. Reads aligning to annotated transcripts were quantified with HTSeq (Anders et al., 2015) htseq-count. SAMtools v1.5 (Li et al., 2009) was used to collect only uniquely mapped read pairs (SAM flag = 83/163 or 99/147). T-to-C mutations were identified through a customized pipeline (https://bitbucket.org/mattsimon9/timelapse_pipeline/src/master/). Only mutations with base quality score greater than 40 that were at least 3 nucleotides from the read’s end were counted. Sites of likely single-nucleotide polymorphisms (SNPs) and alignment artefacts (identified with bcftools) and sites of high mutation levels in the non-s4U treated controls (binomial likelihood of observation p < 0.05) were not considered in mutation calling. Normalization scale factors were calculated with edgeR (Robinson et al., 2010) using calcNormFactors (method = ‘upperquartile’). Browser tracks were made using STAR v2.5.3a and visualized in IGV (Robinson et al., 2011).
Changes in expression between WT and S34F or Q157R were evaluated with DESeq2 (Love et al., 2014), and genes with absolute log2 FC > 0.75 and P-value < 0.05 were considered significant. RNA kinetic parameters were estimated with a Bayesian hierarchical modeling approach using RStan software v2.19.3 (https://mc-stan.org/rstan) as previously reported (Schofield et al., 2018). We extracted the 80% confidence interval of the fraction of change in total RNA attributed to degradation (fracdeg) to identify genes whose change could be attributed primarily to changes in stability (kdeg) or synthesis (ksyn). If the 80% credible interval does not overlap 0.5, the gene is confidently driven by changes in stability (fracdeg > 0.5) or synthesis (fracdeg < 0.5). TL-seq classes combine expression and kinetic changes and are defined according to the following parameters: stabilized or induced genes, log2 FC > 0.75 and fracdeg > 0.5 or < 0.5, respectively; destabilized or shutdown genes, log2 FC < −0.75, fracdeg > 0.5 or < 0.5, respectively.
Mass spectrometry data analysis
Raw MS data were analyzed with MaxQuant software v2.0.3 using default settings and Label Free Quantification (LFQ). Data processing and differential analysis for peptide intensities were performed with the DEqMS Bioconductor package (Zhu et al., 2020).
IF image analysis
Images were analyzed using IMARIS v9.6 (Oxford Instruments). Specifically, ImarisCell was used to identify, segment, measure, and analyze cell, nucleus and vesicles (stress granules) in 3D. Nuclei were identified based on intensity, using automatic thresholding settings. Stress granules from HEL samples were identified by first using an estimated diameter of 1 μm, and then refining the selection with mean intensity, quality settings set to automatic and intensity standard deviation setting to select voxels above 22.0 intensity units. Stress granules from primary samples were identified by first using an estimated diameter of 0.6 μm, and then refining the selection with mean intensity and voxel number settings of 25.0 and 46.0 respectively.
scRNA-seq analysis
Single cell expression measurements (digital counts) were obtained from raw sequence data (FASTQ) with the 10x Genomics Cell Ranger pipeline v4.0.0 (default settings, GRCh38 reference) (Zheng et al., 2017). A total of 4271 cells were sequenced, with an average of 115,000 reads and 3,700 genes per cell. Cell-variant assignment, based on U2AF1 S34F mutation calling in the 21:43104346–43104346 locus, was performed with VarTrix v1.1.19 (https://github.com/10XGenomics/vartrix). Single cell expression data analysis was performed with Seurat v4.0.1 (Hao et al., 2021) following the “sctransform” analysis workflow, with default parameters. In short, raw counts were normalized with the “SCT” method. Cells from different patients were integrated with the “PrepSCTIntegration”, “FindIntegrationAnchors” and “IntegrateData” functions in Seurat. After Principal Component Analysis, cells were visualized with the UMAP dimension reduction technique. For each cell, the SG-expression score was calculated with the “AddModuleScore” function of Seurat, based on a signature of 149 genes enriched in stress granules and differentially bound and spliced by mutant U2AF1 (nbin=10).
Quantification and statistical analysis
Quantification of gels and blots were performed with ImageJ. Statistical analyses were performed in R (https://www.r-project.org). Number of replicates and performed statistical tests are defined in the figure legends. P-values < 0.05 were considered statistically significant and indicated within figure panels.
Supplementary Material
Figures S1–S6
Table S1 related to Figure 1. eCLIP-seq metrics.
Table S2 related to Figure 2. Alternative splicing events comparing S34F vs WT and Q157R vs WT U2AF1 HEL cells, after exclusion of dox-dependent events.
Table S3 related to Figures 2–S2. Published datasets on U2AF1 mutation-dependent alternative splicing.
Table S4 related to Figures 2–S2. Comparative analysis of differentially spliced genes in U2AF1 S34F/Y and Q157R/P conditions in HEL RNA-seq data and in published datasets.
Table S5 related to Figure 3. Differential binding events comparing S34F vs WT and Q157R vs WT U2AF1 HEL cells.
Table S6 related to Figure 3. Combined analysis of differentially bound and spliced genes comparing S34F vs WT and Q157R vs WT U2AF1 HEL cells.
Table S7 related to Figure 4 and S4. Functional annotation enrichment analysis of differentially spliced genes.
Table S8 related to Figure 4 and S4. Functional annotation enrichment analysis of differentially bound genes.
Table S9 related to Figure 4. Functional annotation enrichment analysis of differentially bound and spliced genes.
Table S10 related to Figure 4. Published datasets on stress granule proteome and transcriptome.
Table S11 related to Figure 5. IF imaging metrics for U2AF1 WT, S34F and Q157R HEL samples.
Table S12 related to Figure 6. Patient characteristics.
Table S13 related to Figure 6. IF imaging metrics for U2AF1 WT and S34F AML primary samples.
File S1 related to Figure 5. Z-stack slice movie for the representative U2AF1 S34F dox+ars image reported in Figure 5A.
File S2 related to Figure 5. Z-stack slice movie for the representative U2AF1 Q157R dox+ars image reported in Figure 5A.
KEY RESOURCES TABLE.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antibodies | ||
| Rabbit polyclonal anti-U2AF1 | Bethyl Laboratories | Cat#A302-079; RRID: AB_1604295 |
| Mouse monoclonal anti-FLAG (clone M2) | Sigma-Aldrich | Cat#F1804; RRID: AB_262044 |
| Rabbit monoclonal anti-G3BP1 | Abcam | Cat#ab181149 |
| Mouse monoclonal anti-HSP90 | StressMarq Biosciences | Cat#SMC-107B; RRID: AB_854214 |
| Rabbit polyclonal anti-GAPDH (FL-335) | Santa Cruz Biotechnology | Cat#sc-25778; RRID: AB_10167668 |
| Mouse monoclonal anti-U2AF2 | Sigma-Aldrich | Cat# U4758, RRID: AB_262122 |
| Donkey anti-rabbit IgG Alexa Fluor 555 conjugate | Abcam | Cat#ab150062, RRID: AB_2801638 |
| Pacific Blue anti-human CD34 | BioLegend | Cat#343512; RRID: AB_1877197 |
| Biological Samples | ||
| MDS/AML patient samples | Yale Hematology Tissue Bank | N/A |
| Chemicals, Peptides, and Recombinant Proteins | ||
| DMEM | ThermoFisher SCIENTIFIC | Cat#11965092 |
| FBS | Gemini Bio-Products | Cat#100-106 |
| RPMI 1640 | ThermoFisher SCIENTIFIC | Cat#11875093 |
| Tetracycline-negative FBS | ThermoFisher SCIENTIFIC | Cat#100-800 |
| L-glutamine | ThermoFisher SCIENTIFIC | Cat#25030081 |
| Penicillin-Streptomycin | ThermoFisher SCIENTIFIC | Cat#15140122 |
| StemSpan SFEM | STEMCELL Technologies | Cat#09650 |
| Human FLT-3 Ligand | Gemini Bio-Products | Cat#300-118P |
| Human Stem Cell Factor (SCF) | Gemini Bio-Products | Cat#300-185P |
| Human Thrombopoietin (TPO) | Gemini Bio-Products | Cat#300-188P |
| Human Interleukin-3 (IL-3) | Gemini Bio-Products | Cat#300-151P |
| Human Interleukin-6 (IL-6) | Gemini Bio-Products | Cat#300-155P |
| Polybrene Infection Reagent | Sigma-Aldrich | Cat#TR-1003-G |
| Doxycycline | Sigma-Aldrich | Cat#D3447; CAS: 10592-13-9 |
| RNase I | ThermoFisher SCIENTIFIC | Cat#AM2295 |
| Dynabeads Protein G | ThermoFisher SCIENTIFIC | Cat#10004D |
| ATP, [γ-32P] | PerkinElmer | Cat#BLU502Z250UC |
| 4-Thiouridine | Alfa Aesar | Cat#AAJ60679MC; CAS: 13957-31-8 |
| TRIzol Reagent | ThermoFisher SCIENTIFIC | Cat#15596-018 |
| TURBO DNase | ThermoFisher SCIENTIFIC | Cat#AM2239 |
| Agencourt RNAClean XP beads | Beckman Coulter | Cat #A63987 |
| Sodium (meta)arsenite | Sigma-Aldrich | Cat#S7400; CAS: 7784-46-5 |
| Donkey serum | Abcam | Cat#ab7475 |
| ProLong Gold Antifade Mountant with DAPI | ThermoFisher SCIENTIFIC | Cat#P36935 |
| ISRIB | Sigma-Aldrich | Cat#SML0843; CAS: 1597403-47-8 |
| 7-AAD | STEMCELL Technologies | Cat#75001; CAS: 7240-37-1 |
| Critical Commercial Assays | ||
| RNeasy Mini kit | QIAGEN | Cat#74104 |
| iScript cDNA Synthesis Kit | BIO-RAD | Cat#1708890 |
| QIAquick PCR purification kit | QIAGEN | Cat#28104 |
| SuperSignal West Femto Maximum Sensitivity Substrate | ThermoFisher SCIENTIFIC | Cat#34095 |
| RNA Clean & Concentrator-5 kit | Zymo Research | Cat#R1016 |
| KAPA RNA HyperPrep Kit with RiboErase (HMR) | Kapa Biosystems | Cat#KK8560 |
| SMARTer Stranded Total RNA-Seq kit v2 - Pico Input Mammalian | Takara Bio USA | Cat#634413 |
| WST-1 assay kit | Abcam | Cat#ab65475 |
| Chromium Next GEM Single Cell 5' kit v2 | 10x Genomics | Cat#PN-1000265 |
| Deposited Data | ||
| Sequencing data | This paper | GEO: GSE195620 |
| eCLIP-seq analysis code | This paper | GitHub: https://github.com/TebaldiLab/eCLIP_seq |
| Original Images | This paper | Mendeley Data: DOI: 10.17632/f3xhcbyn4b.1 |
| Experimental Models: Cell Lines | ||
| Human: 293FT | ThermoFisher SCIENTIFIC | R70007 |
| Human: HEL | ATCC | TIB-180 |
| Oligonucleotides | ||
| U2AF1 cDNA PCR: Forward: GGCACCGAGAAAGACAAAGT | This paper | N/A |
| U2AF1 cDNA PCR: Reverse: CTCTGGAAATGGGCTTCAAA | This paper | N/A |
| MED24, KMT2D and BCOR primers (Figure S2B) | Nguyen et al., 2018 | N/A |
| Recombinant DNA | ||
| CS-TRE-U2AF1-PRE-Ubc-tTA-I2G plasmids | Yoshida et al., 2011 Yamaguchi et al., 2012 | N/A |
| psPAX2 | Addgene | Cat#12260 |
| VSV.G | Addgene | Cat#14888 |
| Software and Algorithms | ||
| SnapGene | Insightful Science | https://www.snapgene.com |
| ImageJ | NIH | https://imagej.nih.gov/ij/ |
| FlowJo | BD Biosciences | https://www.flowjo.com |
| FastUniq | Xu et al., 2012 | https://sourceforge.net/projects/fastuniq/ |
| Cutadapt | Martin, 2011 | https://cutadapt.readthedocs.io/en/stable/ |
| FastQC | N/A | https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ |
| STAR v2.7.0f | Dobin et al., 2013 | https://github.com/alexdobin/STAR/releases |
| bedtools | Quinlan and Hall, 2010 | https://github.com/arq5x/bedtools2 |
| edgeR | Robinson et al., 2010 | https://bioconductor.org/packages/release/bioc/html/edgeR.html |
| rMATS v4.0.2 | Shen et al., 2014 | http://rnaseq-mats.sourceforge.net |
| metaseqR | Moulos and Hatzis, 2015) | https://bioconductor.org/packages/3.8/bioc/html/metaseqR.html |
| HISAT2 | Kim et al., 2019 | https://daehwankimlab.github.io/hisat2/ |
| HTSeq | Anders et al., 2015 | https://github.com/htseq |
| SAMtools v1.5 | Li et al., 2009 | http://samtools.sourceforge.net/ |
| IGV | Robinson et al., 2011 | https://software.broadinstitute.org/software/igv/ |
| DESeq2 | Love et al., 2014 | https://bioconductor.org/packages/DESeq2/ |
| RStan v2.19.3 | N/A | https://mc-stan.org/rstan |
| IMARIS v9.6 | Oxford Instruments | https://imaris.oxinst.com |
| Cell Ranger v4.0.0 | Zheng et al., 2017 | https://github.com/10XGenomics/cellranger |
| VarTrix v1.1.19 | 10x Genomics | https://github.com/10XGenomics/vartrix |
| Seurat v4.0.1 | Hao et al., 2021 | https://satijalab.org/seurat/ |
Highlights.
freCLIP-seq dissects in vivo U2AF1 RNA binding at single-nucleotide resolution
U2AF1 mutations create de novo 3’ splice site contacts that alter RNA splicing
Binding and splicing integration uncovers alterations in stress granule components
U2AF1-mutant MDS/AML cells exhibit enhanced stress granule response
Acknowledgements
We thank all our patients and all clinical staff for their help with patient recruitment. We thank Christopher Castaldi and the Yale Center for Genome Analysis, the Yale Center for Research Computing, the MS and Proteomics Core Facility at Department CIBIO University of Trento, Thomas Ardito for Leica TCS SP5 training, Lesley Devine for flow cytometry guidance, Michael Bronson and Guilin Wang for scRNA-seq guidance, Martina Cusan and Wei Liu for IF advice, Diane Krause, Clara Kielkopf and Manoj Pillai for helpful suggestions. We also thank the Keck MS & Proteomics Resource at Yale University for mass spectrometry data collected on the Q-Exactive HF-X (funded by the NIH SIG: S10OD02365101A1). This study was funded in part by the DeLuca Center for Innovation in Hematology Research at Yale Cancer Center and The Frederick A. Deluca Foundation, the Edward P. Evans Foundation, the NIH/NIDDK R01DK102792, the YCCC pilot grant and the State of Connecticut under the Regenerative Medicine Research Fund (to S.H.). T.H. was sponsored by MD fellowship from the Boehringer Ingelheim Fonds. Y.S. was supported by the General Program of the National Natural Science Foundation of China (grant no. 82170137). G.V. was supported by CNR Short Term Mobility 2018. M.D.S. was supported by NIGMS R01GM137117. T.T. was supported by a pilot grant from the Yale Cooperative Center of Excellence in Hematology (YCCEH) (NIDDK U54DK106857) and by AIRC under MFAG 2020 (ID. 24883 project). G.B. was supported by the NIH/NIDDK Cooperative Centers of Excellence in Hematology Pilot & Feasibility Award U24DK126127.
Footnotes
Declaration of interests
S.H., Consultancy, Forma Therapeutics. M.D.S., inventor on a patent application related to nucleotide recoding. G.V., scientific advisor of IMMAGINA Biotechnology s.r.l.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Agrawal AA, Salsi E, Chatrikhi R, Henderson S, Jenkins JL, Green MR, Ermolenko DN, and Kielkopf CL (2016). An extended U2AF65–RNA-binding domain recognizes the 3′ splice site signal. Nat. Commun. 7, 10950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anczuków O, and Krainer AR (2016). Splicing-factor alterations in cancers. RNA 22, 1285–1301. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anders S, Pyl PT, and Huber W (2015). HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anderson P, Kedersha N, and Ivanov P (2015). Stress granules, P-bodies and cancer. Biochim. Biophys. Acta - Gene Regul. Mech. 1849, 861–870. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bamopoulos SA, Batcha AMN, Jurinovic V, Rothenberg-Thurley M, Janke H, Ksienzyk B, Philippou-Massier J, Graf A, Krebs S, Blum H, et al. (2020). Clinical presentation and differential splicing of SRSF2, U2AF1 and SF3B1 mutations in patients with acute myeloid leukemia. Leukemia 34, 2621–2634. [DOI] [PubMed] [Google Scholar]
- Bonnal SC, López-Oreja I, and Valcárcel J (2020). Roles and mechanisms of alternative splicing in cancer — implications for care. Nat. Rev. Clin. Oncol. 17, 457–474. [DOI] [PubMed] [Google Scholar]
- Brooks AN, Choi PS, de Waal L, Sharifnia T, Imielinski M, Saksena G, Pedamallu CS, Sivachenko A, Rosenberg M, Chmielecki J, et al. (2014). A Pan-Cancer Analysis of Transcriptome Changes Associated with Somatic Mutations in U2AF1 Reveals Commonly Altered Splicing Events. PLoS One 9, e87361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chatrikhi R, Feeney CF, Pulvino MJ, Alachouzos G, MacRae AJ, Falls Z, Rai S, Brennessel WW, Jenkins JL, Walter MJ, et al. (2021). A synthetic small molecule stalls pre-mRNA splicing by promoting an early-stage U2AF2-RNA complex. Cell Chem. Biol. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Courchaine EM, Lu A, and Neugebauer KM (2016). Droplet organelles? EMBO J. 35, 1603–1612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, and Gingeras TR (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Döhner H, Weisdorf DJ, and Bloomfield CD (2015). Acute Myeloid Leukemia. N. Engl. J. Med. 373, 1136–1152. [DOI] [PubMed] [Google Scholar]
- Dvinge H, Kim E, Abdel-Wahab O, and Bradley RK (2016). RNA splicing factors as oncoproteins and tumour suppressors. Nat. Rev. Cancer 16, 413–430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Esfahani MS, Lee LJ, Jeon Y-J, Flynn RA, Stehr H, Hui AB, Ishisoko N, Kildebeck E, Newman AM, Bratman SV, et al. (2019). Functional significance of U2AF1 S34F mutations in lung adenocarcinomas. Nat. Commun. 10, 5712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fei DL, Motowski H, Chatrikhi R, Prasad S, Yu J, Gao S, Kielkopf CL, Bradley RK, and Varmus H (2016). Wild-Type U2AF1 Antagonizes the Splicing Program Characteristic of U2AF1-Mutant Tumors and Is Required for Cell Survival. PLoS Genet. 12, e1006384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fong JY, Pignata L, Goy P-A, Kawabata KC, Lee SC-W, Koh CM, Musiani D, Massignani E, Kotini AG, Penson A, et al. (2019). Therapeutic Targeting of RNA Splicing Catalysis through Inhibition of Protein Arginine Methylation. Cancer Cell 36, 194–209.e9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gao X, Jiang L, Gong Y, Chen X, Ying M, Zhu H, He Q, Yang B, and Cao J (2019). Stress granule: A promising target for cancer treatment. Br. J. Pharmacol. 176, 4421–4433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grabocka E, and Bar-Sagi D (2016). Mutant KRAS Enhances Tumor Cell Fitness by Upregulating Stress Granules. Cell 167, 1803–1813.e12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Graubert TA, Shen D, Ding L, Okeyo-Owuor T, Lunn CL, Shao J, Krysiak K, Harris CC, Koboldt DC, Larson DE, et al. (2011). Recurrent mutations in the U2AF1 splicing factor in myelodysplastic syndromes. Nat. Genet. 44, 53–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guth S, Tange T, Kellenberger E, and Valcarcel J (2001). Dual function for U2AF(35) in AG-dependent pre-mRNA splicing. Mol. Cell Biol. 21, 7673–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hao Y, Hao S, Andersen-Nissen E, Mauck WM, Zheng S, Butler A, Lee MJ, Wilk AJ, Darby C, Zager M, et al. (2021). Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hellström-Lindberg E, Tobiasson M, and Greenberg P (2020). Myelodysplastic syndromes: moving towards personalized management. Haematologica 105, 1765–1779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ilagan JO, Ramakrishnan A, Hayes B, Murphy ME, Zebari AS, Bradley P, and Bradley RK (2015). U2AF1 mutations alter splice site recognition in hematological malignancies. Genome Res. 25, 14–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ivanov P, Kedersha N, and Anderson P (2019). Stress granules and processing bodies in translational control. Cold Spring Harb. Perspect. Biol. 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jain S, Wheeler JR, Walters RW, Agrawal A, Barsic A, and Parker R (2016). ATPase-Modulated Stress Granules Contain a Diverse Proteome and Substructure. Cell 164, 487–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kedersha N, and Anderson P (2007). Mammalian Stress Granules and Processing Bodies. In Translation Initiation: Cell Biology, High- Throughput Methods, and Chemical- Based Approaches, (Academic Press; ), pp. 61–81. [DOI] [PubMed] [Google Scholar]
- Kennedy JA, and Ebert BL (2017). Clinical implications of Genetic mutations in Myelodysplastic syndrome. J. Clin. Oncol. 35, 968–974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khong A, Matheny T, Jain S, Mitchell SF, Wheeler JR, and Parker R (2017). The Stress Granule Transcriptome Reveals Principles of mRNA Accumulation in Stress Granules. Mol. Cell 68, 808–820.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kielkopf CL, Rodionova NA, Green MR, and Burley SK (2001). A novel peptide recognition mode revealed by the X-ray structure of a core U2AF35/U2AF65 heterodimer. Cell 106, 595–605. [DOI] [PubMed] [Google Scholar]
- Kim D, Paggi JM, Park C, Bennett C, and Salzberg SL (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim S, Park C, Jun Y, Lee S, Jung Y, and Kim J (2018). Integrative Profiling of Alternative Splicing Induced by U2AF1 S34F Mutation in Lung Adenocarcinoma Reveals a Mechanistic Link to Mitotic Stress. Mol. Cells 41, 733–741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, et al. (2016). Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee SC-W, Dvinge H, Kim E, Cho H, Micol J-B, Chung YR, Durham BH, Yoshimi A, Kim YJ, Thomas M, et al. (2016). Modulation of splicing catalysis for therapeutic targeting of leukemia with mutations in genes encoding spliceosomal proteins. Nat. Med. 22, 672–678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, and Durbin R (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Love MI, Huber W, and Anders S (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Markmiller S, Soltanieh S, Server KL, Mak R, Jin W, Fang MY, Luo E-C, Krach F, Yang D, Sen A, et al. (2018). Context-Dependent and Disease-Specific Diversity in Protein Interactions within Stress Granules. Cell 172, 590–604.e13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marmor-Kollet H, Siany A, Kedersha N, Knafo N, Rivkin N, Danino YM, Moens TG, Olender T, Sheban D, Cohen N, et al. (2020). Spatiotemporal Proteomic Analysis of Stress Granule Disassembly Using APEX Reveals Regulation by SUMOylation and Links to ALS Pathogenesis. Mol. Cell 80, 876–891.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martin M (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.Journal 17, 10. [Google Scholar]
- Matheny T, Van Treeck B, Huynh TN, and Parker R (2021). RNA partitioning into stress granules is based on the summation of multiple interactions. RNA 27, 174–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merendino L, Guth S, Bilbao D, Martínez C, and Valcárcel J (1999). Inhibition of msl-2 splicing by Sex-lethal reveals interaction between U2AF35 and the 3′ splice site AG. Nature 402, 838–841. [DOI] [PubMed] [Google Scholar]
- Motta-Mena LB, Heyd F, and Lynch KW (2010). Context-Dependent Regulatory Mechanism of the Splicing Factor hnRNP L. Mol. Cell 37, 223–234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moulos P, and Hatzis P (2015). Systematic integration of RNA-Seq statistical algorithms for accurate detection of differential gene expression patterns. Nucleic Acids Res. 43, e25–e25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nguyen HD, Leong WY, Li W, Reddy PNG, Sullivan JD, Walter MJ, Zou L, and Graubert TA (2018). Spliceosome Mutations Induce R Loop-Associated Sensitivity to ATR Inhibition in Myelodysplastic Syndromes. Cancer Res. 78, 5363–5374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Nostrand EL, Pratt GA, Shishkin AA, Gelboin-Burkhart C, Fang MY, Sundararaman B, Blue SM, Nguyen TB, Surka C, Elkins K, et al. (2016). Robust transcriptome-wide discovery of RNA-binding protein binding sites with enhanced CLIP (eCLIP). Nat. Methods 13, 508–514. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Nostrand EL, Gelboin-Burkhart C, Wang R, Pratt GA, Blue SM, and Yeo GW (2017). CRISPR/Cas9-mediated integration enables TAG-eCLIP of endogenously tagged RNA binding proteins. Methods 118–119, 50–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Okeyo-Owuor T, White BS, Chatrikhi R, Mohan DR, Kim S, Griffith M, Ding L, Ketkar-Kulkarni S, Hundal J, Laird KM, et al. (2015). U2AF1 mutations alter sequence specificity of pre-mRNA binding and splicing. Leukemia 29, 909–917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palangat M, Anastasakis DG, Fei DL, Lindblad KE, Bradley R, Hourigan CS, Hafner M, and Larson DR (2019). The splicing factor U2AF1 contributes to cancer progression through a noncanonical role in translation regulation. Genes Dev. 33, 482–497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Papaemmanuil E, Gerstung M, Malcovati L, Tauro S, Gundem G, Van Loo P, Yoon CJ, Ellis P, Wedge DC, Pellagatti A, et al. (2013). Clinical and biological implications of driver mutations in myelodysplastic syndromes. Blood 122, 3616–3627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pellagatti A, Armstrong RN, Steeples V, Sharma E, Repapi E, Singh S, Sanchi A, Radujkovic A, Horn P, Dolatshad H, et al. (2018). Impact of spliceosome mutations on RNA splicing in myelodysplasia: dysregulated genes/pathways and clinical associations. Blood 132, 1225–1240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Protter DSW, and Parker R (2016). Principles and Properties of Stress Granules. Trends Cell Biol. 26, 668–679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Przychodzen B, Jerez A, Guinta K, Sekeres MA, Padgett R, Maciejewski JP, and Makishima H (2013). Patterns of missplicing due to somatic U2AF1 mutations in myeloid neoplasms. Blood 122, 999–1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quinlan AR, and Hall IM (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rabouw HH, Langereis MA, Anand AA, Visser LJ, de Groot RJ, Walter P, and van Kuppeveld FJM (2019). Small molecule ISRIB suppresses the integrated stress response within a defined window of activation. Proc. Natl. Acad. Sci. 116, 2097–2102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, and Mesirov JP (2011). Integrative genomics viewer. Nat. Biotechnol. 29, 24–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson MD, McCarthy DJ, and Smyth GK (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saez B, Walter MJ, and Graubert TA (2017). Splicing factor gene mutations in hematologic malignancies. Blood 129, 1260–1269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schofield JA, Duffy EE, Kiefer L, Sullivan MC, and Simon MD (2018). TimeLapse-seq: adding a temporal dimension to RNA sequencing through nucleoside recoding. Nat. Methods 15, 221–225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shallis RM, Wang R, Davidoff A, Ma X, and Zeidan AM (2019). Epidemiology of acute myeloid leukemia: Recent progress and enduring challenges. Blood Rev. 36, 70–87. [DOI] [PubMed] [Google Scholar]
- Shao C, Yang B, Wu T, Huang J, Tang P, Zhou Y, Zhou J, Qiu J, Jiang L, Li H, et al. (2014). Mechanisms for U2AF to define 3′ splice sites and regulate alternative splicing in the human genome. Nat. Struct. Mol. Biol. 21, 997–1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shen S, Park JW, Lu Z, Lin L, Henry MD, Wu YN, Zhou Q, and Xing Y (2014). rMATS: Robust and flexible detection of differential alternative splicing from replicate RNA-Seq data. Proc. Natl. Acad. Sci. 111, E5593–E5601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shirai CL, Ley JN, White BS, Kim S, Tibbitts J, Shao J, Ndonwi M, Wadugu B, Duncavage EJ, Okeyo-Owuor T, et al. (2015). Mutant U2AF1 Expression Alters Hematopoiesis and Pre-mRNA Splicing In Vivo. Cancer Cell 27, 631–643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shirai CL, White BS, Tripathi M, Tapia R, Ley JN, Ndonwi M, Kim S, Shao J, Carver A, Saez B, et al. (2017). Mutant U2AF1-expressing cells are sensitive to pharmacological modulation of the spliceosome. Nat. Commun. 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sidrauski C, Acosta-Alvear D, Khoutorsky A, Vedantham P, Hearn BR, Li H, Gamache K, Gallagher CM, Ang KK-H, Wilson C, et al. (2013). Pharmacological brake-release of mRNA translation enhances cognitive memory. Elife 2, e00498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Song M-S, and Grabocka E (2020). Stress Granules in Cancer. (Berlin, Heidelberg: Springer Berlin Heidelberg; ), pp. 1–28. [Google Scholar]
- Steensma DP (2015). Myelodysplastic Syndromes: Diagnosis and Treatment. Mayo Clin. Proc. 90, 969–983. [DOI] [PubMed] [Google Scholar]
- Van Treeck B, and Parker R (2018). Emerging Roles for Intermolecular RNA-RNA Interactions in RNP Assemblies. Cell 174, 791–802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Treeck B, Protter DSW, Matheny T, Khong A, Link CD, and Parker R (2018). RNA self-assembly contributes to stress granule formation and defining the stress granule transcriptome. Proc. Natl. Acad. Sci. 115, 2734–2739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Visconte V, Nakashima O, M., and Rogers J, H. (2019). Mutations in Splicing Factor Genes in Myeloid Malignancies: Significance and Impact on Clinical Features. Cancers (Basel). 11, 1844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang E, and Aifantis I (2020). RNA Splicing and Cancer. Trends in Cancer 6, 631–644. [DOI] [PubMed] [Google Scholar]
- Wang E, Lu SX, Pastore A, Chen X, Imig J, Chun-Wei Lee S, Hockemeyer K, Ghebrechristos YE, Yoshimi A, Inoue D, et al. (2019). Targeting an RNA-Binding Protein Network in Acute Myeloid Leukemia. Cancer Cell 35, 369–384.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang F, Li J, Fan S, Jin Z, and Huang C (2020). Targeting stress granules: A novel therapeutic strategy for human diseases. Pharmacol. Res. 161, 105143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Warnasooriya C, Feeney CF, Laird KM, Ermolenko DN, and Kielkopf CL (2020). A splice site-sensing conformational switch in U2AF2 is modulated by U2AF1 and its recurrent myelodysplasia-associated mutation. Nucleic Acids Res. 48, 5695–5709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu S, Romfo CM, Nilsen TW, and Green MR (1999). Functional recognition of the 3’ splice site AG by the splicing factor U2AF35. Nature 402, 832–835. [DOI] [PubMed] [Google Scholar]
- Xu H, Luo X, Qian J, Pang X, Song J, Qian G, Chen J, and Chen S (2012). FastUniq: A Fast De Novo Duplicates Removal Tool for Paired Short Reads. PLoS One 7, e52249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yamaguchi T, Hamanaka S, Kamiya A, Okabe M, Kawarai M, Wakiyama Y, Umino A, Hayama T, Sato H, Lee Y-S, et al. (2012). Development of an all-in-one inducible lentiviral vector for gene specific analysis of reprogramming. PLoS One 7, e41007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yip BH, Steeples V, Repapi E, Armstrong RN, Llorian M, Roy S, Shaw J, Dolatshad H, Taylor S, Verma A, et al. (2017). The U2AF1S34F mutation induces lineage-specific splicing alterations in myelodysplastic syndromes. J. Clin. Invest. 127, 2206–2221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoshida H, Park S-Y, Oda T, Akiyoshi T, Sato M, Shirouzu M, Tsuda K, Kuwasako K, Unzai S, Muto Y, et al. (2015). A novel 3′ splice site recognition by the two zinc fingers in the U2AF small subunit. Genes Dev. 29, 1649–1660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoshida H, Park S-Y, Sakashita G, Nariai Y, Kuwasako K, Muto Y, Urano T, and Obayashi E (2020). Elucidation of the aberrant 3′ splice site selection by cancer-associated mutations on the U2AF1. Nat. Commun. 11, 4744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoshida K, Sanada M, Shiraishi Y, Nowak D, Nagata Y, Yamamoto R, Sato Y, Sato-Otsubo A, Kon A, Nagasaki M, et al. (2011). Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478, 64–69. [DOI] [PubMed] [Google Scholar]
- Youn J-Y, Dunham WH, Hong SJ, Knight JDR, Bashkurov M, Chen GI, Bagci H, Rathod B, MacLeod G, Eng SWM, et al. (2018). High-Density Proximity Mapping Reveals the Subcellular Organization of mRNA-Associated Granules and Bodies. Mol. Cell 69, 517–532.e11. [DOI] [PubMed] [Google Scholar]
- Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu J, et al. (2017). Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu Y, Orre LM, Zhou Tran Y, Mermelekas G, Johansson HJ, Malyutina A, Anders S, and Lehtiö J (2020). DEqMS: A Method for Accurate Variance Estimation in Differential Protein Expression Analysis. Mol. Cell. Proteomics 19, 1047–1057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zorio DAR, and Blumenthal T (1999). Both subunits of U2AF recognize the 3′ splice site in Caenorhabditis elegans. Nature 402, 835–838. [DOI] [PubMed] [Google Scholar]
- Zyryanova AF, Kashiwagi K, Rato C, Harding HP, Crespillo-Casado A, Perera LA, Sakamoto A, Nishimoto M, Yonemochi M, Shirouzu M, et al. (2021). ISRIB Blunts the Integrated Stress Response by Allosterically Antagonising the Inhibitory Effect of Phosphorylated eIF2 on eIF2B. Mol. Cell 81, 88–103.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figures S1–S6
Table S1 related to Figure 1. eCLIP-seq metrics.
Table S2 related to Figure 2. Alternative splicing events comparing S34F vs WT and Q157R vs WT U2AF1 HEL cells, after exclusion of dox-dependent events.
Table S3 related to Figures 2–S2. Published datasets on U2AF1 mutation-dependent alternative splicing.
Table S4 related to Figures 2–S2. Comparative analysis of differentially spliced genes in U2AF1 S34F/Y and Q157R/P conditions in HEL RNA-seq data and in published datasets.
Table S5 related to Figure 3. Differential binding events comparing S34F vs WT and Q157R vs WT U2AF1 HEL cells.
Table S6 related to Figure 3. Combined analysis of differentially bound and spliced genes comparing S34F vs WT and Q157R vs WT U2AF1 HEL cells.
Table S7 related to Figure 4 and S4. Functional annotation enrichment analysis of differentially spliced genes.
Table S8 related to Figure 4 and S4. Functional annotation enrichment analysis of differentially bound genes.
Table S9 related to Figure 4. Functional annotation enrichment analysis of differentially bound and spliced genes.
Table S10 related to Figure 4. Published datasets on stress granule proteome and transcriptome.
Table S11 related to Figure 5. IF imaging metrics for U2AF1 WT, S34F and Q157R HEL samples.
Table S12 related to Figure 6. Patient characteristics.
Table S13 related to Figure 6. IF imaging metrics for U2AF1 WT and S34F AML primary samples.
File S1 related to Figure 5. Z-stack slice movie for the representative U2AF1 S34F dox+ars image reported in Figure 5A.
File S2 related to Figure 5. Z-stack slice movie for the representative U2AF1 Q157R dox+ars image reported in Figure 5A.
Data Availability Statement
Sequencing files generated from cell lines (eCLIP-seq, freCLIP-seq, RNA-seq, TL-seq) have been deposited in the GEO database and are available under the accession number GSE195620. Sequencing files generated from patient samples (scRNA-seq) are available upon request. Original blot, gel and autoradiogram images have been deposited in Mendeley Data at https://doi.org/10.17632/f3xhcbyn4b.1
eCLIP-seq analysis code is publicly available on GitHub (https://github.com/TebaldiLab/eCLIP_seq) on the date of publication.
Any additional information required to reanalyze the data reported in this paper is available upon request from the lead contact.







