Abstract
During pre-mRNA splicing, the branch helix forms when U2 snRNP engages with introns to initiate spliceosome assembly. The branch helix is mutually exclusive with the U2 snRNA branchpoint-interacting stem loop (BSL). In yeast, BSL alteration affects branchpoint recognition, but its role in human cells, where branchpoint usage is more flexible, is unknown. To examine the impact of perturbing BSL base pairing, we used a self-contained orthogonal splicing system that pairs an engineered U2 snRNA and splicing reporter. Our results show that BSL mutations affect both U2 snRNA accumulation and splicing in human cells. We also examined the relationship between BSL stability and U2 snRNA complementarity to branchpoint sequence. The results indicate that pairing between the branchpoint sequence and BSL loop links branchpoint fidelity and intron-mediated unwinding of the BSL stem, which supports and extends a toehold-mediated strand invasion model of branch helix formation advanced by Pena and coworkers from cryo-EM structures. Finally, we investigated transcriptome-wide effects of expressing U2 snRNA with either a cancer-associated BSL mutation or with an altered branchpoint recognition sequence. Similarities in both splicing and gene expression changes between the mutants suggest a shared cellular response mechanism leading to gene upregulation linked to oncogenic pathways.
Graphical Abstract
Graphical Abstract.
Introduction
Pre-mRNA splicing removes introns from RNA Polymerase II gene transcripts, and recognition of short intronic sequences by splicing factors and the spliceosome is critical to the process. The 5′ splice site sequence defines the beginning of an intron, while the combined recognition of the 3′ splice site, polypyrimidine track, and branchpoint sequence determines its end. The branchpoint sequence also defines the intron branchpoint adenosine required for splicing chemistry [1]. Recognition of the branchpoint sequence is mediated by base pairing with U2 small nuclear RNA (snRNA) [2–4]. In Saccharomyces cerevisiae full complementarity to the U2 snRNA GUAGUA branchpoint recognition sequence (BPRS) is necessary for splicing, resulting in the strict consensus of UACUAAC flanking the branchpoint adenosine. Although the U2 snRNA sequence is highly conserved across eukaryotes, the requirement for continuous base pairing with branchpoint sequence in many other species is relaxed. For example, the human branchpoint consensus is yUNAy [5, 6], suggesting that the mechanism of branchpoint sequence recognition differs to some extent from yeast to humans. Understanding the molecular mechanisms that underlie branchpoint sequence recognition is also relevant to human health because cancers frequently acquire somatic mutations in factors, including U2 snRNA, involved in the process [7–11]. A detailed model of branchpoint recognition is incomplete, especially in the context of human splicing, because of challenges in manipulating essential splicing factors and capturing all the relevant interactions involved.
Cryo-EM structures of both yeast and human spliceosomes show that branchpoint recognition culminates with formation of an extended branch helix between an intron and U2 snRNA nucleotides 33–45 (Fig. 1A) [12–19]. The branch helix pairs nucleotides of the U2 snRNA BPRS and intron branchpoint sequence and bulges the branchpoint adenosine, while nucleotides upstream of the branchpoint sequences continue to pair with U2 snRNA in an extended helix. The extended helix does not follow strict Watson–Crick base pairing rules and appears to be stabilized by base stacking and contacts with U2 snRNP proteins. Notably, the extended branch helix is mutually exclusive with an internal U2 snRNA structure termed the branchpoint interacting stem loop (BSL) (Fig. 1A). First characterized in yeast [20] and later visualized in the cryo-EM structure of human 17S U2 snRNP [16, 19], the BSL must be unwound to enable branch helix formation. Yeast strains with a BSL stabilizing mutation U44A (nt 43 in humans) are cold sensitive, but show improved splicing of a reporter intron with an imperfect branchpoint consensus sequence [20]. Conversely, mutations reducing BSL base pairing—U42A and C46U (nt 41 and 45 in humans)—resulted in lowered slicing efficiency for introns with either perfect or imperfect branchpoint consensus but also suppress a truncation of PRP5 in the absence of Cus2. The studies concluded that BSL dynamics influence the fidelity of branchpoint recognition.
Figure 1.
Reproducing an orthogonal splicing system for U2 snRNA BSL structure/function studies in human cells. (A) Schematic of key U2 snRNA secondary structure rearrangements in spliceosome assembly. Top: The 5′ sequence and nucleotide modifications of U2 snRNA as observed in the 17S U2 snRNP. The BSL stem and loop, toehold +1 nucleotides proposed for strand invasion and C28U cancer mutation are highlighted. The mutually exclusive Stem I extended is also shown. Bottom: In the catalytic spliceosome after branch helix formation, the 5′ end of U2 is completely remodeled to allow for base pairing with the intron and U6 snRNA. (B) Schematic of SV40 T antigen pre-mRNA highlights the different branchpoint sequences that can be used in the alternative splicing of large T and small t isoforms. (C) Comparison of the U2 snRNP interactions with large T and small t branchpoints with different combinations of exogenous U2 snRNA and splicing reporters. Details of base pairing between the U2 BPRS and small t branchpoint sequence with underlined branchpoint adenosine are included. The expected splice products for each combination are shown in the last column. Orthogonal nucleotides in U2 and the splicing reporter are highlighted. U2-WT* indicates the constant presence of endogenous U2 snRNA. (D) Primer extension analysis to quantify orthogonal U2 snRNA expression for the four transfection combinations shown in panel (C). Left: Representative PAGE of radiolabeled U2 snRNA primer extension reactions. Extension stops for endogenous U2 snRNA/U2-WT and U2-Ortho are indicated, along with positions of shark’s-tooth lane divisions. Right: Schematic of the primer extension reaction with the arrow representing the annealed radiolabeled primer and X’s marking the position of extension stops due to incorporation of ddGTP. (E) The fraction of U2-Ortho relative to total U2 snRNA quantified from primer extension reactions; n = 3, and (F) Levels of small t intron splicing as determined by -ΔΔCt analysis of RT-qPCR for the transfection combinations shown in panel (C). Briefly, -ΔΔCt was determined by first subtracting the Cq value for a small t splice junction probe from the Cq value of an exon 2 probe for each sample, and then from the ΔCt value for sample 2, making the -ΔΔCt value equivalent to log2-fold change in splicing efficiency relative to the transfection with BP-Ortho and U2-WT. In all cases, error bars represent standard deviation. ns = not significant, P-value <.05*, <.01**, <.005*** for Student’s t-test.
The BSL is a relatively weak helix, and in the absence of proteins, the 5′ end of U2 snRNA takes on an extended Stem I structure, which is also mutually exclusive with the BSL and consistent with sequence-based thermodynamic predictions [21, 22]. The BSL therefore requires other factors for its stability, and in the cryo-EM model of the 17S U2 snRNP, the base of the BSL stem appears both supported and constrained by the positions of flanking shortened Stem I and Stem IIa U2 snRNA structures, and by interactions with the protein SF3A3 (Prp9p). The BSL loop region appears stabilized by proteins HTATSF1 (Cus2p) and SF3B1 (Hsh155p), which also occlude three U2 snRNA BPRS nucleotides extending from the loop. The RNA-dependent ATPase DDX46 (Prp5) has been proposed to control BSL unwinding, either as a helicase or by displacing HTATSF1. This situation begs the question of how BSL unwinding is achieved and linked to branch helix formation. Based on their inhibited spliceosome A complex (prespliceosome) structure, Pena and co-workers noted that the SF3B1 inhibitor SSA, which occludes a pocket into which the branchpoint adenosine normally docks, does not prevent BSL unwinding or pairing in the extended helix [23]. They proposed that intron nucleotides at positions −2 to −4 relative to the branchpoint (UACUAAC) pair with the three extended BSL loop nucleotides to initiate toehold-mediated strand invasion that promotes BSL unwinding by progressive and competing intron pairing [24]. This model predicts that an interplay between BSL stability and branchpoint sequence complementarity has a role in branchpoint sequence recognition and branch helix formation. A more stable BSL would require high complementarity with the branchpoint sequence to maintain a toehold and initiate unwinding, while a weaker BSL would be less dependent on the specific branchpoint sequence.
To investigate the relationship between BSL structure and branchpoint sequence recognition in splicing, we pursued a mutational analysis of human U2 snRNA. We used an orthogonal splicing strategy to bypass the essential function of U2 snRNA in human cells. Orthogonal splicing employs a splicing reporter with a noncanonical branchpoint that is fully dependent on the functionality of a complementarily modified U2 snRNA. This strategy was previously used in human and yeast cells to demonstrate the importance and limits of pairing between the U2 BPRS and intron branchpoint sequence [3, 25, 26], and the impact of U2 snRNA mutation on splicing and snRNP biogenesis [27, 28]. Here, we tested a series of U2 snRNA mutations designed to strengthen or weaken BSL base pairing for the ability to splice reporters with different branchpoint sequences. The results demonstrate an interplay between branchpoint complementarity and BSL structure that supports the toehold-mediated strand invasion model of branch helix formation. Moreover, they suggest that a finely tuned balance of base pairing in the BSL results in something akin to two-factor authentication for branchpoint recognition. The first factor requires formation of the BSL to ensure both proper U2 snRNP assembly and positioning of BPRS nucleotides to test for an intron toehold. The second factor monitors sufficient, but not absolute, complementarity between the intron and U2 snRNA for intron-mediated BSL unwinding. Our study thus provides insight into how BSL dynamics dictate splicing efficiency by facilitating branchpoint sequence recognition.
In the second part of our study, we investigated how a U2 snRNA cancer mutation located in the BSL (C28U) in comparison with a mutant in the BPRS affects splicing and gene expression globally in cells [11]. Surprisingly, both U2 snRNA mutations yielded highly overlapping changes in both the relatively small number of altered splicing events and the numerous differentially expressed genes. Although a branchpoint sequence feature mediating the altered splicing could not be identified, the patterns of differential gene expression offer clues to the cellular response to perturbed U2 snRNA. Upregulated genes with both mutants are enriched for pre-mRNA processing, translation regulation, and protein folding pathways, and as MYC targets, whereas downregulated genes have no enriched function. We speculate that mutant U2 snRNP competition for branchpoints results in a mild general splicing defect that decreases expression through quality control pathways such as nonsense-mediated decay (NMD), and that cells respond to the defect by upregulating genes important for gene expression and cell proliferation, which may explain the relevance of U2 snRNA mutations in cancer.
Materials and methods
Orthogonal splicing reporter and U2 snRNA expression plasmids
Based on the system developed by Wu and Manley [3], the SV40 Large T antigen gene was amplified from SV40 small + large T in pENTR1A (w611-7), a gift from Eric Campeau (Addgene plasmid #2229; http://n2t.net/addgene:22297; RRID:Addgene_22297), and subcloned into the AflII and NotI sites of pCI-M2 (a gift from John Olsen; Addgene plasmid #44170; http://n2t.net/addgene:44170; RRID:Addgene_44170) downstream of a CMV promoter. To create the branchpoint splicing reporters, the small t branchpoint was mutated by around-the-horn site-directed mutagenesis and verified by Sanger sequencing. For the orthogonal U2 snRNA construct, the RNU2-1 locus, including its Pol II promoter and 3′ processing site, was amplified from HeLa genomic DNA [29] and subcloned into the BglII site of Addgene plasmid #44170. The orthogonal ACU branchpoint recognition sequence and BSL mutations were introduced by around-the-horn site-directed mutagenesis and verified by Sanger sequencing. For lentiviral transduction, the wild-type and orthogonal RNU2-1 locus was subcloned into the XbaI and NotI sites of a custom pSico lentiviral transfer vector (a gift from Susan Carpenter) containing an EF1A promoter inducing expression of Zeocin resistance-T2A-GFP. The BSL C28U mutation was introduced into the wild-type U2 snRNA sequence by around-the-horn site-directed mutagenesis and verified by Sanger sequencing. Primer sequences for cloning and mutagenesis are included in Supplementary Table S1.
Reporter branchpoint sequence analysis
SVM_BPfinder (https://github.com/comprna/SVM-BPfinder) [30] was used to analyze the branchpoint and SVM scores of branchpoints with different BPRS complementarities. Branchpoint sequences of various complementarity to the orthogonal U2 snRNA were curated, and sequences equivalent to the wild-type branchpoint sequence were evaluated at the same location within the context of the small t intron. Orthogonal branchpoints were also tested to ensure that novel cryptic branchpoints were not produced. Program options for each inton tested included -s Hsap for Homo sapiens, -l 100 for read length of intron, and -d 10 for distance (nt) allowed between branchpoint A and 3′ splice site.
Transfections and RNA isolation for splicing reporter experiments
HEK293 cells (a gift from Doug Kellogg) were cultured at 37°C, 5% CO2 in 24-well tissue culture plates (Falcon, cat. 353047) in Dulbecco’s modified Eagle medium (DMEM) (Gibco, cat. 12100061) supplemented with 10% FBS (Hyclone, cat. SH3054103). Cells at 60%–80% confluency were transiently transfected with a 3:1 v/v ratio of Fugene 6 transfection reagent (Promega, cat. E2693) to DNA containing 150 ng of splicing reporter plasmid and 350 ng of U2 snRNA expression plasmid. After 72 h, total RNA was isolated from the transfected cells by Trizol extraction (Invitrogen, cat. 15596018) and ethanol precipitation. Transfections were performed with a minimum of three biological replicates.
Reverse transcription
Total RNA was treated with RQ1 DNase (Promega, cat. M6101) followed by phenol:chloroform:iso-amyl alcohol (25:24:1 v/v) extraction, chloroform:iso-amyl alcohol (24:1 v/v) extraction, and ethanol precipitation. The DNase-treated RNA was resuspended in distilled water and quantified by UV absorbance with a NanoDrop 2000 (Thermo Scientific, cat ND-2000). RNA integrity was confirmed before and after DNase treatment by agarose gel electrophoresis. First-strand complementary DNA (cDNA) synthesis was performed by first annealing primers specific to SV40 or U2 snRNA (Supplementary Table S1) in a 14 µl reaction containing 800 ng of DNase-treated RNA, 1 µl of 2 µM primers, and 1 µl of 10 mM dNTPs at 65°C for 5 min and placed on ice for 5 min. Extension buffer containing 4 µl 5X First strand buffer (Invitrogen), 1 µl of 0.1 M DTT, and 1 µl of a modified MMLV-RT was added and reactions were incubated at 55°C for 50 min and 85°C for 5 min. The modified MMLV-RT containing ∆1–23, Q84A, H204R, V223H, T306K, F309N, and D524N mutations (a gift from Peter Stoilov, submitted to AddGene and assigned ID #252404) was expressed and purified in-house, and stored in 20 mM Tris-HCl (pH 7.5), 150 mM NaCl, 0.1 mM ethylenediaminetetraacetic acid (EDTA), 1 mM DTT, 0.01% v/v NP-40, and 50% v/v glycerol at a working concentration of 1 mg/ml.
U2 snRNA expression primer extension
To quantify orthogonal U2 snRNA expression relative to endogenous, U2 snRNA cDNA was first amplified from 40 ng of first-strand reverse transcription product with primers complementary to the 5′ and 3′ end of U2 snRNA sequence (Supplementary Table S1) by 25 cycles of PCR with Taq polymerase (gift from David Feldheim), and isolated with the NucleoSpin Gel and PCR Clean-up kit (Macherey-Nagel, cat. 740609.250). A 5′-end-labeled primer was generated by incubating 10 pmol of a DNA oligonucleotide complementary to nucleotides 39–54 with γ-32P ATP and T4 Polynucleotide Kinase (Thermo Scientific cat. EK0031), followed by size exclusion with a Sephadex G-25 (Sigma–Aldrich, cat. G2580-50G) spin column. For primer extension reactions, 50 ng of U2 snRNA amplified cDNA and 0.1 picomole of labeled primer were incubated in 12 µl with annealing buffer (83.3 mM Tris-HCl pH 7.9, 125 mM KCl) at 95°C for 2 min, 37°C (33°C for P4 and P5 primers) for 10 min, and room temperature for 30 min, and then supplemented with 8 µl of extension mix to yield a final concentration of 10 mM DTT, 3 mM magnesium chloride, 0.1 mM dATP, dTTP, dCTP, and ddGTP and 2 units of AMV reverse transcriptase (NEB, cat. M0277S). The reactions were incubated at 42°C for 70 min followed by addition of 10 volumes of 0.3 M sodium acetate pH 5.2, 0.5 M EDTA, and 0.05% SDS and 20 ng glycogen. As a negative control, the template and extension mix were substituted by water. As a positive control, the amplified U2 cDNA template was replaced with a correlating amplicon from the orthogonal U2 snRNA expression plasmid. Extension products were isolated by ethanol precipitation, resuspended in 95% v/v formamide, 20 mM EDTA, 0.01% bromophenol blue, and 0.01% cyan blues and separated on a 15% (v/v) polyacrylamide 7 M urea 1× TBE gel that was dried onto Whatman paper and visualized by phosphor imaging. Fiji (ImageJ; version 1.53t) was used to determine pixel intensities of the ddGTP extension stop bands at C35 and C28, which were adjusted for incomplete ddGTP incorporation based on C28 intensity caused by C35 readthrough in the positive control. The resulting values were divided by the sum of C35 and C28 band intensities to express the ratio of orthogonal U2 snRNA relative to total U2 snRNA. Statistical analysis of differences in expression was performed in Prism 10 (GraphPad) using a paired two-way Student’s t-test to determine P-values with an alpha = .05. In transduced HEK293T cells, the same strategy was used to assess expression of exogenous U2 snRNA expression, except that the radiolabeled primer for the C28U mutant was complementary to nucleotides 32–47 (primer P6), and ddATP was substituted for ddGTP in the extension mix.
Quantitative PCR for the orthogonal splicing reporter
TaqMan hydrolysis (Thermo Fisher) probes specific to the splice junctions of large T and small t antigen isoforms, and to exon 2 were synthesized with 5′ FAM and 3′ NFQ-MGB quencher (Eurofins) (Supplementary Table S1). Amplicon size and primer specificity for the correct spliced product were both confirmed by gel electrophoresis. Serial dilutions of cDNA (1:5 to 1:80) against primer (4.5–18 µM) and probe (2.5–10 µM) concentrations were tested to optimize qPCR efficiency to 110%–120% with R2 between 0.9 and 1. Reactions were prepared by combining 1 µl of 1:10 diluted cDNA with 19 µl of qPCR reaction containing 5 µl of 4X TaqMan Fast Virus 1-Step Master Mix (Applied Biosystems, cat. 4444432), with 0.5 µl probe 2.5 µM to 10 µM, and 1 µl of each forward and reverse primer 4.5–18 µM (Supplementary Table S1) and then incubated for 1 cycle of 20 s at 95°C followed by 40 cycles of 15 s at 95°C and 1 min at 60°C with a QuantStudio 6 Pro Real-Time PCR system (Applied Biosystems). Three technical replicates were prepared for each biological sample tested. RT-PCR reactions using DNase-treated RNA as template were used to rule out DNA contamination. Design and Analysis software 2.6.0 (Applied Biosystems) was used to determine Cq values for reactions. Baseline correction was derived from the linear amplification plot, and then the log-based amplification plot was used to set threshold values within the linear phase where the technical and biological replicates exhibited the least variability and above the background signal of a no-template control reaction. Reactions with Cq values that differ by >0.5 relative to other technical replicates were removed as outliers before calculating an average Cq value for each biological replicate.
A modified -ΔΔCt method [31] using exon 2 as the reference control was used to determine the small t and large T reporter splicing efficiency for U2 variants relative to empty vector (EV) control (or no U2-Ortho) for each group of transfections. Specifically, the Cq value for splice junction probes of a sample was subtracted from the Cq value for the corresponding exon 2 probe to yield ΔCt, from which the ΔCt value for control is subtracted to yield a -ΔΔCt value that representing the log2-fold change in splicing efficiency relative to EV. Statistical analysis of ΔΔCt value comparisons was performed in Prism 10 (GraphPad) using paired two-way Student’s t-test to determine P-values with an alpha level of .05.
Lentiviral transduction and selection of HEK293T U2 mutant cell line
To generate lentivirus for exogenous U2 snRNA expression, HEK293T cells (gift from Angela Brooks) were seeded into six-well plates (BioLite, cat. 130184) at 1 × 106 cells/well in DMEM supplemented with 10% FBS. After 24 h, cells were transfected with 0.75 µg psPAX2 and 0.25 µg pMD2.G (lentiviral packaging plasmids gifted from Joseph Costello lab), and 1 µg of custom pSico lentiviral transfer vector containing BPRS or BSL mutant U2 snRNA expression locus using X-tremeGENE HP (Sigma–Aldrich, cat. 06366244001) transfection reagent in Opti-MEM (Gibco, cat. 31985062) according to manufacturer directions. After a 24-h incubation, media was replaced with DMEM/10% FBS. Virions were collected 72 h post-transfection by filtering the media through 0.2 µM PES (Celltreat, cat. 229746), aliquoted, and stored at −80°C.
Replicate mutant U2 snRNA cell lines were generated by first seeding HEK293T cells in 12-well plates (CELLSTAR, cat. 665180) at 1.5 × 105 cells/well in DMEM/10% FBS. After 24 h, threefold serial dilutions of stock virus and 4 µg/ml polybrene were added for each U2 snRNA mutant construct and EV control line, and the cells were incubated for 24 more hours. The media was then replaced with fresh DMEM/10% FBS and the cells grown for 72 h. Transduced cells were selected by growth and maintained in media supplemented with 300 µg/ml Zeocin (Thermo Fisher, cat. R25001). For each replicate, the polyclonal cell line from viral dilutions yielding the highest GFP fluorescent signal was selected for further culture, characterization, and downstream RNA-seq analysis.
Short-read RNA-seq
For each genotype transduced cell line, total RNA was extracted with Trizol (Invitrogen, cat. 15596018) and ethanol precipitation (n = 2 for one replicate polyclonal cell line and n = 1 for the other replicate). RNA concentration and RIN numbers (ranging from 7.1 to 9.2) were determined by Agilent 2100 Bioanalyzer (Agilent Technologies). UC Davis DNA Technologies and Expression Analysis Core Laboratory prepared poly(A) strand-specific libraries and sequenced 58 million paired-end read pairs by NovaSeq S4 (PE150).
RNA-seq data analysis
Fragment quality was initially checked using Salmon (v1.10.3) [32] by quasi-mapping reads to an index built from the human reference genome release 45 primary comprehensive gene annotation (GRCh38) from Gencode (https://www.gencodegenes.org). The Salmon output was converted into HTML format by Multiqc (v1.27) [33] and reads were confirmed to distribute around ∼400 bp. Raw fastq read quality was assessed using FastQC (v0.12.1) (Andrews, S. 2010. FastQC: A Quality Control Tool for High Throughput Sequence Data. Available online at http://www.bioinformatics.babraham.ac.uk/projects/fastqc/), and remaining Illumina adaptors and low-quality reads were detected and removed by trim_galore (v0.6.10) (Kreuger, 2023 DOI 10.5281/zenodo.5127898). Trimmed reads were then aligned to Gencode human reference genome release 47 primary comprehensive gene annotation (GRCh38) using STAR (v2.7.11b) [34] with the following options: --outFilterMultimapNmax 20, --alignIntronMax 1000000, --alignMatesGapMax 1000000, --alignSJDBoverhangMin 1, --limitSjdbInsertNsj 2000000, --outSAMattributes NH HI AS nM jM, --alignIntronMin 20, --outSAMtype BAM SortedByCoordinate, --twopassMode Basic, and --quantMode TranscriptomeSAM GeneCounts. Bam files were indexed using samtools (v1.21 with htslib 1.21) [35], and bigwig files were created for the forward and reverse reads using deeptools (v3.5.6) [36]. Read counts were normalized using the inverse of the DESeq2 size factors.
Differential expression analysis
Differential gene expression analysis was performed using DESeq2 (v1.42.1) [37] after extracting reverse-stranded read counts from the STAR alignment and removing genes with no read counts across all samples. Reference condition was set to the empty-vector control cell line samples. Results were FDR adjusted with alpha = 0.05. Significantly upregulated or downregulated genes were evaluated for GO term enrichment using Enrichr [38–40].
Differential splicing analysis
Novel splice sites were incorporated into the v47 GTF file using Stringtie (v3.0.1) [41] from the BAM files. Splice sites that did not have at least three bases on either side were filtered out with –a 3. GTFs were created for each sample and merged. Splicing analysis was then performed using junctionCounts (v1.0.0) [42] with the merged GTF file using default settings except for DEXSeq, where --min_jc 10, --min_psi 0.1, and --ri_span 0.1. Significant splice events (dpsi ≥ 0.1, Q-value ≤ 0.05 or dpsi ≤ −0.1, Q-value ≤ 0.05) were extracted and analyzed further using bam coverage tracks on the UCSC genome browser and bam files loaded in IGV (v 2.17.4) [43]. Sashimi plots of analyzed events were generated with IGV.
RNA structure analysis
Structure predictions and folding free energy calculations for U2 snRNA BSL sequence alone (nt 23–47) were generated using the Fold function in RNAstructure version 6.5 [44, 45], constraining nucleotides A30 and C40, along with the toehold nucleotides as single-stranded. For full-length U2 snRNA sequence (nt 1–187), structure prediction, folding free energy calculations, and base-pairing/single-stranded probability were generated using RNAstructure Fold and Partition functions with default parameters. To account for the effect of Sm protein binding, nucleotides 97–111 were constrained as single-stranded.
Branchpoint analysis of differential splice events
A table containing the coordinates of branchpoint sequences identified after SF3A3 IP-seq was downloaded from NCBI GEO under accession number GEO: GSE240608 [67]. Coordinates for significantly (dpsi ≥ 0.1, Q-value ≤ 0.05 or dpsi ≤ −0.1, Q-value ≤ 0.05) differential and non-significantly (|dpsi| < 0.1, Q-value > 0.05) differential splicing events from comparisons of Control versus C28U and Control versus U2-Ortho were converted from hg38 to hg19 using UCSC Genome Browser Liftover. Branchpoints were then matched with splicing events when the branchpoint coordinate fell between the start and end coordinates of the splicing event. Branchpoints were filtered for only those that fell within annotated 3′ splice site regions.
Motif enrichment analysis
Sequences for each branchpoint associated with a splicing event were collected by pulling genomic sequences of nucleotides 10 before and after the branchpoint using UCSC Genome Browser Table Browser with the GRCh37/hg19 assembly. Motif analysis was performed on fasta files of unique branchpoint sequences in STREME version 5.5.9 with the following options: --dna --minw 5 [78]. A control list was set to either STREME sequence shuffled control set or a fasta file of unique non-significantly differential splicing event branchpoint sequences. Motifs were considered significant when E-value ≤ 0.01.
Results
Mutations within the BSL affect both snRNA expression levels and reporter splicing in an orthogonal-U2 system
To investigate the effects of altering BSL structure in human cells, we adopted the orthogonal system designed by Wu and Manely that uses a modified SV40 T antigen gene to report on the splicing with exogenously supplied U2 snRNA mutants [3, 25]. The SV40 T antigen transcript is alternatively spliced into either large T or small t splice products based on usage of competing 5′ splice sites and branchpoints (Fig. 1B) [46]. While large T splicing can be achieved through multiple branchpoints, small t splicing depends on a single branchpoint that, when modified at the core yUR of the human consensus to AGU, can be complemented by an exogenous U2 snRNA with the corresponding substitution in the BPRS (Fig. 1C). To recreate this system, we constructed RNU2-1 expression plasmid with a BPRS containing an ACU transversion (U2-Ortho) at nucleotides 34–36 and an SV40 T antigen splicing reporter plasmid with a complementary small t branchpoint (BP-Ortho). We also generated control plasmids with the normal U2 BPRS and small t branchpoint (U2-WT and BP-WT). To confirm the dependence and specificity of BP-Ortho small t splicing on the presence of U2-Ortho snRNA, we transfected HEK293 cells with different combinations of these plasmids and assessed both U2-Ortho expression and reporter splicing. Primer extension with amplified cDNA of U2 snRNA as template and ddGTP produced an extension stop mapping to the orthogonal nucleotide C35 only with RNA isolated from U2-Ortho transfected cells. All samples exhibited a stop at nucleotide C28 due to the presence of endogenous U2 snRNA (Fig. 1D and E, compare lanes 3 and 4 with 1 and 2). Based on the ratio of the C35 to C28 band intensities, we estimate that up to 15% of cellular U2 snRNA contained the orthogonal BPRS.
To assess reporter splicing, we used reverse transcription quantitative polymerase chain reaction (RT-qPCR) with probes specific to the small t splice product and observed the expected increase in BP-Ortho reporter splicing with co-transfection of U2-Ortho relative to U2-WT (Fig. 1F, compare 2 and 4). Splicing of the BP-WT reporter occurs regardless of the identity of the U2 expression construct transfected (Fig. 1F, compare lanes 1 and 3), due to the constant presence of endogenous U2 snRNA. We also evaluated large T splicing and observed nearly equivalent levels across all transfections (Supplementary Fig. S1), which is also expected since large T splicing is insensitive to changes in small T splicing [46]. Together, these results demonstrate that in our hands, U2-Ortho expresses at levels sufficient to support orthogonal reporter splicing, and that efficient splicing of the BP-Ortho reporter relies on the presence of U2-Ortho.
With the orthogonal system in hand, we turned to testing the hypothesis that the base pairing potential of the BSL influences U2 snRNA expression and splicing. We introduced a series of mutations in the sequence of U2-Ortho chosen to either stabilize (M1 to M1/2) or destabilize different base pairing interactions within the BSL stem (M3 to M3–7), while trying to avoid altering U2’s other pairing interactions (Fig. 2A and E). RNA structure prediction revealed that of all the mutations generated, only the one stabilizing two base pairs (M1/M2) is expected to favor formation of the BSL (Supplementary Fig. S2 and Supplementary Table S2), while the other mutants maintained the wild-type U2 snRNA thermodynamic preference for an extended stem I, with three exceptions. Mutations M6 and M6/7, designed to disrupt the BSL base pairing near the loop, and M3–7, designed to disrupt the entire BSL, were predicted to form low-confidence alternate stem loops with BSL nucleotides. Importantly, because U2 snRNA structure is demonstrably influenced by both RNA modifications and snRNP proteins, the relevance of structure predictions for RNA alone is difficult to assess.
Figure 2.
Mutations within the BSL can affect orthogonal snRNA expression levels and reporter splicing. (A) Schematic of orthogonal human U2 snRNA BSL wildtype and mutations designed to increase BSL base pairing. All U2 snRNA changes are highlighted. The folding free energy (∆G* in kcal/mol) of the isolated BSL sequences is shown below each structure. (B) Representative PAGE showing primer extension analysis of stabilizing U2-Ortho mutants is quantified in panel (C) as the fraction of U2-Ortho relative to total U2 snRNA; n = 4. Due to differences in the primer required for M2 and M1/2, expression levels could not be quantified relative to endogenous U2 snRNA levels. Positive controls represent reactions using a corresponding expression plasmid as template. (D) Orthogonal small t reporter splicing for the same samples in panel (C) determined by RT-qPCR as described in Fig. 1F, except in this set of experiments, -ΔΔCt is derived relative to EV control. (E) Schematic of orthogonal human U2 snRNA BSL wild-type and mutant sequences predicted to decrease BSL base pairing with folding free energies for the structure shown. For M4, M5, M3/4, and M3/4/5 structures, restraint of the stem was required to obtain a folding free energy value. No value could be determined for M3–7 due to the lack of adjacent base pairs. (F) Expression levels of BSL mutations in panel (E); n = 4. (G) Orthogonal small t reporter splicing by RT-qPCR for samples shown in panel (F). -ΔΔCt is derived as in panel (D). Results are grouped by transfection batch for BSL-WT comparisons. Matching closed and open shapes are technical replicates of one biological replicate. In all cases, error bars represent standard deviation, ns = not significant with P-value <.05*, <.01**, <.005*** for Student’s t-test.
To examine the functional effects of the different BSL mutations, we transfected each U2-Ortho BSL mutant expression plasmid with the BP-Ortho splicing reporter into cells and, after 72 h, analyzed total RNA for mutant U2-Ortho expression and reporter splicing as compared to U2-Ortho BSL-WT. For these transfections, U2-Ortho BSL-WT consistently comprised ∼7.5% of cellular U2 snRNA and yielded a log2-fold increase of ≥2 for reporter splicing relative to EV (Fig. 2).
Based on the toehold model, stabilizing mutations in the BSL would be expected to act against its winding and thereby reduce reporter splicing efficiency. The stabilizing mutation M1 (A30G), which changes the A–C bulge between nucleotides 30 and 40 to a G–C base pair, reduces reporter splicing slightly, but not significantly relative to BSL-WT (Fig. 2B–D). It also does not significantly change U2-Ortho expression. With M2 (U43A), which converts the U–U interaction with nucleotide 27 to an A–U base pair, as well as the M1/M2 combination, little to no reporter splicing is detected. The mutations appear to greatly reduce U2-Ortho expression, although we cannot directly compare their levels to endogenous U2 due to incompatible primers (Fig. 2B). While challenging to interpret these data on their own relative to the toehold model, the result demonstrates that stabilizing the BSL with M1 does not greatly impact U2 snRNA function. We note that in S. cerevisiae snRNA, the detrimental effect of stabilizing the same A–C bulge with an A–U base pair is only observed under low temperature, a condition that cannot be replicated in human cells [20]. While stabilizing the BSL at M2 is detrimental, the reduction in splicing cannot be uncoupled from the low steady-state accumulation of M2 U2-Ortho. We suspect that this is the result of increased turnover by quality-control pathways monitoring snRNA assembly or functionality in the snRNP, but our data do not rule out an effect on U2 snRNA production.
From the toehold model, destabilizing the BSL at different positions has the potential to interfere with U2 snRNP biogenesis or U2 snRNA interactions with an intron, but fewer base pairs may also ease BSL unwinding. We positioned three single changes nearer to the base of the BSL stem (M3 G25A, M4 C28U, M5 A29U) and two single changes adjacent to the loop (M6 G31U, M7 U32A), along with several combinations (M3/4, M6/7 M3/4/5, and M3–7) (Fig. 2E). In cells, all the destabilizing changes generally lowered U2-Ortho expression, but to different degrees (Fig. 2F). The single changes to BSL pairing showed the least reduction regardless of their position, except for M3. The effect of M3, which disrupts the first base pair in the stem, appears to dominate because its presence in any combination essentially halved U2-Ortho expression relative to BSL-WT. The combination of M6/M7 had an intermediate impact on U2-Ortho expression. Thus, while the loss of most single base pairing interactions within the BSL does not severely impact steady-state levels of U2 snRNA, disruption of the first base pair in the stem and/or multiple base pairs together may negatively affect incorporation or maintenance of U2 snRNA in the snRNP, leading to its degradation.
The BSL-destabilizing mutations also had differential effects on splicing of the BP-Ortho reporter small t splicing, but no effect on large T splicing (Fig. 2G and Supplementary Fig. S1). While no mutation increased splicing efficiency, small decreases in reporter splicing with M5, M6, M7, and M6/7 were consistent but not statistically significant relative to BSL-WT. M3 and M3–7 severely decreased splicing, essentially to background. Although low U2-Ortho expression levels likely contribute to the reduction, we reason that these mutations may also impair U2 snRNA function in splicing. This conclusion is based on mutants M3/4 and M3/4/5, which have comparable low expression, but detectable reporter splicing. It is also consistent with the negative effect on cell growth and reporter splicing for the equivalent M3 mutation in yeast [20]. In the cryo-EM structures of U2 snRNP the base pair disrupted by M3 appears to constrain the location of the base of the BSL stem as it stacks on a tryptophan extending from SF3A3 [16, 19, 47]. Loss of the base pair could therefore alter BSL positioning or SF3A3’s assembly into U2 snRNP and interfere with splicing. The other mutation with a reduced splicing that is likely not fully explained by decreased U2-Ortho expression is M4 (C28U). Although the equivalent mutation in yeast has no growth defect, it suppresses an N-terminal deletion of PRP5, suggesting that disruption of this base pair is involved in branchpoint selection. We are tempted to speculate a functional connection to same C28U change recently identified in U2 snRNA in pancreatic, prostate, and several hematological cancers [11], which we explore by the transcriptome analysis presented later in this study.
BSL base pairing potential influences recognition of branchpoint sequences with differential complementarity to the U2 snRNA BPRS
In S. cerevisiae, a stabilizing mutation equivalent to M2 in this study allowed for increased splicing of branchpoint mutants relative to WT U2 snRNA, supporting a role for the BSL in mediating the yeast stringency for the branchpoint consensus sequence [20]. Therefore, we set out to determine if the effect would hold in human cells, where, despite the flexibility of branchpoint sequence selection, the amount of base pairing complementarity between the human U2 snRNA BPRS and intron branchpoint sequence has been shown to dictate splicing efficiency in minigene assays [48, 49]. To test for this relationship, we created a new collection of orthogonal splicing reporters with nucleotide changes among four of the six nucleotides flanking the branchpoint adenosine in the small t branchpoint reporter predicted to either decrease or increase the base pairing between U2-Ortho BPRS and the small t branchpoint sequence. In all cases, two nucleotides corresponding to the UN of the human yUNAy branchpoint consensus were maintained as GU as required for orthogonality. The branchpoint sequences, along with their predicted branchpoint strength [30] are shown in Fig. 3A. BP-NC1, which makes two Watson–Crick (WC), three GU, and one mismatched base pair with the orthogonal U2 BPRS, had the lowest score while still being identified as a branchpoint. BP-NC2 and BP-NC6 retain a single mismatch but incrementally increase the WC base pairing. The BP-Ortho sequence for Figs 1 and 2 experiments, makes four WC and one GU base pairs, and ranks slightly lower than BP-FC, which makes all six WC base pairs. In the context of the toehold strand invasion model, these branchpoints also vary the strength of pairing with the three flipped-out U2 snRNA nucleotides of the BSL stem loop and first base pair formed by the invading intron (Toehold +1).
Figure 3.
BSL base pairing potential influences branchpoint sequence recognition. (A) Sequences of orthogonal splicing reporter branchpoints with different complementarity to U2-Ortho, their relative strengths from SVM_BPfinder [30], and number of Watson–Crick (WC) and GU pairs for the predicted toehold (underlined) and first base pair invading the BSL (+1) (italics). Orthogonal nucleotides are highlighted and the branchpoint adenosine is bolded. (B) Top: Representative agarose gel of RT-PCR for spliced small t. Bottom: Band intensity quantified relative to sample 1 (BP-Ortho and U2-WT); n = 2. (C) Representative PAGE of radiolabeled U2 snRNA primer extension analysis for samples in panel (B) performed as described in Fig. 1D. (D) Comparison of small t intron splicing by RT-qPCR for the three reporter branchpoints with the BSL mutants schematized below. -ΔΔCt is derived relative to EV of the same transfection batch as described in Fig. 2D. BP-Ortho data from Fig. 2D and G are included for comparison. (E) Splicing levels from panel (D) displayed as a heatmap of fold-change values relative to EV and indicating for a given branchpoint reporter the significance of the difference between each BSL mutant and BSL-WT. ns = not significant, P-value <.05*, <.01**, <.005*** for Student’s t-test.
When tested in the orthogonal system with U2-Ortho BSL WT, the splicing efficiency of the different reporters correlated with the branchpoint SVM score, with BP-NC1 yielding the lowest level of orthogonal small t splicing and BP-FC yielding the highest (Fig. 3B, lanes 6–10). Importantly, none of changes enabled significant small t splicing in the absence of U2-Ortho (Fig. 3B, lanes 1–5), and U2-Ortho expression was constant across transfections (Fig. 3C). We conclude that complementarity between the orthogonal branchpoint and U2-Ortho BPRS contributes to efficient intron recognition and/or splicing.
Next, to determine whether BSL base pairing influences branchpoint recognition and splicing in human cells, we tested our collection of 12 U2-Ortho BSL mutants with the weakest (BP-NC1) and strongest (BP-FC) branchpoint sequence reporters to compare with the intermediate strength BP-Ortho from Fig. 2. Expression of U2-Ortho BSL-WT and BSL mutants transfected with BP-NC1 and BP-FC were similar to previously observed levels (Supplementary Fig. S3), as was large T splicing (Supplementary Fig. S1). With most single BSL mutations, splicing of the different small t branchpoint reporters followed the same trend as U2-Ortho BSL-WT, with splicing for BP-NC1 < BP-Ortho < BP-FC (Fig. 3D and E). Notably, the fully complementary BP-FC provided a larger boost in splicing relative to the other branchpoints with the stabilizing mutation M1 located closer to the BSL stem loop. Along with the slight reduction in splicing of BP-Ortho, these results are consistent with intron pairing helping to drive BSL unwinding. For M2 the higher complementarity of the BP-FC did not improve splicing. This result may parallel the notable reduction in branchpoint stringency for the equivalent stabilizing mutation in yeast, but the overall poor splicing and low expression of the M2 U2-Ortho mutant challenges a firm conclusion in the human system.
Destabilizing mutations M6 and M7, which are directly adjacent to the BSL loop, appear to be least sensitive to increased branchpoint complementarity. BP-FC confers only a very small increase in splicing, and the BP-NC1 splicing with these mutants is consistently higher than with BSL-WT, although not quite meeting the cutoff for statistical significance. This result is consistent with BSL unwinding being driven by intron pairing starting at the loop, where the loss of base pairs at M6 and M7 could prime BSL unwinding and take away the disadvantage for low complementarity branchpoint sequences. Consistent with this idea, as the position of disrupted base pairs moved away from the BSL loop with M5 and M4, branchpoints with higher complementarity appeared to progressively regain their advantage. Based on the overall decrease in reporter splicing with the M6/M7, we infer that maintenance of at least some helical structure near the BSL loop is important, likely in positioning loop nucleotides for an initial toehold.
U2-Ortho constructs containing M3, which disrupts the base pair at the start of stem, did not show a clear pattern with the different branchpoint reporters. BP-FC appeared to rescue some splicing of the single M3 mutation, but it was essentially not spliced with the combinations of M3/4, M3/4/5, and M3–7. Unexpectedly, M3/4 and M3/4/5 promoted BP-Ortho better than the other branchpoint reporters, while M3–7 worked best with the weakest BP-NC1. Together these results demonstrate that even when present at only 2% of total U2 snRNA, as was the case for M3, sufficient U2-Ortho is present to allow for a significant amount of reporter splicing. Because M3–7 also disrupts the BSL base pairing near the loop, this result further supports the idea that stability of pairing near the BSL loop controls recognition of the branchpoint sequence.
Compensatory mutations do not reverse the negative effect of some BSL destabilizing mutations
In addition to participating in BSL base pairing, U2 snRNA nucleotides G25–C28 also pair with U6 snRNA in Helix Ia in the final steps of spliceosome catalytic activation [50]. As a result, mutations M3 and M4 have the potential to also interfere with splicing catalysis. The mutations do not fully disrupt U2/U6 pairing, as M3 G25A converts a G–U to A–U pair and M4 converts a C–G to U–G, and modeling the changes appears compatible with the U2/U6 Helix Ia structure (Supplementary Fig. S4). In an attempt to more definitively determine the primary deficit of the M3 and M4 mutants, we tested whether compensatory U2 mutations that restore BSL pairing could reverse their effects with M3/M8 (G25A, C45U) and M4/M9 (C28U, G42A) and the combined M3/8/4/9 (Fig. 4A). We assessed the mutants with the BP-Ortho reporter and newly performed replicates of M3, M4, and M3/M4 for comparison. While U2-Ortho expression levels remained similar for M4 (partially reduced), M3 and M3/M4 (significantly reduced) relative to BSL-WT, unfortunately little to no expression of any of the compensatory U2-Ortho mutants was detected (Fig. 4B). The lack of detectable expression of the compensatory U2-Ortho mutants likely explains the absence of reporter splicing relative to EV control (Fig. 4C). On the other hand, the uncompensated BSL mutations maintained splicing levels akin to our previous experiments. Because of highly deleterious effect of the compensatory mutations on U2-Ortho levels, we cannot rule in or out perturbation of U2/U6 Helix Ia as a contributing factor for the splicing defect associated with M3, M4, and M3/4. The results do suggest, however, that the negative effects of M3 and M4 mutations on U2-Ortho expression and reporter splicing are not as simple as loss of BSL base pairing, and that nucleotide identity at the compensatory positions is likely also important for U2 snRNA expression.
Figure 4.
Compensatory mutations do not reverse the negative effect of some BSL mutations. (A) Schematic of orthogonal human U2 snRNA BSL constructs with indicated mutations. (B) Representative PAGE of radiolabeled U2 snRNA primer extension analysis of cellular RNA for the indicated BSL mutants as described in Fig. 1D. Depending on the mutation, different primers were required for annealing. Again, positive controls represent reactions using the expression plasmid for each mutant as template. Ortho (BSL-WT, M3, M4, M3/4, respective plasmid positive controls, and EV reactions). (C) Comparison of small t intron splicing by RT-qPCR with the indicated U2-Ortho constructs. -ΔΔCt is derived relative to EV of the same transfection batch as described in Fig. 2D.
U2 snRNA BSL mutations have a subtle impact on steady-state splice isoform ratios in human cells
Recently, mutations to the U2 snRNA RNU2-1 and RNU2-2 BSL region were discovered in several tumor types and neurodevelopmental disorders, but their global effects on gene expression and splicing have not been published [11, 51, 52]. Notably, C28U, which was identified as a recurrent hot-spot mutation in B-cell, pancreatic, and prostate cancer, corresponds to our M4 BSL mutant, which had relatively mild effects on orthogonal reporter splicing and its dependency on branchpoint sequence complementarity. To determine whether these effects would carry over to other splicing contexts, we generated a lentivirus-transduced HEK293T with a wild-type RNU2-1 expression locus harboring C28T (U2-C28U) for RNA sequencing analysis. If the function of this mutation in cancer is related to a role in branchpoint sequence recognition, we expected that it may phenocopy the splicing changes observed with other frequent somatic mutations in U2 snRNP proteins and associated factors (e.g. SF3B1, U2AF, SRSF2) [7–10]. Point mutations in these proteins primarily induce changes in alternative 3′ splice site selection, cassette exon, and intron retention [53–62]. Alternatively, if C28U results in a generally less functional U2 snRNP, we reasoned that its splicing changes could resemble the effects of expressing U2-Ortho, which is presumably capable of being recruited to but not functioning at branchpoints in cellular transcripts and may confer a unique effect on cellular splicing. Therefore, we also transduced cells with a TAG36-28ACT RNU2-1 expression locus (U2-Ortho), as well as with an empty lentivirus vector control. After selecting for stable integration and verifying expression of the mutant U2 snRNA by primer extension (Supplementary Fig. S5A), we extracted total RNA from the transduced cell lines and submitted samples for poly(A)-selected RNA sequencing. The resulting mapped reads were analyzed for gene expression and splicing changes with the U2 snRNA mutants relative to EV control.
The splicing analysis did not result in evidence of widespread splicing changes with either construct. Out of the >300 000 splicing events detected for each cell line, only 175 splicing events were significantly different from the EV control (FDR ΔPSI ≤ 0.05) in the U2-C28U mutant and 186 events in the U2-Ortho mutant (Fig. 5A and Supplementary Tables S3 and S4). In both lines, the differential splice events were distributed similarly among event types, with skipped exon as most common. Alternative 3′ splice site changes are not clearly enriched relative to other types of splicing changes, as reported for SF3B1 cancer mutations [61–63]. Notably, among all the significantly altered events across both U2 mutants, 25% were shared, supporting a common impact on splicing (Fig. 5B). In several cases, the shared splicing change involved alternative inclusion of exons containing a premature termination codon (PTC) that targets the transcript for degradation by NMD. For example, we observed higher skipping of a “poison” cassette exon in splicing factor SRSF6 transcripts and the IDX cassette exon of the HRAS proto-oncogene (Fig. 5C and D). Both U2 mutant cell lines also show decreased usage of the alternative 3′ splice site with a downstream PTC in the snRNA 3′-tail processing gene TOE1 transcripts (Fig. 5E). In line with their NMD sensitivity, the expected correlated increase in expression of these genes, and thus higher expression of the functional isoform, was confirmed by the DESeq2 analysis described below (Fig. 5F). Non-NMD-related alternative splicing changes were also identified, including a novel functional isoform of the ubiquitin-activating enzyme gene UBA2 with the in-frame skipping of exon 13 [64], and a multiple skipping event of exons 7–9 in the tryptophan-metabolizing enzyme gene AFMID that is upregulated in several cancer lines [65] (Fig. 5G and H).
Figure 5.
Global splicing changes in cells expressing U2 snRNA BSL or BPRS mutations. (A) Bar graph quantifying different types of differential alternative splicing events for each U2 snRNA mutant relative to EV control cell line. A3 = Alternative 3′ splice site; A5 = Alternative 5′ splice site; AF = Alternative First Exon; AL = Alternative Last Exon; MS = Multiple Skipped Exons; MX = Mutually Exclusive Exons; RI = Retained Intron; SE = Skipped Exon. (B) Venn diagram illustrating differential alternative splicing events overlapping between U2-C28U and U2-Ortho. (C) Sashimi plot of SE event around exon 3 of SRSF6. Coverage is shown as raw read counts from BAM files and PSI is defined as event inclusion counts/total event counts. (D) Sashimi plot of SE event around IDX exon of HRAS. (E) Sashimi plot of A3 event at exon 7 of TOE1. (F) Differential expression values for SRSF6, HRAS, and TOE1 from DESeq2. (G) Sashimi plot of SE event around exon 13 of UBA2. (H) Sashimi plot of MS event around exons 7–9 of AFMID.
We attempted to identify a shared feature of the altered splicing events that could confer sensitivity to the mutant U2 snRNAs. Because branchpoint sequence context is the most likely factor, we used a list of previously determined branchpoint sequences verified to interact with U2 snRNP to identify the branchpoints likely used for both altered and unaltered splicing events in our samples [66, 67]. We carried out motif analysis for a 21-nucleotide span of sequence centered on the branchpoint adenosine. In the large collection of unaltered splicing events, the human branchpoint consensus of yUNAy was the only significant motif found relative to a sequence shuffle control (Supplementary Fig. S5B). However, nothing rose to significance (E-value < 0.01) with the branchpoints from the altered splicing events compared to either sequence shuffle controls or the unaltered splicing events. We do not think that this result indicates that the altered splicing events have poor matches to the branchpoint consensus but is instead due to the small number of events. When we randomly choose a similarly small number of branchpoints from the unaltered events, no significant motif appears. Another confounding factor for trying to discern the relative strength of branchpoints for different groups of splicing events is that most human introns contain three or more functional branchpoints [66]. Because our data cannot discern which is used for a given splicing event, we included all possibilities in this analysis. As another potential differentiating factor, we also compared the number of potential branchpoints per altered and unaltered splicing events, but their distributions are not distinguishable (Supplementary Fig. S5C). At this point, the factors conferring splicing sensitivity to the U2 snRNA mutants remain elusive. We note that a preprint from the Query group examined the effect of the C28U mutation in the context of the RNU2-2 variant and came to a similar conclusion [68].
Global gene expression patterns change with the expression of U2 snRNA BSL mutations in human cells
We also looked at the effect of the two U2 mutants on gene expression and observed more significant changes. Relative to the EV control, 1823 downregulated and 2669 upregulated genes were identified in the U2-C28U cells, and 1923 downregulated and 2075 upregulated genes in the U2-Ortho cells (Fig. 6A and B). As with the splicing changes, many differentially expressed genes (∼47%) were shared between the two cell lines (Fig. 6C). However, only ∼30% of genes with altered splicing also showed differential expression with both U2 snRNA variants (Fig. 6D and E).
Figure 6.
U2 snRNA BSL and BPRS mutations result in global alterations to gene expression. (A) Plot of log2 fold change versus log10 mean expression for genes from Control versus U2-C28U comparison. Number of significant upregulated and downregulated genes (padj ≤.05) is shown. (B) Same as panel (A) for Control versus U2-Ortho comparison. (C) Venn diagram illustrating differentially expressed genes overlapping between U2-C28U and U2-Ortho. (D) Venn diagram illustrating overlap of genes with differentially expressed and alternative splicing for U2-C28U. (E) Same as panel (D) for U2-Ortho. (F) Bubble plot of GO Biological Process terms enriched in significantly upregulated genes from U2-C28U grouped by similar pathways. Color represents the significance of the enrichment [−log10(adjusted P-value)] and bubble size indicates the number of genes identified. (G) Same as panel (F) for U2-Ortho.
To determine how the differentially spliced or expressed genes were related to each other biologically, we carried out Gene Ontology (GO) Biological Process enrichment analysis. For both U2 snRNA mutations, no significantly enriched terms (adjusted P-value ≤.05) were identified among the genes with altered splicing or that were downregulated. In contrast, several pathways showed enrichment among the upregulated genes, including many pathways linked to general regulation of gene expression, including RNA processing, translation, RNA decay, and nuclear pore formation (Fig. 6F and G). The magnitude of the expression changes for many of these genes was relatively modest. For example, the upregulation of genes involved in RNA processing is between 0.3 and 1 log2 fold change. However, these genes are normally well-expressed (>2.5 log10 fold mean expression), and cells may have limited ability to upregulate them much further (Supplementary Tables S5 and S6).
Overall, this analysis indicates that the presence of an altered U2 snRNA can be tolerated in cells with minimal impact on RNA splicing choices. However, there is an effect on gene expression. We postulate that the mutant U2 snRNPs may broadly compete with WT U2 snRNP to decrease splicing efficiency or accuracy, which would increase RNA decay of unspliced or mis-spliced transcripts through pathways like NMD. Cells may also have mechanisms to respond to the decrease in splicing efficiency by upregulating genes involved in RNA processing and gene expression more generally, which could be related to the selection of the C28U U2 snRNA mutation in cancer. Considering that the overexpression of the oncogene MYC is also associated with upregulation of splicing factors [69] and increased skipping of the HRAS IDX cassette exon [70], we looked more closely at its expression. Although the MYC transcript levels in both U2 mutant cell lines were not significantly different, we noted that GO term analysis of upregulated genes in the U2 mutants using the Molecular Signatures Database (MSigDB) Hallmark sets revealed that MYC target genes, mTORC signaling genes, and G2-M checkpoint genes are highly overrepresented (Fig. 7). We speculate that the response to or effect of U2 snRNA mutants may contribute to cancer development or progression by phenocopying MYC overexpression.
Figure 7.
MYC-target genes are upregulated with mutant U2 snRNA expression. Bubble plot of MSigDB Hallmark Enrichment GO terms enriched for significantly upregulated genes with U2-C28U and U2-Ortho. Color represents the significance of the enrichment [−log10(adjusted P-value)] and bubble size indicates the number of genes identified.
Discussion
This study of the function of the human U2 snRNA BSL in splicing was motivated by foundational studies in budding yeast that identified this conserved stem-loop structure and its involvement in branchpoint recognition [20], as well as the subsequent cryo-EM studies demonstrating that the BSL nucleotides form an extension of the branch helix in early prespliceosome (A complex) assembly [14, 19, 23, 71]. These observations set up BSL unwinding as a potential regulatory step for branchpoint engagement during pre-mRNA splicing and prompted the model of toehold-mediated intron strand invasion as a molecular mechanism [23]. We therefore set out to test predictions of the hypothesis that, in the human U2 snRNA, BSL stability is balanced to enable folding of the helix with its role in branch helix formation.
Based on the interplay between different BSL mutations and branchpoint sequence complementarity, our findings are concurrent with the toehold model of BSL unwinding and further lead us to propose that the relative stability of both the toehold and BSL pairing function together as something of a two-factor authentication system for branchpoint recognition (Fig. 8A). The presence and positioning of the BSL serves as the first factor, ensuring that U2 snRNA is in the correct conformation to establish a toehold check in branchpoint pairing. Loss of the G–C base pair (M3) at the base of the stem, and thus the stacking interactions with the SF3A3 protein, would likely have the largest impact on this factor. Indeed, the highly negative impact of the M3 G25A mutants on both splicing and U2 snRNA steady-state levels supports this model. We also acknowledge that some of M3’s negative impact on splicing could be due to changing the G–U pair with U6 snRNA to an A–U pair, as previously discussed (Supplementary Fig. S4). The M3 change would also disrupt a G–C pair in an extension of Stem Loop I that is mutually exclusive of the BSL (Fig. 1A). However, despite being part of the more thermodynamically favored structure and detectable by chemical probing in the absence of SF3 proteins, the extension of Stem Loop I is of unknown relevance to U2 snRNA function and has not been observed in cryo-EM structures. Still, the structure could play a role in either snRNP biogenesis and/or U2 snRNA recycling after release from the intron-lariat spliceosome and contribute to the highly negative impact of BSL-stabilizing mutations M2 and M1/M2 on both U2 snRNA steady-state levels and reporter splicing. Another element to consider is the potential effect of mutations on U2 snRNA modifications and their demonstrated contributions to U2 snRNP assembly and function [72–74]. Of the tested BSL mutations, the mutations directly disrupting BSL nucleotide modification include the ACU orthogonal conversion (U34 pseudouridylation), M2 (pseudouridylation of U43), M1 (m6a and 2′-O-methylation of A30), and M3 (2′-O-methylation of G25). Our data cannot discount the contribution of modifications to the decrease in splicing and U2 snRNA steady-state levels, although we note that no individual BSL modification has been identified as essential, and U2 snRNA lacking modifications downstream of either the 24th or 27th nucleotides assemble into snRNPs and are functional for splicing in HeLa nuclear extract or frog oocytes, respectively [72, 73].
Figure 8.
Proposed U2 snRNA BSL two-factor authentication for branch helix formation. (A) The first factor requires the BSL to be positioned to establish a toehold interaction, while the second factor depends on the balance between toehold and BSL stability to control intron strand invasion. (B) Model for how altering the balance between toehold and BSL stability impacts branchpoint recognition and branch helix formation. (C) Cryo-EM models illustrating the molecular interactions near the 5′ end of U2 snRNA in the human 17S U2 snRNP (PDB 7EVO) and S. cerevisiae Complex A (PDB 6G90). Proteins near U2 snRNA structures are labeled. DDX46 was removed in the far-left image to better view protein interactions stabilizing the BSL. BPRS and BSL nucleotides that contact the intron are highlighted.
The second authentication factor depends on the balance between toehold and BSL stability to ensure branchpoint sequence fidelity. A stable toehold with the intron enables BSL unwinding by increasing the likelihood of strand invasion and subsequent branch helix formation, while a less stable toehold would both decrease the probability of strand invasion and may also allow for sampling of other potential branchpoint positions within the intron. The correlation between increased splicing with stronger toehold base pairing from our collection of orthogonal reporters is consistent with this idea. For example, the similar splicing efficiencies of NC2 and NC6 correlate with their identical toehold pairing, rather than with considerably different predicted branchpoint strengths. In the same vein, while the predicted branchpoint strength of NC1 is close to NC2, its toehold pairing and resulting splicing efficiency are both lower. The model is also consistent with the boost in splicing for both BP-Ortho and BP-FC relative to the other branchpoints, as only they offer a Watson–Crick interaction as the first base pair (+1) formed by the invading intron strand (BP-Ortho and BP-FC versus NC reporters). BSL stability also contributes to the influence of toehold and +1 pairing (Fig. 8B). For example, when a BSL base pair is added near the BSL loop (M1), toehold pairing competing for BSL nucleotide pairing explains the slight decrease in BP-Ortho splicing and the increase in the splicing differential for the fully complementary branchpoint BP-FC. Conversely, the influence of complementarity is greatly diminished when the competing BSL base pairs near the toehold are absent (M6 and M7). The diminishment attenuates as the BSL base pairing is perturbed further from the loop and thus the toehold (M4 and M5).
One implication of the toehold strand invasion model is that the nature of base pairing within the extended helix could influence the efficiency of BSL unwinding. Although we did not directly investigate this in the present study, transcriptome-wide analysis of human branchpoint sequences provides little evidence of selection for Watson–Crick pairing [6, 66, 67, 75], suggesting that base stacking is the primary force driving strand exchange. We note that the extended helix at the orthogonal small t branchpoint in our reporter contains four Watson–Crick base pairs (Supplementary Fig. S6A), more than those found at the nearby large T branchpoints (Supplementary Fig. S6B). Whether this difference contributes to the previously observed preferential usage of this branchpoint in SV40 T antigen splicing in HEK293 cells [76] remains unclear, and the influence of extended helix sequence on BSL unwinding warrants further investigation.
Overall, our orthogonal splicing study shed new light on how an inherently unstable structure in U2 snRNA facilitates branchpoint recognition and adds functional support for toehold-mediated strand invasion as the initiating step of branch helix formation. The two-factor authentication model offers a mechanism by which BSL formation and unwinding together monitor branchpoint sequence pairing for appropriate branch helix formation. We note that the hand-in-hand timing of BSL unwinding and branch helix formation adds constraints to the order and nature of the additional rearrangements required to move U2 snRNP from the 17S form to the spliceosome A complex (prespliceosome). It seems unlikely that toehold formation and BSL unwinding occur in the context of the 17S U2 snRNP as visualized by cryo-EM, because the position of HTATSF1 is incompatible with intron access to the BSL loop [16, 19] (Fig. 8C). Before or during BSL unwinding, interactions between U2 snRNA and SF3 proteins stabilizing the BSL must also be released so that the nucleotides 5′ of the BSL and Stem I are displaced to essentially flip 180 degrees as visualized in the spliceosome A complex [14]. Likely, the rearrangements are regulated by ATP-driven displacement of HTATSF through DDX46 (PRP5). We imagine that the intron is somehow “held in waiting” to avoid forming the unproductive Branch Helix–Mimicking Stem Loop observed with HTATSF1 release in the absence of an intron [47]. In this structure, nucleotides from the 5′ end of U2 snRNA pair with the U2 BPRS and some residues within the BSL.
Our transcriptomic studies demonstrate that some alteration of BSL nucleotides in a subpopulation of U2 snRNA is relatively well tolerated. We chose to examine C28U because it was uniquely identified as a recurrent mutation in several cancer types [11]. We also tested U2-Ortho to see whether the presence of an snRNP with an altered BPRS would have a global effect on splicing outcomes. As we observed only a small effect on U2 snRNA steady-state levels and orthogonal reporter splicing from the M4 (C28U) mutation, the limited number of changes detected for global splicing patterns when the same mutation is expressed in a non-orthogonal U2 snRNA (U2-C28U) was perhaps not so surprising. The significant overlap in altered splicing with U2-Ortho was more unexpected. Analysis of the branchpoints of affected introns did not suggest a clear link between specific branchpoint features and the identified splicing that our model for branchpoint fidelity and BSL unwinding might suggest. The splicing effects of our U2 snRNA mutants also did not resemble those of the reported cancer mutations in U2 snRNP-associated proteins SF3B1, U2AF1, and SRSF2 [53–62]. However, several limitations of the splicing analysis must be acknowledged. We did not inhibit the NMD pathway that would degrade many aberrantly or poorly spliced transcripts, which may account for the general decrease of expression of a wide variety of transcripts with both U2-C28U and U2-Ortho. Indeed, a preprint examining the effect of both C28U and C28A mutations in RNU2-1 and RNU2-2 sequences expressed from the RNU2-2 single locus showed that NMD removed many poorly or aberrantly spliced transcripts. Similar to our analysis, that study also did not identify a shared feature of splicing events altered with their C28U expression constructs [68].
The similar effect of expressing U2-C28U and U2-Ortho extends to changes in gene expression as well. Based on the findings in Chi et al., we speculate that downregulated transcripts are largely due to NMD-mediated decay of poorly or mis-spliced transcripts. The splicing defect may result from mutant snRNPs competing with the endogenous U2 snRNP for recruitment to branchpoint sequences. While not sufficient to disrupt their physiology, cells respond by generally upregulating the machinery involved in gene expression and cell proliferation. This upregulation may be favored in cancers, which commonly exhibit hypersensitivity to decreases in either spliceosome abundance or functionality [69, 77]. The pathways mediating the expression changes resulting from U2 snRNA mutation could therefore hold promise as targets for new cancer therapies.
Supplementary Material
Acknowledgements
We thank Dr Manuel Ares and Dr Hannah Maul-Newby for insightful discussion and comments on the manuscript. We thank our many colleagues at UCSC for assistance, especially Dr Victor Tse and Dr Zach Neeb (RT-qPCR assays), Dr Angela Brooks and Dr Cindy Liang (RNA sequencing), and Kristina Oh (cloning U2 snRNA mutant plasmids). RNA sequencing was carried out at the DNA Technologies and Expression Analysis Cores at the UC Davis Genome Center, supported by NIH Shared Instrumentation Grant 1S10OD010786-01. Lastly, we thank Dr Nicholas Stevers at the University of California San Francisco for lentivirus reagents and protocol, computational resources, and RNA-seq analysis support.
Author contributions: Meredith B. Stevers (Conceptualization [lead], Data curation [lead], Formal analysis [lead], Funding acquisition [supporting], Investigation [lead], Methodology [lead], Project administration [supporting], Validation [lead], Visualization [lead], Writing – original draft [lead], Writing – review & editing [equal]), Sol Katzman (Formal analysis [supporting], Resources [supporting], Software [supporting]), and Melissa S. Jurica (Conceptualization [supporting], Funding acquisition [lead], Methodology [supporting], Project administration [lead], Supervision [lead], Visualization [supporting], Writing – review & editing [equal])
Contributor Information
Meredith B Stevers, Molecular, Cell & Developmental Biology, University of California Santa Cruz, Santa Cruz, CA 95064, United States.
Sol Katzman, Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, United States.
Melissa S Jurica, Molecular, Cell & Developmental Biology, University of California Santa Cruz, Santa Cruz, CA 95064, United States; Center for Molecular Biology of RNA, University of California Santa Cruz, Santa Cruz, CA 95064, United States.
Supplementary data
Supplementary data is available at NAR online.
Conflict of interest
M.S.J. holds the position of Executive Editor for Nucleic Acids Research and has not peer reviewed or made any editorial decisions for this paper.
Funding
This work was funded by National Institutes of Health [R01GM72649 to M.S.J.]; and the National Institutes of Health training grant fellowship [T32GM133391 to M.B.S.].
Data availability
The underlying data of this study are available in the article and supplementary material. The RNA sequencing data are available in Gene Expression Omnibus (GEO) database at https://www.ncbi.nlm.nih.gov/geo/, under accession number GSE303759.
References
- 1. Reed R, Maniatis T. Intron sequences involved in lariat formation during pre-mRNA splicing. Cell. 1985;41:95–105. 10.1016/0092-8674(85)90064-9. [DOI] [PubMed] [Google Scholar]
- 2. Parker R, Siliciano PG, Guthrie C. Recognition of the TACTAAC box during mRNA splicing in yeast involves base pairing to the U2-like snRNA. Cell. 1987;49:229–39. 10.1016/0092-8674(87)90564-2. [DOI] [PubMed] [Google Scholar]
- 3. Wu J, Manley JL. Mammalian pre-mRNA branch site selection by U2 snRNP involves base pairing. Genes Dev. 1989;3:1553–61. 10.1101/gad.3.10.1553. [DOI] [PubMed] [Google Scholar]
- 4. Zhuang Y, Weiner AM. A compensatory base change in human U2 snRNA can suppress a branch site mutation. Genes Dev. 1989;3:1545–52. 10.1101/gad.3.10.1545. [DOI] [PubMed] [Google Scholar]
- 5. Gao K, Masuda A, Matsuura T et al. Human branch point consensus sequence is yUnAy. Nucleic Acids Res. 2008;36:2257–67. 10.1093/nar/gkn073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Mercer TR, Clark MB, Andersen SB et al. Genome-wide discovery of human splicing branchpoints. Genome Res. 2015;25:290–303. 10.1101/gr.182899.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Quesada V, Conde L, Villamor N et al. Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia. Nat Genet. 2011;44:47–52. 10.1038/ng.1032. [DOI] [PubMed] [Google Scholar]
- 8. Yoshida K, Sanada M, Shiraishi Y et al. Frequent pathway mutations of splicing machinery in myelodysplasia. Nature. 2011;478:64–9. 10.1038/nature10496. [DOI] [PubMed] [Google Scholar]
- 9. Graubert TA, Shen D, Ding L et al. Recurrent mutations in the U2AF1 splicing factor in myelodysplastic syndromes. Nat Genet. 2011;44:53–7. 10.1038/ng.1031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Papaemmanuil E, Cazzola M, Boultwood J et al. Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts. N Engl J Med. 2011;365:1384–95. 10.1056/NEJMoa1103283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Bousquets-Muñoz P, Díaz-Navarro A, Nadeu F et al. PanCancer analysis of somatic mutations in repetitive regions reveals recurrent mutations in snRNA U2. NPJ Genom Med. 2022;7:19. 10.1038/s41525-022-00292-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Galej WP, Wilkinson ME, Fica SM et al. Cryo-EM structure of the spliceosome immediately after branching. Nature. 2016;537:197–201. 10.1038/nature19316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Plaschka C, Lin PC, Nagai K. Structure of a pre-catalytic spliceosome. Nature. 2017;546:617–21. 10.1038/nature22799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Plaschka C, Lin PC, Charenton C et al. Prespliceosome structure provides insights into spliceosome assembly and regulation. Nature. 2018;559:419–22. 10.1038/s41586-018-0323-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Rauhut R, Fabrizio P, Dybkov O et al. Molecular architecture of the Saccharomyces cerevisiae activated spliceosome. Science. 2016;353:1399–405. 10.1126/science.aag1906. [DOI] [PubMed] [Google Scholar]
- 16. Zhang Z, Will CL, Bertram K et al. Molecular architecture of the human 17S U2 snRNP. Nature. 2020;583:310–3. 10.1038/s41586-020-2344-3. [DOI] [PubMed] [Google Scholar]
- 17. Bertram K, Agafonov DE, Dybkov O et al. Cryo-EM structure of a pre-catalytic human spliceosome primed for activation. Cell. 2017;170:701–13.e11. 10.1016/j.cell.2017.07.011. [DOI] [PubMed] [Google Scholar]
- 18. Yan C, Wan R, Bai R et al. Structure of a yeast activated spliceosome at 3.5 A resolution. Science. 2016;353:904–11. 10.1126/science.aag0291. [DOI] [PubMed] [Google Scholar]
- 19. Zhang X, Zhan X, Bian T et al. Structural insights into branch site proofreading by human spliceosome. Nat Struct Mol Biol. 2024;31:835–45. 10.1038/s41594-023-01188-0. [DOI] [PubMed] [Google Scholar]
- 20. Perriman R, Ares M., Jr. Invariant U2 snRNA nucleotides form a stem loop to recognize the intron early in splicing. Mol Cell. 2010;38:416–27. 10.1016/j.molcel.2010.02.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Behrens SE, Tyc K, Kastner B et al. Small nuclear ribonucleoprotein (RNP) U2 contains numerous additional proteins and has a bipartite RNP structure under splicing conditions. Mol Cell Biol. 1993;13:307–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Urabe VK, Stevers M, Ghosh AK et al. U2 snRNA structure is influenced by SF3A and SF3B proteins but not by SF3B inhibitors. PLoS One. 2021;16:e0258551. 10.1371/journal.pone.0258551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Cretu C, Gee P, Liu X et al. Structural basis of intron selection by U2 snRNP in the presence of covalent inhibitors. Nat Commun. 2021;12:4491. 10.1038/s41467-021-24741-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Green SJ, Lubrich D, Turberfield AJ. DNA hairpins: fuel for autonomous DNA devices. Biophys J. 2006;91:2966–75. 10.1529/biophysj.106.084681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Wu J, Manley JL. Multiple functional domains of human U2 small nuclear RNA: strengthening conserved stem I can block splicing. Mol Cell Biol. 1992;12:5464–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Smith DJ, Konarska MM, Query CC. Insights into branch nucleophile positioning and activation from an orthogonal pre-mRNA splicing system in yeast. Mol Cell. 2009;34:333–43. 10.1016/j.molcel.2009.03.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Wu JA, Manley JL. Base pairing between U2 and U6 snRNAs is necessary for splicing of a mammalian pre-mRNA. Nature. 1991;352:818–21. 10.1038/352818a0. [DOI] [PubMed] [Google Scholar]
- 28. Huang Q, Pederson T. A human U2 RNA mutant stalled in 3′ end processing is impaired in nuclear import. Nucleic Acids Res. 1999;27:1025–31. 10.1093/nar/27.4.1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Ares M, Mangin M, Weiner AM. Orientation-dependent transcriptional activator upstream of a human U2 snRNA gene. Mol Cell Biol. 1985;5:1560–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Corvelo A, Hallegger M, Smith CW et al. Genome-wide association between branch point properties and alternative splicing. PLoS Comput Biol. 2010;6:e1001016. 10.1371/journal.pcbi.1001016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Camacho Londoño J, Philipp SE. A reliable method for quantification of splice variants using RT-qPCR. BMC Molecular Biol. 2016;17:8. 10.1186/s12867-016-0060-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Patro R, Duggal G, Love MI et al. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14:417–9. 10.1038/nmeth.4197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Ewels P, Magnusson M, Lundin S et al. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32:3047–8. 10.1093/bioinformatics/btw354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Dobin A, Davis CA, Schlesinger F et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Li H, Handsaker B, Wysoker A et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–9. 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Ramírez F, Dündar F, Diehl S et al. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 2014;42:W187–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Chen EY, Tan CM, Kou Y et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013;14:128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Kuleshov MV, Jones MR, Rouillard AD et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44:W90–7. 10.1093/nar/gkw377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Xie Z, Bailey A, Kuleshov MV et al. Gene set knowledge discovery with Enrichr. Current Protocols. 2021;1:e90. 10.1002/cpz1.90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Pertea M, Pertea GM, Antonescu CM et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33:290–5. 10.1038/nbt.3122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Ritter AJ, Wallace A, Ronaghi N et al. junctionCounts: comprehensive alternative splicing analysis and prediction of isoform-level impacts to the coding sequence. NAR Genom Bioinform. 2024;6:lqae093. 10.1093/nargab/lqae093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Robinson JT, Thorvaldsdóttir H, Winckler W et al. Integrative genomics viewer. Nat Biotechnol. 2011;29:24–6. 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Reuter JS, Mathews DH. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics. 2010;11:129. 10.1186/1471-2105-11-129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Ali SE, Mittal A, Mathews DH. RNA secondary structure analysis using RNAstructure. Current Protocols. 2023;3:e846. 10.1002/cpz1.846. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Noble JC, Prives C, Manley JL. Alternative splicing of SV40 early pre-mRNA is determined by branch site selection. Genes Dev. 1988;2:1460–75. 10.1101/gad.2.11.1460. [DOI] [PubMed] [Google Scholar]
- 47. Tholen J, Razew M, Weis F et al. Structural basis of branch site recognition by the human spliceosome. Science. 2022;375:50–7. 10.1126/science.abm4245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Li M, Pritchard PH. Characterization of the effects of mutations in the putative branchpoint sequence of intron 4 on the splicing within the human lecithin:cholesterol acyltransferase gene. J Biol Chem. 2000;275:18079–84. 10.1074/jbc.M910197199. [DOI] [PubMed] [Google Scholar]
- 49. Královicová J, Houngninou-Molango S, Krämer A et al. Branch site haplotypes that control alternative splicing. Hum Mol Genet. 2004;13:3189–202. 10.1093/hmg/ddh334. [DOI] [PubMed] [Google Scholar]
- 50. Townsend C, Leelaram MN, Agafonov DE et al. Mechanism of protein-guided folding of the active site U2/U6 RNA during spliceosome activation. Science. 2020;370:eabc3753. 10.1126/science.abc3753. [DOI] [PubMed] [Google Scholar]
- 51. Greene D, De Wispelaere K, Lees J et al. Mutations in the small nuclear RNA gene RNU2-2 cause a severe neurodevelopmental disorder with prominent epilepsy. Nat Genet. 2025;57:1367–73. 10.1038/s41588-025-02159-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Jackson A, Blakes AJM, Wall E et al. Biallelic variants in RNU2-2 cause a remarkably frequent developmental epileptic encephalopathy. medRxiv, 10.1101/2025.09.02.25334957, 4 September 2025, preprint; not peer reviewed. [DOI] [Google Scholar]
- 53. Brooks AN, Choi PS, de Waal L et al. A pan-cancer analysis of transcriptome changes associated with somatic mutations in U2AF1 reveals commonly altered splicing events. PLoS One. 2014;9:e87361. 10.1371/journal.pone.0087361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Okeyo-Owuor T, White BS, Chatrikhi R et al. U2AF1 mutations alter sequence specificity of pre-mRNA binding and splicing. Leukemia. 2015;29:909–17. 10.1038/leu.2014.303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Ilagan JO, Ramakrishnan A, Hayes B et al. U2AF1 mutations alter splice site recognition in hematological malignancies. Genome Res. 2015;25:14–26. 10.1101/gr.181016.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Zhang J, Lieu YK, Ali AM et al. Disease-associated mutation in SRSF2 misregulates splicing by altering RNA-binding affinities. Proc Natl Acad Sci USA. 2015;112:E4726–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Kim E, Ilagan JO, Liang Y et al. SRSF2 mutations contribute to myelodysplasia by mutant-specific effects on exon recognition. Cancer Cell. 2015;27:617–30. 10.1016/j.ccell.2015.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Shiozawa Y, Malcovati L, Gallì A et al. Aberrant splicing and defective mRNA production induced by somatic spliceosome mutations in myelodysplasia. Nat Commun. 2018;9:3649. 10.1038/s41467-018-06063-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Wang L, Brooks AN, Fan J et al. Transcriptomic characterization of SF3B1 mutation reveals its pleiotropic effects in chronic lymphocytic leukemia. Cancer Cell. 2016;30:750–63. 10.1016/j.ccell.2016.10.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Pacholewska A, Lienhard M, Brüggemann M et al. Long-read transcriptome sequencing of CLL and MDS patients uncovers molecular effects of SF3B1 mutations. Genome Res. 2024;34:1832–48. 10.1101/gr.279327.124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Alsafadi S, Houy A, Battistella A et al. Cancer-associated SF3B1 mutations affect alternative splicing by promoting alternative branchpoint usage. Nat Commun. 2016;7:10615. 10.1038/ncomms10615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Darman RB, Seiler M, Agrawal AA et al. Cancer-associated SF3B1 hotspot mutations induce cryptic 3′ splice site selection through use of a different branch point. Cell Rep. 2015;13:1033–45. 10.1016/j.celrep.2015.09.053. [DOI] [PubMed] [Google Scholar]
- 63. Fernandez MM, Yu L, Jia Q et al. Engineering oncogenic hotspot mutations on SF3B1 via CRISPR-directed PRECIS mutagenesis. Cancer Res Commun. 2024;4:2498–513. 10.1158/2767-9764.CRC-24-0145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Chmielarska K. Biochemical and cell biological characterisation of Sumo E1 activating enzyme Aos1/Uba2. Dissertation, LMU München: Faculty of Chemistry and Pharmacy. 2005. 10.5282/edoc.6302. [DOI]
- 65. Lin KT, Ma WK, Scharner J et al. A human-specific switch of alternatively spliced AFMID isoforms contributes to TP53 mutations and tumor recurrence in hepatocellular carcinoma. Genome Res. 2018;28:275–84. 10.1101/gr.227181.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Pineda JMB, Bradley RK. Most human introns are recognized via multiple and tissue-specific branchpoints. Genes Dev. 2018;32:577–91. 10.1101/gad.312058.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Damianov A, Lin CH, Huang J et al. The splicing regulators RBM5 and RBM10 are subunits of the U2 snRNP engaged with intron branch sites on chromatin. Mol Cell. 2024;84:1496–511. 10.1016/j.molcel.2024.02.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Chi Z, Gupta V, Query C. U2-2 snRNA mutations alter the transcriptome. bioRxiv, 10.1101/2023.06.11.543863, 12 June 2023, preprint; not peer reviewed. [DOI] [Google Scholar]
- 69. Koh CM, Bezzi M, Low DH et al. MYC regulates the core pre-mRNA splicing machinery as an essential step in lymphomagenesis. Nature. 2015;523:96–100. 10.1038/nature14351. [DOI] [PubMed] [Google Scholar]
- 70. Chen X, Yang HT, Zhang B et al. The RNA-binding proteins hnRNP H and F regulate splicing of a MYC-dependent HRAS exon in prostate cancer cells. Proc Natl Acad Sci USA. 2023;120:e2220190120. 10.1073/pnas.2220190120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. Zhang Z, Rigo N, Dybkov O et al. Structural insights into how Prp5 proofreads the pre-mRNA branch site. Nature. 2021;596:296–300. 10.1038/s41586-021-03789-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Dönmez G, Hartmuth K, Lührmann R. Modified nucleotides at the 5′ end of human U2 snRNA are required for spliceosomal E-complex formation. RNA. 2004;10:1925–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Yu YT, Shu MD, Steitz JA. Modifications of U2 snRNA are required for snRNP assembly and pre-mRNA splicing. EMBO J. 1998;17:5783–95. 10.1093/emboj/17.19.5783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Zhao X, Yu YT. Pseudouridines in and near the branch site recognition region of U2 snRNA are required for snRNP biogenesis and pre-mRNA splicing in Xenopus oocytes. RNA. 2004;10:681–90. 10.1261/rna.5159504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75. Paggi JM, Bejerano G. A sequence-based, deep learning model accurately predicts RNA splicing branchpoints. RNA. 2018;24:1647–58. 10.1261/rna.066290.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76. Noble JC, Pan ZQ, Prives C et al. Splicing of SV40 early pre-mRNA to large T and small t mRNAs utilizes different patterns of lariat branch sites. Cell. 1987;50:227–36. 10.1016/0092-8674(87)90218-2. [DOI] [PubMed] [Google Scholar]
- 77. Hsu TY, Simon LM, Neill NJ et al. The spliceosome is a therapeutic vulnerability in MYC-driven cancer. Nature. 2015;525:384–8. 10.1038/nature14985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Bailey TL, STREME: accurate and versatile sequence motif discovery. Bioinformatics. 2021;37:2834–2840. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The underlying data of this study are available in the article and supplementary material. The RNA sequencing data are available in Gene Expression Omnibus (GEO) database at https://www.ncbi.nlm.nih.gov/geo/, under accession number GSE303759.









