The RNA-binding protein SFPQ preserves long-intron splicing and regulates circRNA biogenesis in mammals

Lotte Victoria Winther Stagsted; Eoghan Thomas O'Leary; Karoline Kragh Ebbesen; Thomas Birkballe Hansen

doi:10.7554/eLife.63088

. 2021 Jan 21;10:e63088. doi: 10.7554/eLife.63088

The RNA-binding protein SFPQ preserves long-intron splicing and regulates circRNA biogenesis in mammals

Lotte Victoria Winther Stagsted ¹, Eoghan Thomas O'Leary ¹, Karoline Kragh Ebbesen ¹, Thomas Birkballe Hansen ^1,^✉

Editors: Juan Valcárcel², James L Manley³

PMCID: PMC7819710 PMID: 33476259

Abstract

Circular RNAs (circRNAs) represent an abundant and conserved entity of non-coding RNAs; however, the principles of biogenesis are currently not fully understood. Here, we identify two factors, splicing factor proline/glutamine rich (SFPQ) and non-POU domain-containing octamer-binding protein (NONO), to be enriched around circRNA loci. We observe a subclass of circRNAs, coined DALI circRNAs, with distal inverted Alu elements and long flanking introns to be highly deregulated upon SFPQ knockdown. Moreover, SFPQ depletion leads to increased intron retention with concomitant induction of cryptic splicing, premature transcription termination, and polyadenylation, particularly prevalent for long introns. Aberrant splicing in the upstream and downstream regions of circRNA producing exons are critical for shaping the circRNAome, and specifically, we identify missplicing in the immediate upstream region to be a conserved driver of circRNA biogenesis. Collectively, our data show that SFPQ plays an important role in maintaining intron integrity by ensuring accurate splicing of long introns, and disclose novel features governing Alu-independent circRNA production.

Research organism: Human, Mouse

Introduction

Gene expression is the output of multiple tightly coupled and controlled steps within the cell, which are highly regulated by a variety of factors and processes. Among these are the physical and functional interactions between the transcriptional and splicing machineries that are of great importance for the generation of both canonical and alternative isoforms of RNA transcripts. This includes a novel class of unique, closed circular RNA (circRNA) molecules.

CircRNAs are evolutionary conserved and display differential expression across cell types, tissues, and developmental stages. The highly stable circular conformation is obtained by covalently joining a downstream splice donor to an upstream splice acceptor, a backsplicing process catalyzed by the spliceosome (Memczak et al., 2013; Jeck et al., 2013; Salzman et al., 2013; Ashwal-Fluss et al., 2014; Hansen et al., 2013). The vast majority of circRNAs derive from coding sequences, making their biogenesis compete with the production of linear isoforms (Ashwal-Fluss et al., 2014; Salzman et al., 2013, Salzman et al., 2012). Complementary sequences in the flanking introns can facilitate the production of circRNAs (Dubin et al., 1995; Ivanov et al., 2015; Jeck et al., 2013; Westholm et al., 2014; Zhang et al., 2014), where the primate-specific Alu repeats are found to be significantly enriched in the flanking introns of circRNAs (Jeck et al., 2013; Ivanov et al., 2015; Venø et al., 2015). In some cases, exon skipping has been shown to stimulate circularization of the skipped exon (Barrett et al., 2015). However, in both human and Drosophila, biogenesis of the most abundant and conserved pool of circRNAs tend to be driven by long flanking introns rather than the presence of proximal inverted repeats in the flanking sequences (Westholm et al., 2014; Stagsted et al., 2019). The biogenesis of circRNAs without inverted repeats is currently not understood in detail, although RNA-binding proteins (RBPs) associating with the flanking introns of circRNAs have been shown to be important (Ashwal-Fluss et al., 2014; Conn et al., 2015; Errichelli et al., 2017).

Here, we aim to identify additional protein factors involved in circRNA biogenesis. To this end, we exploited the enormous eCLIP and RNA sequencing resource available from the ENCODE consortium (ENCODE Project Consortium, 2012). Stratifying eCLIP hits across the genome with circRNA loci coordinates revealed the splicing factor proline/glutamine rich (SFPQ) and non-POU domain-containing octamer-binding protein (NONO) as highly enriched around circRNAs compared to other exons. Both proteins belong to the multifunctional Drosophila behavior/human splicing (DBHS) family with highly conserved RNA recognition motifs (RRMs) (Dong et al., 1993) and they are often found as a heterodimeric complex (Knott et al., 2016; Knott et al., 2015; Lee et al., 2015; Passon et al., 2012). The proteins are predominantly located to the nucleus, in particular to the membrane-less condensates known as paraspeckles (Clemson et al., 2009; Fox et al., 2018), where they play a pivotal role in cellular mechanisms ranging from regulation of transcription by interaction with the C-terminal domain (CTD) of RNA polymerase II (Buxadé et al., 2008; Rosonina et al., 2005; Urban et al., 2000), pre-mRNA splicing (Emili et al., 2002; Ito et al., 2008; Kameoka et al., 2004; Peng et al., 2002) and 3’end processing (Kaneko et al., 2007; Rosonina et al., 2005) to nuclear retention (Zhang and Carmichael, 2001) and nuclear export of RNA (Furukawa et al., 2015). Recently, SFPQ has been implicated in ensuring proper transcription elongation of neuronal genes (Takeuchi et al., 2018) representing an interesting link to circRNAs, as these are highly abundant in neuronal tissues and often derive from neuronal genes (Rybak-Wolf et al., 2015).

Here, we show that SFPQ depletion leads to specific deregulation of circRNAs with long flanking introns devoid of proximal inverted Alu elements. Moreover, we show that long introns in particular are prone to intron retention and alternative splicing with concomitant premature termination. While premature termination is not the main driver of circRNA deregulation, we provide evidence for a complex interplay between upstream (acting positively on circRNA production) and downstream features (acting negatively) that collectively govern the production of individual circRNAs in the absence of SFPQ. This not only elucidates a conserved role for SFPQ in circRNA regulation but also identifies upstream alternative splicing as an approach toward circRNA production.

Results

The DALI circRNAs are defined by long flanking introns and distal inverted Alu elements

To stratify circRNAs by their inverted Alu element dependencies, we characterized the circRNAome in two of the main ENCODE cell lines, HepG2 and K562 (Supplementary file 1). Using the joint prediction of two circRNA detection algorithms, ciri2 and find_circ, we identified 3044 and 7656 circRNAs in HepG2 and K562, respectively. While proximal inverted Alu elements (IAEs) are important for the biogenesis of a subset of circRNAs (Jeck et al., 2013; Ivanov et al., 2015), we and others have shown that long flanking introns associate with circRNA loci, particularly for the conserved and abundant circRNAs (Stagsted et al., 2019; Westholm et al., 2014), and the biogenesis of this group of circRNA species is largely unresolved. To focus our analysis on the non-Alu, long intron fraction of circRNAs, we subgrouped circRNAs based on their IAE distance and flanking intron length using median distance and length as cutoffs (Figure 1A–C). We observed that these two features show interdependent distributions, where approximately 70% of the top1000 expressed circRNAs group as either Distal-Alu-Long-Intron (DALI) circRNAs or Proximal-Alu-Short-Intron (PASI) circRNAs (Figure 1D). Apart from long flanking introns and distal IAEs, DALI circRNAs show higher overall expression compared to PASI circRNAs, longer genomic lengths, but similar distribution of mature lengths (Figure 1—figure supplement 1A–D). Moreover, almost half of a previously characterized subgroup of circRNAs, the AUG circRNAs (Stagsted et al., 2019), derive from DALI circRNAs (Figure 1—figure supplement 1B), and interestingly, when filtering circRNAs for conservation (in mouse and human), 69–72% of conserved circRNAs are DALI circRNAs (Figure 1E). This finding suggests that the IAE-dependent biogenesis pathways may not be relevant for the most conserved and abundant circRNAs and that other factors must be involved.

Figure 1. — (A) Schematics showing the flanking intron length (red) defined by the sum of annotated flanking introns and inverted *Alu* element (IAE) distance (blue) defined by the sum of distance to the most proximal IAE. (**B–C**) Density plot for the distribution of flanking intron lengths (B) and IAE Distance (C) for the top1000 expressed circRNAs in HepG2 (upper facet) and K562 (lower facet). The vertical line represents the median. (D) Contingency table showing the 4-way distribution of circRNAs with long and short flanking introns (in respect to the median) and proximal and distal IAEs (also in respect to the median, see B and C) for HepG2 (left facet) and K562 (right facet). The contingency table is color-coded by circRNA subgroup; DALI (distal *Alu*, long flanking introns, in red), PASI (proximal *Alu*, short flanking introns, in blue) and ‘Other’ (unclassified, in gray) circRNAs. The p-values are Fisher's exact test of independence. (E) As in D, but for the subset of circRNAs with conserved expression in mouse.

Figure 1—figure supplement 1. — (A) Schematics showing the flanking intron length (red) defined by the sum of annotated flanking introns and inverted *Alu* element (IAE) distance (blue) defined by the sum of distance to the most proximal IAE. (**B–C**) Density plot for the distribution of flanking intron lengths (B) and IAE Distance (C) for the top1000 expressed circRNAs in HepG2 (upper facet) and K562 (lower facet). The vertical line represents the median. (D) Contingency table showing the 4-way distribution of circRNAs with long and short flanking introns (in respect to the median) and proximal and distal IAEs (also in respect to the median, see B and C) for HepG2 (left facet) and K562 (right facet). The contingency table is color-coded by circRNA subgroup; DALI (distal *Alu*, long flanking introns, in red), PASI (proximal *Alu*, short flanking introns, in blue) and ‘Other’ (unclassified, in gray) circRNAs. The p-values are Fisher's exact test of independence. (E) As in D, but for the subset of circRNAs with conserved expression in mouse.

SFPQ and NONO are specifically enriched in the introns flanking DALI circRNAs

In order to identify RBPs that could drive circRNA formation, we used the elaborate ENCODE eCLIP data (ENCODE Project Consortium, 2012; Supplementary file 2). We scrutinized the immediate flanking regions of the 1000 most highly expressed circRNAs in HepG2 and K562 with the assumption that factors directly involved in backsplicing are likely to bind in the vicinity of the back-splicing sites. We extracted an eCLIP enrichment score using Wilcoxon rank-sum tests between the number of eCLIP reads aligned to circRNA flanking regions (upstream and downstream) compared to flanking regions of host exons, that is other exons from the circRNAs expressing genes. In HepG2, we found SFPQ to be the protein most highly enriched in the circRNA flanking regions, while NONO – a known interaction partner for SFPQ (Dong et al., 1993) – shows enrichment in K562 cells (Figure 2A–B, to our knowledge eCLIP datasets on SPFQ in K562 and NONO in HepG2 are not available). Comparing DALI and PASI circRNAs shows that SFPQ is DALI circRNA specific, both upstream and downstream of the circularizing exons (Figure 2C, p≤1.2e-16), whereas NONO associates with circRNA loci more generally and with the upstream regions of DALI circRNAs specifically (Figure 2D). SFPQ, like circRNAs, is known to associate with long introns (Iida et al., 2020; Takeuchi et al., 2018). To exclude that the enrichment seen is a mere bias from the flanking intron length, we extracted a subset of annotated splice acceptor (SA) and splice donor (SD) pairs sampled to match the expression level (linear spliced reads) and flanking intron lengths of DALI circRNAs (denoted ‘DALI-like exons’) (Figure 2—figure supplement 1A–H). This analysis shows that both SFPQ and NONO were significantly more enriched around circRNA exons compared to sampled DALI-like exons (Figure 2—figure supplement 1E–H).

Figure 2—figure supplement 1. — (**A–B**) Barplot showing enrichment/depletion of eCLIP signal (see Supplementary file 2) in the vicinity of circRNAs (+/- 2000 nt) compared to host exons (+/- 2000 nt) as determined by Wilcoxon rank-sum tests for HepG2 (A) and K562 (B) eCLIP samples. (**C–D**) Cumulative plots of SFPQ (C) and NONO (D) eCLIP read distribution upstream and downstream of circRNA subgroups and host exons as denoted. (E) Schematic showing localization of primers (+/- 2000 nt) for targeting either upstream (up) or downstream (down) intronic regions of splice sites in respect to circRNA exons or host exon. (F) Western blotting of immunoprecipitated (IP), endogenous SFPQ or NONO from nuclear fractions of HepG2 cells with Histone H3 as a loading control. (**G–H**) RT-qPCR of intronic regions flanking a downstream host gene exon (left facet) or flanking the circRNA producing exon(s) (right facet) of CDYL (G) and ZKSCAN1 (H) upon RNA IP of endogenous SFPQ or NONO from nuclear fractions of HepG2 cells. The relative expression of immunoprecipitate (IP)/input is plotted. Data for three biological replicates are shown.

To validate the binding of SFPQ and NONO on nascent circRNA transcripts, we conducted RNA immunoprecipitation (RIP) qPCR in HepG2 (Figure 2E–H and Figure 2—figure supplement 2A–B) and HEK293T cells (Figure 2—figure supplement 2C–H) and quantified the expression of a panel of representative DALI and PASI circRNAs. Here, the flanking regions of DALI circRNAs, circCDYL and circARHGAP5 (circEYA1 in HEK293T), were significantly enriched for SFPQ binding compared to downstream intronic regions (Figure 2G and Figure 2—figure supplement 2A,E and G). However, we found no enrichment for PASI circRNAs, circZKSCAN1 (Figure 2H and Figure 2—figure supplement 2F) and circNEIL3 (Figure 2—figure supplement 2B and H). Thus, we conclude that SFPQ and NONO associate with the flanking introns of DALI circRNAs, and this may be indicative of a functional role in circRNA biogenesis.

SFPQ depletion represses DALI circRNAs production

To study the impact of SFPQ and NONO on circRNA production, we depleted SFPQ and NONO in HepG2 and HEK293T cells using two different siRNAs for each target (Figure 3—figure supplement 1A, Supplementary files 3 and 4). Western blot and RT-qPCR (Figure 3A, Figure 3—figure supplement 1B–E) showed that expression of both proteins was efficiently reduced upon siRNA treatment, although, unexpectedly, the expression levels of SFPQ mRNA appeared unaffected by SFPQ knockdown and greatly elevated upon NONO depletion (Figure 3—figure supplement 1C and E). This, we speculate, is the result of compensatory effects or autoregulatory mechanisms. We performed total RNA sequencing of the knockdown samples, and conducted gene expression analysis of circRNA and mRNAs. Principal Component Analysis (PCA) of HEK293T and HepG2 samples shows clear grouping of treatments (SFPQ, NONO, and CTRL), both on mRNA and circRNA levels (Figure 3—figure supplement 1F–I), suggesting that most of the variance between samples are explained by the knockdown. Although for HepG2, two samples (siSFPQ1_rep1 and siNONO2_rep2) display outlier signatures and were thus removed in downstream analyses. Overall, the composition and expression of DALI and PASI circRNAs in the HepG2 and HEK293T-derived samples look very similar to the ENCODE-based analysis (Figure 3—figure supplement 1J–M).

Figure 3. — (A) Western blotting of proteins from HEK293T (upper panel) and HepG2 (lower panel) cells transfected with either CTRL siRNAs, siRNAs targeting NONO mRNA, or siRNAs targeting SFPQ mRNA using antibodies against SFPQ, NONO, and β-tubulin (loading control) as denoted. (**B–C**) Volcano plot showing deregulated circRNAs upon NONO (left facet) and SFPQ (right facet) depletion in HEK293T cells (B) or HepG2 cells (C) color-coded by circRNA subgroup; DALI circRNAs (red), PASI circRNAs (blue) and ‘other’ circRNAs (gray). (**D–E**) Boxplot showing overall changes in expression (log2Foldchange) of the three circRNA subgroups upon NONO and SFPQ depletion in HEK293T (D) and HepG2 (E) cells. p-Values are calculated using two-sided Wilcoxon rank-sum tests. (F) Genome screen dump of the circCDYL expressing locus with BSJ-spanning reads visualized as junction-track in the IGV browser (G) RT-qPCR quantification of circCDYL and linear CDYL expression upon SFPQ and NONO-depletion in HepG2 cells relative to *GAPDH* mRNA using two different siRNA designs for each target. Data for four biological replicates are shown. p-Values are calculated using Student’s two-tailed t-test. (**H–I**) as in F and G, but for the PASI circRNA, circZKSCAN1. (J) Boxplot showing eCLIP enrichment for SFPQ either immediately upstream or downstream (within 2000 nucleotides from the circRNA splice sites) of expressed circRNAs stratified either by circRNA subgroup or by deregulation upon SFPQ depletion in HepG2 cells. p-Values are calculated using two-sided Wilcoxon rank-sum tests.

Figure 3—figure supplement 1. — (A) Western blotting of proteins from HEK293T (upper panel) and HepG2 (lower panel) cells transfected with either CTRL siRNAs, siRNAs targeting NONO mRNA, or siRNAs targeting SFPQ mRNA using antibodies against SFPQ, NONO, and β-tubulin (loading control) as denoted. (**B–C**) Volcano plot showing deregulated circRNAs upon NONO (left facet) and SFPQ (right facet) depletion in HEK293T cells (B) or HepG2 cells (C) color-coded by circRNA subgroup; DALI circRNAs (red), PASI circRNAs (blue) and ‘other’ circRNAs (gray). (**D–E**) Boxplot showing overall changes in expression (log2Foldchange) of the three circRNA subgroups upon NONO and SFPQ depletion in HEK293T (D) and HepG2 (E) cells. p-Values are calculated using two-sided Wilcoxon rank-sum tests. (F) Genome screen dump of the circCDYL expressing locus with BSJ-spanning reads visualized as junction-track in the IGV browser (G) RT-qPCR quantification of circCDYL and linear CDYL expression upon SFPQ and NONO-depletion in HepG2 cells relative to *GAPDH* mRNA using two different siRNA designs for each target. Data for four biological replicates are shown. p-Values are calculated using Student’s two-tailed t-test. (**H–I**) as in F and G, but for the PASI circRNA, circZKSCAN1. (J) Boxplot showing eCLIP enrichment for SFPQ either immediately upstream or downstream (within 2000 nucleotides from the circRNA splice sites) of expressed circRNAs stratified either by circRNA subgroup or by deregulation upon SFPQ depletion in HepG2 cells. p-Values are calculated using two-sided Wilcoxon rank-sum tests.

The differential circRNA expression analysis showed that DALI circRNAs are generally reduced upon SFPQ depletion, whereas PASI circRNAs are practically unaffected in both HEK293T (Figure 3B and D) and HepG2 (Figure 3C and E) cells. For NONO, we observed almost no impact on circRNA production in both cell lines (Figure 3B–E). This could either indicate that NONO is less involved in circRNA biogenesis, or that the effect is in part masked by the concomitant upregulation of SFPQ observed upon NONO depletion. Consistently, RT-qPCR analyses of abundant DALI circRNAs, circCDYL (Figure 3F-G, ,Figure 3—figure supplement 2C), circARHGAP5 (Figure 3—figure supplement 2A) and circEYA1 (Figure 3—figure supplement 2E), and PASI circRNAs, circZKSCAN1 (Figure 3H-I , Figure 3—figure supplement 2D) and circNEIL3 (Figure 3—figure supplement 2B and F) confirmed repressed expression of DALI circRNAs and unchanged PASI circRNAs expression relative to host gene levels. Finally, to support a direct role for SFPQ in circRNA formation, we overlaid the results from SFPQ-depleted HepG2 cells with the SFPQ eCLIP data and observed a significant association between SFPQ binding in the flanking regions of DALI circRNAs, as expected, but also a clear association with deregulated circRNAs compared to unchanged circRNAs (Figure 3J). Here, SFPQ appears to associate upstream and downstream of repressed circRNAs, whereas upregulated circRNAs only show significant enrichment in the upstream region. In addition, we examined previously published total RNAseq from SFPQ conditional knock-out (KO) mouse brain (Takeuchi et al., 2018; Supplementary file 5). Here, as in human cell lines, DALI and PASI circRNA are prevalent subclasses (Figure 3—figure supplement 3A) with DALI circRNAs showing higher abundancy compared to PASI circRNAs (Figure 3—figure supplement 3B). SFPQ depletion in mouse brain affects global circRNA expression (Figure 3—figure supplement 3C); however, in contrast to HEK293T and HepG2 cells, we found a more equal distribution of up- and downregulated circRNAs upon SFPQ removal (Figure 3—figure supplement 3D–E), and we detect a clear tendency for DALI circRNAs to be more prone to SFPQ-mediated regulation (25% vs 5% showing significant deregulation, Figure 3—figure supplement 3F, p=8.2e-80, Fisher’s exact test). Consistent with HepG2, eCLIP analysis from mouse brain (Takeuchi et al., 2018) shows a clear association with DALI circRNAs, and a similar tendency toward upstream-only enrichment for circRNAs with increased expression upon SFPQ knockout (Figure 3—figure supplement 3G). Collectively, these findings suggest that SFPQ (and to a lesser degree NONO) regulates DALI circRNA biogenesis in mice and humans.

SFPQ depletion affects alternative splicing and intron retention in long genes

Next, to understand the impact of SFPQ and NONO on transcription and splicing in general, we used the RNAseq data to investigate SFPQ/NONO-sensitive mRNAs. Here, we found that SFPQ-depletion triggers a general repression of long genes (stratified by median gene length, Figure 4A). The read distribution of highly repressed genes showed a peculiar expression profile with unaffected read densities at the genic 5’ends but with dramatic reduction at the 3’end in HepG2 cells (Figure 4—figure supplement 1A–D) indicating that the transcription machinery drops off mid gene. This prompted us to survey genes globally for a ‘drop-off’ phenotype. Thus, we subgrouped genes into their expression profile by slicing each gene into 20 equally sized bins and conducting differential gene expression on all bins. To subgroup genes with of similar profiles in an unsupervised manner, we clustered the log2foldchanges across genes into five categories, denoted kc1-5, using k-means clustering (Figure 4B). Here, kc5 but also kc3 and 4 showed ‘drop-off’ effects but to different degrees, and interestingly, the effect correlates with gene length (Figure 4B–C). We obtain almost identical results from SFPQ-depleted HEK293T cells (Figure 4—figure supplement 2A–C) and mouse brain (Figure 4—figure supplement 2F–H).

Figure 4—figure supplement 1. — (A) Volcano plot depicting differential expression of annotated genes upon NONO or SFPQ KD compared to CTRL in HepG2 cells, stratified by median gene length into ‘long’ and ‘short’ genes as denoted. (B) Boxplot showing binned expression of clustered genes. Each gene is sliced into 20 equally sized bins, and the differential expression of each bin is determined and subgrouped into five k-means clusters (kc) (see Materials and methods). (C) Boxplot showing gene lengths distribution (0.25, 0.5 and 0.75 quantiles) stratified by clusters obtained in B. (D) Schematic representation of alternative splicing, where canonical (gray) denoted the most abundant splicing from the splice donor in question. Inclusion (green) and skipping (red) denotes an alternative splicing event shorter or longer than canonical, respectively. (E) Scatter plot showing alternative splicing in NONO and SFPQ depleted samples as a function of canonical intron length and color-coded by type of splicing either inclusion or skipping, see schematics in D. (F) Barplot with the number of unique alternative splicing events showing significant deregulation upon NONO and SFPQ depletion stratified by inclusion (green) and skipping (red), and whether the alternative SA site is annotated (transparent) or not (opaque). (G) Scatter plot showing effects on intron retention (IR) upon SFPQ and NONO depletion as a function of intron length, color-coded by significance (adjusted p-value<0.05) as denoted. (H) Scatterplot showing for each detectable intron the correlation between changes in exon-inclusion/skipping (red/green) and intron retention upon SFPQ depletion. (I) Boxplot showing the IP/Input enrichment of SFPQ eCLIP reads in introns harboring an exon inclusion or an intron retention event color-coded by whether the event is up or down (red or blue, respectively) or not significant (n/s, gray). (J) Schematic showing coordinates and full genic locus of *DENND1A* (top panel) and exon 8 and 9 with alternative, unannotated exon in-between (green, middle panel). Merged intron-spanning reads (lower panel) from CTRL, NONO-KD, and SFPQ-KD samples (HepG2) are shown and color-coded by splicing type; canonical (gray), inclusion (green), and skipping (red), see D. (**K–M**) RT-qPCR analysis of alternative splicing event (K), upstream expression (L) and downstream expression (M) relative to *GAPDH* mRNA using two different siRNA designs for each target. Data for four biological replicates are shown. p-Values are calculated using student’s two-tailed t-test.

Upon inspection of the downregulated genes in our samples, we found an upregulation of alternative splicing in the SFPQ KD samples (Figure 4—figure supplement 1). We classified all alternative splicing events as either inclusion or skipping relative to their respective canonical isoform (Figure 4D) and performed differential expression analysis using DESeq2. This showed an extensive change (mostly upregulation) of alternative splicing events correlating with intron length in both HepG2, HEK293T and mouse brain (Figure 4E, Figure 4—figure supplement 2D and I). Of the 2106 significantly deregulated inclusion events in HepG2, more than 96% are upregulated and of these, 76% are not annotated by gencode (Figure 4F, in HEK293T: 95% upregulated, 78% unannotated, in mouse: 90% upregulated, 88% unannotated: data not shown), and consequently, we suggest that these events are mostly cryptic or aberrant splicing. Furthermore, analyzing the levels of intron retention by quantifying unspliced intronic reads shows a very similar intron-length-dependent pattern with significant retention of long intron upon SFPQ depletion (Figure 4G, Figure 4—figure supplement 2E and J). Consistently, we find a clear correlation between exon inclusion and intron retention (Figure 4H), and a clear enrichment of SFPQ eCLIP signal in regions subjected to alternative splicing and intron retention (Figure 4I). As an example, for DENND1A, we observe a previously unannotated splicing event joining exon eight to an alternative splice acceptor dinucleotide (AG) residing in intron eight of this gene (Figure 4J), which is only detectable upon SFPQ knockdown (Figure 4K). In DENND1A, this cryptic event marks the transition from unaffected to repressed state, as quantification of the upstream region shows modest to no effect between control and knockdown, whereas the downstream region is highly suppressed (Figure 4K–M). To strengthen the direct effect of SFPQ on cryptic SA inclusion, we co-introduced an siRNA-resistant SFPQ expression vector (Figure 4—figure supplement 3A–B). This showed an almost complete rescue of the DENND1A cryptic splicing otherwise observed upon SFPQ depletion (Figure 4—figure supplement 3C–E).

Collectively, this suggests that intron retention and alternative splicing are conserved effects of SFPQ depletion, and that SFPQ plays a vital role in splicing integrity for long introns in particular.

SFPQ depletion results in premature termination events

In order for alternative splicing to result in premature termination of transcription, the alternative/cryptic-included exons need to harbor a polyA-signal that can serve as a functional terminator of transcription. To investigate the magnitude of polyA-signal appearance in SFPQ knockdown samples, we subjected SFPQ and NONO-depleted HEK293T cells to 3’end quantSeq (Figure 5—figure supplement 1A, Supplementary file 6). Putative polyA-signals were retrieved using the MACS2 callpeak algorithm, and to further increase the signal-to-noise ratio, we characterized each peak by the presence of a bonafide polyA-signal (PAS: AAUAAA or AUUAAA). Furthermore, for each quantseq peak, we also extracted the highest prevalence of A’s in all possible 15-nucleotide windows to reduce non-polyA-tail artefacts in the samples. The fraction of PAS-containing peaks dropped markedly when regions with 14 or 15 nucleotides A-stretches were found (Figure 5—figure supplement 1B), suggesting that these A-rich peaks are likely polyA-tail-independent artefacts and were thus removed from the analysis. The remaining peaks were classified as PAS sites, and for all PASs, the genic origin was annotated, and the differential usage was determined by DESeq2. This showed a clear enrichment on intronic PAS and a repression of exonic PAS usage upon SFPQ knockdown (Figure 5A). As before, NONO-depletion only showed a modest effect.

Figure 5. — (A) Volcano plot showing deregulated PAS usage as measured by quantseq upon NONO and SFPQ depletion in HEK293T cells. PAS signals are color-coded by their genic origin; intronic (dark blue), exonic (light blue), or ambiguous (gray). (B) Plot showing the cumulative fraction of PASs as a function of relative genic position stratified by genic origin (ambiguous, exonic or intronic, vertical facets) and color-coded by whether the PAS is significantly up (red) or downregulated (blue) upon SFPQ knockdown. (C) Schematic representation of the DENND1A exon 8–9 locus with alternative exon (green) and putative PAS element (purple). Below, merged quantseq coverage from each experiment. (D) RT-qPCR on input and oligo-dT purified RNA from control and SFPQ-depleted HEK293T cells using amplicons specific for *GAPDH* mRNA (positive control), circZKSCAN1 (negative control), and the alternative SFPQ-activated exon. Values reflect ratios between oligo-dT purified and input quantities. Data for two biological replicates are shown. (E) Venn diagrams showing the number of unique introns with co-occurring upregulation of PAS and upregulated alternative splicing. The number of expressed introns without any evidence of enriched PASs or alternative splicing is denoted below the diagram. P-values are calculated by Fisher’s exact test. (**F–G**) Schematic showing the outline of the analysis (upper panel): For each circRNA, the locus spanning from the promoter to the circRNA splice donor was interrogated for the presence of quantseq PASs (F) or exon inclusion (G). Barplot (lower panel) showing the fraction of upregulated and downregulated circRNAs upon SFPQ depletion in HEK293T cells with evidence of a concomitant upregulated upstream PAS (F) or an upstream exon inclusion event (G). Numbers indicate the total number of circRNAs in each group. p-Values are calculated by Fisher’s exact test.

Figure 5—figure supplement 1. — (A) Volcano plot showing deregulated PAS usage as measured by quantseq upon NONO and SFPQ depletion in HEK293T cells. PAS signals are color-coded by their genic origin; intronic (dark blue), exonic (light blue), or ambiguous (gray). (B) Plot showing the cumulative fraction of PASs as a function of relative genic position stratified by genic origin (ambiguous, exonic or intronic, vertical facets) and color-coded by whether the PAS is significantly up (red) or downregulated (blue) upon SFPQ knockdown. (C) Schematic representation of the DENND1A exon 8–9 locus with alternative exon (green) and putative PAS element (purple). Below, merged quantseq coverage from each experiment. (D) RT-qPCR on input and oligo-dT purified RNA from control and SFPQ-depleted HEK293T cells using amplicons specific for *GAPDH* mRNA (positive control), circZKSCAN1 (negative control), and the alternative SFPQ-activated exon. Values reflect ratios between oligo-dT purified and input quantities. Data for two biological replicates are shown. (E) Venn diagrams showing the number of unique introns with co-occurring upregulation of PAS and upregulated alternative splicing. The number of expressed introns without any evidence of enriched PASs or alternative splicing is denoted below the diagram. P-values are calculated by Fisher’s exact test. (**F–G**) Schematic showing the outline of the analysis (upper panel): For each circRNA, the locus spanning from the promoter to the circRNA splice donor was interrogated for the presence of quantseq PASs (F) or exon inclusion (G). Barplot (lower panel) showing the fraction of upregulated and downregulated circRNAs upon SFPQ depletion in HEK293T cells with evidence of a concomitant upregulated upstream PAS (F) or an upstream exon inclusion event (G). Numbers indicate the total number of circRNAs in each group. p-Values are calculated by Fisher’s exact test.

As an upstream termination impacts downstream elements, we determined the relative genic position of up- and downregulated PASs. This showed a clear and general 5’region tendency of upregulated vs downregulated intronic PASs (Figure 5B), suggesting that activation of upstream PASs may subsequently repress the usage of downstream PASs. In addition, activation of upstream PASs were particularly pronounced in kc4 and 5 (Figure 5—figure supplement 1C) indicating that the ‘drop-off’-phenotype may be a consequence of intronic PAS activation and premature termination. To investigate how alternative splicing relates to premature termination in a global manner, we assessed for the co-presence of alternative exon inclusion and significantly enriched PASs across all expressed introns (Figure 5E). Overall, this showed a significant overlap (Figure 5E) with kc4 exhibiting the highest degree of overlap with 72 distinct introns harboring both events (Figure 5—figure supplement 1D). For DENND1A, where cryptic splicing marks the transition from unaffected to repressed state, we also observe a clear PAS with a consensus polyA signal (Figure 5C). This was validated using polyA enrichment, where the alternative transcript is oligo-dT purified as effectively as GAPDH mRNA only upon SFPQ knockdown (Figure 5D).

Collectively, this suggests that a notable fraction of genes exhibit alternative splicing and premature termination upon SFPQ knockdown with increased probability for longer introns, underscoring, once again, the importance of SFPQ in gene expression. Similarily, U1 snRNP has been shown to be important for repression of upstream, cryptic PAS usage in introns, known as U1 telescripting (Gunderson et al., 1998; Kaida et al., 2010; Oh et al., 2017). This raises the possibility that SFPQ could act in concert with U1 to protect long genes from premature transcriptional termination and polyadenylation. However, upon SFPQ knockdown U1 levels showed little or no response in HepG2 (Figure 5—figure supplement 2A–B) and HEK293T cells (Figure 5—figure supplement 2C–D), indicating that the SFPQ-dependent premature termination is independent from U1 telescripting.

circRNA deregulation is not explained by premature termination

If SFPQ depletion results in wide-spread increase in premature termination, the observed deregulation of circRNAs in our dataset could simply be explained by incomplete transcription and not as a biogenesis effect per se. This notion is consistent with the fact that circRNAs in general and DALI circRNAs in particular associate with long flanking introns prone to alternative splicing and premature termination. However, not all circRNAs were depleted upon SFPQ knockdown and particularly in mouse brain, the DALI circRNAs were affected in both directions (i.e. up- and downregulated). To test whether the deregulation of circRNAs is driven by premature termination, we stratified circRNAs by their host gene clusters. This showed that while most circRNAs derive from kc1, 2, and 4, roughly the same expression profile is observed across all clusters (Figure 5—figure supplement 3A–F). In addition, comparing backsplicing to linear splicing from the circRNA producing loci, no clear correlation was observed, suggesting that the circRNA deregulation is not a mere consequence of transcription levels (Figure 5—figure supplement 3G–I). Finally, counting the prevalence of upstream (from the SD) significant intronic quantseq PASs (Figure 5F) or alternative splicing events (Figure 5G) there is no significant difference between up- and downregulated circRNAs, and for alternative splicing no difference between DALI and PASI circRNAs, whereas premature termination is more prominent upstream of DALI circRNAs. Collectively, we argue that premature termination is not the main driver of circRNA deregulation.

Extracting features important for circRNA biogenesis

But what is then the underlying explanation for the deregulated expression of DALI circRNAs upon SFPQ depletion? As no single feature captures the circRNA deregulation accurately, we turned to multivariate regression analysis. Here, we collected a number of genic features (up- and downstream intron lengths, IAE distance, annotated distance to promoter and termination, and genomic length of circRNA), and differential expression data upon SFPQ depletion (linear up- and downstream splicing, flanking alternative splicing, upstream alternative splicing, up- and downstream intron retention) (Figure 6A). Pairwise correlation of all features shows modest redundancy but for certain combinations, such as 5’ linear splice (5’S) and 3’ linear splicing (3’S), we find a high level of positive interdependence (Figure 6—figure supplements 1 and 2), whereas intron retention generally correlates negatively with linear splicing (5’IR vs 5’S and 3’IR vs 3’S). In fact, linear splicing correlates negatively with all other features included in both HepG2 cells and mouse brain (Figure 6—figure supplements 1 and 2).

Figure 6. — (A) Schematic representation of features used in analysis. (B) Heatmap showing the feature coefficients from modeling circRNA deregulation (log2FoldChange) upon SFPQ depletion in HepG2 cells. The numbers within the heatmap are the associated p-values. (C) Boxplot showing the centered and scaled feature-values for significant up (red), significant down (blue), and unchanged (gray) circRNAs in HepG2. (**D–E**) as in B and C using mouse brain data. (F) Schematic depicting the SFPQ-mediated regulation of circRNA expression. Upon SFPQ knockdown, usage of cryptic splice acceptor sites (cSA) is induced, particularly within long introns. For upstream cSA inclusion (left scenario), the adjacent circRNA is upregulated possibly due to reduced competition with backsplicing, whereas for downstream cSA inclusion (right scenario), the circRNA is repressed due to increased competition with backsplicing,.

Figure 6—figure supplement 1. — (A) Schematic representation of features used in analysis. (B) Heatmap showing the feature coefficients from modeling circRNA deregulation (log2FoldChange) upon SFPQ depletion in HepG2 cells. The numbers within the heatmap are the associated p-values. (C) Boxplot showing the centered and scaled feature-values for significant up (red), significant down (blue), and unchanged (gray) circRNAs in HepG2. (**D–E**) as in B and C using mouse brain data. (F) Schematic depicting the SFPQ-mediated regulation of circRNA expression. Upon SFPQ knockdown, usage of cryptic splice acceptor sites (cSA) is induced, particularly within long introns. For upstream cSA inclusion (left scenario), the adjacent circRNA is upregulated possibly due to reduced competition with backsplicing, whereas for downstream cSA inclusion (right scenario), the circRNA is repressed due to increased competition with backsplicing,.

Splitting the quantified circRNAs into train and test sets (80:20 ratio), we trained a generalized linear model (GLM) against the observed circRNA log2foldchange. As all features were standardized, the resulting coefficients serve as a proxy for feature importance. Here, ranking features by coefficient, it is evident for both HepG2 and mouse brain that 5’ features generally correlate positively with circRNA production, whereas 3’ features (and IAE distance) correlate negatively (Figure 6B and D). As seen in both HepG2 and mouse brain, certain features, such as 5’IR (upstream intron retention), 5’ intron (upstream intron length), and 5’CSA (upstream cryptic SA usage), are highly distinctive for upregulated circRNA (Figure 6C and E), that is upstream aberrant splicing and intron retention stimulate circRNA production. Interestingly, correlating features show opposing effects on circRNA abundance. Here, according to the model, in HepG2, 5’S and 3’S impose positive and negative impact on circRNA production, respectively (Figure 6B), whereas features showing anti-correlation (5’S and 5’IR) are both ascribed a positive coefficient in both HepG2 and mouse (Figure 6B and D). This was also observed in HEK293T, although less convincingly partly due to low sequencing depth in these samples (Figure 6—figure supplement 3A–D). Also, while the performance of the model on the test-set is modest but significant Figure 6—figure supplement 4A and B, Pearson correlations: 0.41 (HepG2, p=1.8e-15) and 0.43 (mouse brain, p=1.8e-49), we observe convergence between the HepG2 and mouse brain-derived coefficients suggesting that the obtained features are conserved aspects of SFPQ-mediated circRNA regulation. Here, the most notable difference between HepG2 and mouse brain is the estimated intercept term (Figure 6—figure supplement 4C), which we interpret as the difference in cellular context, suggesting that the overall impact of SFPQ on circRNA expression may depend on other unidentified factors. Conclusively, we propose that aberrant SA usage caused by SFPQ depletion, most often observed in long introns, impacts transcription by invoking cryptic PAS usage and affects circRNA biogenesis by introducing competitive SAs. In case of cryptic SA inclusion immediately upstream a circRNA-producing locus, the cryptic SA engages with the upstream SD thus leaving the circRNA-dependent SA unattached and exposed for backsplicing. Alternatively, if the cryptic SA is located downstream, it may compete with backsplicing and thus repress circularization (Figure 6F).

Discussion

The biogenesis of circRNAs is typically ascribed to the presence of proximal inverted repeat elements positioning the two splice sites involved in backsplicing into close proximity. For example, Alu elements are frequently found close to circRNA exons (Ivanov et al., 2015; Jeck et al., 2013), but since they are primate-specific (Lander, 2011), the biogenesis of most highly expressed and conserved circRNAs can not be explained by the presence of such elements (Stagsted et al., 2019). Additionally, by RNA association and dimerization, RBPs have a similar ability to juxtapose splice sites destined for backsplicing, although this currently only seems to apply to a few specific cases (Conn et al., 2015; Errichelli et al., 2017; Ashwal-Fluss et al., 2014; Kramer et al., 2015). Here, we attempted to further disclose the impact of RBPs on circRNAs biogenesis and to reveal features important for backsplicing. First, we subgrouped circRNAs into the Alu-dependent subset characterized by proximal IAEs and short flanking introns, termed PASI circRNAs, and the Alu-independent circRNAs with distal IAEs co-occurring with long flanking introns, the DALI circRNAs. By utilizing the extensive eCLIP resource made available by the ENCODE consortium, we identified SFPQ and NONO as two potentially interesting candidates, both associating significantly with the DALI circRNA producing loci. In both HepG2 and HEK293T cell lines as well as in mouse brain conditional knockouts, we observed a general deregulation of DALI circRNAs upon depletion of SFPQ and only subtle effects upon NONO KD, possibly due to the concomitant upregulation of SFPQ in these samples. Thus, we mostly focused our analysis on SFPQ. Here, apart from dramatic changes in DALI circRNA expression, we observed across all samples (HepG2, HEK293T, and mouse brain), that the absence of SFPQ results in aberrant splicing with extensive induction of cryptic splice acceptor sites, particularly in long introns. This correlates with a similar increase in intron retention, suggesting that these two phenotypes are closely coupled. Consistently, recent studies have shown SFPQ to associate with long introns (Iida et al., 2020; Takeuchi et al., 2018) and to be vital in regulating alternative splicing of long target genes ultimately affecting neural differentiation (Luisier et al., 2018) and axon development (Thomas-Jinu et al., 2017). Interestingly, circRNAs tend to originate from longer genes, especially neuronal genes (Ragan et al., 2019; Rybak-Wolf et al., 2015; Szabo et al., 2015; You et al., 2015), and exons prone to circularize are more frequently flanked by longer introns than non-circularized exons (Jeck et al., 2013), supporting SPFQ-sensitive regulation of circRNAs, in particular DALI circRNAs.

Using quantseq analysis, we found SFPQ-depletion to activate the use of intronic polyA-sites, which in many cases overlap with aberrant cryptic splicing thereby resulting in premature termination. Consistently, we also observe decreasing signal across the gene body in the absence of SFPQ. A model in which SFPQ facilitates the recruitment of CDK9 to the CTD of RNA polymerase II to maintain transcription elongation was recently proposed to explain the ‘drop-off’ effect seen upon SFPQ depletion (Takeuchi et al., 2018; Hosokawa et al., 2019). Instead, we claim that this is partly explained by the induced cryptic splicing and subsequent premature termination, emphasizing that transcription is highly coupled to splicing. SFPQ was initially found to be associated with the polypyrimidine tract, aid in the assembly of the spliceosome and be critical for the second catalytic step in splicing (Ajuh et al., 2000; Makarov et al., 2002; Gozani et al., 1994; Patton et al., 1993), supporting its role in splicing fidelity. Moreover, U1 snRNP abundancy has been described to be important for transcriptional elongation of long, neuronal genes (Kaida et al., 2010; Berg et al., 2012; Oh et al., 2017). However, in many cases, the observed premature termination observed upon SFPQ depletion co-appears with an upstream cryptic SA. In addition, the U1 snRNA levels seem unaffected by SFPQ depletion suggesting that the SFPQ phenotype is not mediated entirely by U1 telescripting.

While DALI circRNAs in HepG2, HEK293T, and mouse brain are generally sensitive to SFPQ depletion, premature transcription termination fails to explain the observed circRNA levels. Instead, upstream intron length and aberrant splicing in the immediate upstream region have stimulating effects on circRNAs biogenesis, while, although less clear, cryptic events downstream show more detrimental impact. We speculate that SFPQ plays an imperative role in splicing fidelity, a role that becomes increasingly important with intron length. With the persistent presence of intronic sequences, cryptic and aberrant splicing become more likely, and cryptic exon inclusions in AU-rich introns will in many cases contain PAS-signal and thus cause premature termination. Consistent with this, RNA polymerase II has been shown to be stalled at AT-rich sequences (Henriques et al., 2013; Palangat et al., 2004), thus allowing a window of opportunity for splicing- and possible cleavage-directed transcription termination.

For circRNAs, the splice-sites involved in backsplicing must be protected from linear splicing as backsplicing occurs less effectively than canonical splicing. In particular, the SA has to remain unspliced until the RNA polymerase reaches the downstream SD. This can be facilitated by a fast polymerase elongation rate (Zhang et al., 2016) or by the lack of spliceosomal components (Liang et al., 2017). While SFPQ-depletion generally induces cryptic splicing imposing additional splice site competition, it also potentially eliminates upstream linear splicing and thus uncouples the SA from any upstream SD. This potentially exposes the upstream SA for backsplicing consistent with the observed positive impact of cryptic events (5’ intron retention and 5’ cryptic SA). In addition, we also observe that the mere length of the 5’ intron has an important predictive value for circRNA formation. This, we hypothesize, is due to the high correlation with aberrant splicing and intron retention; both features are limited to detection in RNAseq. Supposedly, many of the cryptic events are not detectable in a steady state sequencing approach as they are likely unstable and subjected to nuclear quality control or nonsense mediated decay (NMD), and therefore intron length may serve as a useful proxy for aberrant splicing upon SFPQ-depletion. Furthermore, SFPQ is often described in various protein complexes with some comprising FUS and the nuclear resolvase DHX9. Like SFPQ, FUS has been shown to act in various processes within the cell, such as transcription regulation and RNA metabolism (Lagier-Tourenne et al., 2012; Kwiatkowski et al., 2009; Vance et al., 2009) but also to associate with the 5’ss of long introns (Nakaya et al., 2013; Lagier-Tourenne et al., 2012), especially those flanking circularizing exons, and hereby regulate circRNA biogenesis (Errichelli et al., 2017). DHX9 has, on the other hand, been shown to unwind intronic base pairing and thereby reduce the production of Alu-dependent circRNAs (Aktaş et al., 2017; Errichelli et al., 2017). Both proteins show interesting circRNA regulation abilities which could act cooperatively with SFPQ and thus affect the fate of DALI circRNAs upon SFPQ depletion.

Conclusively, we show that SFPQ is a key regulator of DALI circRNAs production by controlling and enforcing accurate long intron splicing. This highlights the complex and intricate relationship between splicing in general and backsplicing in particular. Furthermore, SFPQ has been associated to diverse neurological diseases, such as ALS (Thomas-Jinu et al., 2017; Luisier et al., 2018) and FTLD (Ishigaki et al., 2017), and may prove to be a critical for maintaining the circRNAome in these and other neurodegenerative pathologies. And while in steady state scenarios, cryptic splicing is negligible, it is interesting to speculate whether upstream cryptic splicing is generally involved in DALI circRNA production providing a useful tool for manipulating circRNA production without impacting host gene expression.

Materials and methods

Key resources table.

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Cell line (Homo sapiens)	HEK293T	Invitrogen	N/A
Cell line (Homo sapiens)	HepG2	ATCC	RRID:CVCL_0027
Transfected construct (human)	pcDNA3- Myc-SFPQ	Genscript	This paper	Transfected construct
Transfected construct (human)	pcDNA3-Empty Vector	Invitrogen	N/A	Transfected construct
Antibody	Anti-SFPQ antibody (Polyclonal, Rabbit)	Abcam	RRID:AB_882523	WB (1:5,000)
Antibody	Anti-nmt55/ p54nrb antibody (Polyclonal, Rabbit)	Abcam	RRID:AB_1269576	WB (1:20,000) IP (2.5 µg)
Antibody	Anti-SFPQ antibody (Monoclonal, Mouse)	Sigma	RRID:AB_260995	IP (2.5 µg)
Antibody	Anti-FLAG antibody (Monoclonal, Mouse)	Sigma	RRID:AB_262044	IP (2.5 µg)
Antibody	Anti-beta Tubulin (Monoclonal, Mouse)	Millipore	RRID:AB_309885	WB (1:2,000)
Antibody	Anti-MYC (Polyclonal, Rabbit)	Sigma	RRID:AB_439694	WB (1:5,000)
Antibody	Anti-Histone H3 antibody (Monoclonal, Rabbit)	Abcam	RRID:AB_302613	WB (1:1,000)
Antibody	Anti-Rabbit antibody (Polyclonal, Goat)	Dako	RRID:AB_2617138	WB (1:5,000)
Antibody	Anti-Mouse antibody (Polyclonal, Goat)	Dako	RRID:AB_2617137	WB (1:5,000)
Sequenced-based reagent	GAPDH FW	This paper	qPCR Primer	GTCAGCCGCATCTTCTTTTG
Sequenced-based reagent	GAPDH RE	This paper	qPCR Primer	GCGCCCAATACGACCAAATC
Sequenced-based reagent	SFPQ FW	This paper	qPCR Primer	ACAGGGAAAGGCATTGTTGA
Sequenced-based reagent	SFPQ RE	This paper	qPCR Primer	TCATCTAGTTGTTCAAGTGGTTCC
Sequenced-based reagent	NONO FW	This paper	qPCR Primer	TGATGAAGAGGGACTTCCAGA
Sequenced-based reagent	NONO RE	This paper	qPCR Primer	AGCGCATGGCATATTCATACT
Sequenced-based reagent	CDYL FW	This paper	qPCR Primer	ACCCACTAGTGCCTCAGGTG
Sequenced-based reagent	CDYL lin RE	This paper	qPCR Primer	ATTTCCTTTTGCTGGCAGTC
Sequenced-based reagent	CDYL circ RE	This paper	qPCR Primer	CTCGCTGTCATAGCCTTTCC
Sequenced-based reagent	ZKSCAN1 FW	This paper	qPCR Primer	CCCAGTCCCACTTCAAACAT
Sequenced-based reagent	ZKSCAN1 lin RE	This paper	qPCR Primer	TCCGCTGTGAATAGTGCAGA
Sequenced-based reagent	ZKSCAN1 circ RE	This paper	qPCR Primer	TCATTCAGGCTCCAGGAACT
Sequenced-based reagent	NEIL3 FW	This paper	qPCR Primer	CAGCCCAATACTCATCACCA
Sequenced-based reagent	NEIL3 lin RE	This paper	qPCR Primer	GAGGCGGTTGTGTTTACTGC
Sequenced-based reagent	NEIL3 circ RE	This paper	qPCR Primer	CGGGTACTTCATTAAGTGGCTAA
Sequenced-based reagent	EYA1 FW	This paper	qPCR Primer	CCAATGCCACTTACCAGCTT
Sequenced-based reagent	EYA1 lin RE	This paper	qPCR Primer	TACTGCTCCCAATTGCTGAA
Sequenced-based reagent	EYA1 circ RE	This paper	qPCR Primer	TTTCCCATCTGAACCTCGAC
Sequenced-based reagent	ARHGAP5 lin FW	This paper	qPCR Primer	CGTGTCAGCGGGAATAAAACT
Sequenced-based reagent	ARHGAP5 lin RE	This paper	qPCR Primer	TGGAATTAAAGGATCTGGCAGA
Sequenced-based reagent	ARHGAP5 circ Fw	This paper	qPCR Primer	CCTGCAATCACTTCTGACCA
Sequenced-based reagent	ARHGAP5 circ RE	This paper	qPCR Primer	TTTGGTTCTTTGTATTTCCCTCA
Sequenced-based reagent	DENND1A Upstream FW	This paper	qPCR Primer	CCAAGTTTTGTTTCCCCTTC
Sequenced-based reagent	DENND1A Upstream RE	This paper	qPCR Primer	AGAAGCAGCTCTTCGCTCCT
Sequenced-based reagent	DENND1A Ex8 FW	This paper	qPCR Primer	ACCAGAGAACTTCCCAGCAT
Sequenced-based reagent	DENND1A cSA RE	This paper	qPCR Primer	TGGGAGAGGGGAAATATGTG
Sequenced-based reagent	DENND1A Downstream FW	This paper	qPCR Primer	AAGAGCAGCTGCCAAAGACT
Sequenced-based reagent	DENND1A Downstream RE	This paper	qPCR Primer	GCGATGTTGCTCTTTGGTCT
Sequenced-based reagent	circARHGAP5 intronic Ups Fw	This paper	qPCR Primer	ATGGAATCATTGTGCTTTTC
Sequenced-based reagent	circARHGAP5 intronic Ups Re	This paper	qPCR Primer	AATCTTAATCTGGCCCAACTGA
Sequenced-based reagent	circARHGAP5 intronic Ds Fw	This paper	qPCR Primer	GGCTAAAAGCTGATTATTTGAAAAG
Sequenced-based reagent	circARHGAP5 intronic Ds Re	This paper	qPCR Primer	TACATTTTTCCAGGACTTTGTTCAT
Sequenced-based reagent	ARHGAP5 exon five intronic Ups Fw	This paper	qPCR Primer	TGTGGCTAAAACAGGGTGTG
Sequenced-based reagent	ARHGAP5 exon five intronic Ups Re	This paper	qPCR Primer	AGGCACCTACAACCAACAGC
Sequenced-based reagent	ARHGAP5 exon five intronic Ds Fw	This paper	qPCR Primer	AATGCTGGGTCACTTTGGTC
Sequenced-based reagent	ARHGAP5 exon five intronic Ds Re	This paper	qPCR Primer	CAGCCTGGTTCCTAACAAGC
Sequenced-based reagent	circCDYL intronic Ups Fw	This paper	qPCR Primer	TTTTGTCTTTGTTTAATGCCATTTC
Sequenced-based reagent	circCDYL intronic Ups Re	This paper	qPCR Primer	GGCCAGACTGAGTATACATAAGGAA
Sequenced-based reagent	circCDYL intronic Ds Fw	This paper	qPCR Primer	TGACCTGCAAGCTCAGAATGG
Sequenced-based reagent	circCDYL intronic Ds Re	This paper	qPCR Primer	GGATTGGTGGTGGAAGTAAAT
Sequenced-based reagent	CDYL exon seven intronic Ups Fw	This paper	qPCR Primer	CTGGTTCCTTGTGCCTTGAT
Sequenced-based reagent	CDYL exon seven intronic Ups Re	This paper	qPCR Primer	TTTTCAGGGAATGGAACTG
Sequenced-based reagent	CDYL exon seven intronic Ds Fw	This paper	qPCR Primer	CCTGCTCCTCACCTTCTCAC
Sequenced-based reagent	CDYL exon seven intronic Ds Re	This paper	qPCR Primer	GTGCTGCTTGTTCCTCTCCT
Sequenced-based reagent	circNEIL3 intronic Ups Fw	This paper	qPCR Primer	TTCGAGGCTGCAGTGAACTA
Sequenced-based reagent	circNEIL3 intronic Ups Re	This paper	qPCR Primer	TTGCCTTGTTCTTGTCTGGA
Sequenced-based reagent	circNEIL3 intronic Ds Fw	This paper	qPCR Primer	CGATCCAAGGTTGGTTGAAT
Sequenced-based reagent	circNEIL3 intronic Ds Re	This paper	qPCR Primer	TTTACACCAAATGGTCCCTCA
Sequenced-based reagent	NEIL3 exon five intronic Ups Fw	This paper	qPCR Primer	TACCCAAATCAGTAGGAATGAAGC
Sequenced-based reagent	NEIL3 exon five intronic Ups Re	This paper	qPCR Primer	CATACTGAACTCACGTGTTCCAA
Sequenced-based reagent	NEIL3 exon five intronic Ds Fw	This paper	qPCR Primer	AACCTGAGGGAGCCAAAGAT
Sequenced-based reagent	NEIL3 exon five intronic Ds Re	This paper	qPCR Primer	TGAAGCAGAGACTTTTGAAGG
Sequenced-based reagent	circZKSCAN1 intronic Ups Fw	This paper	qPCR Primer	ATGGCCAAGCTGGTCTTGAACTCC
Sequenced-based reagent	circZKSCAN1 intronic Ups Re	This paper	qPCR Primer	CAGGAACAGCTGTATGAAATGG
Sequenced-based reagent	circZKSCAN1 intronic Ds Fw	This paper	qPCR Primer	TGGAACACTTAACCATGACTGG
Sequenced-based reagent	circZKSCAN1 intronic Ds Re	This paper	qPCR Primer	CCATGCCTGGCTGATTTATTAT
Sequenced-based reagent	ZKSCAN1 exon four intronic Ups Fw	This paper	qPCR Primer	GCAACAGAGGGAGATGCTG
Sequenced-based reagent	ZKSCAN1 exon four intronic Ups Re	This paper	qPCR Primer	GTGTGTGCCAGGATCTTTGA
Sequenced-based reagent	ZKSCAN1 exon four intronic Ds Fw	This paper	qPCR Primer	GAAAACTCACAGAATTGGAGAAA
Sequenced-based reagent	ZKSCAN1 exon four intronic Ds Re	This paper	qPCR Primer	GGAGCCTTCAGAGGTCACAG
Sequenced-based reagent	circEYA1 intronic Ups Fw	This paper	qPCR Primer	CGGTCCATGGTTTTAAGAGTGA
Sequenced-based reagent	circEYA1 intronic Ups Re	This paper	qPCR Primer	TGCAACACAAGAAAGGCTGA
Sequenced-based reagent	circEYA1 intronic Ds Fw	This paper	qPCR Primer	AGCCTTGTTGTGGAGTAGCT
Sequenced-based reagent	circEYA1 intronic Ds Re	This paper	qPCR Primer	TCTTGTTTCCCATGCACACA
Sequenced-based reagent	EYA1 exon 12 intronic Ups Fw	This paper	qPCR Primer	CAGATTCTATTTTTGGCATGAGG
Sequenced-based reagent	EYA1 exon 12 intronic Ups Re	This paper	qPCR Primer	GGGCAAGTAAACAATTTCCAA
Sequenced-based reagent	EYA1 exon 12 intronic Ds Fw	This paper	qPCR Primer	CTCCCATCTCCCACCCTTTC
Sequenced-based reagent	EYA1 exon 12 intronic Ds Re	This paper	qPCR Primer	TCTCATCGAGCCTGGTTTGT
Sequenced-based reagent	U1 Fw	This paper	qPCR Primer	GCTTATCCATTGCACTCCGG
Sequenced-based reagent	U1 Re	This paper	qPCR Primer	CCCCACTACCACAAATTATGCA
Sequenced-based reagent	U1	This paper	Northern blot Probe	ACAAATTATGCAGTCGAGTTTCCCACATTTGGGGAAATCGCAGGGGTCAGCACATCCGGA
Sequenced-based reagent	7SK	This paper	Northern blot Probe	TACTCGTATACCCTTGACCGAAGACCGGTCCTCCTCTATCGGGGATGGTC
Sequenced-based reagent	SFPQ siRNA #1	Merck	SASI_Hs01_00073164	Sense strand: GUACGAAUAUUCUCAGCGA[dT][dT] Antisense strand: UCGCUGAGAAUAUUCGUAC[dT][dT]
Sequenced-based reagent	SFPQ siRNA #2	Merck	SASI_Hs01_00073165	Sense strand: GGAAGAUGCCUAUCAUGAA[dT][dT] Antisense strand: UUCAUGAUAGGCAUCUUCC[dT][dT]
Sequenced-based reagent	NONO siRNA #1	Merck	SASI_Hs02_00343478	Sense strand: GAUGGAAGCUGCACGCCAU[dT][dT] Antisense strand: AUGGCGUGCAGCUUCCAUC[dT][dT]
Sequenced-based reagent	NONO siRNA #2	Merck	SASI_Hs02_00343479	Sense strand: CUCAGUAUGUGUCCAACGA[dT][dT] Antisense strand: UCGUUGGACACAUACUGAG[dT][dT]
Sequenced-based reagent	CTRL siRNA #1	Merck	CAT#SIC001	MISSION-siRNA Universal Negative Control #1
Sequenced-based reagent	CTRL siRNA #2	RiboTask		Targets eGFP Sense strand: GACGUAAACGGCCACAAGUUC Antisense strand: ACUUGUGGCCGUUUACGUCGC
Commercial assay or kit	DNase I, RNase-free (1 U/µL)	Thermo Fisher Scientific	CAT# EN0521	DNase Treatment
Commercial assay or kit	M-MLV Reverse Transcriptase kit	Thermo Fisher Scientific	CAT# 28025013	Reverse trancription
Commercial assay or kit	NEBNext Poly(A) mRNA Magnetic Isolation Module	New England BioLabs Inc	CAT# E7490S	PolyA RNA selection
Commercial assay or kit	RiboCop rRNA Depletion Kit V1.2	Lexogen	CAT# 037.24	RNA ribodepletion
Commercial assay or kit	SENSE Total RNA-Seq Library Prep Kit	Lexogen	CAT# 009.24	HEK293T RNA library preparation for total RNA sequencing
Commercial assay or kit	QuantSeq 3’ mRNA-Seq Library Prep Kit	Lexogen	CAT# 038.24	HEK293T RNA library preparation for 3’end RNA sequencing
Chemical compound, drug	Lipofectamine 2000	Thermo Fisher Scientific	CAT# 12566014	Transfection HEK293T and HepG2
Chemical compound, drug	Lipofectamine RNAiMAX	Thermo Fisher Scientific	CAT# 13778150	Transfection HepG2
Chemical compound, drug	SiLentFect Lipid Reagent	Bio-Rad	CAT# 1703361	Transfection HEK293T
Software, algorithm	GraphPad	Prism7	RRID:SCR_002798
Software, algorithm	R	R Project for statistical computing	RRID:SCR_001905
Software, algorithm	Typhoon FLA 9500	GE Healthcare	V. 1.1.0.187	Northern blot
Software, algorithm	Image Studio	Licor Odyssey Fc	Ver 5.2	Western blot
Software, algorithm	DESeq2, v1.24.0	Love et al., 2014	RRID:SCR_015687
Software, algorithm	STAR, v2.7,	Dobin et al., 2013	N/A
Software, algorithm	featureCounts, v2.0.0	Liao et al., 2014	RRID:SCR_012919
Software, algorithm	ciri2	Gao et al., 2015	N/A
Software, algorithm	find_circ v1.2	https://github.com/marvin-jens/find_circ	N/A
Software, algorithm	annotate_circ.py	This paper, github/ncrnalab/pyutils	N/A
Software, algorithm	RepeatMasker	UCSC Genome Browser	RRID:SCR_012954
Software, algorithm	liftOver tool	UCSC genome browser	RRID:SCR_018160
Software, algorithm	get_flanking spliced_reads.py	This paper, github/ncrnalab/pyutils	N/A
Software, algorithm	get_spliced_reads.py	This paper, github/ncrnalab/pyutils	N/A
Software, algorithm	get_alternative_splicing.py	This paper, github/ncrnalab/pyutils	N/A
Software, algorithm	MACS2 peakcall, v2.2.6	https://github.com/macs3-project/MACS	RRID:SCR_013291
Other	Dynabeads Protein A	Thermo Fisher Scientific	CAT# 10001D	RIP
Other	Dynabeads Protein G	Thermo Fisher Scientific	CAT# 10003D	RIP
Other	TRIzol Reagent	Thermo Fisher Scientific	CAT# 15596018	RNA Extraction
Other	Platinum SYBR Green I Master kit	Thermo Fisher Scientific	CAT# 11733046	qPCR assay
Other	10% Tris-Glycine SDS-PAGE gel	Thermo Fisher Scientific	CAT# XP00102BOX	Western blot
Other	Immobilon-P Transfer Membrane	EMD Millipore	CAT# IPVH85R	Western blot
Other	SuperSignal West Femto Maximum Sensitivity Substrate kit	Thermo Fisher Scientific	CAT# 34095	Western blot
Other	Amersham Hyperfilm ECL	GE Healthcare	N/A	Western blot
Other	Medical film	MG-SR plus, Konica Minolta	N/A	Western blot

Open in a new tab

Cell lines and transfections

HEK293T cells (Invitrogen) and HepG2 (ATCC, HB-8065) were cultured in Dulbecco’s modified Eagle’s media (DMEM) with GlutaMAX (Thermo Fisher Scientific) supplemented with 10% foetal bovine serum (FBS) and 1% penicillin/streptomycin sulphate (P/S). All cells were kept at 37°C in a humidified chamber with 5% CO₂. Knockdown of SFPQ and NONO were carried out using transient transfections of siRNAs accordingly (siRNA sequences in Supplementary file 7): For HEK293T, approximately 250,000 cells were plated in a six-well dish and 24 hr later transfected with a final concentration of 22.5 nM siRNA using siLentFect Lipid Reagent (Bio-Rad) accordingly to manufacturer’s protocol. Forty-eight hr post-transfection, the cells were replenished with new media and re-transfected using Lipofectamine 2000 (Thermo Fisher Scientific). After additional 48 hr, cells were harvested. For HepG2, approximately 400,000 cells were plated in a six-well dish and reverse transfected with a final siRNA concentration of 50 nM using Lipofectamine RNAiMax (Thermo Fisher Scientific). Forty-eight hr post transfection, the cells were trypsinized before reverse transfected for the second hit using Lipofectamine 2000 (Thermo Fisher Scientific). After 48 hr, the cells were harvested (see Figure 3—figure supplement 1A for experimental outline). For the co-expression of EV or SFPQ WT upon CTRL or SFPQ KD, approximately 400,000 HEK293T cells were plated in six-well dishes. After 24 hr, the cells were transfected with a final concentration of 22.5 nM siRNA and 2.5 µg plasmid using Lipofectamine 2000 (Thermo Fisher Scientific) according to manufacturer’s protocol, and the cells were harvested 48 hr post transfection. For all experiments, cells were harvested by washing in 1xPBS and subsequent centrifugation at 1200 rpm at 4°C for 4 min. 66.6% of the harvested cells was used for RNA isolation, which was carried out using TRIzol Reagent (Thermo Fisher Scientific) according to manufacturer’s protocol. Except for RNA used for RNAseq and RIP, 1 μg RNA was subjected to DNase I treatment (Thermo Fisher Scientific #EN0521) prior to subsequent analysis. The remaining cells (33.3%) were used for protein isolation; after centrifugation, the cell pellets were resuspended in 2xSDS loading buffer (125 mM Tris–HCl pH 6.8, 20% glycerol, 5% SDS, and 0.2 M DTT) and boiled at 95°C for 5 min.

RT-PCR and RT-qPCR

One μg of DNase-treated total RNA was reverse transcribed using the M-MLV Reverse Transcriptase kit (Thermo Fisher Scientific) according to manufacturer’s protocol with the use of random hexamers to prime the reaction. In case of RT-PCR, the reaction was conducted with 30 cycles of PCR with or without RT enzyme (Primers listed in Supplementary file 7). The products were visualized by 1% agarose gel electrophoresis and verified using Sanger sequencing. For quantitative PCR, cDNA was mixed with Platinum SYBR Green I Master kit (Invitrogen) and ran on Light cycler 480 II instrument (Roche). The reactions were carried out in technical triplicates. The obtained Ct values for each triplicate were transformed (2-Ct) and averaged (σ). All samples were normalized to GAPDH. The results were visualized using GraphPad (Prism 7) with individual biological replicates are shown and the mean is plotted as a bar. For statistical analysis, Student’s two-tailed t-test was used. p-Values below 0.05 (p<0.05) were considered significant. All statistical analyses were performed in GraphPad Prism.

Poly(A) enrichment

Poly(A) enrichment was performed using NEBNext Poly(A) mRNA Magnetic Isolation Module (New England BioLabs Inc) according to manufacturer’s protocol with five μg total RNA from CTRL-KD or SFPQ-KD in HEK293T cells used as input.

RNA sequencing

For total RNA sequencing of HEK293T, RNA from SFPQ, NONO, and CTRL KD (using two different siRNAs for each condition with biological duplicates) were rRNA depleted using RiboCop rRNA Depletion Kit V1.2 (Lexogen) according to manufacturer’s protocol. Subsequent cDNA libraries were prepared using SENSE Total RNA-Seq Library Prep Kit (Lexogen) following manufacturer’s protocol.

For 3’end sequencing, cDNA libraries from HEK293T were prepared using QuantSeq 3' mRNA-Seq Library Prep Kit (Lexogen). For both methods, RNA quality was determined using the BioAnalyzer RNA nanochip (Agilent) and library concentration was quantified with KAPA Library Quant KIT RT-qPCR (Roche). Total RNAseq was done as 100nt paired-end sequencing and performed using the Illumnia platform (HiSEQ4000, BGI, Copenhagen), while for 3’end sequencing, 75nt single-end sequencing was performed at MOMA (Aarhus University Hospital) on a NextSeq500. For total RNA sequencing of HepG2 cells, library preparation and sequencing was performed at BGI (Copenhagen) using BGIseq.

RNA-immunoprecipitation (RIP)

RIP was performed as previously described (Rinn et al., 2007) with some modifications to immunoprecipitate endogenous SFPQ and NONO. HepG2 and HEK293T cells were grown to confluence in 15 cm² dishes. Cells were harvested by trypsinization and resuspended in 2 ml PBS, 2 ml nuclear isolation buffer (1.28 M sucrose; 40 mM Tris-HCl Ph 7.5; 20 mM MgCl2; 4% Triton X-100) and 6 ml water on ice for 20 min (with frequent mixing). Nuclei were pelleted by centrifugation at 2,500G for 15 min. Nuclear pellet was resuspended in 1 ml RIP buffer (150 mM KCl, 25 mM Tris-HCl, pH 7.4, 5 mM EDTA, 0.5% Triton X-100) and 5 mM dithiothreitol (DTT) supplemented with Ribolock (Thermo Fisher Scientific) and proteinase inhibitor cocktail (Roche). Resuspended nuclei were split into two fractions of 500 μl each (for Mock and IP). Nuclear membrane and debris were pelleted by centrifugation at 13,000 RPM for 20 min. Antibody to SFPQ (P2860 Sigma), NONO (ab70335 Abcam), or FLAG epitope (Mock IP, F1804 Sigma) was added to supernatant (2.5 μg) and incubated for 4 hr at 4°C with gentle rotation. Of protein A/G beads, 20 μl were added and incubated for 1 hr at 4°C with gentle rotation. Beads isolated using magnetic, the supernatant were removed and beads were resuspended in 500 μl RIP buffer and repeated for a total of 5 RIP washes. Beads were divided into two fractions for protein (30%) and RNA (70%). Protein fraction was resuspended in 2xSDS loading buffer (125 mM Tris–HCl pH 6.8, 20% glycerol, 5% SDS, and 0.2 M DTT) and boiled at 95°C for 5 min. RNA fraction was resuspended in 1 ml TRIzol Reagent (Thermo Fisher Scientific).

Western blotting

Cells were harvested in 1xPBS and centrifuged at 1200 rpm at 4°C for 5 min. For cell lysis, the cell pellet was collected and resuspended in 2xSDS loading buffer [125 mM Tris–HCl pH 6.8, 20% glycerol, 5% SDS, and 0.2 M DTT] and briefly boiled at 95°C for 5 min before loading 1% on a 10% Tris-Glycine SDS-PAGE gel (Thermo Fisher Scientific) and run for app. 1.5 hr at 125 V. The proteins were transferred to an Immobilon-P Transfer Membrane (EMD Millipore) by wet-blotting ON at 4°C at 25 V. Subsequently, the membrane was pre-blocked for 1 hr at RT with 10% skim milk, followed by 1 hr incubation with primary antibody (Supplementary file 7) and 1 hr with secondary antibody. After each antibody incubation, the membrane was rinsed 3 × 5 min in 1xPBS+0.05% Tween20 and 1 × 5 min wash with 1xPBS. The protein bands were developed using SuperSignal West Femto Maximum Sensitivity Substrate kit (Thermo Fisher Scientific) and Amersham Hyperfilm ECL (GE Healthcare) or Medical film (MG-SR plus, Konica Minolta).

Mapping, circRNA detection, and quantification

Reads were mapped onto hg19 and mm10 for human and mouse data, respectively, with STAR (v2.7, Dobin et al., 2013) and quantified with featureCounts (v2.0.0, Liao et al., 2014) [featureCounts –p –O –i gene_id –t exon] using gencode annotations (v28lift37 for hg19 and v12 for mm10).

CircRNAs were predicted and quantified using ciri2 (Gao et al., 2015) and find_circ v1.2 (https://github.com/marvin-jens/find_circ) adhering to default settings, and only the shared predictions with ciri2 quantification were kept for analysis. circRNAs were annotated using annotate_circ.py (python scripts used are available at github/ncrnalab/pyutils). Flanking intron lengths were based on the mean total distance to the flanking exons based on gencode annotation (in case of multiple annotated flanking introns, the mean length was used), and IAE distance is the shortest possible Alu-mediated inverted repeat distance based on RepeatMasker (UCSC Genome Browser). For mouse, the IAE-distance is the shortest distance involving B1, B2 or B4 elements possible. DALI and PASI circRNAs were classified based on the median flanking intron lengths and median IAE distance in the sample. If no flanking introns were annotated, the circRNA was classified as ‘other’. Furthermore, circRNAs were classified as conserved if both splice sites coincide exactly with previously detected mouse circRNAs (Stagsted et al., 2019) converted to hg19 coordinates using the liftOver tool (UCSC genome browser). Flanking linear spliced reads from the circRNA producing loci were extracted using get_flanking spliced_reads.py.

Cryptic/alternative splicing and intron retention

First, all spliced reads were extracted from bam-files using get_spliced_reads.py requiring at least an eight nucleotide match on each exon. Then, separately for each splice-donor and –acceptor, all possible conjoining splice sites were extracted and counted using get_alternative_splicing.py. For each splice site, the most abundant splicing event across all samples was denoted as canonical, whereas all other splicing events from that particular splice site were either classified as ‘inclusion’ if shorter or ‘skipping’ if longer than the canonical.

Based on the output from alternative splicing, for each splice-site the intronic region of the shortest alternative event was quantified using featureCounts (as above) but with [–minOverlap 5] to avoid quantification of any overlapping spliced reads.

eCLIP analysis

Pre-mapped eCLIP datasets were downloaded from encodeproject.org (hg19, see Supplementary file 2). Based on the top 1000 expressed circRNAs from HepG2 and K562 (see Supplementary file 1), all reads aligning within 2000 nt upstream of the circRNA splice-acceptors or within 2000 nts downstream the circRNA splice-donors were counted using featureCounts. The same analysis was performed on all other annotated exons from the circRNAs host genes excluding the first and last host gene exons as well as exons involved in backsplicing. Moreover, genome-wide, exon-pairs were subsampled from gencode annotations to match the distributions of flanking intron lengths and linear spliced reads of DALI circRNAs; DALI-like exons. Enrichment was assessed by Wilcoxon rank-sum test between the number of reads flanking circRNAs compared to host exons. In case of SFPQ eCLIP on deregulated circRNAs, the number of eCLIP reads flanking the expressed circRNAs (within 2000 nt upstream and 2000 nt downstream) were retrieved using featureCounts (as above). For each locus, read counts were normalized to library depth (total reads) and to deduce the IP/INPUT enrichment, the mean of the replicate IP samples were divided by the input sample count. For SFPQ eCLIP from mouse brain (GSE96081), IP and INPUT samples were mapped to the mm10 genome using STAR (default setting), and then processed similar to the ENCODE eCLIP.

Quantseq analysis

Quantseq reads were mapped onto hg19 using STAR as described above. The resulting bam-files were merged and divided into plus and minus-strand alignments. Then, using MACS2 peakcall (v2.2.6, https://github.com/macs3-project/MACS) with parameters [--nomodel --shift 0 –g 2.9e9], quantseq peaks were extracted. Each peak was then subsequently quantified using featureCounts and analysed for the presence of polyA signal (A[AU]UAAA) and the presence of polyA-stretches within the locus or in the immediate flanking regions (+/- 50 nts). Based on gencode annotation, each peak was assigned as exonic, intronic, ambiguous, or intergenic.

Differential gene expression

First, in all analyses, low-expressed entries defined by mean counts across all samples < 1 and expressed in less than three samples were discarded. Then, analysis of differential gene expression was performed using DESeq2 (v1.24.0, Love et al., 2014) using formula ~ treatment, where treatment denotes the knockdown/knockout target. For mRNA and circRNA expression, the raw counts were merged and analysed in bulk. For conditional analysis, such as circ vs linear, alternative vs canonical splicing, and intron-retention vs intron-splicing, raw counts for each type was combined in one expression matrix with the associated design formula: ~ treatment * type, where type denotes circular or linear splicing (in case of circ vs linear). The log2FoldChange and p-adjust values from the interaction-term (treatment:type) was used in subsequent analyses. For binned analysis of transcripts, each locus was sliced and re-annotated as 20 equally sized bins irrespective of exon-intron structure, and this was then used in the featureCounts quantification. After differential expression analysis by DESeq2, genes were subgrouped into five k-means clusters based on the DESeq2-derived log2foldchange of all 20 bins.

An adjusted p-value below 0.05 was considered significant. All statistics were conducted in R (v3.6.3) and visualizations were done in R using ggplot2 (v3.3.0) and GraphPad.

PAGE northern blot

PAGE northern was conducted as previously described (Hansen, 2018). Briefly, three µg RNA were dissolved in 40 µL loading buffer (8 M urea, 20 mM EDTA, 1% xylen, 1% bromophenolblue) and separated on a 10% PAGE gel in 1 x TBE buffer. Then, the RNA was transferred to a Hybond N+ membrane (GE Healthcare) by electroblotting overnight (ON) in 0.5 x TBE at 4°C. The membrane was UV crosslinked on both sides using 120 mJ/cm² and then pre-hybridized in Church buffer (0.158 M NaH2PO4, 0.342 M Na2HPO4, 7% SDS, 1 mM EDTA, 0.5% BSA, pH 7.5) at 37°C for 1 hr. For visualization, the membrane was probed with 5’ radioactively labeled DNA oligonucleotides at 37°C ON and washed twice in washing buffer (2x SSC, 0.1% SDS) for 10 min at 25°C prior to exposure on a phosphoimager screen, and finally scanned using a Typhoon imager (Amersham).

Data accessibility

Sequencing data has been deposited on GEO (accession no GSE157622), and scripts for RNAseq data processing are available at github: github.com/ncrnalab/pyutils.

Acknowledgements

We would like to thank Dr. Anne Færch Nielsen and Professor Jørgen Kjems for constructive comments and critical reading of the manuscript. This work was supported by the Novo Nordisk Foundation (NNF16OC0019874 to TBH) and the circRTrain ITN network.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Thomas Birkballe Hansen, Email: tbh@mbg.au.dk.

Juan Valcárcel, Centre de Regulació Genòmica (CRG), Spain.

James L Manley, Columbia University, United States.

Funding Information

This paper was supported by the following grant:

Novo Nordisk Fonden NNF16OC0019874 to Thomas Birkballe Hansen.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing.

Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing.

Investigation, Methodology, Writing - review and editing.

Conceptualization, Data curation, Supervision, Funding acquisition, Validation, Investigation, Visualization, Writing - original draft, Writing - review and editing.

Additional files

Supplementary file 1. ENCODE RNAseq.

List of ENCODE dataset accession numbers used in RNAseq analysis.

elife-63088-supp1.xlsx^{(10.1KB, xlsx)}

Supplementary file 2. ENCODE eCLIP.

List of ENCODE dataset accession numbers used in eCLIP analysis.

elife-63088-supp2.xlsx^{(36.4KB, xlsx)}

Supplementary file 3. HEK293T RNAseq.

Mapping statistics and annotated output from featureCounts, ciri2 and find_circ using RNAseq on HEK293T cells.

elife-63088-supp3.xlsx^{(12.3MB, xlsx)}

Supplementary file 4. HepG2 RNAseq.

Mapping statistics and annotated output from featureCounts, ciri2 and find_circ using RNAseq on HepG2 cells.

elife-63088-supp4.xlsx^{(19.5MB, xlsx)}

Supplementary file 5. Mouse brain RNAseq.

Mapping statistics and annotated output from featureCounts, ciri2 and find_circ using RNAseq on mouse brain (GSE60246).

elife-63088-supp5.xlsx^{(20.5MB, xlsx)}

Supplementary file 6. HEK293T QuantSeq.

Mapping statistics and output from MACS2 analysis using quantseq data on HEK293T cells.

elife-63088-supp6.xlsx^{(23.3MB, xlsx)}

Supplementary file 7. Primers, probes, and antibodies.

elife-63088-supp7.xlsx^{(14.7KB, xlsx)}

Transparent reporting form

elife-63088-transrepform.pdf^{(183.8KB, pdf)}

Data availability

Sequencing data has been deposited on GEO (accession no GSE157622).

The following dataset was generated:

Stagsted L, O'leary E, Hansen T. 2020. The RNA-binding protein SFPQ preserves long-intron splicing and regulates circRNA biogenesis. NCBI Gene Expression Omnibus. GSE157622

The following previously published datasets were used:

Takeuchi A, Iida K, Tsubota T, Hosokawa M. 2018. The RNA-binding protein Sfpq regulates long neuronal genes in transcriptional elongation. NCBI Gene Expression Omnibus. GSE60246

Takeuchi A, Denawa M, Iida K, Hagiwara M. 2018. The RNA-binding protein Sfpq regulates long neuronal genes in transcriptional elongation [SFPQ_CLIP-seq] NCBI Gene Expression Omnibus. GSE96081

References

Ajuh P, Kuster B, Panov K, Zomerdijk JC, Mann M, Lamond AI. Functional analysis of the human CDC5L complex and identification of its components by mass spectrometry. The EMBO Journal. 2000;19:6569–6581. doi: 10.1093/emboj/19.23.6569. [DOI] [PMC free article] [PubMed] [Google Scholar]
Aktaş T, Avşar Ilık İ, Maticzka D, Bhardwaj V, Pessoa Rodrigues C, Mittler G, Manke T, Backofen R, Akhtar A. DHX9 suppresses RNA processing defects originating from the alu invasion of the human genome. Nature. 2017;544:115–119. doi: 10.1038/nature21715. [DOI] [PubMed] [Google Scholar]
Ashwal-Fluss R, Meyer M, Pamudurti NR, Ivanov A, Bartok O, Hanan M, Evantal N, Memczak S, Rajewsky N, Kadener S. circRNA biogenesis competes with pre-mRNA splicing. Molecular Cell. 2014;56:55–66. doi: 10.1016/j.molcel.2014.08.019. [DOI] [PubMed] [Google Scholar]
Barrett SP, Wang PL, Salzman J. Circular RNA biogenesis can proceed through an exon-containing lariat precursor. eLife. 2015;4:e07540. doi: 10.7554/eLife.07540. [DOI] [PMC free article] [PubMed] [Google Scholar]
Berg MG, Singh LN, Younis I, Liu Q, Pinto AM, Kaida D, Zhang Z, Cho S, Sherrill-Mix S, Wan L, Dreyfuss G. U1 snRNP determines mRNA length and regulates isoform expression. Cell. 2012;150:53–64. doi: 10.1016/j.cell.2012.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
Buxadé M, Morrice N, Krebs DL, Proud CG. The PSF·p54 nrb Complex Is a Novel Mnk Substrate That Binds the mRNA for Tumor Necrosis Factor α. Journal of Biological Chemistry. 2008;283:57–65. doi: 10.1074/jbc.M705286200. [DOI] [PubMed] [Google Scholar]
Clemson CM, Hutchinson JN, Sara SA, Ensminger AW, Fox AH, Chess A, Lawrence JB. An architectural role for a nuclear noncoding RNA: neat1 RNA is essential for the structure of paraspeckles. Molecular Cell. 2009;33:717–726. doi: 10.1016/j.molcel.2009.01.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
Conn SJ, Pillman KA, Toubia J, Conn VM, Salmanidis M, Phillips CA, Roslan S, Schreiber AW, Gregory PA, Goodall GJ. The RNA binding protein quaking regulates formation of circRNAs. Cell. 2015;160:1125–1134. doi: 10.1016/j.cell.2015.02.014. [DOI] [PubMed] [Google Scholar]
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dong B, Horowitz DS, Kobayashi R, Krainer AR. Purification and cDNA cloning of HeLa cell p54nrb, a nuclear protein with two RNA recognition motifs and extensive homology to human splicing factor PSF and Drosophila NONA/BJ6. Nucleic Acids Research. 1993;21:4085–4092. doi: 10.1093/nar/21.17.4085. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dubin RA, Kazmi MA, Ostrer H. Inverted repeats are necessary for circularization of the mouse testis sry transcript. Gene. 1995;167:245–248. doi: 10.1016/0378-1119(95)00639-7. [DOI] [PubMed] [Google Scholar]
Emili A, Shales M, McCracken S, Xie W, Tucker PW, Kobayashi R, Blencowe BJ, Ingles CJ. Splicing and transcription-associated proteins PSF and p54nrb/nonO bind to the RNA polymerase II CTD. RNA. 2002;8:1102–1111. doi: 10.1017/S1355838202025037. [DOI] [PMC free article] [PubMed] [Google Scholar]
ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
Errichelli L, Dini Modigliani S, Laneve P, Colantoni A, Legnini I, Capauto D, Rosa A, De Santis R, Scarfò R, Peruzzi G, Lu L, Caffarelli E, Shneider NA, Morlando M, Bozzoni I. FUS affects circular RNA expression in murine embryonic stem cell-derived motor neurons. Nature Communications. 2017;8:14741. doi: 10.1038/ncomms14741. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fox AH, Nakagawa S, Hirose T, Bond CS. Paraspeckles: where long noncoding RNA meets phase separation. Trends in Biochemical Sciences. 2018;43:124–135. doi: 10.1016/j.tibs.2017.12.001. [DOI] [PubMed] [Google Scholar]
Furukawa MT, Sakamoto H, Inoue K. Interaction and colocalization of HERMES/RBPMS with NonO, PSF, and G3BP1 in neuronal cytoplasmic RNP granules in mouse retinal line cells. Genes to Cells. 2015;20:257–266. doi: 10.1111/gtc.12224. [DOI] [PubMed] [Google Scholar]
Gao Y, Wang J, Zhao F. CIRI: an efficient and unbiased algorithm for de novo circular RNA identification. Genome Biology. 2015;16 doi: 10.1186/s13059-014-0571-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gozani O, Patton JG, Reed R. A novel set of spliceosome-associated proteins and the essential splicing factor PSF bind stably to pre-mRNA prior to catalytic step II of the splicing reaction. The EMBO Journal. 1994;13:3356–3367. doi: 10.1002/j.1460-2075.1994.tb06638.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gunderson SI, Polycarpou-Schwarz M, Mattaj IW. U1 snRNP inhibits pre-mRNA polyadenylation through a direct interaction between U1 70K and poly(A) polymerase. Molecular Cell. 1998;1:255–264. doi: 10.1016/S1097-2765(00)80026-X. [DOI] [PubMed] [Google Scholar]
Hansen TB, Jensen TI, Clausen BH, Bramsen JB, Finsen B, Damgaard CK, Kjems J. Natural RNA circles function as efficient microRNA sponges. Nature. 2013;495:384–388. doi: 10.1038/nature11993. [DOI] [PubMed] [Google Scholar]
Hansen TB. Characterization of circular RNA concatemers. Methods in Molecular Biology. 2018;1724:143–157. doi: 10.1007/978-1-4939-7562-4_12. [DOI] [PubMed] [Google Scholar]
Henriques T, Gilchrist DA, Nechaev S, Bern M, Muse GW, Burkholder A, Fargo DC, Adelman K. Stable pausing by RNA polymerase II provides an opportunity to target and integrate regulatory signals. Molecular Cell. 2013;52:517–528. doi: 10.1016/j.molcel.2013.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hosokawa M, Takeuchi A, Tanihata J, Iida K, Takeda S, Hagiwara M. Loss of RNA-Binding protein sfpq causes Long-Gene transcriptopathy in skeletal muscle and severe muscle mass reduction with metabolic myopathy. iScience. 2019;13:229–242. doi: 10.1016/j.isci.2019.02.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
Iida K, Hagiwara M, Takeuchi A. Multilateral bioinformatics analyses reveal the Function-Oriented target specificities and recognition of the RNA-Binding protein SFPQ. iScience. 2020;23:101325. doi: 10.1016/j.isci.2020.101325. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ishigaki S, Fujioka Y, Okada Y, Riku Y, Udagawa T, Honda D, Yokoi S, Endo K, Ikenaka K, Takagi S, Iguchi Y, Sahara N, Takashima A, Okano H, Yoshida M, Warita H, Aoki M, Watanabe H, Okado H, Katsuno M, Sobue G. Altered tau isoform ratio caused by loss of FUS and SFPQ function leads to FTLD-like phenotypes. Cell Reports. 2017;18:1118–1131. doi: 10.1016/j.celrep.2017.01.013. [DOI] [PubMed] [Google Scholar]
Ito T, Watanabe H, Yamamichi N, Kondo S, Tando T, Haraguchi T, Mizutani T, Sakurai K, Fujita S, Izumi T, Isobe T, Iba H. Brm transactivates the telomerase reverse transcriptase (TERT) gene and modulates the splicing patterns of its transcripts in concert with p54nrb. Biochemical Journal. 2008;411:201–209. doi: 10.1042/BJ20071075. [DOI] [PubMed] [Google Scholar]
Ivanov A, Memczak S, Wyler E, Torti F, Porath HT, Orejuela MR, Piechotta M, Levanon EY, Landthaler M, Dieterich C, Rajewsky N. Analysis of intron sequences reveals hallmarks of circular RNA biogenesis in animals. Cell Reports. 2015;10:170–177. doi: 10.1016/j.celrep.2014.12.019. [DOI] [PubMed] [Google Scholar]
Jeck WR, Sorrentino JA, Wang K, Slevin MK, Burd CE, Liu J, Marzluff WF, Sharpless NE. Circular RNAs are abundant, conserved, and associated with ALU repeats. RNA. 2013;19:141–157. doi: 10.1261/rna.035667.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kaida D, Berg MG, Younis I, Kasim M, Singh LN, Wan L, Dreyfuss G. U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature. 2010;468:664–668. doi: 10.1038/nature09479. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kameoka S, Duque P, Konarska MM. p54(nrb) associates with the 5' splice site within large transcription/splicing complexes. The EMBO Journal. 2004;23:1782–1791. doi: 10.1038/sj.emboj.7600187. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kaneko S, Rozenblatt-Rosen O, Meyerson M, Manley JL. The multifunctional protein p54nrb/PSF recruits the exonuclease XRN2 to facilitate pre-mRNA 3' processing and transcription termination. Genes & Development. 2007;21:1779–1789. doi: 10.1101/gad.1565207. [DOI] [PMC free article] [PubMed] [Google Scholar]
Knott GJ, Lee M, Passon DM, Fox AH, Bond CS. Caenorhabditis elegans NONO-1: insights into DBHS protein structure, architecture, and function. Protein Science. 2015;24:2033–2043. doi: 10.1002/pro.2816. [DOI] [PMC free article] [PubMed] [Google Scholar]
Knott GJ, Bond CS, Fox AH. The DBHS proteins SFPQ, NONO and PSPC1: a multipurpose molecular scaffold. Nucleic Acids Research. 2016;44:3989–4004. doi: 10.1093/nar/gkw271. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kramer MC, Liang D, Tatomer DC, Gold B, March ZM, Cherry S, Wilusz JE. Combinatorial control of Drosophila circular RNA expression by intronic repeats, hnRNPs, and SR proteins. Genes & Development. 2015;29:2168–2182. doi: 10.1101/gad.270421.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kwiatkowski TJ, Bosco DA, Leclerc AL, Tamrazian E, Vanderburg CR, Russ C, Davis A, Gilchrist J, Kasarskis EJ, Munsat T, Valdmanis P, Rouleau GA, Hosler BA, Cortelli P, de Jong PJ, Yoshinaga Y, Haines JL, Pericak-Vance MA, Yan J, Ticozzi N, Siddique T, McKenna-Yasek D, Sapp PC, Horvitz HR, Landers JE, Brown RH. Mutations in the FUS/TLS gene on chromosome 16 cause familial amyotrophic lateral sclerosis. Science. 2009;323:1205–1208. doi: 10.1126/science.1166066. [DOI] [PubMed] [Google Scholar]
Lagier-Tourenne C, Polymenidou M, Hutt KR, Vu AQ, Baughn M, Huelga SC, Clutario KM, Ling SC, Liang TY, Mazur C, Wancewicz E, Kim AS, Watt A, Freier S, Hicks GG, Donohue JP, Shiue L, Bennett CF, Ravits J, Cleveland DW, Yeo GW. Divergent roles of ALS-linked proteins FUS/TLS and TDP-43 intersect in processing long pre-mRNAs. Nature Neuroscience. 2012;15:1488–1497. doi: 10.1038/nn.3230. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lander ES. Initial sequencing and analysis of the human genome. Nature. 2011;409 doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
Lee M, Sadowska A, Bekere I, Ho D, Gully BS, Lu Y, Iyer KS, Trewhella J, Fox AH, Bond CS. The structure of human SFPQ reveals a coiled-coil mediated polymer essential for functional aggregation in gene regulation. Nucleic Acids Research. 2015;43:3826–3840. doi: 10.1093/nar/gkv156. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liang D, Tatomer DC, Luo Z, Wu H, Yang L, Chen LL, Cherry S, Wilusz JE. The output of Protein-Coding genes shifts to circular RNAs when the Pre-mRNA processing machinery is limiting. Molecular Cell. 2017;68:940–954. doi: 10.1016/j.molcel.2017.10.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30:923–930. doi: 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Luisier R, Tyzack GE, Hall CE, Mitchell JS, Devine H, Taha DM, Malik B, Meyer I, Greensmith L, Newcombe J, Ule J, Luscombe NM, Patani R. Intron retention and nuclear loss of SFPQ are molecular hallmarks of ALS. Nature Communications. 2018;9:2010. doi: 10.1038/s41467-018-04373-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Makarov EM, Makarova OV, Urlaub H, Gentzel M, Will CL, Wilm M, Lührmann R. Small nuclear ribonucleoprotein remodeling during catalytic activation of the spliceosome. Science. 2002;298:2205–2208. doi: 10.1126/science.1077783. [DOI] [PubMed] [Google Scholar]
Memczak S, Jens M, Elefsinioti A, Torti F, Krueger J, Rybak A, Maier L, Mackowiak SD, Gregersen LH, Munschauer M, Loewer A, Ziebold U, Landthaler M, Kocks C, le Noble F, Rajewsky N. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature. 2013;495:333–338. doi: 10.1038/nature11928. [DOI] [PubMed] [Google Scholar]
Nakaya T, Alexiou P, Maragkakis M, Chang A, Mourelatos Z. FUS regulates genes coding for RNA-binding proteins in neurons by binding to their highly conserved introns. RNA. 2013;19:498–509. doi: 10.1261/rna.037804.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
Oh JM, Di C, Venters CC, Guo J, Arai C, So BR, Pinto AM, Zhang Z, Wan L, Younis I, Dreyfuss G. U1 snRNP telescripting regulates a size-function-stratified human genome. Nature Structural & Molecular Biology. 2017;24:993–999. doi: 10.1038/nsmb.3473. [DOI] [PMC free article] [PubMed] [Google Scholar]
Palangat M, Hittinger CT, Landick R. Downstream DNA selectively affects a paused conformation of human RNA polymerase II. Journal of Molecular Biology. 2004;341:429–442. doi: 10.1016/j.jmb.2004.06.009. [DOI] [PubMed] [Google Scholar]
Passon DM, Lee M, Rackham O, Stanley WA, Sadowska A, Filipovska A, Fox AH, Bond CS. Structure of the heterodimer of human NONO and paraspeckle protein component 1 and analysis of its role in subnuclear body formation. PNAS. 2012;109:4846–4850. doi: 10.1073/pnas.1120792109. [DOI] [PMC free article] [PubMed] [Google Scholar]
Patton JG, Porro EB, Galceran J, Tempst P, Nadal-Ginard B. Cloning and characterization of PSF, a novel pre-mRNA splicing factor. Genes & Development. 1993;7:393–406. doi: 10.1101/gad.7.3.393. [DOI] [PubMed] [Google Scholar]
Peng R, Dye BT, Pérez I, Barnard DC, Thompson AB, Patton JG. PSF and p54nrb bind a conserved stem in U5 snRNA. RNA. 2002;8:1334–1347. doi: 10.1017/S1355838202022070. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ragan C, Goodall GJ, Shirokikh NE, Preiss T. Insights into the biogenesis and potential functions of exonic circular RNA. Scientific Reports. 2019;9:2048. doi: 10.1038/s41598-018-37037-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, Brugmann SA, Goodnough LH, Helms JA, Farnham PJ, Segal E, Chang HY. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell. 2007;129:1311–1323. doi: 10.1016/j.cell.2007.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rosonina E, Ip JY, Calarco JA, Bakowski MA, Emili A, McCracken S, Tucker P, Ingles CJ, Blencowe BJ. Role for PSF in mediating transcriptional activator-dependent stimulation of pre-mRNA processing in vivo. Molecular and Cellular Biology. 2005;25:6734–6746. doi: 10.1128/MCB.25.15.6734-6746.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rybak-Wolf A, Stottmeister C, Glažar P, Jens M, Pino N, Giusti S, Hanan M, Behm M, Bartok O, Ashwal-Fluss R, Herzog M, Schreyer L, Papavasileiou P, Ivanov A, Öhman M, Refojo D, Kadener S, Rajewsky N. Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed. Molecular Cell. 2015;58:870–885. doi: 10.1016/j.molcel.2015.03.027. [DOI] [PubMed] [Google Scholar]
Salzman J, Gawad C, Wang PL, Lacayo N, Brown PO. Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLOS ONE. 2012;7:e30733. doi: 10.1371/journal.pone.0030733. [DOI] [PMC free article] [PubMed] [Google Scholar]
Salzman J, Chen RE, Olsen MN, Wang PL, Brown PO. Cell-type specific features of circular RNA expression. PLOS Genetics. 2013;9:e1003777. doi: 10.1371/journal.pgen.1003777. [DOI] [PMC free article] [PubMed] [Google Scholar]
Stagsted LV, Nielsen KM, Daugaard I, Hansen TB. Noncoding AUG circRNAs constitute an abundant and conserved subclass of circles. Life Science Alliance. 2019;2:e201900398. doi: 10.26508/lsa.201900398. [DOI] [PMC free article] [PubMed] [Google Scholar]
Szabo L, Morey R, Palpant NJ, Wang PL, Afari N, Jiang C, Parast MM, Murry CE, Laurent LC, Salzman J. Statistically based splicing detection reveals neural enrichment and tissue-specific induction of circular RNA during human fetal development. Genome Biology. 2015;16:126. doi: 10.1186/s13059-015-0690-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Takeuchi A, Iida K, Tsubota T, Hosokawa M, Denawa M, Brown JB, Ninomiya K, Ito M, Kimura H, Abe T, Kiyonari H, Ohno K, Hagiwara M. Loss of sfpq causes Long-Gene transcriptopathy in the brain. Cell Reports. 2018;23:1326–1341. doi: 10.1016/j.celrep.2018.03.141. [DOI] [PubMed] [Google Scholar]
Thomas-Jinu S, Gordon PM, Fielding T, Taylor R, Smith BN, Snowden V, Blanc E, Vance C, Topp S, Wong CH, Bielen H, Williams KL, McCann EP, Nicholson GA, Pan-Vazquez A, Fox AH, Bond CS, Talbot WS, Blair IP, Shaw CE, Houart C. Non-nuclear pool of splicing factor SFPQ regulates axonal transcripts required for normal motor development. Neuron. 2017;94:931. doi: 10.1016/j.neuron.2017.04.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
Urban RJ, Bodenburg Y, Kurosky A, Wood TG, Gasic S. Polypyrimidine tract-binding protein-associated splicing factor is a negative regulator of transcriptional activity of the porcine p450scc insulin-like growth factor response element. Molecular Endocrinology. 2000;14:774–782. doi: 10.1210/mend.14.6.0485. [DOI] [PubMed] [Google Scholar]
Vance C, Rogelj B, Hortobágyi T, De Vos KJ, Nishimura AL, Sreedharan J, Hu X, Smith B, Ruddy D, Wright P, Ganesalingam J, Williams KL, Tripathi V, Al-Saraj S, Al-Chalabi A, Leigh PN, Blair IP, Nicholson G, de Belleroche J, Gallo JM, Miller CC, Shaw CE. Mutations in FUS, an RNA processing protein, cause familial amyotrophic lateral sclerosis type 6. Science. 2009;323:1208–1211. doi: 10.1126/science.1165942. [DOI] [PMC free article] [PubMed] [Google Scholar]
Venø MT, Hansen TB, Venø ST, Clausen BH, Grebing M, Finsen B, Holm IE, Kjems J. Spatio-temporal regulation of circular RNA expression during porcine embryonic brain development. Genome Biology. 2015;16:245. doi: 10.1186/s13059-015-0801-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Westholm JO, Miura P, Olson S, Shenker S, Joseph B, Sanfilippo P, Celniker SE, Graveley BR, Lai EC. Genome-wide analysis of Drosophila circular RNAs reveals their structural and sequence properties and age-dependent neural accumulation. Cell Reports. 2014;9:1966–1980. doi: 10.1016/j.celrep.2014.10.062. [DOI] [PMC free article] [PubMed] [Google Scholar]
You X, Vlatkovic I, Babic A, Will T, Epstein I, Tushev G, Akbalik G, Wang M, Glock C, Quedenau C, Wang X, Hou J, Liu H, Sun W, Sambandan S, Chen T, Schuman EM, Chen W. Neural circular RNAs are derived from synaptic genes and regulated by development and plasticity. Nature Neuroscience. 2015;18:603–610. doi: 10.1038/nn.3975. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang XO, Wang HB, Zhang Y, Lu X, Chen LL, Yang L. Complementary sequence-mediated exon circularization. Cell. 2014;159:134–147. doi: 10.1016/j.cell.2014.09.001. [DOI] [PubMed] [Google Scholar]
Zhang Y, Xue W, Li X, Zhang J, Chen S, Zhang JL, Yang L, Chen LL. The biogenesis of nascent circular RNAs. Cell Reports. 2016;15:611–624. doi: 10.1016/j.celrep.2016.03.058. [DOI] [PubMed] [Google Scholar]
Zhang Z, Carmichael GG. The fate of dsRNA in the nucleus: a p54(nrb)-containing complex mediates the nuclear retention of promiscuously A-to-I edited RNAs. Cell. 2001;106:465–475. doi: 10.1016/s0092-8674(01)00466-4. [DOI] [PubMed] [Google Scholar]

eLife. doi: 10.7554/eLife.63088.sa1

Decision letter

Editor: Juan Valcárcel¹

Reviewed by: Juan Valcárcel²

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

We believe that your paper provides relevant mechanistic insights about splicing of long introns, the generation of circular RNAs and the interplay between these two processes.

Decision letter after peer review:

Thank you for submitting your article "The RNA-binding protein SFPQ preserves long-intron splicing and regulates circRNA biogenesis" for consideration by eLife. Your article has been reviewed by three peer reviewers, including Juan Valcárcel as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by James Manley as the Senior Editor.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

As the editors have judged that your manuscript is of interest, but as described below that additional experiments are required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)

Summary:

Some circRNAs are flanked by inverted repeat elements, but many are not and this paper investigates the factors that control this latter class of RNAs. The authors find, using ENCODE circRNA and eCLIP data, as well as their own RIP results, that SFPQ is highly enriched in the introns flanking these circRNAs. siRNA depletion of SFPQ or its interacting partner NONO indeed regulate circRNA levels, concomitant with activation of alternative/cryptic splice sites, upregulation of intron retention and premature polyadenylation. Multivariate regression analyses indicate that upstream intron length and aberrant splicing in the immediate upstream region impact on circRNA biogenesis. The authors propose that a combination of aberrant splicing, premature termination and other elements control the production of circRNAs of the DALI (Distal Alu Long Intron) type.

The manuscript provides new insights into circRNA biosynthesis and efficient splicing of long introns, showcases the complexity of the interplay between various mechanisms for proper RNA processing and provides original leads for the field to pursue. The manuscript however falls short of demonstrating direct effects of SFPQ on circRNA production and of proving the cause-effects of the interplay between RNA processing steps.

Essential revisions:

1) The authors should try to rescue the effects of the knock down by expressing wild type and mutant forms of SFPQ and thus rigorously prove that SFPQ is responsible for these activities and identify critical protein domains involved. They should also use minigenes to show that exon circularization depends on the presence of the SFPQ sites and test the contribution of intronic polyadenylation sites. SFPQ could help avoiding the use of premature cleavage and polyA sites, which are known to alter circRNA formation (Liang et al., 2017). The authors should discuss this in the context of the role of U1 snRNP on usage of intronic PAS (the telescripting mechanism uncovered by Dreyfuss and colleagues). The authors should also determine if U1 snRNA levels are altered as this may be part of the reason for the widespread transcription termination effects that are observed.

2) Removal of SFPQ has very different effects on circRNA levels in HepG2 and HEK293T cells (mostly downregulation of DALI circRNAs) compared to in mouse brain (equal distribution of up- and down-regulated circRNAs). The authors should address this, to start with by looking into the CLIP-seq data from Takeuchi et al., 2018 and explore the binding of SFPQ to the introns flanking the down-regulated vs. up-regulated circRNAs. They should acknowledge the previous work by Takeuchi et al., 2018 on the function of SFPQ in transcription and discuss the drop-off phenotype showing that mice loss of SFPQ impairs transcriptional elongation accompanied by a decrease in RNA Pol II density.

eLife. 2021 Jan 21;10:e63088. doi: 10.7554/eLife.63088.sa2

Author response

Essential revisions:

1) The authors should try to rescue the effects of the knock down by expressing wild type and mutant forms of SFPQ and thus rigorously prove that SFPQ is responsible for these activities and identify critical protein domains involved.

We thank the reviewers for this suggestion. Regarding a SFPQ rescue experiment, we have pursued several approaches. First, we tried targeting SFPQ in the 3’UTR using four different siRNAs followed by co-introduction of a SFPQ WT expression plasmid. However, we were unable to obtain any successful depletion of endogenous SFPQ by this method. Then, we designed a Myc-tagged SFPQ WT expressing plasmid resistant to one of the functional siRNAs (SFPQ siRNA#1) used in this paper. Unfortunately, effective co-transfection of SFPQ WT in conjunction with a 5-day knockdown protocol was challenging due to low transfection efficiency. Instead, we established a short-term 2-day, one-hit rescue setup. Here, western blotting demonstrated successful KD of endogenous SFPQ and Myc-SFPQ WT expression in our two day, one hit KD protocol in HEK293T cells (now included as Figure 4—figure supplement 3). We observed a substantial rescue of our cryptic spliced transcript, DENND1A, supporting a direct role of SFPQ in regulating cryptic splicing. However, the two-day protocol is insufficient to examine the effects on DALI circRNA expression, due to their high level of stability.

Regarding rescue-experiments with SFPQ domain mutants, this is definitely an interesting future avenue of research, however, in the current manuscript, we have not focused on the individual domains of SFPQ and therefore we argue that this is currently beyond the scope of this paper.

They should also use minigenes to show that exon circularization depends on the presence of the SFPQ sites and test the contribution of intronic polyadenylation sites. SFPQ could help avoiding the use of premature cleavage and polyA sites, which are known to alter circRNA formation (Liang et al., 2017). The authors should discuss this in the context of the role of U1 snRNP on usage of intronic PAS (the telescripting mechanism uncovered by Dreyfuss and colleagues). The authors should also determine if U1 snRNA levels are altered as this may be part of the reason for the widespread transcription termination effects that are observed.

We agree that the use of minigenes to show that exon circularization depends on the presence of the SFPQ sites, if successful, would be an important addition to this study. While SFPQ is known to have enriched binding in long introns of target RNAs (Iida et al., 2020; Takeuchi et al., 2018), SFPQ lacks a defined RNA-binding consensus sequence (Iida et al., 2020; 1). This suggests that SFPQ binding specificities may be regulated by a more complex mechanism, where the recruitment mechanism, binding proportion, and co-binding partners may affect SFPQs roles in transcription, splicing or stabilization of pre-mRNAs. As a consequence of SFPQs undefined binding specificity we are unable to generate a mini-gene system where we could confidently establish that exon circularization is dependent on SFPQ binding.

Furthermore, we agree that SFPQ’s ability to preserve long gene transcription through repression of intronic PAS usage is similar to the U1 telescripting mechanism uncovered by Dreyfuss and colleagues.

As suggested by the reviewers, we have conducted northern blotting and qRT-PCR analysis to determine the effect of SFPQ depletion on U1 snRNA levels. Here, U1 snRNP levels were not found to be significantly affected by SFPQ knockdown for both HepG2 and HEK293T cells (new Figure 5—figure supplement 2 ). Based on these observations, we cannot exclude any changes in U1 occupancy or localization upon SFPQ depletion, and the relationship between SFPQ and U1 could prove to be an interesting direction to pursue further. However, our data suggest that the effect of SFPQ depletion on intronic PAS usage cannot solely be explained by U1 shortage. This is now also included in the Discussion:

“Moreover, U1 snRNP abundancy has been described to be important for transcriptional elongation of long, neuronal genes by protecting against aberrant PAS usage (Kaida et al., 2010; Berg et al., 2012; Oh et al., 2017). However, in many cases the observed premature termination upon SFPQ depletion co-appears with an upstream cryptic SA, and the U1 snRNA levels seem unaffected by SFPQ depletion suggesting that the SFPQ phenotype is not mediated entirely by U1 telescripting.”

2) Removal of SFPQ has very different effects on circRNA levels in HepG2 and HEK293T cells (mostly downregulation of DALI circRNAs) compared to in mouse brain (equal distribution of up- and down-regulated circRNAs). The authors should address this, to start with by looking into the CLIP-seq data from Takeuchi et al., 2018 and explore the binding of SFPQ to the introns flanking the down-regulated vs. up-regulated circRNAs. They should acknowledge the previous work by Takeuchi et al., 2018 on the function of SFPQ in transcription and discuss the drop-off phenotype showing that mice loss of SFPQ impairs transcriptional elongation accompanied by a decrease in RNA Pol II density.

We thank the reviewers for this suggestion, and we have now included the eCLIP data from Takeuchi (GSE96081), shown in Figure 3—figure supplement 3G. Consistent with the SFPQ eCLIP from HepG2 cells, DALI circRNAs in the mouse brain data show highly enriched SFPQ binding in the flanking regions. Furthermore, to better grasp the impact of SFPQ on circRNA biogenesis, we now stratify the eCLIP analysis by upstream and downstream binding. Interestingly, this shows that particularly for circRNA with increased expression upon SFPQ repression, SFPQ binds solely to the upstream region, whereas downregulated circRNAs generally show enriched SFPQ binding both up- and downstream (see Figure 3J and Figure 3—figure supplement 3G). Additionally, we also now include the eCLIP data in the final GLM model. Here, most particularly for eCLIP signal in the downstream region, enrichment correlates negatively with circRNA expression upon SFPQ depletion, in mouse and HepG2, emphasizing that SFPQ binding in the flanking regions (mostly downstream) support circRNA expression (see also the schematic model in Figure 6F).

Reference:

1) Van Nostrand EL, Freese P, Pratt GA, Wang X, Wei X, Xiao R, et al. A large-scale binding and functional map of human RNA-binding proteins. Nature. 2020 Jul 29;583(7818):711–9.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

Stagsted L, O'leary E, Hansen T. 2020. The RNA-binding protein SFPQ preserves long-intron splicing and regulates circRNA biogenesis. NCBI Gene Expression Omnibus. GSE157622 [DOI] [PMC free article] [PubMed]
Takeuchi A, Iida K, Tsubota T, Hosokawa M. 2018. The RNA-binding protein Sfpq regulates long neuronal genes in transcriptional elongation. NCBI Gene Expression Omnibus. GSE60246
Takeuchi A, Denawa M, Iida K, Hagiwara M. 2018. The RNA-binding protein Sfpq regulates long neuronal genes in transcriptional elongation [SFPQ_CLIP-seq] NCBI Gene Expression Omnibus. GSE96081

Supplementary Materials

Supplementary file 1. ENCODE RNAseq.

List of ENCODE dataset accession numbers used in RNAseq analysis.

elife-63088-supp1.xlsx^{(10.1KB, xlsx)}

Supplementary file 2. ENCODE eCLIP.

List of ENCODE dataset accession numbers used in eCLIP analysis.

elife-63088-supp2.xlsx^{(36.4KB, xlsx)}

Supplementary file 3. HEK293T RNAseq.

Mapping statistics and annotated output from featureCounts, ciri2 and find_circ using RNAseq on HEK293T cells.

elife-63088-supp3.xlsx^{(12.3MB, xlsx)}

Supplementary file 4. HepG2 RNAseq.

Mapping statistics and annotated output from featureCounts, ciri2 and find_circ using RNAseq on HepG2 cells.

elife-63088-supp4.xlsx^{(19.5MB, xlsx)}

Supplementary file 5. Mouse brain RNAseq.

Mapping statistics and annotated output from featureCounts, ciri2 and find_circ using RNAseq on mouse brain (GSE60246).

elife-63088-supp5.xlsx^{(20.5MB, xlsx)}

Supplementary file 6. HEK293T QuantSeq.

Mapping statistics and output from MACS2 analysis using quantseq data on HEK293T cells.

elife-63088-supp6.xlsx^{(23.3MB, xlsx)}

Supplementary file 7. Primers, probes, and antibodies.

elife-63088-supp7.xlsx^{(14.7KB, xlsx)}

Transparent reporting form

elife-63088-transrepform.pdf^{(183.8KB, pdf)}

Data Availability Statement

Sequencing data has been deposited on GEO (accession no GSE157622), and scripts for RNAseq data processing are available at github: github.com/ncrnalab/pyutils.

Sequencing data has been deposited on GEO (accession no GSE157622).

The following dataset was generated:

Stagsted L, O'leary E, Hansen T. 2020. The RNA-binding protein SFPQ preserves long-intron splicing and regulates circRNA biogenesis. NCBI Gene Expression Omnibus. GSE157622

The following previously published datasets were used:

Takeuchi A, Iida K, Tsubota T, Hosokawa M. 2018. The RNA-binding protein Sfpq regulates long neuronal genes in transcriptional elongation. NCBI Gene Expression Omnibus. GSE60246

Takeuchi A, Denawa M, Iida K, Hagiwara M. 2018. The RNA-binding protein Sfpq regulates long neuronal genes in transcriptional elongation [SFPQ_CLIP-seq] NCBI Gene Expression Omnibus. GSE96081

[bib1] Ajuh P, Kuster B, Panov K, Zomerdijk JC, Mann M, Lamond AI. Functional analysis of the human CDC5L complex and identification of its components by mass spectrometry. The EMBO Journal. 2000;19:6569–6581. doi: 10.1093/emboj/19.23.6569. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib2] Aktaş T, Avşar Ilık İ, Maticzka D, Bhardwaj V, Pessoa Rodrigues C, Mittler G, Manke T, Backofen R, Akhtar A. DHX9 suppresses RNA processing defects originating from the alu invasion of the human genome. Nature. 2017;544:115–119. doi: 10.1038/nature21715. [DOI] [PubMed] [Google Scholar]

[bib3] Ashwal-Fluss R, Meyer M, Pamudurti NR, Ivanov A, Bartok O, Hanan M, Evantal N, Memczak S, Rajewsky N, Kadener S. circRNA biogenesis competes with pre-mRNA splicing. Molecular Cell. 2014;56:55–66. doi: 10.1016/j.molcel.2014.08.019. [DOI] [PubMed] [Google Scholar]

[bib4] Barrett SP, Wang PL, Salzman J. Circular RNA biogenesis can proceed through an exon-containing lariat precursor. eLife. 2015;4:e07540. doi: 10.7554/eLife.07540. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib5] Berg MG, Singh LN, Younis I, Liu Q, Pinto AM, Kaida D, Zhang Z, Cho S, Sherrill-Mix S, Wan L, Dreyfuss G. U1 snRNP determines mRNA length and regulates isoform expression. Cell. 2012;150:53–64. doi: 10.1016/j.cell.2012.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib6] Buxadé M, Morrice N, Krebs DL, Proud CG. The PSF·p54 nrb Complex Is a Novel Mnk Substrate That Binds the mRNA for Tumor Necrosis Factor α. Journal of Biological Chemistry. 2008;283:57–65. doi: 10.1074/jbc.M705286200. [DOI] [PubMed] [Google Scholar]

[bib7] Clemson CM, Hutchinson JN, Sara SA, Ensminger AW, Fox AH, Chess A, Lawrence JB. An architectural role for a nuclear noncoding RNA: neat1 RNA is essential for the structure of paraspeckles. Molecular Cell. 2009;33:717–726. doi: 10.1016/j.molcel.2009.01.026. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib8] Conn SJ, Pillman KA, Toubia J, Conn VM, Salmanidis M, Phillips CA, Roslan S, Schreiber AW, Gregory PA, Goodall GJ. The RNA binding protein quaking regulates formation of circRNAs. Cell. 2015;160:1125–1134. doi: 10.1016/j.cell.2015.02.014. [DOI] [PubMed] [Google Scholar]

[bib9] Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] Dong B, Horowitz DS, Kobayashi R, Krainer AR. Purification and cDNA cloning of HeLa cell p54nrb, a nuclear protein with two RNA recognition motifs and extensive homology to human splicing factor PSF and Drosophila NONA/BJ6. Nucleic Acids Research. 1993;21:4085–4092. doi: 10.1093/nar/21.17.4085. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib11] Dubin RA, Kazmi MA, Ostrer H. Inverted repeats are necessary for circularization of the mouse testis sry transcript. Gene. 1995;167:245–248. doi: 10.1016/0378-1119(95)00639-7. [DOI] [PubMed] [Google Scholar]

[bib12] Emili A, Shales M, McCracken S, Xie W, Tucker PW, Kobayashi R, Blencowe BJ, Ingles CJ. Splicing and transcription-associated proteins PSF and p54nrb/nonO bind to the RNA polymerase II CTD. RNA. 2002;8:1102–1111. doi: 10.1017/S1355838202025037. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib13] ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] Errichelli L, Dini Modigliani S, Laneve P, Colantoni A, Legnini I, Capauto D, Rosa A, De Santis R, Scarfò R, Peruzzi G, Lu L, Caffarelli E, Shneider NA, Morlando M, Bozzoni I. FUS affects circular RNA expression in murine embryonic stem cell-derived motor neurons. Nature Communications. 2017;8:14741. doi: 10.1038/ncomms14741. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] Fox AH, Nakagawa S, Hirose T, Bond CS. Paraspeckles: where long noncoding RNA meets phase separation. Trends in Biochemical Sciences. 2018;43:124–135. doi: 10.1016/j.tibs.2017.12.001. [DOI] [PubMed] [Google Scholar]

[bib16] Furukawa MT, Sakamoto H, Inoue K. Interaction and colocalization of HERMES/RBPMS with NonO, PSF, and G3BP1 in neuronal cytoplasmic RNP granules in mouse retinal line cells. Genes to Cells. 2015;20:257–266. doi: 10.1111/gtc.12224. [DOI] [PubMed] [Google Scholar]

[bib17] Gao Y, Wang J, Zhao F. CIRI: an efficient and unbiased algorithm for de novo circular RNA identification. Genome Biology. 2015;16 doi: 10.1186/s13059-014-0571-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib18] Gozani O, Patton JG, Reed R. A novel set of spliceosome-associated proteins and the essential splicing factor PSF bind stably to pre-mRNA prior to catalytic step II of the splicing reaction. The EMBO Journal. 1994;13:3356–3367. doi: 10.1002/j.1460-2075.1994.tb06638.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib19] Gunderson SI, Polycarpou-Schwarz M, Mattaj IW. U1 snRNP inhibits pre-mRNA polyadenylation through a direct interaction between U1 70K and poly(A) polymerase. Molecular Cell. 1998;1:255–264. doi: 10.1016/S1097-2765(00)80026-X. [DOI] [PubMed] [Google Scholar]

[bib20] Hansen TB, Jensen TI, Clausen BH, Bramsen JB, Finsen B, Damgaard CK, Kjems J. Natural RNA circles function as efficient microRNA sponges. Nature. 2013;495:384–388. doi: 10.1038/nature11993. [DOI] [PubMed] [Google Scholar]

[bib21] Hansen TB. Characterization of circular RNA concatemers. Methods in Molecular Biology. 2018;1724:143–157. doi: 10.1007/978-1-4939-7562-4_12. [DOI] [PubMed] [Google Scholar]

[bib22] Henriques T, Gilchrist DA, Nechaev S, Bern M, Muse GW, Burkholder A, Fargo DC, Adelman K. Stable pausing by RNA polymerase II provides an opportunity to target and integrate regulatory signals. Molecular Cell. 2013;52:517–528. doi: 10.1016/j.molcel.2013.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib23] Hosokawa M, Takeuchi A, Tanihata J, Iida K, Takeda S, Hagiwara M. Loss of RNA-Binding protein sfpq causes Long-Gene transcriptopathy in skeletal muscle and severe muscle mass reduction with metabolic myopathy. iScience. 2019;13:229–242. doi: 10.1016/j.isci.2019.02.023. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib24] Iida K, Hagiwara M, Takeuchi A. Multilateral bioinformatics analyses reveal the Function-Oriented target specificities and recognition of the RNA-Binding protein SFPQ. iScience. 2020;23:101325. doi: 10.1016/j.isci.2020.101325. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib25] Ishigaki S, Fujioka Y, Okada Y, Riku Y, Udagawa T, Honda D, Yokoi S, Endo K, Ikenaka K, Takagi S, Iguchi Y, Sahara N, Takashima A, Okano H, Yoshida M, Warita H, Aoki M, Watanabe H, Okado H, Katsuno M, Sobue G. Altered tau isoform ratio caused by loss of FUS and SFPQ function leads to FTLD-like phenotypes. Cell Reports. 2017;18:1118–1131. doi: 10.1016/j.celrep.2017.01.013. [DOI] [PubMed] [Google Scholar]

[bib26] Ito T, Watanabe H, Yamamichi N, Kondo S, Tando T, Haraguchi T, Mizutani T, Sakurai K, Fujita S, Izumi T, Isobe T, Iba H. Brm transactivates the telomerase reverse transcriptase (TERT) gene and modulates the splicing patterns of its transcripts in concert with p54nrb. Biochemical Journal. 2008;411:201–209. doi: 10.1042/BJ20071075. [DOI] [PubMed] [Google Scholar]

[bib27] Ivanov A, Memczak S, Wyler E, Torti F, Porath HT, Orejuela MR, Piechotta M, Levanon EY, Landthaler M, Dieterich C, Rajewsky N. Analysis of intron sequences reveals hallmarks of circular RNA biogenesis in animals. Cell Reports. 2015;10:170–177. doi: 10.1016/j.celrep.2014.12.019. [DOI] [PubMed] [Google Scholar]

[bib28] Jeck WR, Sorrentino JA, Wang K, Slevin MK, Burd CE, Liu J, Marzluff WF, Sharpless NE. Circular RNAs are abundant, conserved, and associated with ALU repeats. RNA. 2013;19:141–157. doi: 10.1261/rna.035667.112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib29] Kaida D, Berg MG, Younis I, Kasim M, Singh LN, Wan L, Dreyfuss G. U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature. 2010;468:664–668. doi: 10.1038/nature09479. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib30] Kameoka S, Duque P, Konarska MM. p54(nrb) associates with the 5' splice site within large transcription/splicing complexes. The EMBO Journal. 2004;23:1782–1791. doi: 10.1038/sj.emboj.7600187. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib31] Kaneko S, Rozenblatt-Rosen O, Meyerson M, Manley JL. The multifunctional protein p54nrb/PSF recruits the exonuclease XRN2 to facilitate pre-mRNA 3' processing and transcription termination. Genes & Development. 2007;21:1779–1789. doi: 10.1101/gad.1565207. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib32] Knott GJ, Lee M, Passon DM, Fox AH, Bond CS. Caenorhabditis elegans NONO-1: insights into DBHS protein structure, architecture, and function. Protein Science. 2015;24:2033–2043. doi: 10.1002/pro.2816. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib33] Knott GJ, Bond CS, Fox AH. The DBHS proteins SFPQ, NONO and PSPC1: a multipurpose molecular scaffold. Nucleic Acids Research. 2016;44:3989–4004. doi: 10.1093/nar/gkw271. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib34] Kramer MC, Liang D, Tatomer DC, Gold B, March ZM, Cherry S, Wilusz JE. Combinatorial control of Drosophila circular RNA expression by intronic repeats, hnRNPs, and SR proteins. Genes & Development. 2015;29:2168–2182. doi: 10.1101/gad.270421.115. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib35] Kwiatkowski TJ, Bosco DA, Leclerc AL, Tamrazian E, Vanderburg CR, Russ C, Davis A, Gilchrist J, Kasarskis EJ, Munsat T, Valdmanis P, Rouleau GA, Hosler BA, Cortelli P, de Jong PJ, Yoshinaga Y, Haines JL, Pericak-Vance MA, Yan J, Ticozzi N, Siddique T, McKenna-Yasek D, Sapp PC, Horvitz HR, Landers JE, Brown RH. Mutations in the FUS/TLS gene on chromosome 16 cause familial amyotrophic lateral sclerosis. Science. 2009;323:1205–1208. doi: 10.1126/science.1166066. [DOI] [PubMed] [Google Scholar]

[bib36] Lagier-Tourenne C, Polymenidou M, Hutt KR, Vu AQ, Baughn M, Huelga SC, Clutario KM, Ling SC, Liang TY, Mazur C, Wancewicz E, Kim AS, Watt A, Freier S, Hicks GG, Donohue JP, Shiue L, Bennett CF, Ravits J, Cleveland DW, Yeo GW. Divergent roles of ALS-linked proteins FUS/TLS and TDP-43 intersect in processing long pre-mRNAs. Nature Neuroscience. 2012;15:1488–1497. doi: 10.1038/nn.3230. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib37] Lander ES. Initial sequencing and analysis of the human genome. Nature. 2011;409 doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]

[bib38] Lee M, Sadowska A, Bekere I, Ho D, Gully BS, Lu Y, Iyer KS, Trewhella J, Fox AH, Bond CS. The structure of human SFPQ reveals a coiled-coil mediated polymer essential for functional aggregation in gene regulation. Nucleic Acids Research. 2015;43:3826–3840. doi: 10.1093/nar/gkv156. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib39] Liang D, Tatomer DC, Luo Z, Wu H, Yang L, Chen LL, Cherry S, Wilusz JE. The output of Protein-Coding genes shifts to circular RNAs when the Pre-mRNA processing machinery is limiting. Molecular Cell. 2017;68:940–954. doi: 10.1016/j.molcel.2017.10.034. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib40] Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30:923–930. doi: 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]

[bib41] Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib42] Luisier R, Tyzack GE, Hall CE, Mitchell JS, Devine H, Taha DM, Malik B, Meyer I, Greensmith L, Newcombe J, Ule J, Luscombe NM, Patani R. Intron retention and nuclear loss of SFPQ are molecular hallmarks of ALS. Nature Communications. 2018;9:2010. doi: 10.1038/s41467-018-04373-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib43] Makarov EM, Makarova OV, Urlaub H, Gentzel M, Will CL, Wilm M, Lührmann R. Small nuclear ribonucleoprotein remodeling during catalytic activation of the spliceosome. Science. 2002;298:2205–2208. doi: 10.1126/science.1077783. [DOI] [PubMed] [Google Scholar]

[bib44] Memczak S, Jens M, Elefsinioti A, Torti F, Krueger J, Rybak A, Maier L, Mackowiak SD, Gregersen LH, Munschauer M, Loewer A, Ziebold U, Landthaler M, Kocks C, le Noble F, Rajewsky N. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature. 2013;495:333–338. doi: 10.1038/nature11928. [DOI] [PubMed] [Google Scholar]

[bib45] Nakaya T, Alexiou P, Maragkakis M, Chang A, Mourelatos Z. FUS regulates genes coding for RNA-binding proteins in neurons by binding to their highly conserved introns. RNA. 2013;19:498–509. doi: 10.1261/rna.037804.112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib46] Oh JM, Di C, Venters CC, Guo J, Arai C, So BR, Pinto AM, Zhang Z, Wan L, Younis I, Dreyfuss G. U1 snRNP telescripting regulates a size-function-stratified human genome. Nature Structural & Molecular Biology. 2017;24:993–999. doi: 10.1038/nsmb.3473. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib47] Palangat M, Hittinger CT, Landick R. Downstream DNA selectively affects a paused conformation of human RNA polymerase II. Journal of Molecular Biology. 2004;341:429–442. doi: 10.1016/j.jmb.2004.06.009. [DOI] [PubMed] [Google Scholar]

[bib48] Passon DM, Lee M, Rackham O, Stanley WA, Sadowska A, Filipovska A, Fox AH, Bond CS. Structure of the heterodimer of human NONO and paraspeckle protein component 1 and analysis of its role in subnuclear body formation. PNAS. 2012;109:4846–4850. doi: 10.1073/pnas.1120792109. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib49] Patton JG, Porro EB, Galceran J, Tempst P, Nadal-Ginard B. Cloning and characterization of PSF, a novel pre-mRNA splicing factor. Genes & Development. 1993;7:393–406. doi: 10.1101/gad.7.3.393. [DOI] [PubMed] [Google Scholar]

[bib50] Peng R, Dye BT, Pérez I, Barnard DC, Thompson AB, Patton JG. PSF and p54nrb bind a conserved stem in U5 snRNA. RNA. 2002;8:1334–1347. doi: 10.1017/S1355838202022070. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib51] Ragan C, Goodall GJ, Shirokikh NE, Preiss T. Insights into the biogenesis and potential functions of exonic circular RNA. Scientific Reports. 2019;9:2048. doi: 10.1038/s41598-018-37037-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib52] Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, Brugmann SA, Goodnough LH, Helms JA, Farnham PJ, Segal E, Chang HY. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell. 2007;129:1311–1323. doi: 10.1016/j.cell.2007.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib53] Rosonina E, Ip JY, Calarco JA, Bakowski MA, Emili A, McCracken S, Tucker P, Ingles CJ, Blencowe BJ. Role for PSF in mediating transcriptional activator-dependent stimulation of pre-mRNA processing in vivo. Molecular and Cellular Biology. 2005;25:6734–6746. doi: 10.1128/MCB.25.15.6734-6746.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib54] Rybak-Wolf A, Stottmeister C, Glažar P, Jens M, Pino N, Giusti S, Hanan M, Behm M, Bartok O, Ashwal-Fluss R, Herzog M, Schreyer L, Papavasileiou P, Ivanov A, Öhman M, Refojo D, Kadener S, Rajewsky N. Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed. Molecular Cell. 2015;58:870–885. doi: 10.1016/j.molcel.2015.03.027. [DOI] [PubMed] [Google Scholar]

[bib55] Salzman J, Gawad C, Wang PL, Lacayo N, Brown PO. Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLOS ONE. 2012;7:e30733. doi: 10.1371/journal.pone.0030733. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib56] Salzman J, Chen RE, Olsen MN, Wang PL, Brown PO. Cell-type specific features of circular RNA expression. PLOS Genetics. 2013;9:e1003777. doi: 10.1371/journal.pgen.1003777. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib57] Stagsted LV, Nielsen KM, Daugaard I, Hansen TB. Noncoding AUG circRNAs constitute an abundant and conserved subclass of circles. Life Science Alliance. 2019;2:e201900398. doi: 10.26508/lsa.201900398. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib58] Szabo L, Morey R, Palpant NJ, Wang PL, Afari N, Jiang C, Parast MM, Murry CE, Laurent LC, Salzman J. Statistically based splicing detection reveals neural enrichment and tissue-specific induction of circular RNA during human fetal development. Genome Biology. 2015;16:126. doi: 10.1186/s13059-015-0690-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib59] Takeuchi A, Iida K, Tsubota T, Hosokawa M, Denawa M, Brown JB, Ninomiya K, Ito M, Kimura H, Abe T, Kiyonari H, Ohno K, Hagiwara M. Loss of sfpq causes Long-Gene transcriptopathy in the brain. Cell Reports. 2018;23:1326–1341. doi: 10.1016/j.celrep.2018.03.141. [DOI] [PubMed] [Google Scholar]

[bib60] Thomas-Jinu S, Gordon PM, Fielding T, Taylor R, Smith BN, Snowden V, Blanc E, Vance C, Topp S, Wong CH, Bielen H, Williams KL, McCann EP, Nicholson GA, Pan-Vazquez A, Fox AH, Bond CS, Talbot WS, Blair IP, Shaw CE, Houart C. Non-nuclear pool of splicing factor SFPQ regulates axonal transcripts required for normal motor development. Neuron. 2017;94:931. doi: 10.1016/j.neuron.2017.04.036. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib61] Urban RJ, Bodenburg Y, Kurosky A, Wood TG, Gasic S. Polypyrimidine tract-binding protein-associated splicing factor is a negative regulator of transcriptional activity of the porcine p450scc insulin-like growth factor response element. Molecular Endocrinology. 2000;14:774–782. doi: 10.1210/mend.14.6.0485. [DOI] [PubMed] [Google Scholar]

[bib62] Vance C, Rogelj B, Hortobágyi T, De Vos KJ, Nishimura AL, Sreedharan J, Hu X, Smith B, Ruddy D, Wright P, Ganesalingam J, Williams KL, Tripathi V, Al-Saraj S, Al-Chalabi A, Leigh PN, Blair IP, Nicholson G, de Belleroche J, Gallo JM, Miller CC, Shaw CE. Mutations in FUS, an RNA processing protein, cause familial amyotrophic lateral sclerosis type 6. Science. 2009;323:1208–1211. doi: 10.1126/science.1165942. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib63] Venø MT, Hansen TB, Venø ST, Clausen BH, Grebing M, Finsen B, Holm IE, Kjems J. Spatio-temporal regulation of circular RNA expression during porcine embryonic brain development. Genome Biology. 2015;16:245. doi: 10.1186/s13059-015-0801-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib64] Westholm JO, Miura P, Olson S, Shenker S, Joseph B, Sanfilippo P, Celniker SE, Graveley BR, Lai EC. Genome-wide analysis of Drosophila circular RNAs reveals their structural and sequence properties and age-dependent neural accumulation. Cell Reports. 2014;9:1966–1980. doi: 10.1016/j.celrep.2014.10.062. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib65] You X, Vlatkovic I, Babic A, Will T, Epstein I, Tushev G, Akbalik G, Wang M, Glock C, Quedenau C, Wang X, Hou J, Liu H, Sun W, Sambandan S, Chen T, Schuman EM, Chen W. Neural circular RNAs are derived from synaptic genes and regulated by development and plasticity. Nature Neuroscience. 2015;18:603–610. doi: 10.1038/nn.3975. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib66] Zhang XO, Wang HB, Zhang Y, Lu X, Chen LL, Yang L. Complementary sequence-mediated exon circularization. Cell. 2014;159:134–147. doi: 10.1016/j.cell.2014.09.001. [DOI] [PubMed] [Google Scholar]

[bib67] Zhang Y, Xue W, Li X, Zhang J, Chen S, Zhang JL, Yang L, Chen LL. The biogenesis of nascent circular RNAs. Cell Reports. 2016;15:611–624. doi: 10.1016/j.celrep.2016.03.058. [DOI] [PubMed] [Google Scholar]

[bib68] Zhang Z, Carmichael GG. The fate of dsRNA in the nucleus: a p54(nrb)-containing complex mediates the nuclear retention of promiscuously A-to-I edited RNAs. Cell. 2001;106:465–475. doi: 10.1016/s0092-8674(01)00466-4. [DOI] [PubMed] [Google Scholar]

PERMALINK

The RNA-binding protein SFPQ preserves long-intron splicing and regulates circRNA biogenesis in mammals

Lotte Victoria Winther Stagsted

Eoghan Thomas O'Leary

Karoline Kragh Ebbesen

Thomas Birkballe Hansen

Roles

Abstract

Introduction

Results

The DALI circRNAs are defined by long flanking introns and distal inverted Alu elements

Figure 1. Characteristics of DALI-circRNA.

Figure 1—figure supplement 1. circRNAome in HepG2 and K562 from ENCODE.

SFPQ and NONO are specifically enriched in the introns flanking DALI circRNAs

Figure 2. SFPQ and NONO show enriched binding in the flanking regions of DALI circRNAs.

Figure 2—figure supplement 1. SFPQ and NONO enriched on circRNA flanking introns.

Figure 2—figure supplement 2. RNA immunoprecipitation of SFPQ and NONO confirms enrichment.

SFPQ depletion represses DALI circRNAs production

Figure 3. Knockdown of SFPQ affects DALI circRNAs.

Figure 3—figure supplement 1. SFPQ/NONO-depletion in HEK293T and HepG2 cells.

Figure 3—figure supplement 2. Expression profiles for selected circRNAs.

Figure 3—figure supplement 3. CircRNAome analysis of SFPQ knockout mouse brain data (GSE60246).

SFPQ depletion affects alternative splicing and intron retention in long genes

Figure 4. SFPQ ensures long-gene expression and suppresses cryptic splicing.

Figure 4—figure supplement 1. Genic expression profile for selected long genes.

Figure 4—figure supplement 2. SFPQ ensures long-gene expression (HEK293T + MOUSE).

Figure 4—figure supplement 3. SFPQ co-expression rescue cryptic splicing.

SFPQ depletion results in premature termination events

Figure 5. SFPQ depletion activates intronic polyA signal and premature termination.

Figure 5—figure supplement 1. Quantseq analysis.

Figure 5—figure supplement 2. U1 snRNA abundance upon SFPQ knockdown.

Figure 5—figure supplement 3. circRNAs in kmeans clusters.

circRNA deregulation is not explained by premature termination

Extracting features important for circRNA biogenesis

Figure 6. Multiple features contribute to circRNA regulation by SFPQ.

Figure 6—figure supplement 1. HepG2 features.

Figure 6—figure supplement 2. Mouse brain features.

Figure 6—figure supplement 3. HEK293T features and GLM model performance.

Figure 6—figure supplement 4. GLM model performance.

Discussion

Materials and methods

Key resources table.

Cell lines and transfections

RT-PCR and RT-qPCR

Poly(A) enrichment

RNA sequencing

RNA-immunoprecipitation (RIP)

Western blotting

Mapping, circRNA detection, and quantification

Cryptic/alternative splicing and intron retention

eCLIP analysis

Quantseq analysis

Differential gene expression

PAGE northern blot

Data accessibility

Acknowledgements

Funding Statement

Contributor Information

Funding Information

Additional information

Competing interests

Author contributions

Additional files

Data availability

References

Decision letter

Roles

Author response

Associated Data

Data Citations

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases