Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2015 Mar 2;112(11):E1191–E1200. doi: 10.1073/pnas.1416879112

Dramatically reduced spliceosome in Cyanidioschyzon merolae

Martha R Stark a, Elizabeth A Dunn b, William S C Dunn a, Cameron J Grisdale c, Anthony R Daniele a, Matthew R G Halstead a, Naomi M Fast c, Stephen D Rader a,b,1
PMCID: PMC4371933  PMID: 25733880

Significance

The spliceosome—the molecular particle responsible for removing interrupting sequences from eukaryotic messenger RNA—is one of the most complex cellular machines. Consisting of five snRNAs and over 200 proteins in humans, its numerous changes in composition and shape during splicing have made it difficult to study. We have characterized an algal spliceosome that is much smaller, with only 43 identifiable core proteins, the majority of which are essential for viability in other organisms. We propose that this highly reduced spliceosome has retained only the most critical splicing factors. Cyanidioschyzon merolae therefore provides a powerful system to examine the spliceosome’s catalytic core, enabling future advances in understanding the splicing mechanism and spliceosomal organization that are challenging in more complex systems.

Keywords: pre-mRNA splicing, spliceosome core, U1 snRNP, genome reduction, splicing mechanism

Abstract

The human spliceosome is a large ribonucleoprotein complex that catalyzes pre-mRNA splicing. It consists of five snRNAs and more than 200 proteins. Because of this complexity, much work has focused on the Saccharomyces cerevisiae spliceosome, viewed as a highly simplified system with fewer than half as many splicing factors as humans. Nevertheless, it has been difficult to ascribe a mechanistic function to individual splicing factors or even to discern which are critical for catalyzing the splicing reaction. We have identified and characterized the splicing machinery from the red alga Cyanidioschyzon merolae, which has been reported to harbor only 26 intron-containing genes. The U2, U4, U5, and U6 snRNAs contain expected conserved sequences and have the ability to adopt secondary structures and form intermolecular base-pairing interactions, as in other organisms. C. merolae has a highly reduced set of 43 identifiable core splicing proteins, compared with ∼90 in budding yeast and ∼140 in humans. Strikingly, we have been unable to find a U1 snRNA candidate or any predicted U1-associated proteins, suggesting that splicing in C. merolae may occur without the U1 small nuclear ribonucleoprotein particle. In addition, based on mapping the identified proteins onto the known splicing cycle, we propose that there is far less compositional variability during splicing in C. merolae than in other organisms. The observed reduction in splicing factors is consistent with the elimination of spliceosomal components that play a peripheral or modulatory role in splicing, presumably retaining those with a more central role in organization and catalysis.


Pre-mRNA splicing occurs by two transesterification reactions that are catalyzed by the spliceosome, a large macromolecular assembly of five snRNAs and more than 200 proteins in humans (1). These components are thought to assemble onto each new pre-mRNA transcript in an ordered fashion through the recognition and binding of three highly conserved sequences in the transcript: the 5′ splice site, the branch site, and the 3′ splice site (2, 3). Some of these interactions occur via direct RNA/RNA base pairing between the transcript and snRNAs; for example, both U1 and U6 snRNAs base pair to the 5′ splice site of the pre-mRNA transcript, and, similarly, U2 snRNA base pairs to the branch site (3).

Given the complexity of the human spliceosome, it is of considerable interest to find a more tractable splicing system with fewer components to study the core processes of splicing (assembly, catalysis, and fidelity). The Saccharomyces cerevisiae (yeast) spliceosome has been proposed as a simplified model system, because it contains only about 100 proteins (4). Indeed, substantial progress in understanding the spliceosome has been made by studying yeast splicing (3, 5). Nevertheless, the yeast spliceosome is still a highly complex system in which to investigate the role of individual proteins, let alone attempt to develop a completely defined splicing system for more incisive experiments. The publication of the Cyanidioschyzon merolae genome sequence revealed that it has only 27 introns (6), indicating that it might be a simpler system in which to investigate splicing.

C. merolae is an acidophilic, unicellular red alga that grows at temperatures of up to 56 °C (6). At 16.5 million base pairs, its genome is similar in size to that of S. cerevisiae and contains a comparable number of genes; however only one tenth as many introns were annotated in C. merolae: 26 intron-containing genes, 0.5% of the genome (6). The small number of introns in C. merolae raises the questions of whether the full complexity of the canonical splicing machinery has been maintained or whether C. merolae also harbors a reduced set of splicing factors.

We have undertaken a comprehensive bioinformatic survey of the C. merolae splicing machinery, identifying four snRNAs (U2, U4, U5, and U6) and 69 splicing proteins, 43 of which are predicted to be associated with small nuclear ribonucleoprotein (snRNPs) or part of complexes that associate directly with the spliceosome. Surprisingly, we were not able to identify any candidates for the U1 snRNA or U1-associated proteins, leading us to conclude that C. merolae does not contain a U1 snRNP. The profile of splicing proteins retained in C. merolae provides a means of assessing the contribution of specific proteins to the splicing reaction, at least insofar as the missing ones clearly are not always essential for splicing. The U2 snRNP is the most complex particle, with 10 proteins present in addition to the Sm proteins. Many U5-associated proteins also are retained, whereas the Prp19/CDC5L complex (NTC), step-specific proteins, and tri-snRNP–specific proteins are largely predicted to be absent. Indeed, the C. merolae spliceosome is notable for the almost complete absence of proteins that join or leave the spliceosome after B complex formation, suggesting that the initial core contains all the components required to carry out the splicing reaction. The novelty of a splicing system reduced to ∼40 proteins suggests that C. merolae may be a more tractable system for studying the central processes of pre-mRNA splicing.

Results

C. merolae Transcripts Are Spliced.

Twenty-seven introns were identified computationally in the C. merolae genome (figure S2 in ref. 6). To ascertain whether the intron-containing transcripts are in fact spliced, we performed RT-PCR on DNase-treated total RNA. Each primer pair, except for those for gene CMK245C, resulted in two PCR products, a longer intron-containing amplicon and a shorter amplicon corresponding to spliced mRNA (Fig. 1). The CMK245C transcript appears to be completely spliced. Two junctions (CMQ117C and CMO094C) were confirmed by sequencing reverse transcriptase products, and the remaining introns were observed to splice at their expected junctions based on our sequencing of background mRNAs in the RNA immunoprecipitation sequencing (RIP-seq) Illumina library (see below and SI Materials and Methods).

Fig. 1.

Fig. 1.

Amplification of pre-mRNA and mRNA by RT-PCR for all 26 C. merolae intron-containing genes. Gene names are listed above each lane. “L” indicates the 100-bp DNA ladder, “−” indicates the negative control with no reverse transcriptase. Expected amplicon sizes for pre-mRNA and mRNA spliced products are given below. The 500- and 1,000-bp ladder bands are indicated on the left.

Identification of snRNAs.

Given that intron-containing transcripts are spliced in C. merolae, we sought to identify components of the splicing apparatus, beginning with the snRNAs. BLAST searches were unsuccessful, presumably because of the small size of snRNAs (generally ∼150 nt), so we turned to Infernal (7) to take advantage of conserved secondary structure elements in the search. U2 and U4 snRNA candidates were identified using Infernal single-covariance model searches, and U5 and U6 candidates were identified using multiple covariance models. Unexpectedly, we found no candidates for U1 snRNA despite repeated searches with a variety of parameters.

To test expression of the predicted snRNAs, we performed Northern analysis on total RNA extracted from C. merolae. Fig. 2A demonstrates that each of the identified snRNAs was expressed and that their sizes corresponded to those predicted bioinformatically, except for U5, which was substantially larger than predicted.

Fig. 2.

Fig. 2.

Analysis of C. merolae snRNA expression and base pairing. (A) Denaturing Northern analysis of snRNA candidates. C. merolae total RNA was probed for each snRNA, as indicated. The leftmost lane contains S. cerevisiae total RNA probed for all five snRNAs as size markers, with sizes indicated on the left. (B) Nondenaturing Northern analysis of U4/U6 base-pairing status. Total RNA from C. merolae and S. cerevisiae was probed for U4 or U6, as labeled, with alternate lanes either heat treated (+) or not (−). Band identity is shown at left.

We determined the snRNA ends by a combination of sequencing and alignment with known snRNAs. The ends were similar to our predictions, confirming the unexpectedly large U5, which had 111- and 168-nt extensions on either side of the conserved central core (see Table S1 for snRNA accession numbers and sequences). Sequence alignments between the C. merolae candidates and snRNAs from other organisms are shown in Fig. S1, demonstrating sequence conservation in those regions that also are conserved among other organisms. For example, the branch site-binding region of U2 is conserved in C. merolae and is complementary to annotated branch sites in the genome (underlined in Fig. S1) (3). Similarly, the U6 ACAGAGA sequence is present—albeit with a change to ACUGAGA—as is U5 loop I (8, 9). In addition, U2, U4, and U5 each have recognizable Sm protein-binding sites, whereas U6 has a binding site for the like Sm (LSm) proteins (underlined in Fig. S1) (3).

Differences between C. merolae snRNAs and those of other species include the extended U5, slightly longer U4 and U6, and a shorter U2 snNRA. U4 has an 18-nt insertion, relative to the human sequence, starting at nucleotide 77 (Fig. S1). This insertion falls in the central domain of U4, in the middle of the hypothetical U4/U6 stem III, which is evolutionarily conserved, but for which there is no experimental support (10, 11). In addition, the 5′ stem-loop of U6 is ∼10 bp longer than that in other species (Fig. S1). Finally, U2 snRNA lacks ∼60 nucleotides from the 3′ end, after the Sm-binding site, a region that generally forms one or more stem-loops in other organisms (12). No C. merolae snRNAs have stem-loops predicted downstream of their Sm site.

U4 and U6 snRNAs in other organisms are known to form an extended base-pairing interaction, allowing them to comigrate on a nondenaturing gel. To test whether this interaction also occurs in C. merolae, we compared cold phenol-extracted total RNA (Fig. 2B, lanes 1 and 3) with the same samples incubated for 5 min at 70 °C to disrupt base pairing (Fig. 2B, lanes 2 and 4). Heat treatment resulted in the disappearance of the low-mobility band detected by both U4 (lane 1) and U6 (lane 3) probes and the appearance of higher-mobility bands detected by U4 (lane 2) and U6 (lane 4) probes. Control experiments with S. cerevisiae snRNAs (lanes 5–8) demonstrate the analogous interaction. Our results are consistent with the identification of these snRNAs as U4 and U6 and with the conservation of their ability to form extensive base-pairing interactions (Fig. 3 and discussed below).

Fig. 3.

Fig. 3.

Predicted secondary structures of C. merolae snRNAs (AD) with S. cerevisiae structures depicted schematically for comparison. The alternative U2 toggle structure (13), in which stem IIc replaces stem IIa, is shown in the Inset in A. The conserved core of U5 (C) extends from nucleotides 112–282. S. cerevisiae secondary structure models are based on refs. 12 and 74 for U2 (A), on ref. 75 for U4 (B), on ref. 76 for the conserved portion of U5 (C), and on ref. 17 for U6 (D). The U2 branch site-binding region is underlined, and the Sm- and LSm-binding sites are in gray boxes.

snRNA Secondary Structure.

Conservation of the secondary structure is a useful criterion supporting homology. To determine whether the candidate snRNAs in C. merolae were capable of adopting secondary structures similar to those of their S. cerevisiae counterparts, we manually folded them into the experimentally determined secondary structures of homologous snRNAs. As shown in Fig. 3, the individual snRNAs are capable of forming structures similar to those formed by their homologs. U2 can adopt the two conformations proposed to be involved in progression through the splicing cycle (Fig. 3A) (13). The insertion in U4 has the potential to form an extended stem that interrupts the central domain (Fig. 3B) as well as a 5′ kissing loop present in S. cerevisiae. The central core of the C. merolae U5 sequence also is consistent with proposed secondary structures from other organisms (Fig. 3C) (14). The secondary structures of U5’s 5′ and 3′ extensions are unique to C. merolae and therefore cannot be compared with other U5s, although the 3′ end can adopt a characteristic stem-loop adjacent to the predicted Sm site. The 5′ and 3′ extensions are highly complementary to one another, and we propose that they form long-range stems, based on sfold analysis (15). Finally, U6 snRNA can form the 3′ internal stem loop known to be present in the active spliceosome as well as the conserved 5′ stem-loop (Fig. 3D) (16, 17).

Notably, these putative C. merolae snRNAs have the potential to form the same intermolecular base-pairing interactions as their counterparts. For example, as in S. cerevisiae, U4 and U6 in C. merolae have complementary sequences in stems I, II, and III, (Fig. S2A) (8). Similarly, U2 and U6 can base pair to each other and to the corresponding regions of the substrate transcript in either the three-helix form (Fig. S2B) (18) or the four-helix form (Fig. S2C) (19). Finally, U5 has the potential to form interactions with C. merolae exons similar to those seen in other organisms (Fig. S2D) (9). Overall, the C. merolae snRNAs appear to be highly conserved in sequence and base-pairing interactions.

snRNA Stability.

C. merolae is a moderate thermophile, raising the possibility that its snRNAs have evolved greater intrinsic stability to avoid denaturing under these conditions, as has been shown in a number of other cases (20). To test this possibility, we compared the calculated stability of each C. merolae snRNA stem with the homologous stems in S. cerevisiae and human snRNAs. Each snRNA has at least one stem that is substantially more stable in C. merolae than in yeast or humans. U2’s stem IIb (Fig. 3A) is 30–50 kJ/mol more stable than the orthologous stems in yeast and humans, and U2 stem IIc is 40–50 kJ/mol more stable than the corresponding structures (Table S2). Similarly, the U6 5′ stem-loop has a predicted ΔG of −128 kJ/mol, compared with only −49 and −51 kJ/mol for human and yeast, respectively. U5’s stem Ia and variable stem loop (VSL) are both substantially more stable than their orthologous stems, and U4 stem II also is more stable than that in yeast and humans. In contrast, many of the other snRNA stems have comparable, and sometimes less, predicted stability in C. merolae than in yeast or humans (Table S2), indicating that greater stability is not a universal feature of C. merolae snRNA. Although the intramolecular stems show a pattern of increased stability in certain cases, intermolecular stems between U4/U6, U2/U6, and U2/intron do not differ substantially in stability between C. merolae and the other organisms (Table S2).

C. merolae Splicing Proteins.

Despite the conservation of the sequence and secondary structure of U2, U4, U5, and U6, we were unable to identify a C. merolae U1 snRNA. In other organisms, U1 snRNA is associated with a number of snRNP-specific proteins in addition to the Sm proteins, which are common to all snRNAs except U6. These proteins include U1-70K (found in all organisms tested), U1-A, U1-C, and a variety of other proteins (1, 4).

To determine whether U1-associated proteins are present in C. merolae, which in turn would support the existence of the U1 snRNA, we performed BLAST searches (21) using Reciprocal Best Hit methodology (22) to identify homologs of all known human and yeast splicing proteins listed in the online Spliceosome Database and in recent papers (4, 23, 24). The candidate homologs were assigned to particles, splicing steps, or other categories according to their human homolog (1). Based on our criteria (Materials and Methods), we were able to identify 10 U2-associated proteins along with the Commitment Complex proteins Msl5, Mud2, and Sub2; four U5-associated proteins; two U4/U6-associated proteins; three from the NTC and related proteins; seven LSm and seven Sm proteins; and a variety of individual proteins found at various steps of splicing (Table 1 and Tables S3–S5). We did not find an LSm8 homolog, suggesting that C. merolae has only the LSm1–7 complex (25). Including tangentially associated splicing proteins, such as Dbr1 and Fal1, we predict a total of 69 splicing proteins in C. merolae, 43 of which we consider to be core proteins associated directly with snRNPs or the spliceosome.

Table 1.

C. merolae splicing proteins

Particle or step Name S. cerevisiae (H. sapiens) Accession no. E value* Identity, % DEG CSRC
Sm SmB/B′ CMK022C 3E-35 32
SmD1 CMF084C 3E-15 32
SmD2 CMN302C 1E-17 38
SmD3 CMM065C 1E-14 37
SmE CMM109C 2E-15 35
CMH215C
SmF CMQ171C 2E-20 50
SmG CMO342C 2E-11 39
U2 Prp9 (SF3a3) CMQ406C 3E-45 22
Prp11 (SF3a2) CMH102C 5E-11 51
CMN095C
Prp21 (SF3a1) CMJ300C 9E-14 34
Hsh155 (SF3b1) CMB002C 1E-73 25
Cus1 (SF3b2) CMT357C 4E-30 31
Rse1 (SF3b3) CML103C 2E-24 27
Hsh49 (SF3b4) CME063C 1E-13 42
Rds3 (SF3b6) CMS014C 3E-37 31
Prp5 (hPRP5) CMR433C 4E-66 32
U2 related Prp43 (hPRP43) CMM048C 0 48
Mud2 (U2AF65) CMS438C 7E-63 22
U5 Prp8 (220K) CMH168C 0 34
Brr2 (200K) CML192C 1E-119 35
Snu114 (116K) CMK208C 1E-42 35
Dib1 (15K) CMN033C 2E-78 34
CMS018C
U4/U6 Prp3 (90K) CMT170C 7E-11 21
Snu13 (15.5K) CMP335C 5E-52 57
U6§ Lsm1 CMT394C 1E-19 44
Lsm2 CMB130C 6E-23 41
Lsm3 CMT262C 3E-20 44
Lsm4 CMG061C 1E-27 44
CMT545C
Lsm5 CMP159C 5E-18 38
Lsm6 CMP138C 1E-28 31
Lsm7 CMP206C 1E-11 41
Cap binding Sto1 (CBP80) CMJ189C 5E-92 18
Cbc2 (CBP20) CMQ282C 1E-43 53
Complex A Msl5/BBP/SF1 (RBM10) CMI292C 2E-39 34
Sub2 (hUAP56) CME073C 0 59
NTC Cef1 (CDC5L) CMR098C 4E-33 34
Prp46 (PRL1) CMR305C 2E-38 28
Bud31 (G10) CMG014C 4E-19 35
Complex B Prp38 (hPRP38) CMJ144C 1E-57 27
Complex Bact Yju2 (CCDC94) CMN267C 3E-5 25
Second step Prp16 (hPRP16) CMQ385C 0 35
Prp22 (hPRP22) CMG044C 1E-147 46
EJC/TREX Fal1 (eIF4A3) CMK028C 0 78
Yra1 (Aly/THOC4) CMH135C 3E-10 37
n/a (THOC2) CMG046C 1E-41 21
SR Rsp31 CMO009C 3E-24 34
n/a (SRSF2) CML202C 3E-12 30
hnRNP n/a (hnRNP H3) CMF163C 3E-15 38
Miscellaneous Dbr1 (hDBR1) CMK205C 2E-67 41
n/a (SRPK1) CMK182C 1E-61 41
n/a (DDX3X) CMT173C 1E-177 59
Dbp2 (p68/DDX5) CMR479C 0 54
n/a (PABP1) CMJ286C 3E-76 43
n/a (DHX36/RHAU) CMC171C 3E-42 34
n/a (PPP1CA) CME079C 0 75
n/a (RBBP6/PACT) CMQ079C 1E-29 32
n/a (Quaking) CMA075C 9E-41 55
n/a (RACK1) CMI283C 7E-150 27
n/a (TOE1) CMK240C 7E-20 23
Rts2 (HsKin17) CMG137C 9E-17 33
n/a (ERH) CMR260C 2E-22 42
Mtr4 (SKIV2L2) CMA072C 0 39
Rvb2 (TIP48) CMT427C 0 56
Tef1 (eEF1A) CMH226C 0 75
Rpg1 (eIF3A) CMH060C 7E-40 28
Prt1 (EIF3B) CMK285C 5E-111 31
n/a (RPSA) CMT410C 9E-89 56
n/a (TUBA1B) CMT504C 0 77
Tub2 (TUBB4B) CMN263C 0 72
*

Best E value (among all species, forward and reverse BLASTs) and percent identity for best BLAST.

Essential for viability in S. cerevisiae, Schizosaccharomyces pombe, or Mus musculus according to the Database of Essential Genes (30).

C complex proteins detected by mass spectrometry after treatment with 1 M salt (33).

§

It is unclear whether the CmLSm proteins associate with CmU6 (Discussion).

Rsp31 is an Arabidopsis thaliana SR protein.

Although many proteins either were clearly present (often already correctly annotated in the C. merolae genome database) or had no hits that were even remotely related, a number were ambiguous, often because of the presence of common motifs [e.g., RNA-recognition motifs (RRMs), tetratricopeptide repeats (TPRs), DEAD-boxes, and others]. We included the marginal candidates Mud2, Prp21, hnRNP H3, and Yju2 on the basis of additional analysis (SI Materials and Methods and Table S4). Conversely, we ruled out candidates for Clf1, hnRNP C, hnRNP M, and Prp2 because of the lack of gene-specific features. We did not find any components of the minor spliceosome (26). Given the difficulties inherent in identifying distant homologs bioinformatically, we acknowledge that this list is unlikely to be complete or final. Therefore it will be important to confirm these predictions experimentally.

Notably, we were unable to identify any U1-associated proteins, even when we extended our search to include homologs from additional organisms (Materials and Methods and Tables S3 and S5). As a further test for the presence of U1, we asked whether other proteins whose function is related to U1’s role in splicing were present in C. merolae. One such protein is Prp28, the DExD/H-box ATP hydrolase that catalyzes the dissociation of U1 from the 5′ splice site, allowing U6 to bind in its place (27). In the absence of U1, Prp28 could be rendered redundant. Consistent with this possibility, all Prp28 candidates were more similar to a different ATPase (Table 1 and Table S5). We were able to identify all of the other splicing-associated ATPases except Prp2, demonstrating that ATPases are not generally difficult to identify when present. The simplest explanation for the predicted absence of U1 snRNA, all its associated proteins, and Prp28 is that splicing in C. merolae proceeds via a U1-independent mechanism.

RNA Immunoprecipitation.

The unexpected absence of U1 snRNA and any U1-associated proteins in our bioinformatic searches suggested that a bioinformatics-independent strategy might be necessary to find U1. We therefore took advantage of the conserved, hypermethylated cap structure—a trimethylguanosine (TMG)—found on snRNAs in other organisms. We immunoprecipitated TMG-containing RNAs from total C. merolae RNA using an anti-TMG antibody and sequenced the resulting pool. Northern analysis of the immunoprecipitated RNAs demonstrated 80–90% depletion of all snRNAs in the supernatant compared with total RNA (Fig. 4, lanes 1, 2, 4, and 5). The snRNAs except for U6 were recovered in 65–90% yield in the eluate (Fig. 4, lanes 3 and 6, and table at right), yielding an enrichment of 500- to 1,000-fold relative to unselected RNAs (the total RNA decreased from 1.1 mg in the input to 1.8 µg in the eluate). Because U6 does not have a TMG cap, it is immunoprecipitated only via its association with U4. Silver staining revealed the presence of at least 10 bands smaller than U5 that appeared enriched in the eluate relative to the supernatant, aside from those comigrating with the snRNAs, as well as a large number of larger bands (Fig. 4, lanes 7 and 8). Bands visible in the supernatant are likely to be rRNAs, some fraction of which bound nonspecifically to the resin and eluted with the snRNAs. The entire pool of immunoprecipitated RNA was used as input for a modified Illumina TruSeq library preparation, from which we obtained paired-end reads of 100 base pairs.

Fig. 4.

Fig. 4.

Immunoprecipitation of RNA using anti-TMG antibodies. Denaturing Northern analysis of total RNA (T) from C. merolae (lanes 1 and 4), supernatant (S) from immunoprecipitation (lanes 2 and 5), and immunoprecipitated eluates (E, lanes 3 and 6). Northern lanes were probed for C. merolae snRNAs as indicated above. Bands are labeled at left. Band intensities on the Northern blot were measured and normalized to the fraction loaded on the gel, yielding the values shown at right. Lane 7 shows silver stain analysis of immunoprecipitated supernatant, and lane 8 shows immunoprecipitated eluate.

After eliminating rRNAs and mRNAs (i.e., transcripts annotated in the C. merolae genome as protein-coding), we were left with 58 transcripts that were reproducibly immunoprecipitated and had a sequencing coverage of at least 500 reads (Table S6). U2, U4, and U5 were found among these 58 RNAs, as expected from Northern analysis. U6 was present, but at less than 500-fold coverage, because of its lack of a TMG cap. The remaining RNAs were examined for the expected hallmarks of U1 snRNA: complementarity to known 5′ splice sites in C. merolae (GUAAGU) (6), presence of an Sm-binding site, and binding sites for conserved U1 proteins U1-A and U1-70K (28). Eleven RNAs had a potential Sm-binding site and complementarity to the 5′ splice site, and five of these also contained one canonical U1 protein-binding site, but sequence alignments to U1 sequences from the Rfam database (29) failed to reveal significant sequence similarity. Given the enrichment of U2, U4, and U5 in this experiment, it is unlikely that we would have missed U1 if it were present.

Discussion

The pre-mRNA splicing machinery is highly conserved across eukaryotes, with the number of identified splicing proteins ranging from ∼100 in S. cerevisiae to well over 200 in humans (4, 23). Our computational and biochemical analysis of the splicing factors present in the acidophilic alga C. merolae demonstrates a dramatically smaller set of splicing machinery than has been found in other organisms. With only ∼40 predicted core splicing-associated proteins and four snRNAs, C. merolae appears to have been subject to strong selective pressure to reduce its spliceosomal complexity, along with its complement of introns. Strikingly, our multiple, independent search methods have not yielded any evidence for the presence of U1 snRNA or its associated proteins. The best explanation for these observations is that, to our knowledge, C. merolae is the first known eukaryote to splice introns in the absence of U1 snRNP.

Conservation of Splicing Factors.

Although C. merolae lives in an extreme environment, the splicing factors it has retained are not dramatically different from those characterized in other organisms. Its snRNAs, for example, are of similar length and sequence, except for U5, and appear to adopt the same conformations and interactions as in yeast and humans. Many of the retained proteins also are highly conserved. This conservation in the face of strong pressure to eliminate splicing components raises the possibility that the eliminated proteins are not involved in key catalytic or assembly events of the splicing reaction. To test this idea, we asked whether C. merolae’s splicing components are enriched in proteins that are essential for viability in other organisms, reasoning that core splicing components would be more likely to be essential. Of ∼100 splicing-related proteins in yeast, 65 are essential (i.e., a knockout of the gene is inviable in rich medium at 30 °C). Thirty-nine of the 43 core splicing proteins in C. merolae are essential in at least one organism according to the Database of Essential Genes (30), as indicated in Table 1. Of the nonessential proteins, the cap-binding complex (Sto1 and Cbc2) appears to be less important in budding yeast than in humans, where it may be essential (31). Intriguingly, Bud31, although not required for growth at 30 °C, becomes essential in yeast under heat stress (32), suggesting why it may be retained in C. merolae, which grows at up to 56 °C.

Bessonov et al. (33) have reported a biochemical strategy to identify the core of the human spliceosome, in which they purified the C complex and treated it with 1 M salt to determine which components are most stably associated. Of the 54 proteins detected in this C complex salt-resistant core (CSRC), 23 were found in C. merolae (Table 1), and all had been assigned to the C complex, supporting our classification of these proteins. Indeed, these 23 proteins form nearly the entire complement of the C. merolae C complex (Fig. 5), consistent with the view that the most functionally important splicing proteins are overrepresented in this organism. (Of the three exceptions, Hsh49, Prp11, and SmE, the third may not have been detected because of its small size, and Hsh49 was barely detected in any complex; SI Materials and Methods). The remaining 31 proteins from the CSRC do not have identifiable C. merolae homologs. The 18 core proteins from C. merolae that were not found in the CSRC were primarily early splicing factors, such as Commitment Complex and U4 snRNP proteins, LSms, and components of the A and B complexes (Table 1).

Fig. 5.

Fig. 5.

Summary of C. merolae splicing proteins mapped onto the splicing reaction. Splicing steps begin with pre-mRNA and proceed clockwise through complexes A, B, Bact, and C and the postspliceosomal complex. The asterisk indicates that the U6-association status of the LSm proteins has not yet been determined. An additional 26 noncore splicing factors are not shown (Table 1).

To understand better the conservation of its spliceosomal components, we assigned C. merolae proteins to particles, steps, or other categories by placing them in the same groupings that have been determined for human splicing factors (Fig. 5) (1, 23). The most striking difference between the predicted complement of splicing factors in C. merolae and other organisms is the near absence of proteins that in yeast and humans enter and leave the spliceosome during the splicing reaction. In other words, the C. merolae spliceosome appears to have much less change in composition during splicing. For example, in the transition from the yeast B complex to Bact, 12 proteins join the spliceosome, and 35 dissociate (4). In C. merolae, however, only Yju2 is predicted to join the spliceosome. The U4 snRNP (U4 snRNA and its associated proteins, seven Sms, Snu13, and Prp3) and four other proteins (Dib1 from the U5 snRNP, Prp38 from the tri-snRNP, and Prp5 and Prp43 from the U2 snRNP) are predicted to dissociate (Fig. 5). In addition, the seven LSm proteins of the U6 snRNP, if present, probably would dissociate upon release of U4 (see discussion below and ref. 34). Similarly, in the transition from Bact to the C complex, nine proteins join and two proteins leave in yeast, whereas in C. merolae the addition of only two proteins—Fal1 and Prp22—is predicted, and only Yju2 is predicted to leave (Fig. 5). In contrast, in humans there is an exchange of more than 50 proteins at both the B-to-Bact and the Bact-to-C transitions (1, 35).

NTC Complex.

The NTC complex is highly conserved in all organisms studied and plays an important role in the transition from the B to Bact complex (24). Surprisingly, of 19 NTC proteins known from other organisms, only three were detected in C. merolae: Cef1/CDC5L, Prp46, and Bud31. The first two form part of the salt-resistant core of this complex in humans (36), whereas Bud31 has been classified as part of Sf3b as well as the NTC complex (37). Bud31 has been shown to be required for efficient progression to the first step of splicing (32). The paucity of NTC proteins in C. merolae is particularly unexpected, given that this complex has been implicated in a wide range of processes outside of splicing, including transcription and mRNA export (38).

DEAD Box Proteins.

Of the eight splicing-associated DExD/H-box ATPases in yeast, two are predicted to be missing in C. merolae, Prp2 and Prp28. Prp2 is required in budding yeast to displace the Sf3 complex from the branch site region of the transcript to allow the first chemical reaction of splicing (39). Prp2 is at the center of a network of interactions, directly contacting the Sf3 protein Ysf3 (39) as well as Cwc22 and Spp2. Cwc22 is essential for Prp2 function: In its absence, hydrolysis of ATP by Prp2 simply results in the release of Prp2 without promoting the first chemical step (40). Spp2 interacts directly with Prp2 and is released upon ATP hydrolysis (41). The correlated absence of all three of these proteins supports our conclusion that Prp2 is missing in C. merolae. With the absence of these proteins, the only first-step protein remaining in C. merolae is Yju2, which interacts directly with U2 snRNA (42) and is required for progression to the first chemical step of splicing (43). However, there is a weak C. merolae candidate for Prp2, CME166C. Although it is annotated as a probable Prp43 ortholog, there is a better match for Prp43 (CMM048C), so CME166C remains an unassigned helicase with features of splicing proteins. If future biochemical work demonstrates that CME166C is, in fact, orthologous to Prp2, it would suggest that Prp2’s function in dissociating proteins is incidental to another role, presumably involving RNA changes, as appears to be the case for Prp22 and Prp43 (see below).

The absence of Prp28, which is required for the exchange of U6 for U1 at the 5′ splice site (27), is not surprising given the apparent absence of U1 snRNA—another example of the correlated absence of functionally related proteins. The six remaining ATPases (Prp5, Prp16, Prp22, Prp43, Brr2, and Sub2) and one GTPase (Snu114) all have C. merolae homologs, consistent with the critical role of NTP-driven conformational changes in splicing as well as their role in increasing splicing fidelity (3).

The C. merolae LSm Complex Lacks LSm8.

It is not yet clear whether the LSm complex that we have identified in C. merolae functions in splicing. In eukaryotes, two separate LSm complexes exist: the cytoplasmic LSm1–7 complex that binds the polyadenylated 3′ end of mRNAs and is involved in mRNA degradation and the LSm2–8 complex that functions in splicing in the nucleus where it binds the 3′ end of U6 snRNA (44). Although these complexes share six of seven proteins, they differ in the presence of either LSm1 or LSm8. Our bioinformatic search for the LSm proteins suggests that we have identified the LSm1–7 complex and that an eighth LSm homolog is not present. It is possible that this complex functions in both mRNA degradation and splicing; however, it is not immediately clear how a single complex could be involved in such disparate functions. Nevertheless, we feel it is likely that these proteins associate with the U6 snRNA; otherwise, U6 would have no associated proteins, because Prp24 appears to be absent in C. merolae.

U1-Independent Splicing in C. merolae.

The apparent absence of U1 in C. merolae raises the question of how the 5′ splice site is recognized. However, there is ample precedent for U1-independent splicing, both in artificial contexts (45) and with naturally occurring transcripts (46, 47). There are several mechanisms by which U1-independent splicing can occur. For example, recent single-molecule experiments demonstrated that the pre-mRNA transcript could be recognized by U2 snRNP before U1 (48). Overexpression of SR proteins could compensate for depletion of U1 in HeLa cells (45); however, we have detected only two candidate SR proteins, CmRsp31 and CmSRSF2, in C. merolae. Extending the base-pairing interaction between U6 and the 5′ splice site also increased the splicing efficiency of U1-independent splicing (49). Notably, C. merolae U6 has extended complementarity to 5′ splice sites, with six of seven positions capable of forming standard Watson–Crick base pairs, making it comparable in strength to the canonical U1 5′ splice site interactions (Fig. S2E). In other organisms U6 forms only three base pairs with the 5′ splice site. Therefore it is plausible that C. merolae can dispense with U1 by relying only on a U6 5′ splice site interaction. Other U1-independent interactions with the 5′ splice site have been reported, for instance by Prp8 (50) and U5 snRNA (51). Intriguingly, the 5′ end of U5 snRNA (5′ GUCUGC) is complementary to all annotated C. merolae 5′ splice sites (Fig. S2E), raising the possibility that initial recognition of introns in C. merolae occurs via U5 snRNA.

In addition to recognition of the 5′ splice site, the predicted absence of U1 also raises questions about other roles for U1 that could be missing in C. merolae. U1 has been shown to play a role in increasing transcription from intron-containing genes (52) or regulating alternative splicing (53). Furthermore, a number of studies have demonstrated roles for U1 outside of pre-mRNA splicing. These include regulating the use of cleavage and polyadenylation sites (54) and conferring proper polarity to bidirectional transcription start sites (55). Whether any of these processes are affected in C. merolae remains to be determined.

Commitment Complex.

One of the earliest steps in splicing is the formation of the Commitment Complex, in which the Msl5/Mud2/U2AF1 heterotrimer stabilizes U1 snRNP association with the 5′ splice site by binding to intron features and the U1 snRNP protein Prp40 (3). We found apparent C. merolae orthologs of Msl5 and Mud2, but not of U2AF1. Neuvéglise et al. (56) showed that, although U2AF1 does bind the 3′ splice site, which is conserved in C. merolae (6), it is required only for transcripts with branch site-to-3′ splice site distances less than 15 nt. This observation may explain its absence in C. merolae, in which this distance averages 29 nt. In contrast, human Mud2 (U2AF2), which binds the polypyrimidine tract, was found to be critical for excision of all introns tested (57). Because there is no polypyrimidine tract in C. merolae (6), we suggest that CmMud2 could bind the branch site directly, as has been reported in S. cerevisiae (58), which also has less prominent polypyrimidine tracts. In the absence of U1, some other mechanism may be required to ensure splicing of bona fide introns (i.e., those introns containing both a 5′ splice site and a branch site), perhaps involving a bridging interaction mediated by the extended U5 snRNA.

Biogenesis and Disassembly Factors.

Another category of splicing factors that appear to be nearly absent in C. merolae is snRNP biogenesis proteins, the factors involved in initial assembly of snRNPs. The U2 protein Cus2, Aar2 from the cytoplasmic form of U5, Snu40 from the nuclear U5, Sad1 from the tri-snRNP, and the SMN protein required for correct Sm ring assembly in humans were not detected in C. merolae (5961). SMN may be rendered unnecessary by the absence of stem-loops 3′ of the Sm binding site in C. merolae snRNAs. C. merolae also appears to lack a homolog of Prp24, a protein involved in promoting base-pair formation between U4 and U6 snRNAs in yeast and humans (62). It is possible that the extra domain in U4 compensates in some way for the absence of Prp24, allowing U4 and U6 to form the base-paired di-snRNP without the assistance of a trans-acting factor.

Spliceosome disassembly minimally requires release of the mRNA product and the lariat intron, in which Prp22 and Prp43 have been implicated respectively, but also has been shown to involve protein dissociation and snRNP disassembly. Fourmann et al. (63) recently showed that Prp22 activity is associated with the disappearance of the RES complex (Bud13, Pml1, and Ist3), Cwc21, and Cwc22 from the postcatalytic spliceosome. We have not found any of these proteins in C. merolae, suggesting that Prp22-catalyzed protein dissociation is only incidental to its role in mRNA release.

Similarly, the spliceosome disassembly factors Ntr1 and Ntr2 also appear to be missing. This complex, along with the ATPase Prp43, binds to the spliceosome via interactions between Ntr2 and Brr2 and promotes disassembly of the postsplicing particle into its component parts—the U2 snRNP, U5 snRNP, U6 snRNP, and NTC—as it releases the lariat intron (64). The absence of so many assembly and disassembly factors in C. merolae raises the possibility that its spliceosome functions as a preassembled holoenzyme that does not proceed through the stepwise formation and dissociation seen in other organisms (2, 65). There is some precedent for the idea that what appears to be dissociation by one assay may represent weakened interaction, but not complete dissociation, under gentler conditions. For example, oligonucleotide-directed selection of U4 snRNA copurifies excised lariat intron, suggesting that the U4 dissociation before the chemical steps of splicing seen on gels may be caused by the stringency of the assay (66).

Noncore Proteins.

Twenty proteins classified by Agafonov (1) or Hegele (23) as miscellaneous splicing proteins have apparent homologs in C. merolae. A number of these proteins were identified on the basis of physical interactions rather than functional assays, leaving open the possibility that some of them are not actually involved in splicing. For example, Fal1 (eIF4A3) is an ATPase that contacts the mRNA (67) and loads exon junction complex (EJC) components Y14 and Magoh (68). It is recruited to the spliceosome by Cwc22 (69), so, given the predicted absence of the latter, it seems likely that Fal1 has been retained for roles outside of splicing. Others are kinases, phosphatases, or ubiquitin ligases that may be involved in regulating splicing or are simply RNA-binding proteins. Surprisingly, the alternative splicing factor Quaking appears to have an ortholog, even though C. merolae has only one known gene (CMR350C) with more than one intron. Sequencing of CMR350C splice junctions revealed singly-spliced transcripts that may be intermediates of the fully spliced form but no exon-skipped variants that would be most indicative of alternative splicing. We have not included these proteins as part of our list of core-splicing machinery, and it is likely that some of these have other cellular roles.

RNA Stability.

One prediction concerning RNAs from thermophilic organisms is that they might be more resistant to thermal denaturation, and hence more intrinsically stable, than RNAs from mesophiles. Our results (Table S2) suggest that this prediction is true to a limited extent in C. merolae, in that approximately only one stem in each snRNA has substantially more stability in C. merolae than in the orthologous region from yeast or human snRNAs, but the majority of stems have comparable stabilities in these three organisms. One factor that might restrict a stem's ability to evolve greater thermal stability would be a requirement for it to unwind readily as it undergoes conformational rearrangements. Such a requirement would allow us to distinguish “structural” stems, those that are required for snRNP stability, perhaps as protein-binding sites, from “functional” stems, i.e., those that must change conformation or binding interactions during the splicing reaction. Using this framework, we suggest that U2 stems IIb and IIc, U4 stem II, U6 stem I, and U5 stems Ia and VSL are structural, and the remaining stems are predicted to be functional. The prediction of structural stems is consistent with data showing that the U6 5′ stem-loop is strongly protected from hydroxyl radicals in the snRNP particle (17). However, the observation that U2 stems IIb and IIc are more stable suggests that the reality may be more nuanced than this straightforward classification into structural and functional stems allows, because stem II is thought to toggle between two conformations during the splicing reaction.

Our primarily bioinformatic analysis suggests that most peripheral or regulatory splicing proteins have been eliminated in C. merolae, leaving a spliceosome enriched in catalytically essential components. We note that these results remain to be confirmed biochemically. This highly simplified splicing machinery should provide a powerful system in which to study key features of the pre-mRNA splicing mechanism. The apparent absence of biogenesis proteins raises the possibility that the entire complement of splicing proteins and particles might be expressed recombinantly, allowing the generation of a completely defined splicing system and providing assembled particles for biophysical studies.

Materials and Methods

For detailed methods, please refer to SI Materials and Methods.

C. merolae RNA Preparation.

The 10D strain of C. merolae (NIES-1332), obtained from the Microbial Culture Collection at the National Institute for Environmental Studies in Tsukuba, Japan (mcc.nies.go.jp/), was cultured as described (70). C. merolae cultures (50–500 mL) were harvested in log phase at an OD750 1.8–2.0 and were lysed by sonication in the presence of 1% SDS. RNA was acid phenol/chloroform extracted and EtOH precipitated. Where appropriate, total RNA was denatured by heating for 3 min at 65 °C.

C. merolae Splicing.

All 26 genes predicted to contain introns were tested for splicing via RT-PCR analysis. Total RNA was treated with DNase, and RT-PCR reactions were carried out using the appropriate primers with reverse transcriptase and Taq DNA polymerase. The negative control reaction contained no reverse transcriptase. Reaction products were run on agarose gels and visualized on a Chemi Imager (Alpha Innotec). The exon junctions from CMQ117C and CMO094C were confirmed by sequencing reverse transcriptase products, and the remaining introns were observed to splice at their expected junctions based on our sequencing of background mRNAs in the RIP-seq Illumina library (SI Materials and Methods).

Bioinformatic Analysis.

We used Infernal v1.0 (7) to search for snRNAs in the C. merolae genome. Sequences were aligned with Clustal Omega v1.1.0 (71), and U4 and U5 secondary structures were modeled using sfold v2.2 (15). RNA stem stabilities were calculated with mfold (72). Splicing protein sequences were retrieved from the National Center for Biotechnology Information (NCBI) website and were used in a Reciprocal Best Hit strategy (22) with an E value threshold up to 100 to maximize our chances of finding homologs. A C. merolae gene was considered a clear homolog if, in searches with sequences from two or more species, the top hit in the C. merolae database retrieved the initial search protein and the E value was smaller than 10−10 (Tables S3 and S4).

snRNA Sequencing.

The 5′ and 3′ ends of three of the snRNAs were sequenced by circularizing total RNA and specifically amplifying the snRNA junctions by RT-PCR, followed by cloning into pUC19 and sequencing. The 5′ end of U6 was determined by primer extension sequencing; the 3′ end was predicted by alignment to U6 snRNAs from other organisms.

Denaturing and Native Northern Analysis.

To determine candidate snRNA expression and length, C. merolae total RNA (and S. cerevisiae total RNA as a control) was electrophoresed through a 6% denaturing polyacrylamide gel, transferred to a nylon membrane, and probed for C. merolae U2, U4, U5, and U6 snRNAs as well as all five of the S. cerevisiae snRNAs. The base-pairing status of U4 and U6 snRNAs extracted from C. merolae was analyzed by native Northern and compared with U4/U6 from S. cerevisiae.

RNA Immunoprecipitation.

Anti-TMG antibodies (200 µg /mL; K121; Santa Cruz Biotechnology) were bound to protein G Sepharose before the addition of total C. merolae RNA. RNA was eluted with proteinase K. The supernatant (flow-through) and eluates were electrophoresed on a 6% denaturing polyacrylamide gel. The gel was cut in two, and one half was transferred to a nylon membrane and probed first for U2 and U4 and then for U5 and U6 snRNAs. The other half of the gel was silver stained. The remainder of the eluted RNA was used for sequencing.

Illumina Sequencing.

Two independent samples of anti-TMG isolated RNA were used as input for two TruSeq RNA library preparations (Illumina). Reads first were mapped to C. merolae rDNA sequences to filter out rDNA contamination and then were mapped to the C. merolae genome. A pileup file was created using the SAMtools 0.1.18 mpileup option (73), and a custom Python script was used to identify contiguous stretches of expressed regions with greater than 500× coverage that did not overlap annotated genes. This process resulted in 82 sequences from the first experiment and 87 from the second, with 58 (∼70%) common to the two experiments. These sequences were analyzed for features of U1 snRNA.

Supplementary Material

Supplementary File
pnas.201416879SI.pdf (734.7KB, pdf)
Supplementary File
pnas.1416879112.st01.pdf (32.9KB, pdf)
Supplementary File
pnas.1416879112.st02.pdf (58.5KB, pdf)
Supplementary File
pnas.1416879112.st04.pdf (69.6KB, pdf)
Supplementary File
pnas.1416879112.st03.pdf (297.7KB, pdf)
Supplementary File
pnas.1416879112.st05.pdf (108.1KB, pdf)
Supplementary File
pnas.1416879112.st06.pdf (82.6KB, pdf)
Supplementary File
pnas.1416879112.st03.pdf (297.7KB, pdf)
Supplementary File
pnas.1416879112.st04.pdf (69.6KB, pdf)
Supplementary File
pnas.1416879112.st05.pdf (108.1KB, pdf)

Acknowledgments

We thank Camellia Presley, Sepehr Masoumi-Alamouti, and Everett Versteeg for early bioinformatic work on this project; Cassandra Fayowski for assistance with C. merolae splicing analysis; Alex Aravind (University of Northern British Columbia, UNBC) for computational resources and assistance; and Andrew MacMillan for encouragement and thoughtful comments on the manuscript. This work was supported by Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant 298521 and UNBC Office of Research awards (to S.D.R.) and by an NSERC Postgraduate Scholarship award (to E.A.D.). N.M.F.’s research is supported by an NSERC Discovery Grant and a grant to the Centre for Microbial Diversity and Evolution (CMDE) from the Tula Foundation. C.J.G. was supported by an NSERC doctoral scholarship and a CMDE postdoctoral fellowship from the Tula Foundation.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1416879112/-/DCSupplemental.

References

  • 1.Agafonov DE, et al. Semiquantitative proteomic analysis of the human spliceosome via a novel two-dimensional gel electrophoresis method. Mol Cell Biol. 2011;31(13):2667–2682. doi: 10.1128/MCB.05266-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Cheng SC, Abelson J. Spliceosome assembly in yeast. Genes Dev. 1987;1(9):1014–1027. doi: 10.1101/gad.1.9.1014. [DOI] [PubMed] [Google Scholar]
  • 3.Dunn EA, Rader SD. Fungal RNA Biology. Springer, Cham, Switzerland; 2014. Pre-mRNA splicing and the spliceosome: Assembly, catalysis, and fidelity. [Google Scholar]
  • 4.Fabrizio P, et al. The evolutionarily conserved core design of the catalytic activation step of the yeast spliceosome. Mol Cell. 2009;36(4):593–608. doi: 10.1016/j.molcel.2009.09.040. [DOI] [PubMed] [Google Scholar]
  • 5.Hossain MA, Johnson TL. Using yeast genetics to study splicing mechanisms. Methods Mol Biol. 2014;1126:285–298. doi: 10.1007/978-1-62703-980-2_21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Matsuzaki M, et al. Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature. 2004;428(6983):653–657. doi: 10.1038/nature02398. [DOI] [PubMed] [Google Scholar]
  • 7.Nawrocki EP, Kolbe DL, Eddy SR. Infernal 1.0: Inference of RNA alignments. Bioinformatics. 2009;25(10):1335–1337. doi: 10.1093/bioinformatics/btp157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Brow DA, Guthrie C. Spliceosomal RNA U6 is remarkably conserved from yeast to mammals. Nature. 1988;334(6179):213–218. doi: 10.1038/334213a0. [DOI] [PubMed] [Google Scholar]
  • 9.Newman AJ, Norman C. U5 snRNA interacts with exon sequences at 5′ and 3′ splice sites. Cell. 1992;68(4):743–754. doi: 10.1016/0092-8674(92)90149-7. [DOI] [PubMed] [Google Scholar]
  • 10.Brow DA, Vidaver RM. An element in human U6 RNA destabilizes the U4/U6 spliceosomal RNA complex. RNA. 1995;1(2):122–131. [PMC free article] [PubMed] [Google Scholar]
  • 11.Mougin A, Gottschalk A, Fabrizio P, Lührmann R, Branlant C. Direct probing of RNA structure and RNA-protein interactions in purified HeLa cell’s and yeast spliceosomal U4/U6.U5 tri-snRNP particles. J Mol Biol. 2002;317(5):631–649. doi: 10.1006/jmbi.2002.5451. [DOI] [PubMed] [Google Scholar]
  • 12.Reddy R, Henning D, Epstein P, Busch H. Primary and secondary structure of U2 snRNA. Nucleic Acids Res. 1981;9(21):5645–5658. doi: 10.1093/nar/9.21.5645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hilliker AK, Mefford MA, Staley JP. U2 toggles iteratively between the stem IIa and stem IIc conformations to promote pre-mRNA splicing. Genes Dev. 2007;21(7):821–834. doi: 10.1101/gad.1536107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Dix I, Russell CS, O’Keefe RT, Newman AJ, Beggs JD. Protein-RNA interactions in the U5 snRNP of Saccharomyces cerevisiae. RNA. 1998;4(10):1239–1250. doi: 10.1017/s1355838298981109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ding Y, Chan CY, Lawrence CE. Sfold web server for statistical folding and rational design of nucleic acids. Nucleic Acids Res. 2004;32(web server issue):W135-41. doi: 10.1093/nar/gkh449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Rhode BM, Hartmuth K, Westhof E, Lührmann R. Proximity of conserved U6 and U2 snRNA elements to the 5′ splice site region in activated spliceosomes. EMBO J. 2006;25(11):2475–2486. doi: 10.1038/sj.emboj.7601134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Karaduman R, Fabrizio P, Hartmuth K, Urlaub H, Lührmann R. RNA structure and RNA-protein interactions in purified yeast U6 snRNPs. J Mol Biol. 2006;356(5):1248–1262. doi: 10.1016/j.jmb.2005.12.013. [DOI] [PubMed] [Google Scholar]
  • 18.Madhani HD, Guthrie C. A novel base-pairing interaction between U2 and U6 snRNAs suggests a mechanism for the catalytic activation of the spliceosome. Cell. 1992;71(5):803–817. doi: 10.1016/0092-8674(92)90556-r. [DOI] [PubMed] [Google Scholar]
  • 19.Sun JS, Manley JL. A novel U2-U6 snRNA structure is necessary for mammalian mRNA splicing. Genes Dev. 1995;9(7):843–854. doi: 10.1101/gad.9.7.843. [DOI] [PubMed] [Google Scholar]
  • 20.Fang XW, et al. The thermodynamic origin of the stability of a thermophilic ribozyme. Proc Natl Acad Sci USA. 2001;98(8):4355–4360. doi: 10.1073/pnas.071050698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  • 22.Ward N, Moreno-Hagelsieb G. Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: How much do we miss? PLoS ONE. 2014;9(7):e101850. doi: 10.1371/journal.pone.0101850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Hegele A, et al. Dynamic protein-protein interaction wiring of the human spliceosome. Mol Cell. 2012;45(4):567–580. doi: 10.1016/j.molcel.2011.12.034. [DOI] [PubMed] [Google Scholar]
  • 24.Chen H-C, Cheng S-C. Functional roles of protein splicing factors. Biosci Rep. 2012;32(4):345–359. doi: 10.1042/BSR20120007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Tharun S, et al. Yeast Sm-like proteins function in mRNA decapping and decay. Nature. 2000;404(6777):515–518. doi: 10.1038/35006676. [DOI] [PubMed] [Google Scholar]
  • 26.Tarn WY, Steitz JA. A novel spliceosome containing U11, U12, and U5 snRNPs excises a minor class (AT-AC) intron in vitro. Cell. 1996;84(5):801–811. doi: 10.1016/s0092-8674(00)81057-0. [DOI] [PubMed] [Google Scholar]
  • 27.Staley JP, Guthrie C. An RNA switch at the 5′ splice site requires ATP and the DEAD box protein Prp28p. Mol Cell. 1999;3(1):55–64. doi: 10.1016/s1097-2765(00)80174-4. [DOI] [PubMed] [Google Scholar]
  • 28.Hamm J, Kazmaier M, Mattaj IW. In vitro assembly of U1 snRNPs. EMBO J. 1987;6(11):3479–3485. doi: 10.1002/j.1460-2075.1987.tb02672.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Burge SW, et al. Rfam 11.0: 10 years of RNA families. Nucleic Acids Res. 2013;41(Database issue):D226–D232. doi: 10.1093/nar/gks1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Zhang R, Ou H-Y, Zhang C-T. DEG: A database of essential genes. Nucleic Acids Res. 2004;32(Database issue):D271–D272. doi: 10.1093/nar/gkh024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Gonatopoulos-Pournatzis T, Cowling VH. Cap-binding complex (CBC) Biochem J. 2014;457(2):231–242. doi: 10.1042/BJ20131214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Saha D, Khandelia P, O’Keefe RT, Vijayraghavan U. Saccharomyces cerevisiae NineTeen complex (NTC)-associated factor Bud31/Ycr063w assembles on precatalytic spliceosomes and improves first and second step pre-mRNA splicing efficiency. J Biol Chem. 2012;287(8):5390–5399. doi: 10.1074/jbc.M111.298547. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Bessonov S, Anokhina M, Will CL, Urlaub H, Lührmann R. Isolation of an active step I spliceosome and composition of its RNP core. Nature. 2008;452(7189):846–850. doi: 10.1038/nature06842. [DOI] [PubMed] [Google Scholar]
  • 34.Chan S-P, Kao D-I, Tsai W-Y, Cheng S-C. The Prp19p-associated complex in spliceosome activation. Science. 2003;302(5643):279–282. doi: 10.1126/science.1086602. [DOI] [PubMed] [Google Scholar]
  • 35.Bessonov S, et al. Characterization of purified human Bact spliceosomal complexes reveals compositional and morphological changes during spliceosome activation and first step catalysis. RNA. 2010;16(12):2384–2403. doi: 10.1261/rna.2456210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Grote M, et al. Molecular architecture of the human Prp19/CDC5L complex. Mol Cell Biol. 2010;30(9):2105–2119. doi: 10.1128/MCB.01505-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Wang Q, He J, Lynn B, Rymond BC. Interactions of the yeast SF3b splicing factor. Mol Cell Biol. 2005;25(24):10745–10754. doi: 10.1128/MCB.25.24.10745-10754.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Chanarat S, Sträßer K. Splicing and beyond: The many faces of the Prp19 complex. Biochim Biophys Acta. 2013;1833(10):2126–2134. doi: 10.1016/j.bbamcr.2013.05.023. [DOI] [PubMed] [Google Scholar]
  • 39.Lardelli RM, Thompson JX, Yates JR, 3rd, Stevens SW. Release of SF3 from the intron branchpoint activates the first step of pre-mRNA splicing. RNA. 2010;16(3):516–528. doi: 10.1261/rna.2030510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Yeh T-C, et al. Splicing factor Cwc22 is required for the function of Prp2 and for the spliceosome to escape from a futile pathway. Mol Cell Biol. 2011;31(1):43–53. doi: 10.1128/MCB.00801-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Roy J, Kim K, Maddock JR, Anthony JG, Woolford JL., Jr The final stages of spliceosome maturation require Spp2p that can interact with the DEAH box protein Prp2p and promote step 1 of splicing. RNA. 1995;1(4):375–390. [PMC free article] [PubMed] [Google Scholar]
  • 42.Chiang T-W, Cheng S-C. A weak spliceosome-binding domain of Yju2 functions in the first step and bypasses Prp16 in the second step of splicing. Mol Cell Biol. 2013;33(9):1746–1755. doi: 10.1128/MCB.00035-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Liu Y-C, Chen H-C, Wu N-Y, Cheng S-C. A novel splicing factor, Yju2, is associated with NTC and acts after Prp2 in promoting the first catalytic reaction of pre-mRNA splicing. Mol Cell Biol. 2007;27(15):5403–5413. doi: 10.1128/MCB.00346-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Achsel T, et al. A doughnut-shaped heteromer of human Sm-like proteins binds to the 3′-end of U6 snRNA, thereby facilitating U4/U6 duplex formation in vitro. EMBO J. 1999;18(20):5789–5802. doi: 10.1093/emboj/18.20.5789. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Crispino JD, Blencowe BJ, Sharp PA. Complementation by SR proteins of pre-mRNA splicing reactions depleted of U1 snRNP. Science. 1994;265(5180):1866–1869. doi: 10.1126/science.8091213. [DOI] [PubMed] [Google Scholar]
  • 46.Crispino JD, Mermoud JE, Lamond AI, Sharp PA. Cis-acting elements distinct from the 5′ splice site promote U1-independent pre-mRNA splicing. RNA. 1996;2(7):664–673. [PMC free article] [PubMed] [Google Scholar]
  • 47.Fukumura K, Taniguchi I, Sakamoto H, Ohno M, Inoue K. U1-independent pre-mRNA splicing contributes to the regulation of alternative splicing. Nucleic Acids Res. 2009;37(6):1907–1914. doi: 10.1093/nar/gkp050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Shcherbakova I, et al. Alternative spliceosome assembly pathways revealed by single-molecule fluorescence microscopy. Cell Reports. 2013;5(1):151–165. doi: 10.1016/j.celrep.2013.08.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Crispino JD, Sharp PA. A U6 snRNA:pre-mRNA interaction can be rate-limiting for U1-independent splicing. Genes Dev. 1995;9(18):2314–2323. doi: 10.1101/gad.9.18.2314. [DOI] [PubMed] [Google Scholar]
  • 50.Maroney PA, Romfo CM, Nilsen TW. Functional recognition of 5′ splice site by U4/U6.U5 tri-snRNP defines a novel ATP-dependent step in early spliceosome assembly. Mol Cell. 2000;6(2):317–328. doi: 10.1016/s1097-2765(00)00032-0. [DOI] [PubMed] [Google Scholar]
  • 51.Wyatt JR, Sontheimer EJ, Steitz JA. Site-specific cross-linking of mammalian U5 snRNP to the 5′ splice site before the first step of pre-mRNA splicing. Genes Dev. 1992;6(12B):2542–2553. doi: 10.1101/gad.6.12b.2542. [DOI] [PubMed] [Google Scholar]
  • 52.Kwek KY, et al. U1 snRNA associates with TFIIH and regulates transcriptional initiation. Nat Struct Biol. 2002;9(11):800–805. doi: 10.1038/nsb862. [DOI] [PubMed] [Google Scholar]
  • 53.Fukumura K, Inoue K. Role and mechanism of U1-independent pre-mRNA splicing in the regulation of alternative splicing. RNA Biol. 2009;6(4):395–398. doi: 10.4161/rna.6.4.9318. [DOI] [PubMed] [Google Scholar]
  • 54.Kaida D, et al. U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature. 2010;468(7324):664–668. doi: 10.1038/nature09479. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Almada AE, Wu X, Kriz AJ, Burge CB, Sharp PA. Promoter directionality is controlled by U1 snRNP and polyadenylation signals. Nature. 2013;499(7458):360–363. doi: 10.1038/nature12349. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Neuvéglise C, Marck C, Gaillardin C. The intronome of budding yeasts. C R Biol. 2011;334(8-9):662–670. doi: 10.1016/j.crvi.2011.05.015. [DOI] [PubMed] [Google Scholar]
  • 57.Guth S, Martínez C, Gaur RK, Valcárcel J. Evidence for substrate-specific requirement of the splicing factor U2AF(35) and for its function after polypyrimidine tract recognition by U2AF(65) Mol Cell Biol. 1999;19(12):8263–8271. doi: 10.1128/mcb.19.12.8263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Abovich N, Liao XC, Rosbash M. The yeast MUD2 protein: An interaction with PRP11 defines a bridge between commitment complexes and U2 snRNP addition. Genes Dev. 1994;8(7):843–854. doi: 10.1101/gad.8.7.843. [DOI] [PubMed] [Google Scholar]
  • 59.Yan D, et al. CUS2, a yeast homolog of human Tat-SF1, rescues function of misfolded U2 through an unusual RNA recognition motif. Mol Cell Biol. 1998;18(9):5000–5009. doi: 10.1128/mcb.18.9.5000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Weber G, et al. Structural basis for dual roles of Aar2p in U5 snRNP assembly. Genes Dev. 2013;27(5):525–540. doi: 10.1101/gad.213207.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Battle DJ, et al. The Gemin5 protein of the SMN complex identifies snRNAs. Mol Cell. 2006;23(2):273–279. doi: 10.1016/j.molcel.2006.05.036. [DOI] [PubMed] [Google Scholar]
  • 62.Raghunathan PL, Guthrie C. A spliceosomal recycling factor that reanneals U4 and U6 small nuclear ribonucleoprotein particles. Science. 1998;279(5352):857–860. doi: 10.1126/science.279.5352.857. [DOI] [PubMed] [Google Scholar]
  • 63.Fourmann J-B, et al. Dissection of the factor requirements for spliceosome disassembly and the elucidation of its dissociation products using a purified splicing system. Genes Dev. 2013;27(4):413–428. doi: 10.1101/gad.207779.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Tsai R-T, et al. Spliceosome disassembly catalyzed by Prp43 and its associated components Ntr1 and Ntr2. Genes Dev. 2005;19(24):2991–3003. doi: 10.1101/gad.1377405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Stevens SW, et al. Composition and functional characterization of the yeast spliceosomal penta-snRNP. Mol Cell. 2002;9(1):31–44. doi: 10.1016/s1097-2765(02)00436-7. [DOI] [PubMed] [Google Scholar]
  • 66.Blencowe BJ, Sproat BS, Ryder U, Barabino S, Lamond AI. Antisense probing of the human U4/U6 snRNP with biotinylated 2′-OMe RNA oligonucleotides. Cell. 1989;59(3):531–539. doi: 10.1016/0092-8674(89)90036-6. [DOI] [PubMed] [Google Scholar]
  • 67.Shibuya T, Tange TØ, Sonenberg N, Moore MJ. eIF4AIII binds spliced mRNA in the exon junction complex and is essential for nonsense-mediated decay. Nat Struct Mol Biol. 2004;11(4):346–351. doi: 10.1038/nsmb750. [DOI] [PubMed] [Google Scholar]
  • 68.Zhang Z, Krainer AR. Splicing remodels messenger ribonucleoprotein architecture via eIF4A3-dependent and -independent recruitment of exon junction complex components. Proc Natl Acad Sci USA. 2007;104(28):11574–11579. doi: 10.1073/pnas.0704946104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Barbosa I, et al. Human CWC22 escorts the helicase eIF4AIII to spliceosomes and promotes exon junction complex assembly. Nat Struct Mol Biol. 2012;19(10):983–990. doi: 10.1038/nsmb.2380. [DOI] [PubMed] [Google Scholar]
  • 70.Minoda A, Sakagami R, Yagisawa F, Kuroiwa T, Tanaka K. Improvement of culture conditions and evidence for nuclear transformation by homologous recombination in a red alga, Cyanidioschyzon merolae 10D. Plant Cell Physiol. 2004;45(6):667–671. doi: 10.1093/pcp/pch087. [DOI] [PubMed] [Google Scholar]
  • 71.Sievers F, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011;7:539. doi: 10.1038/msb.2011.75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 2003;31(13):3406–3415. doi: 10.1093/nar/gkg595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Li H, et al. 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Brennwald P, Porter G, Wise JA. U2 small nuclear RNA is remarkably conserved between Schizosaccharomyces pombe and mammals. Mol Cell Biol. 1988;8(12):5575–5580. doi: 10.1128/mcb.8.12.5575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Myslinski E, Branlant C. A phylogenetic study of U4 snRNA reveals the existence of an evolutionarily conserved secondary structure corresponding to ‘free’ U4 snRNA. Biochimie. 1991;73(1):17–28. doi: 10.1016/0300-9084(91)90069-d. [DOI] [PubMed] [Google Scholar]
  • 76.Frank DN, Roiha H, Guthrie C. Architecture of the U5 small nuclear RNA. Mol Cell Biol. 1994;14(3):2180–2190. doi: 10.1128/mcb.14.3.2180. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.201416879SI.pdf (734.7KB, pdf)
Supplementary File
pnas.1416879112.st01.pdf (32.9KB, pdf)
Supplementary File
pnas.1416879112.st02.pdf (58.5KB, pdf)
Supplementary File
pnas.1416879112.st04.pdf (69.6KB, pdf)
Supplementary File
pnas.1416879112.st03.pdf (297.7KB, pdf)
Supplementary File
pnas.1416879112.st05.pdf (108.1KB, pdf)
Supplementary File
pnas.1416879112.st06.pdf (82.6KB, pdf)
Supplementary File
pnas.1416879112.st03.pdf (297.7KB, pdf)
Supplementary File
pnas.1416879112.st04.pdf (69.6KB, pdf)
Supplementary File
pnas.1416879112.st05.pdf (108.1KB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES