Skip to main content
Molecular and Cellular Biology logoLink to Molecular and Cellular Biology
. 2012 Mar;32(6):1112–1123. doi: 10.1128/MCB.06511-11

snRNA 3′ End Formation Requires Heterodimeric Association of Integrator Subunits

Todd R Albrecht 1, Eric J Wagner 1,
PMCID: PMC3295014  PMID: 22252320

Abstract

The Integrator Complex is a group of proteins responsible for the endonucleolytic cleavage of primary small nuclear RNA (snRNA) transcripts within the nucleus. Integrator subunits 9 and 11 (IntS9/11) are thought to contain the catalytic activity based on their high sequence similarity to CPSF100 and CPSF73, which have been shown to be components of both the poly(A)+ and histone pre-mRNA cleavage complex. Here we demonstrate that the specific heterodimeric interaction between IntS9 and IntS11 is mediated by a discrete domain present at the extreme C terminus of IntS9 and within the C terminus of IntS11, adjacent to the predicted active site of this endonuclease. This domain is highly conserved within IntS11 but conspicuously absent in CPSF73. Using a cell-based complementation assay that measures Integrator activity, we determined that the IntS9 interaction domain within IntS11 is required for its ability to restore snRNA 3′ end processing after RNA interference (RNAi)-mediated depletion of IntS11. Moreover, overexpression of these interaction domains alone elicits snRNA misprocessing through a dominant-negative titration of endogenous Integrator subunits. These data collectively explain the mechanism by which the IntS11/9 and, by analogy, the CPSF73/100 heterodimeric cleavage factors distinguish themselves from each other and demonstrate that the heterodimeric interaction is functionally required for snRNA 3′ end formation.

INTRODUCTION

The 3′ end formation of RNA polymerase II (RNAPII)-transcribed RNA is an essential component of their biosynthesis (11, 15). These 3′ end formation reactions can be broken down into three categories based on the presence of particular cis regulatory elements: poly(A)+ mRNA, histone mRNA, and small nuclear RNA (snRNA). Significant advances have been made in understanding the endonucleolytic processing of both poly(A)+ mRNA and the replication-dependent histone mRNA (reviewed in references 21 and 23). Despite significant differences in the cis elements governing each of their 3′ end formation reactions, both poly(A)+ and histone pre-mRNA are cleaved by the cleavage and polyadenylation specificity factor of 73 kDa (CPSF73) (8, 22, 30, 34). CPSF73 is an RNA endonuclease and a member of a group of the zinc-dependent hydrolases called the metallo-β-lactamase (MBL) family (reviewed in references 3 and 7). Other members of this family with notably similar activity include RNase Z, which removes several nucleotides from the 3′ end of tRNA prior to the addition of the terminal CCA, and RNase J, which performs an “exosome-like” function in bacteria by degrading targeted mRNA (5, 20).

Members of the MBL family generally contain 5 signature motifs (motifs 1 to 5) involved in the coordination of two zinc ions, which are required for the hydrolytic cleavage of substrates that typically contain an ester linkage with a nearby positive charge (1). The general architecture of MBL enzymes contains an αβ/βα sandwich fold responsible for positioning motifs 1 to 5 in proximity in order to form the active site of the enzyme. CPSF73 and other RNA/DNA endonucleases within the MBL family, however, replace motif 5 with three additional motifs (motifs A to C) and insert a large sequence within the MBL domain called the β-CASP (CPSF, Artemis, SNM1/PSO2) motif (reviewed in reference 3) (Fig. 1). The crystal structure of CPSF73 definitively identified it as an endonuclease and determined that the centrally positioned β-CASP motif forms a separate domain that is positioned above the MBL domain that is itself comprised of both the N- and C-terminal regions of the protein (22). Also evident with respect to this structure is the unresolved observation that the active site resides in a region of the enzyme that appears unlikely to be accessible to its substrate. This has led to the hypothesis that members of the β-CASP family may require conformational changes to activate catalysis (22). Subsequently solved crystal structures of archaeal CPSF proteins have demonstrated a remarkable degree of similarity at the structural level, suggesting evolutionary conservation in this particular mechanism of RNA cleavage (25, 28, 31).

Fig 1.

Fig 1

Similarity between CPSF73 and IntS11. The schematic at the top represents the canonical organization of a β-CASP protein with the β-CASP domain inserted between an N-terminal and C-terminal MBL domains. The seven conserved motifs (motifs 1 to 4 and A to C) that are involved in the coordination of zinc and enzymatic catalysis are highlighted in red (CPSF73) and blue (IntS11). The amino acids that constitute these motifs are represented for CPFS73 and are identical in IntS11. Amino acids in black are identical in the two proteins, blue amino acids represent residues critical for function, and red amino acids distinguish those that are different in IntS11.

CPSF73 interacts with another cleavage and polyadenylation specificity factor, CPSF of 100 kDa (CPSF100), and the two are thought to exist as a heterodimer that is specifically recruited to pre-mRNA substrates to elicit processing (9). CPSF100 is also a member of the β-CASP family, and the crystal structure of the yeast orthologue of CPSF100 (Ydh1) demonstrates a fold that is similar overall to that of human CPSF73 despite a divergent primary sequence (22). An important distinguishing factor, however, is the absence of key catalytic residues within motif 2 of the MBL domain and motif B adjacent to the β-CASP domain. Consistent with these changes is the absence of detectable zinc ions in the CPSF100 crystal structure, which presumably renders CPSF100 catalytically inactive. The functional implication of an inactive CPSF100 is unknown, but others have observed that the overall MBL/β-CASP domain integrity is important in mediating its interaction with CPSF73 in conjunction with the large scaffold protein Symplekin (17, 18). Together, Symplekin, CPSF73, and CPSF100 likely comprise the core cleavage factor that cleaves poly(A)+ pre-mRNA as well as histone pre-mRNA. Despite this commonality, its mode of recruitment must be unique in each circumstance (18, 32). In support of this model, others have shown that the cleavage factor associates with the U7 snRNP involved in histone pre-mRNA processing or with CPSF160, which is responsible for the recognition of the polyadenylation sequence (PAS) present in poly(A)+ pre-mRNA (32).

The Integrator Complex represents a parallel, yet distinct set of proteins that must achieve the same ultimate outcome of 3′ end formation but instead function to cleave the RNAPII-transcribed uridine-rich small nuclear RNA (U snRNA) (reviewed in reference 4). The complex contains at least 12 subunits and was initially identified based on an affinity for the C-terminal domain (CTD) of Rpb1, the largest subunit of RNAPII (2). Integrator subunits have a high affinity for the CTD phosphorylated at the serine 2 and 7 positions (12), and the loss of serine 7 phosphorylation disrupts snRNA 3′ end formation in cells (10). Subsequent systematic functional analysis of the Drosophila Integrator complex by the use of RNA interference (RNAi) revealed that nearly all of its members are required for the 3′ end processing of primary snRNA (pri-snRNA), with four subunits in particular, IntS1, -4, -9, and -11, playing a significant role in this process (13). Intriguingly, nearly all of the subunits of the Integrator complex lack domains that would suggest a role in 3′ end processing of RNA, with the exception of IntS9 and IntS11 (9). These two factors are both members of the β-CASP family and are similar to CPSF100 and CPSF73, respectively (9). Integrator 11 contains its highest degree of conservation with CPSF73 over much of the N-terminal MBL domains and β-CASP domains (Fig. 1). IntS9 appears related to CPSF100 in that both proteins are likely inactivated through changes in key catalytic residues; however, it is noteworthy that IntS9 is significantly smaller than CPSF100, with a predicted molecular mass of 74 kDa. Initial analysis of IntS11 performed using a yeast two-hybrid assay found that a region within the C terminus of IntS11 is sufficient for interaction with IntS9 and that, despite its homology, it does not demonstrate any interaction with CPSF100 (9). Beyond this analysis, little is known about how IntS11 specifically associates with IntS9 rather than CPSF100. Moreover, the functional repercussions of this interaction for snRNA 3′ end formation have not been investigated.

Here we perform structural and functional analysis of the two subunits of the Integrator cleavage factor, IntS9 and IntS11. We find that the binding site within IntS11 for IntS9 comprises sequences downstream of the β-CASP domain, including the C motif, through the remainder of the C-terminal MBL domain. The critical amino acids required for binding within this region are highly conserved within IntS11 subunits of various species and are absent within the analogous region of CPSF73. Using a cell-based complementation assay for Integrator activity, we found that the interaction between IntS9 and IntS11 is critical for snRNA 3′ end formation. These data show that the interaction of IntS11 with IntS9 is required for 3′ end formation of snRNAs and suggest that heterodimerization of IntS9 and IntS11 may activate the endonuclease activity of IntS11 and/or is required for the recruitment of IntS11 to the cleavage site within primary snRNA transcripts.

MATERIALS AND METHODS

Cell culture, transfections, and RNAi.

HeLa and HEK293T cells were cultured using Dulbecco's modified Eagle medium (DMEM) (Invitrogen, Carlsbad CA) supplemented with 10% fetal bovine serum (FBS; Phenix Research, Chandler, NC) and penicillin-streptomycin (pen/strep) (Invitrogen). Transient transfections were performed using HEK293T cells and Lipofectamine 2000 (Invitrogen) and essentially following the manufacturer's protocols. Briefly, 5 × 105 HEK293T cells were seeded in a six-well plate (Greiner Bio-One, Germany) 1 day prior to transfection. The following day, 500 ng of each plasmid encoding hemagglutinin (HA)-tagged and myc-tagged proteins was mixed in 100 μl of Opti-MEM (Invitrogen) without serum. A second tube with 100 μl of Opti-MEM was mixed with 2 μl of Lipofectamine 2000 and allowed to incubate at room temperature for 7 min. The contents of the two tubes were mixed and allowed to incubate for 25 min at room temperature. The entire 200-μl mixture was directly added to the cells that had been plated in 1 ml of growth media containing 10% FBS. Media were changed 24 h later depending on cell fitness and ultimately harvested 48 h after the initial transfection.

RNAi experiments were performed only in HeLa cells by the use of a modified protocol derived from the manufacturer's instructions for Lipofectamine 2000. Briefly, we placed into tube A 3 μl of 20 μM siRNA mixed into 47 μl of Opti-MEM (Invitrogen). Tube B contained 12 μl of Opti-MEM and 3 μl of Lipofectamine 2000. Both tubes were allowed to incubate at room temperature for 7 min, and the contents were mixed and allowed to further incubate at room temperature (RT) for an additional 25 min. Following this incubation, 38 μl of Opti-MEM was added to each tube and 100 μl of the mixture was pipetted on the cells. Cells were initially plated in 24-well plates at a density of 8.5 × 104 cells per well. The day after the RNAi treatment, cells were transferred one-to-one into a 6-well plate containing 2 ml total of standard 10% FBS–DMEM growth media. The following day, cells were transfected again with siRNA under the conditions described above. Two days following the second siRNA transfection, cells were harvested for Western blot analysis.

For RNAi experiments using the U7-Reporter and RNAi-resistant rescue plasmids, the RNAi protocol described above was followed for the first siRNA transfection and the transfer from 24-well plates into 6-well plates but was then modified. The second siRNA experiment was done in conjunction with a standard plasmid transfection. In tube A, 3 μl of 20 μM siRNA was mixed with 500 ng of U7-green fluorescent protein (U7-GFP) reporter as well as 500 ng of HA-tagged rescue plasmid or HA empty vector, all of which was mixed with 100 μl of Opti-MEM. A 3-μl volume of Lipofectamine 2000 was mixed with Opti-MEM in tube B, and both tubes were allowed to incubate for 7 min at room temperature. The contents of the two tubes were then combined and allowed to further incubate at room temperature, and then 200 μl was pipetted directly in the 6 wells containing the cells that had been treated once prior with siRNA. Cells were harvested 2 days after transfection and subjected to Western blot analysis.

Western blot analysis and coimmunoprecipitations.

For all Western blot analyses performed in this study, between 5 and 20 μg of protein lysate was mixed 1:1 with 2× sodium dodecyl sulfate (SDS) loading buffer (100 mM Tris-HCl [pH 6.8], 4% SDS, 10% glycerol, 200 mM dithiothreitol [DTT]) and boiled for 5 min. Samples were resolved using 12.5% SDS-polyacrylamide gel electrophoresis (SDS-PAGE) until the dye front had run off the gel. Gels were then transferred overnight at 40 V onto polyvinylidene fluoride (PVDF) (Millipore, Billerica, MA). PVDF membranes were then blocked for 30 min in blocking buffer (phosphate-buffered saline [PBS], 1% Tween, 5% nonfat dry milk). Primary antibodies were diluted 1:1,000 for myc, 1:1,000 for HA (Covance, Princeton, NJ), 1:2,000 for GFP (BD Biosciences, Sparks, MD), 1:5,000 for tubulin (Abcam, Cambridge, MA), and 1:1,000 for IntS11 (Bethyl, Montgomery, TX). Following the incubation in primary antibody, samples were then incubated with secondary antibodies that were conjugated to horseradish peroxidase at a 1:5,000 dilution, imaged using enhanced chemiluminescent substrate (Pierce ThermoFisher Scientific, Rockford, IL), and detected on hyperfilm (Phenix).

For coimmunoprecipitations, HEK293T cells plated in 6-well plates were transiently cotransfected with 500 ng of plasmids encoding the HA-tagged and myc-tagged proteins. Following 48 h of incubation, cells were lysed using 500 μl of different lysis buffers. The low-salt lysis buffer contained 150 mM NaCl, 1% NP-40, and 100 mM Tris-HCl (pH 8.8); the high-salt lysis buffer was the same, with the exception that 500 mM NaCl was used. The denaturing lysis buffer contained 0.1% SDS, 1% NP-40, 0.5% sodium deoxycholate (SDC), 150 mM NaCl, and 50 mM Tris-HCl (pH 8.8). Cells were lysed for 30 min at 4°C and then transferred to Eppendorf tubes. Approximately 50 μl of the mixture was removed for input samples. A 20-μl volume of a 1:1 slurry of HA-conjugated agarose beads (Sigma, St. Louis, MO) was added to the lysates and allowed to rotate at 4°C for >2 h. Following the rotation, the beads were spun down at 2,000 rpm for 2 min at 4°C, the supernatant was removed, and the beads were then washed in 500 μl of lysis buffer 3 times. The first wash was also used to transfer the beads to a new Eppendorf tube. Following the three washes, the final supernatant was removed, 2× SDS loading buffer was added to the beads, and they were allowed to boil. In all cases, 10% of the input was loaded as the “input” lanes. Representative inputs are used to consolidate multiple input experiments; however, we never observed any effect on input expression with any cotransfection experiments. Bead supernatants were analyzed by Western blot analysis as described above, with the exception that the mouse secondary antibody was light chain specific to reduce heavy-chain contamination (Jackson ImmunoResearch, West Grove, PA).

In vitro translation and pulldown assay.

Full-length, FLAG-tagged Integrator 9 was purchased from Origene Technologies (Rockville, MD). Analysis of purified IntS9 was performed using standard techniques for Coomassie staining of SDS-PAGE gels and the Western blot protocol described above. The in vitro translation of IntS11 was performed using a TNT T7 quick-coupled transcription/translation system (Promega, Madison, WI). Briefly, 500 ng of pcDNA3-HA IntS11, which contains a single T7 promoter site upstream of the IntS11 translation start site, was added to 20 μl of the TNT reticulocyte lysate reaction mixture lacking methionine. To the TNT reaction mixture containing the IntS11 transcription template, 3 μl of [35S]methionine (MP Biomedical, Solon, OH) (specific activity, >1,000 Ci/mmol) was added. Reaction mixtures were then incubated at 30°C per the manufacturer's instructions and assessed using SDS-PAGE analysis for correct translation. The translation product generated was a single band migrating at the expected size of 70 kDa. Pulldown experiments were performed essentially as described previously (35). Briefly, 500 ng of full-length recombinant IntS9 was added to 5 μl of translated IntS11. As a negative control, an equal volume of IntS9 elution buffer was added to 5 μl of translated IntS11. Reaction mixtures were allowed to incubate on ice for 30 min, and then the volume was brought to 100 μl using PBS. Twenty microliters of a 1:1 slurry of anti-FLAG agarose beads (Sigma) was added to the reaction mixture, which was allowed to incubate for a further 30 min, with periodic mixing of the beads. The beads were then washed three times with PBS supplemented with an additional 250 mM NaCl. After the final wash, the beads were boiled in 40 μl of SDS loading buffer and were analyzed using SDS-PAGE. The input lane represents 10% of the initial translated IntS11.

Cloning and plasmid design.

Human Integrator and CPSF cDNAs were purchased from Open Biosystems. The plasmids encoding HA-tagged proteins were homemade derivatives of pcDNA3 vectors in which two HA epitopes were placed at the N terminus. Plasmids encoding myc-tagged proteins were also homemade derivatives of pcDNA3 vectors, and three myc epitopes were placed at the N terminus. All constructs were sequenced to determine identity. The human U7-GFP reporter was derived from the U7 snRNA gene (RNU7) located on chromosome 12. The U7 promoter, coding region, and 3′ box were all cloned upstream of an enhanced GFP (EGFP) open reading frame (ORF) and placed into a pcDNA3 vector in which the promoter was deleted to remove any residual background transcription. RNAi-resistant plasmids were generated through alteration of the IntS11 cDNA to create wobble mutations by the use of site-directed mutagenesis. All oligonucleotides used in this study are available upon request.

RESULTS

Integrator subunits 9 and 11 interact in a strong and specific fashion.

To investigate and characterize the interaction of IntS9 and IntS11 in a cell-based system, we cloned both human cDNAs into expression vectors containing either a myc tag or an HA tag at the N terminus. As controls, we also cloned human CPSF73 and CPSF100 to determine the specificity of the IntS9/11 interaction in this system. Western blot analysis of cell lysates from HEK293T cells individually transfected with each myc-tagged protein demonstrated equal levels of expression of the exogenous proteins (Fig. 2A, upper panel). To analyze pairwise interactions, we cotransfected each of the myc-tagged cDNAs with either HA-IntS9 or HA-IntS11 and then performed immunoprecipitation using resin conjugated to HA antibodies. We observed that HA-tagged IntS9 was able to specifically pull down myc-tagged IntS11 and was not observed to interact with itself or either CPSF subunit (Fig. 2A, lower panels). Likewise, coimmunoprecipitation of HA-IntS11 demonstrated specific binding to myc-IntS9. We did observe weak but reproducible self-interaction of IntS11 (Fig. 2A, lane 2); however, the amount of myc-IntS11 pulled down was significantly lower than that seen with myc-IntS9.

Fig 2.

Fig 2

Integrator 9 and Integrator 11 form a specific and robust heterodimer. (A) Upper panel, Western blot (WB) analysis of cell lysates from HeLa cells transfected with myc-tagged IntS9, IntS11, CPSF100, or CPSF73; middle panel, Western blot analysis of immunoprecipitates (IP) by the use of anti-HA agarose beads from cells transfected with HA-tagged IntS9 along with each of the myc-tagged proteins shown in the upper panel; lower panel, same as the middle panel except that data represent cells cotransfected with HA-tagged IntS11. The asterisks represent the cross-reacting Ig heavy chain present after immunoprecipitation. (B) Western blot analysis of cells cotransfected with HA-IntS11 and myc-IntS9 or with HA-IntS9 and myc-IntS11 (lower panel). Lysates were subjected to immunoprecipitation with anti-HA antibodies under various lysis and wash conditions (LS, 150 mM NaCl; HS, 500 mM NaCl; Denaturing, 0.1% SDS and 0.5% SDC) (lane 4) and then supplemented with either 500 mM NaCl (lane 5) or 1 M NaCl (lane 6). “VA” denotes lysates transfected with myc-tagged proteins with empty HA vector. (C) SDS-PAGE analysis of eluted FLAG-tagged, full-length IntS9 by the use of Coomassie blue staining (left panel) or Western blot analysis with FLAG antibodies (middle panel) or IntS9 antibodies (right panel). “—,” lane loaded with elution buffer only. (D) Results from a pulldown assay using FLAG-tagged IntS9 and [35S]methionine-labeled IntS11. The first lane represents 10% of the input (Inp.) IntS11, the middle lane reflects the amount of IntS11 pulled down with anti-FLAG beads alone, and the last lane represents the pulldown of IntS11 with IntS9.

To determine the relative strength of the IntS9/11 interaction, we repeated the coimmunoprecipitation analysis under conditions of increasing stringency. Using lysis and wash buffers containing either low salt (150 mM NaCl) or high salt (500 mM NaCl), we found no observable differences in the relative amounts of coimmunoprecipitated protein (Fig. 2B, lane 2 versus lane 3). This was true regardless of which Integrator contained the N-terminal HA tag, and we did not observe any residual binding to the HA-agarose when the myc-tagged cDNAs were cotransfected with HA vector alone (“VA” in Fig. 2 and subsequent figures). We then tested the effects of denaturing solvents containing both SDS and sodium deoxycholate and found the efficiency of pulldown was only slightly diminished relative to that seen in the low-salt lysis (Fig. 2B, lane 2 versus lane 4). Finally, we lysed cotransfected cells by the use of the same denaturing conditions; however, we then washed the HA resin using denaturing wash buffers that were also supplemented with increasing salt (Fig. 2B [lane 5 = 500 mM NaCl; lane 6 = 1 M NaCl]). The amount of coimmunoprecipitated protein was reduced slightly but still readily observable even after washes with denaturing solvent containing 1 M NaCl.

To address whether the interaction between IntS9 and IntS11 is direct, we investigated their binding in a cell-free system. Full-length Integrator 9 was expressed and purified taking advantage of two C-terminal FLAG epitope tags. To assess the quality of recombinant IntS9, we analyzed the eluted protein by Coomassie staining as well as by Western blot analysis using both anti-FLAG and anti-IntS9 antibodies (Fig. 2C). In all three cases, we observed only a single band of the predicted 74-kDa size that reacted with either the FLAG or IntS9 antibodies. We incubated FLAG-tagged IntS9 with full-length, in vitro-translated IntS11 in the presence of [35S]methionine and performed a pulldown assay using anti-FLAG agarose. We found a significant amount of IntS11 coming down with anti-FLAG agarose after preincubation with FLAG-IntS9 relative to only background binding observed using anti-FLAG agarose alone (Fig. 2D).

These data show that the interaction of IntS9 and IntS11 is not only specific and direct but remarkably robust. This is in contrast to the interaction of CPSF73 and CPSF100, where we were unable to achieve reproducible coimmunoprecipitations between these two proteins; any weak association that we did observe was present only under low-salt conditions (not shown).

A distinct domain within the IntS11 C-terminal MBL domain binds to IntS9.

In order to determine the IntS9 binding domain within IntS11, we performed deletion mutagenesis combined with coimmunoprecipitation. We created a series of C-terminal and N-terminal truncation mutants of IntS11, each fused to an N-terminal HA tag (Fig. 3A). The deletion mutants as well as the full-length IntS11 were cotransfected with myc-tagged IntS9 and subjected to coimmunoprecipitation analysis with anti-HA antibodies. Five of the six deletion mutants were expressed at levels comparable to that of the full-length IntS11, with the smallest mutant (ΔN3) failing to accumulate to detectable levels (Fig. 3B). We observed significant enrichment of myc-IntS9 by the use of full-length HA-IntS11; however, deletion of the last 97 amino acids (aa) of the C terminus (ΔC1) significantly reduced this interaction. A larger deletion that included the B and C motifs as well as a portion of the β-CASP domain (ΔC2) caused a further reduction in myc-IntS9 pulldown, suggesting that this region is required for the IntS9/11 interaction (Fig. 3B, lane 4 versus lane 5). We observed that deletions of the N terminus of IntS11 did not result in any reduction in the efficiency of coimmunoprecipitation of IntS9 (ΔN1 and ΔN2). These results show that the IntS11 interaction with IntS9 requires amino acids 503 to 600 and that the final 268 residues are sufficient for this association.

Fig 3.

Fig 3

Integrator 11 binds to Integrator 9 through a conserved region at the C terminus. (A) Schematic of Integrator 11 deletion constructs. The β-CASP domain and the core motifs of the MBL domain are labeled. (B) Western blot analysis of cells transfected with HA-tagged full-length or deletion mutants of IntS11 (upper panel). The lower panel shows the results of coimmunoprecipitation using anti-HA antibody agarose beads followed by Western blot analysis of myc-tagged IntS9. (C) Schematic showing a detailed view of amino acids 384 to 458 of IntS11 and specific fragments that were fused to HA-tagged GFP. The shaded regions represent identical IntS11 amino acids of various species. Below the alignment of IntS11 is an alignment of human IntS11 and CPSF73 highlighting both the conserved residues (in gray) and the known secondary structure elements from the crystal structure of CPSF73 (22). (D) Upper panel, Western blot analysis of cell lysates from cells transfected with an HA-tagged GFP protein fused to six different fragments of the C terminus of IntS11 (C1 to C6); lower panel, anti-myc Western blot analysis of immunoprecipitates by the use of anti-HA antibodies from cells cotransfected with the HA-GFP-tagged IntS11 fragments and myc-IntS9. (E) Three-dimensional structure of human CPSF73 determined using Pymole (PDB accession no. 2I7V); the purple region is the β-CASP domain, the green is the N-terminal MBL domain, and the red is the C-terminal MBL domain. The two gray spheres represent the two zinc ions present in the catalytic core of CPSF73. Middle panel, alignment of CPSF73 (red) and IntS11 (blue). Right panel, predicted structure of IntS11, highlighting the N-terminal MBL domain (blue), β-CASP domain (red), and IntS9 interacting domain (purple).

Given that we were unable to express the final 102 amino acids (ΔN3), probably due to its instability, further resolution of the IntS9 binding domain within IntS11 was not attained. To address this, we fused amino acids from the C terminus of IntS11 to the C terminus of an HA-tagged GFP, to create a heterologous protein. We fused a series of six descending IntS11 C-terminal sequences to HA-GFP as follows: C1 (aa 333 to 600), C2 (418 to 600), C3 (450 to 600), C4 (499 to 600), C5 (528 to 600), and C6 (562 to 600) (Fig. 3C). The C1 fusion contains the sequences present in the ΔN2 as presented in Fig. 3A. The six HA-GFP fusion plasmids and the HA-GFP control were cotransfected with full-length myc-IntS9 and subjected to anti-HA immunoprecipitation. Each of the fusion proteins was expressed to a level comparable with those of the others as well as with that of the HA-GFP lacking any IntS11 sequences (Fig. 3D, upper panel). A significant amount of myc-IntS9 was coimmunoprecipitated with the HA-GFP-C1 protein versus an undetectable amount of myc-IntS9 as observed using the HA-GFP control protein (Fig. 3D, lane 2 versus lane 3). We observed a progressive decrease in the pulldown efficiency, beginning with slight reductions observed in comparisons of the C2 and C3 mutants and then only marginal binding found in the C5 and C6 mutants. These data, coupled with the data observed with the ΔC1 mutant presented in Fig. 3B (deletion of aa 503 to 600), suggest that the binding site for IntS9 within IntS11 begins in the proximity of motif C (at approximately aa 418) and extends to aa 520.

Importantly, an alignment of IntS11 protein sequences of various species in this region demonstrated a high degree of homology in the regions flanking motif C (Fig. 3C). A comparison of the human CPSF73 and IntS11 amino acid sequences in this area reveals that while motif B and C are almost totally conserved, the regions flanking motif C are not. The known secondary structures present within CPSF73 (based on its crystal structure and the numerical assignments as presented in reference 22) are represented below the alignment and illustrate that the regions flanking motif C comprise α-helix 6 and β-sheet 14. It is noteworthy that β-sheet 14, which is completely conserved among the aligned IntS11 sequences, represents the longest stretch of nonconserved amino acids in a full alignment of human IntS11 and CPSF73 (not shown).

To attain an estimation of where this region is located on IntS11, we generated a theoretical structure of IntS11 with a web-based structural alignment tool (http://swissmodel.expasy.org/) using the known coordinates from the crystal structure of CPSF73 (Protein Data Bank [PDB] accession no. 2I7V). The structure of amino acids 1 to 460 of human CPSF73 is shown in Fig. 3E; as noted previously, the C-terminal MBL domain folds back to associate with the N-terminal MBL domain to provide additional secondary structures (22). The modeled IntS11 structure aligns well with that of CPSF73 (Fig. 3E, middle panel), which is not surprising given their high degree of similarity (>40% from the N terminus through motif C). Structural information with respect to the C terminus of CPSF73 was not attained for the original structure; thus, we were unable to model the analogous region of IntS11. It is of note that, although the α6 helix of IntS11 is predicted to be slightly shorter than that of CPSF73, the model places it along with β14 on the surface of the MBL domain directly adjacent to the predicted active site. This model predicts that the experimentally defined IntS9 interaction motif within IntS11 is directly adjacent to the predicted entry site for the RNA substrate to access the active site within the core.

The C terminus of IntS9 is necessary and sufficient for binding to IntS11.

It has been shown previously using a yeast two-hybrid assay that full-length IntS9 is capable of interacting with IntS11; however, the region within IntS9 responsible for this binding is not known (9). To address this, we generated a series of N-terminal HA-tagged deletion mutants within IntS9 by the use of a method similar to that used to generated the IntS11 mutants (Fig. 4A). Unlike the IntS11 mutants, all six mutants generated were expressed to levels comparable to that of the full-length protein (Fig. 4B). In a fashion strikingly similar to that seen with IntS11, deletion of the C terminus of IntS9 abolished binding to IntS11 whereas truncation of the N terminus did not disrupt binding. Therefore, the ΔN3 protein, which includes amino acids 497 to 658, is the smallest fragment that is fully sufficient for mediation of interaction with IntS11. To attain a higher-resolution definition of the IntS11 binding site within the IntS9 C terminus, we cloned a collection of fragments of IntS9 downstream of HA-tagged GFP (Fig. 4C). As with the HA-GFP IntS11 fragments, these heterologous proteins were observed to accumulate to comparable levels, facilitating a comparison of their relative pulldown efficiencies. We observed that the C1, C2, and C3 fragments, which all contained amino acids 565 to 658, were able to pull down myc-IntS11, whereas all other mutants tested pulled down only background levels of myc-IntS11 that were similar to the HA-GFP control levels (Fig. 4D). These data demonstrate that the region of IntS11 interaction within IntS9 is between amino acids 565 and 658. Although the binding site is situated in a position similar to that within IntS11, the domain within IntS9 is shifted distinctly more toward the C terminus and apparently does not include motif C or sequences flanking that region. Both the β-CASP and C-terminal MBL domain within IntS9 are predicted to be slightly larger than IntS11, potentially explaining the shift in interaction further toward the C terminus of the protein.

Fig 4.

Fig 4

Integrator 9 binds to Integrator 11 through a region at the extreme C terminus. (A) Schematic of IntS9 deletion mutants used to test interaction with IntS11. (B) Western blot analysis of lysates from cells cotransfected with HA-tagged IntS9 fragments and full-length IntS11. Upper panel, relative expression of IntS9 fragments assessed with an anti-HA Western blot; lower panel, relative pulldown efficiency of each HA-tagged IntS9 deletion mutant determined by assessing the amount of myc-IntS11 present in the anti-HA immunoprecipitates. (C) Schematic of C-terminal fragments fused to HA-GFP spanning the 161 amino acids of IntS9. (D) Western blot analysis measuring the relative efficiencies of the HA-GFP IntS9 fragments in pulling down myc-tagged IntS11.

The interaction between IntS9 and IntS11 is required for snRNA 3′ end formation.

The strong and specific interaction of IntS9 and IntS11 suggests that it may be required for snRNA 3′ end formation. Here, we employed a reconstitution assay where we depleted the endogenous IntS11 by the use of siRNA and then reexpressed RNAi-resistant mutants to monitor their ability to rescue misprocessing of snRNA. We screened siRNAs for effective depletion of endogenous IntS11 (see Materials and Methods) and then introduced mutations into the IntS11 cDNA to render it resistant to the siRNA while still encoding the same polypeptide sequence. We cotransfected wild-type HA-tagged IntS11 as well as the RNAi-resistant HA-tagged IntS11 (Fig. 5A, lanes 11*) with control siRNA and investigated expression using Western blot analysis. Importantly, we found no differences in the levels of expression produced by either construct. However, when the experiment was repeated using IntS11 siRNA, we observed significant depletion of the wild-type protein whereas the RNAi-resistant IntS11 accumulated to levels comparable to the control-treated cell level (Fig. 5A). The amount of RNAi-resistant plasmid was then titrated to achieve a level of expression that was consistent with the amount of endogenous IntS11. As seen in Fig. 5B, HA-tagged RNAi-resistant IntS11 was expressed at levels slightly less than that of the endogenous protein when transfected into cells treated with control siRNA. After depletion of the endogenous IntS11 by the use of specific siRNA, we observed that expression of RNAi-resistant tagged IntS11 (lanes 11*) was not affected and was at slightly elevated levels (Fig. 5B, lane 2 versus lane 3). This was reproducibly observed and may have been a consequence of increased stability of the exogenous protein in the absence of its endogenous counterpart. Regardless, these data establish that restoration of IntS11 expression to nearly physiological levels can be achieved in cells where the endogenous IntS11 has been depleted using siRNA.

Fig 5.

Fig 5

The interaction of Integrator 9 and Integrator 11 is required for snRNA 3′ end formation. (A) Western blot analysis of lysates from cells treated with either control siRNA or siRNA targeting IntS11 and then transfected with HA-tagged IntS11 or HA-tagged IntS11 that is resistant to the IntS11 siRNA. (B) Western blot analysis using anti-IntS11 antibodies, demonstrating expression of exogenous RNAi-resistant IntS11 comparable to that of the endogenous IntS11. (C) Schematic of human U7-GFP reporter construct that produces GFP only in response to misprocessing conditions when Integrator subunits are knocked down. Right panel, Western blot analysis of cell lysates from cells treated with either control siRNA or IntS11 siRNA that were transfected with empty vector (VA) or RNAi-resistant IntS11 (lanes 11*) along with U7-GFP reporter. (D) Western blot analysis of cell lysates from cells treated with control siRNA or with IntS11 siRNA followed by transfection with the U7-GFP reporter and either full-length RNAi-resistant IntS11 or RNAi-resistant fragments as described for Fig. 2A.

To monitor the effects on snRNA 3′ end formation as a function of IntS11 depletion, we utilized a human fluorescence analysis-based reporter system. The reporter is similar to the one we developed for Drosophila S2 cells (Fig. 5C schematic) (13). This construct places the human U7 promoter, U7snRNA coding sequence, cleavage site, and 3′ box element upstream of an open reading frame for EGFP. The EGFP ORF contains the only start codon in the reporter and is itself followed by a strong cleavage and polyadenylation site. Under normal conditions, transfection of the reporter into human cells results in only marginal GFP expression because the U7 snRNA is cleaved (processed) prior to transcription of the downstream ORF. However, when the processing is inhibited due to RNAi-mediated depletion of an Integrator subunit, the RNAPII transcribes through the cleavage site and then utilizes the cleavage and polyadenylation sequence downstream of the EGFP ORF. Therefore, snRNA misprocessing results in a gain in GFP expression, which can be monitored using Western blot analysis with anti-GFP antibodies. To demonstrate the effectiveness of this reporter system, we first transfected cells with either control siRNA or the IntS11 siRNA followed by cotransfection of the U7-GFP reporter with the HA vector alone. Control siRNA-treated cells exhibited little to no expression of GFP, as was expected due to the activity of the endogenous Integrator subunits. After depletion of IntS11, we observed a significant increase in the level of GFP due to misprocessing and subsequent transcriptional readthrough (Fig. 5C, lane 1 versus lane 2). Importantly, when the U7-GFP reporter was cotransfected with the RNAi-resistant IntS11 cDNA into cells treated with IntS11 siRNA, we observed a dramatic reduction in the amount of GFP expression from the U7-GFP reporter to levels similar to those observed in control siRNA-treated cells (Fig. 5C, lane 1 versus lane 3). This demonstrates that full-length, HA-tagged, and RNAi-resistant IntS11 is capable of fully restoring the level of misprocessing observed using the U7-GFP reporter.

Next, we used the U7-GFP reporter assay to determine which mutants of IntS11 tested as described for Fig. 3A are capable of restoring snRNA processing when expressed in cells depleted of IntS11. The ΔC1, ΔC2, and ΔC3 plasmids were rendered resistant to IntS11 siRNA through the introduction of the same wobble mutations used in the full-length cDNA. Both ΔN1 and ΔN2 are naturally resistant to the siRNA, given that its target is near the 5′ end of the mRNA. We observed that cotransfection of the full-length IntS11 cDNA resulted in total rescue of snRNA processing by virtue of the suppression of GFP expression, whereas none of the deletion mutants were able to restore activity after IntS11 depletion (Fig. 5D). While it was expected that the ΔC2, ΔC3, ΔN1, and ΔN2 fragments would not be functional, as each of these mutants is lacking one or more of the conserved motifs that mediate endonuclease activity, the ΔC1 deletion mutant (amino acids 1 to 603) has all of these motifs intact (Fig. 3A schematic). It is likely that this mutant is folded properly, given that it accumulates to levels similar to that of the endogenous protein, and is also likely to be catalytically active, as it has been established that only amino acids 1 to 460 of human CPSF73 are required for catalytic activity in vitro (22). The most parsimonious interpretation of this result is that the ΔC1 mutant is catalytically competent but unable to restore processing of the U7-GFP reporter, because it cannot interact with IntS9 due to the removal of its IntS9 interaction domain.

Expression of either the IntS11 or IntS9 C terminus alone functions in a dominant-negative fashion by interfering with snRNA 3′ end formation.

Given that the C-terminal domains of both IntS9 and IntS11 are sufficient for their interaction, we addressed the effect of overexpressing only the individual C termini on snRNA 3′ end formation. The prediction is that overexpression of the C-terminal fragment of IntS11 or IntS9 alone could disrupt heterodimeric association of endogenous IntS9/11 by titrating away one of these subunits. To test this hypothesis, we fused C-terminal fragments of IntS9 or IntS11 onto an HA-tagged mCherry heterologous protein. The purpose of changing the fusion protein to mCherry is to allow the use of the U7-GFP reporter to monitor the effect on snRNA 3′ end formation. Western blot analysis demonstrated a comparable level of expression of either heterologous protein relative to mCherry expressed alone (Fig. 6A). We observed that ovexpression of either Integrator fragment alone was capable of triggering misprocessing of the reporter (Fig. 6A). This demonstrates that expression of the individual binding domains results in their behaving as dominant negatives, likely due to disrupting interactions between endogenous IntS9 and IntS911. In support of this interpretation, we lysed transfected cells and subjected them to immunoprecipitation using anti-HA-conjugated agarose resin. Detectable levels of endogenous IntS9 were observed in immunoprecipitates of HA-mCherry-IntS11 fragments, and we also detected endogenous IntS11 present in HA-mCherry-IntS9 immunoprecipitates (Fig. 6B). Collectively, these data demonstrate that disruption of the heterodimeric interaction between IntS9 and IntS11 inhibits snRNA 3′ end formation and further support the model that it is an event required for processing.

Fig 6.

Fig 6

Expression of only the C terminus of Integrator 9 or 11 results in behavior reminiscent of that of a dominant negative through disruption of the endogenous IntS9/11 heterodimer. (A) Left schematic, diagram of HA-tagged mCherry constructs that contain either amino acids 450 to 600 of IntS11 or amino acids 565 to 658 of IntS9; right panel, Western blot analysis showing expression of the control HA-tagged mCherry and the two heterologous fusions. The lower two panels demonstrate the degree of GFP expression from the U7-GFP reporter after cotransfection with the mCherry fusion proteins. (B) Western blot analysis of immunoprecipitates from cells transfected with the mCherry fusion proteins as described for panel A. Input lysates and immunoprecipitates from transfected cells were probed for either endogenous IntS11 or IntS9. (C) Schematic of chimeric CPSF73/IntS11 constructs. The gray-shaded region represents sequences derived from CPSF73, while the hatched regions represent IntS11sequences. (D) Upper panel, Western blot analysis of cell lysates cotransfected with HA-tagged chimeric constructs and full-length IntS11; lower panels, Western blot analysis of input full-length myc-IntS9 and coimmunoprecipitates from cotransfected cells. (E) Western blot analysis of cell lysates from cells treated with either control siRNA or IntS11 siRNA followed by cotransfection with HA vector alone or with RNAi-resistant IntS11 or either of the two chimeric constructs. The upper panel represents probing for GFP, and the lower panel represents an antitubulin loading control.

A chimeric IntS11/CPSF73 protein cannot function as a substitute for a component of the Integrator Cleavage Factor.

Given the high degree of similarity between IntS11 and CPSF73 throughout the N-terminal MBL domain and the β-CASP domain, we asked whether these two regions are interchangeable. To address this question, we created two different chimeric proteins consisting of fusions of CPSF73 and IntS11 (Fig. 6C). The first chimera (which we termed “73/11”) contains the N-terminal MBL domain and most of the β-CASP domain of CPSF73 fused to the remainder of the IntS11 β-CASP and C-terminal MBL domain, including the IntS9 binding region. The second chimera, 11/73, is the reciprocal design and lacks any IntS9 interaction domain but still retains all of the motifs required for catalysis. Both constructs were created to be resistant to the IntS11 siRNA either through the presence of silent mutations (11/73) or by the absence of the targeting site altogether (73/11). We took advantage of the known structure of CPSF73 and our modeled IntS11 structure (Fig. 3E) to design a junction point within a linker region between β-sheet E and α-helix F within CPSF73. We reasoned that this would introduce little structural perturbation when joining the proteins. Both chimeric proteins were observed to be expressed to levels slightly above, but still comparable to, that of the HA-tagged RNAi-resistant IntS11. As expected, the HA-IntS11 and HA-73/11 chimeras were capable of pulling down the myc-tagged IntS9 (Fig. 6D). The absence of myc-IntS9 in the 11/73 chimera immunoprecipitates was the predicted result, since it lacks a C-terminal IntS9 domain. We tested for the ability of both chimeras to restore processing of the U7-GFP reporter after depletion of endogenous IntS11. While the RNAi-resistant IntS11 restored processing after depletion of endogenous IntS11, surprisingly, neither chimera was capable of rescue (Fig. 6E). Although it is possible that the chimeras we designed are inactive or misfolded, we do not favor this hypothesis, as both chimeras accumulated to levels similar to that of the HA-tagged IntS11 and the 73/11 chimera was able to interact as efficiently with IntS9. Furthermore, the positions of all of the motifs required for CPSF73 catalytic activity have been retained in the chimeric construct and this fragment of CPSF73 has been previously demonstrated to be active in vitro (22). These data suggest either that there is a substrate preference present in the MBL/β-CASP domain of CPSF73 that is not compatible with the snRNA 3′ end cleavage site or that a heterodimer consisting of IntS9 and a chimeric CPSF73/IntS11 is not recognized by the other Integrator subunits and therefore not recruited to the cleavage site.

DISCUSSION

Based on homology, the initial annotations of Integrator 9 and Integrator 11 were “CPSF100-like” and “CPSF73-like,” suggesting that these proteins may perform a specialized function related to the CPSF subunits (9, 26). They represent the two most highly conserved members of the Integrator complex, reflecting their importance in the 3′ end formation of snRNA. Here we determined that Integrator 11 binds to Integrator 9 by the use of a domain within its C terminus that is noticeably not conserved in CPSF73, despite a high level of similarity throughout their sequences. Importantly, we show that this interaction is required for snRNA 3′ end formation by both an RNAi-rescue assay and the creation of dominant-negative proteins. The dimerization or heterodimerization of the endonucleases within the MBL/β-CASP family appears to be a common property and, based on the observations presented here, may be essential for all of its members to carry out cleavage.

Implications of IntS9/11 heterodimerization for snRNA 3′ end formation.

The interaction between Integrator 9 and Integrator 11 requires each of their C-terminal domains, in accordance with previous reports demonstrating that the β-CASP member RNase J homodimerizes utilizing binding motifs within its C terminus (20). However, in contrast, we found that the C-terminal domains of either IntS9 or IntS11 alone are sufficient to mediate interaction whereas mutations within motif 2 reduce RNase J self-interaction (24). Deletion analysis of both IntS9 and IntS11 showed that motif 2 is dispensable for heterodimerization and that removal of this sequence actually increases the amount of protein pulled down using coimmunoprecipitation (Fig. 3B and 4B). The interaction between CPSF73 and CPSF100 is also dependent on the integrity of MBL conserved motifs, as seen with RNase J, suggesting that the Integrator cleavage factor (IntS9/11) has unique properties compared to other β-CASP members (18). The data presented in Fig. 3D show that coordination of the IntS11 motif C into the active site is not required for interaction with IntS9, suggesting that these amino acids form a secondary structure capable of binding IntS9 in the absence of zinc coordination.

The distinction between the CPSF and Integrator cleavage factors is further illustrated by the apparent differences in the strengths of the interaction. We and others have noted the apparent weak interaction of CPSF73/100 in pulldown experiments and in yeast two-hybrid assays (9, 18). The interaction of IntS9/11 is remarkably strong, as evidenced by its resistance to denaturing solvents and high salt concentrations (Fig. 2B). The biological importance of this robust interaction is not clear, but a highly stable Integrator cleavage factor may be important when the relative abundances of these two proteins are low. The relative levels of IntS9/11 protein are not known, and while the amount of cellular snRNA is high, the stability of snRNA is also quite high, suggesting that the levels of these factors may be lower than those of their CPSF paralogues.

In the absence of CPSF100, recombinant CPSF73 has been shown to lack enzymatic activity unless it is first preincubated with high levels of calcium (22). The prevailing explanation for this property is that the addition of calcium causes conformational changes required for catalytic activity. The molecular model presented in Fig. 3E reveals that the binding site for IntS9 is in close proximity to the predicted active site of IntS11. The binding of IntS9 to this position may affect IntS11 in a fashion similar to that of calcium treatment of CPSF73 by causing the conformational changes necessary to allow access of the primary snRNA transcripts to the IntS11 active site. Definitive evidence to support this hypothesis would require having structural information describing these two Integrator subunits in a complex.

Potential models of Integrator cleavage factor action.

Data presented here demonstrate that interaction of Integrator subunits 9 and 11 is a requisite event for snRNA 3′ end formation. This parallels studies where it was found that CPSF73/100 heterodimerization is required for histone mRNA 3′ end formation and, likely by extension, poly(A)+ mRNA 3′ end formation (18). The analogy between the Integrator cleavage factor and the CPSF cleavage factor raises two mechanistic questions pertaining to the Integrator cleavage complex. (i) Is there a Symplekin-like protein behaving as a scaffold for the Integrator cleavage factor? (ii) How does the Integrator cleavage factor specifically cleave primary snRNA transcripts despite the absence of a recognizable RNA binding domain within other Integrator subunits?

The first question is based on reports from several groups implicating Symplekin in the 3′ end processing of poly(A)+ and histone mRNA (17, 32, 34). The exact function of Symplekin in pre-mRNA processing is not known, but it is thought to behave as a protein scaffold bridging multiple members of the processing machinery with the transcriptional apparatus to properly target cleavage (reviewed in reference 21). Reports demonstrating interactions between Symplekin and a large number of poly(A) pre-mRNA processing factors are consistent with this hypothesis (6, 14, 19, 29, 33, 37). Further, recently determined crystal structures of the N terminus of human Symplekin revealed that the HEAT repeat structure of Symplekin binds to Ssu72, which in turns binds to the CTD of Rpb1 (16, 36). Importantly, Symplekin forms a stable complex with CPSF73/CPSF100 and is differentially recruited to the sites of processing in poly(A)+ and histone pre-mRNA (17, 32). Although it has not been definitively proven, codepletion experiments suggest that Symplekin may interact with its CPSF binding partners only when they are heterodimeric, as depletion of any one subunit results in the codepletion of the other two. Here, we observed that a CPSF73/IntS11 chimeric protein is capable of robustly interacting with Integrator 9 and yet is incapable of restoring processing after depletion of the endogenous Integrator 11 (Fig. 6). It is feasible that, while care was taken to fuse the proteins in a way that maintained functionality, approaches to assess the activity of the chimera are not yet available to determine its function. We interpret the failure of the chimera to complement the knockdown as reflecting that recognition of the IntS9/11 heterodimer by other Integrator subunits is not dependent on the simple presence of any active MBL domain capable of interacting with IntS9. Rather, we posit that interaction of IntS9/11 with other Integrator subunits is highly specific and likely includes binding surfaces that exist only when IntS9 and IntS11 form a heterodimer. This hypothesis predicts that there is an Integrator subunit performing scaffolding functions similar to those performed by Symplekin, which is capable of a highly precise interaction with the IntS9/11 heterodimer. None of the other known Integrator subunits contain significant homology to Symplekin at the primary sequence level; however, this does not rule out an undetectable structural similarity. One possible candidate for a “Symplekin-like” Integrator subunit is IntS4 based on its similar overall size and domain organization. Integrator 4, like Symplekin, has highly conserved N-terminal HEAT repeats with a less conserved C-terminal region. Moreover, we have shown previously that depletion of Drosophila IntS4 leads to an amount of misprocessing of endogenous snRNA similar to that seen with depletion of either IntS9 or IntS11 (13). We have also observed reciprocal codepletion of IntS4/9/11 (A. Sataluri and E. J. Wagner, unpublished results) similar to the observations made for Symplekin, CPSF73, and CPSF100 (32).

The potential answers to the second mechanistic question of how the Integrator cleavage factor achieves specific RNA cleavage are more elusive. The Integrator cleavage factor, like its CPSF counterparts, does not possess a defined RNA binding domain, leading to the inherent question of how specific RNA cleavage is achieved. Three recent crystal structures of the archaeal CPSF-like proteins; PH1404 from Pyrococcus horikoshii OT3, Mm KH-CPSF from Methanosarcina mazei, and MTH1203 from Methanothermobacter thermautotrophicus, may offer insight into their metazoan equivalents (25, 28, 31). These proteins are all members of the β-CASP family but, unlike CPSF73/100 and IntS9/11, also possess an N-terminal KH domain capable of binding RNA. Intriguingly, the homodimeric interaction of these β-CASP proteins is mediated through interaction of their C-terminal MBL domains, which is strikingly similar to what has been observed here for IntS9/11 heterodimerization. The model of RNA binding and endonuclease cleavage as proposed by Silva et al. (31) is represented in Fig. 7. This model is based on the speculation that the RNA binding by the KH domain of one subunit facilitates cleavage at a downstream site by the other subunit. Molecular modeling of an RNA substrate onto the MTH1203 homodimer supports such a theory and disfavors the alternative possibility that the same β-CASP subunit both binds and cleaves RNA.

Fig 7.

Fig 7

Model of IntS9/11 binding to the 3′ end of snRNA compared to the known archaeal homodimeric CPSF-KH protein, representing the molecular model proposed by Silva et al. (31) based on their crystal structure of the MTH1203 β-CASP protein. This model is compared with a model for both the CPSF73/100 and IntS9/11 heterodimers. The MBL domain is represented in dark gray, the β-CASP domain in light gray, and the RNA substrate in red; the lightning bolt indicates the cleavage site.

The metazoan CPSF and Integrator cleavage factors lack an N-terminal KH domain and thus are dependent upon protein-protein interactions with other subunits that possess RNA binding domains. In the case of CPSF73/100, this interaction is likely carried out by CPSF160, which is known to utilize RNA recognition motifs (RRMs) that bind to the PAS located upstream of the cleavage site (27). This suggests that the Integrator cleavage factor may also utilize a similar mechanism, but the identity of that RNA binding factor is unknown (Fig. 7). Further, this model may explain why both CPSF100 and Integrator 9 are inactive, as the archaeal CPSF-KH structures predict that the subunit with the KH domain is not the subunit that cleaves the RNA. This would argue that the subunit possessing the RNA binding activity or associating with an RNA binding protein would not necessarily require enzymatic activity. If this mechanism had been retained in the metazoan cleavage factors, selection pressure to maintain an active CPSF100 or IntS9 would have been lost, and the primary function of these proteins is to position CPSF73 or IntS11 in proximity with its substrates. Studies are under way to address these two mechanistic questions regarding the Integrator cleavage factor.

ACKNOWLEDGMENTS

We thank Phil Carpenter and members of the Wagner laboratory for helpful discussions and critically reading the manuscript.

Work in the Wagner laboratory is funded by the National Institutes of Health (grant 5R00GM080447) and also by a T.C. Hsu Faculty development award for E.J.W.

Footnotes

Published ahead of print 17 January 2012

REFERENCES

  • 1. Aravind L. 1999. An evolutionary classification of the metallo-beta-lactamase fold proteins. In Silico Biol. 1:69–91 [PubMed] [Google Scholar]
  • 2. Baillat D, et al. 2005. Integrator, a multiprotein mediator of small nuclear RNA processing, associates with the C-terminal repeat of RNA polymerase II. Cell 123:265–276 [DOI] [PubMed] [Google Scholar]
  • 3. Callebaut I, Moshous D, Mornon JP, de Villartay JP. 2002. Metallo-beta-lactamase fold within nucleic acids processing enzymes: the beta-CASP family. Nucleic Acids Res. 30:3592–3601 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Chen J, Wagner EJ. 2010. snRNA 3′ end formation: the dawn of the Integrator complex. Biochem. Soc. Trans. 38:1082–1087 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Clouet-D'Orval B, Rinaldi D, Quentin Y, Carpousis AJ. 2010. Euryarchaeal beta-CASP proteins with homology to bacterial RNase J have 5′- to 3′-exoribonuclease activity. J. Biol. Chem. 285:17574–17583 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Dichtl B, Aasland R, Keller W. 2004. Functions for S. cerevisiae Swd2p in 3′ end formation of specific mRNAs and snoRNAs and global histone 3 lysine 4 methylation. RNA 10:965–977 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Dominski Z. 2007. Nucleases of the metallo-beta-lactamase family and their role in DNA and RNA metabolism. Crit. Rev. Biochem. Mol. Biol. 42:67–93 [DOI] [PubMed] [Google Scholar]
  • 8. Dominski Z, Yang XC, Marzluff WF. 2005. The polyadenylation factor CPSF-73 is involved in histone-pre-mRNA processing. Cell 123:37–48 [DOI] [PubMed] [Google Scholar]
  • 9. Dominski Z, Yang XC, Purdy M, Wagner EJ, Marzluff WF. 2005. A CPSF-73 homologue is required for cell cycle progression but not cell growth and interacts with a protein having features of CPSF-100. Mol. Cell. Biol. 25:1489–1500 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Egloff S, et al. 2007. Serine-7 of the RNA polymerase II CTD is specifically required for snRNA gene expression. Science 318:1777–1779 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Egloff S, O'Reilly D, Murphy S. 2008. Expression of human snRNA genes from beginning to end. Biochem. Soc. Trans. 36:590–594 [DOI] [PubMed] [Google Scholar]
  • 12. Egloff S, et al. 2010. The integrator complex recognizes a new double mark on the RNA polymerase II carboxyl-terminal domain. J. Biol. Chem. 285:20564–20569 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Ezzeddine N, et al. 2011. A subset of Drosophila integrator proteins is essential for efficient U7 snRNA and spliceosomal snRNA 3′-end formation. Mol. Cell. Biol. 31:328–341 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. He X, et al. 2003. Functional interactions between the transcription and mRNA 3′ end processing machineries mediated by Ssu72 and Sub1. Genes Dev. 17:1030–1042 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Hernandez N. 2001. Small nuclear RNA genes: a model system to study fundamental mechanisms of transcription. J. Biol. Chem. 276:26733–26736 [DOI] [PubMed] [Google Scholar]
  • 16. Kennedy SA, et al. 2009. Crystal structure of the HEAT domain from the Pre-mRNA processing factor Symplekin. J. Mol. Biol. 392:115–128 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Kolev NG, Steitz JA. 2005. Symplekin and multiple other polyadenylation factors participate in 3′-end maturation of histone mRNAs. Genes Dev. 19:2583–2592 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Kolev NG, Yario TA, Benson E, Steitz JA. 2008. Conserved motifs in both CPSF73 and CPSF100 are required to assemble the active endonuclease for histone mRNA 3′-end maturation. EMBO Rep. 9:1013–1018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Kyburz A, Sadowski M, Dichtl B, Keller W. 2003. The role of the yeast cleavage and polyadenylation factor subunit Ydh1p/Cft2p in pre-mRNA 3′-end formation. Nucleic Acids Res. 31:3936–3945 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Li de la Sierra-Gallay I, Zig L, Jamalli A, Putzer H. 2008. Structural insights into the dual activity of RNase J. Nat. Struct. Mol. Biol. 15:206–212 [DOI] [PubMed] [Google Scholar]
  • 21. Mandel CR, Bai Y, Tong L. 2008. Protein factors in pre-mRNA 3′-end processing. Cell. Mol. Life Sci. 65:1099–1122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Mandel CR, et al. 2006. Polyadenylation factor CPSF-73 is the pre-mRNA 3′-end-processing endonuclease. Nature 444:953–956 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Marzluff WF, Wagner EJ, Duronio RJ. 2008. Metabolism and regulation of canonical histone mRNAs: life without a poly(A) tail. Nat. Rev. Genet. 9:843–854 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Mathy N, et al. 2007. 5′-to-3′ exoribonuclease activity in bacteria: role of RNase J1 in rRNA maturation and 5′ stability of mRNA. Cell 129:681–692 [DOI] [PubMed] [Google Scholar]
  • 25. Mir-Montazeri B, Ammelburg M, Forouzan D, Lupas AN, Hartmann MD. 2011. Crystal structure of a dimeric archaeal cleavage and polyadenylation specificity factor. J. Struct. Biol. 173:191–195 [DOI] [PubMed] [Google Scholar]
  • 26. Mount SM, Salz HK. 2000. Pre-messenger RNA processing factors in the Drosophila genome. J. Cell Biol. 150:F37–F44 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Murthy KG, Manley JL. 1995. The 160-kD subunit of human cleavage-polyadenylation specificity factor coordinates pre-mRNA 3′-end formation. Genes Dev. 9:2672–2683 [DOI] [PubMed] [Google Scholar]
  • 28. Nishida Y, et al. 2010. Crystal structure of an archaeal cleavage and polyadenylation specificity factor subunit from Pyrococcus horikoshii. Proteins 78:2395–2398 [DOI] [PubMed] [Google Scholar]
  • 29. Ruepp MD, Schweingruber C, Kleinschmidt N, Schumperli D. 2011. Interactions of CstF-64, CstF-77, and symplekin: implications on localisation and function. Mol. Biol. Cell 22:91–104 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Ryan K, Calvo O, Manley JL. 2004. Evidence that polyadenylation factor CPSF-73 is the mRNA 3′ processing endonuclease. RNA 10:565–573 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Silva AP, et al. 2011. Structure and activity of a novel archaeal beta-CASP protein with N-terminal KH domains. Structure 19:622–632 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Sullivan KD, Steiniger M, Marzluff WF. 2009. A core complex of CPSF73, CPSF100, and Symplekin may form two different cleavage factors for processing of poly(A) and histone mRNAs. Mol. Cell 34:322–332 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Takagaki Y, Manley JL. 2000. Complex protein interactions within the human polyadenylation machinery identify a novel component. Mol. Cell. Biol. 20:1515–1525 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Wagner EJ, et al. 2007. A genome-wide RNA interference screen reveals that variant histones are necessary for replication-dependent histone pre-mRNA processing. Mol. Cell 28:692–699 [DOI] [PubMed] [Google Scholar]
  • 35. Wagner EJ, et al. 2006. Conserved zinc fingers mediate multiple functions of ZFP100, a U7snRNP associated protein. RNA 12:1206–1218 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Xiang K, et al. 2010. Crystal structure of the human symplekin-Ssu72-CTD phosphopeptide complex. Nature 467:729–733 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Zhelkovsky A, et al. 2006. The role of the Brr5/Ysh1 C-terminal domain and its homolog Syc1 in mRNA 3′-end processing in Saccharomyces cerevisiae. RNA 12:435–445 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Molecular and Cellular Biology are provided here courtesy of Taylor & Francis

RESOURCES