SUMMARY
INTS11 and CPSF73 are metal-dependent endonucleases for Integrator and pre-mRNA 3′-end processing, respectively. Here, we show that the INTS11 binding partner BRAT1/CG7044, a factor important for neuronal fitness, stabilizes INTS11 in the cytoplasm and is required for Integrator function in the nucleus. Loss of BRAT1 in neural organoids leads to transcriptomic disruption and precocious expression of neurogenesis-driving transcription factors. The structures of the human INTS9-INTS11-BRAT1 and Drosophila IntS11-CG7044 complexes reveal that the conserved C-terminus of BRAT1/CG7044 is captured in the active site of INTS11, with a cysteine residue directly coordinating the metal ions. Inspired by these observations, we find that UBE3D is a binding partner for CPSF73; and UBE3D likely also uses a conserved cysteine residue to directly coordinate the active site metal ions. Our studies have revealed binding partners for INTS11 and CPSF73 that behave like cytoplasmic chaperones with a conserved impact on the nuclear functions of these enzymes.
Graphical Abstract
eTOC
Lin et al. reveal BRAT1/CG7044 as a binding partner for the INTS11 endonuclease that behaves like a cytoplasmic chaperone and is required for Integrator function in the nucleus. UBE3D is also identified as a binding partner for the CPSF73 endonuclease in pre-mRNA 3’-end processing, with a similar mechanism of action.
INTRODUCTION
Transcription termination is an obligatory event in the RNA polymerase II (RNAPII) lifecycle (reviewed in1,2). Multiple mechanisms exist to ensure transcription termination occurs robustly at the proper location. Near the 3'-ends of protein-coding genes, 'canonical termination' is carried out by the Cleavage and Polyadenylation (CPA) machinery that simultaneously causes the removal of RNAPII from the DNA template2, while the replication-dependent histone pre-mRNA 3′-end processing machinery (U7 snRNP) produces a non-polyadenylated mRNA3. More recently, at the 5'-ends of protein-coding genes, it has been found that the Integrator Complex (INT) is responsible for executing promoter-proximal termination4-10, attenuating transcription. Despite significant differences in the constituency between INT and the CPA machinery, common mechanistic themes are apparent. Both complexes incorporate phosphatase activities to reverse the effect of elongation stimulatory kinases (i.e., pTEFb)6,11-14 and utilize RNA endonucleases to cleave nascent RNA4,7,9,15-19, which generates access points for exonucleases such as XRN220 to promote termination. Impairment of either enzymatic activities results in loss of efficient termination by INT10 or the CPA machinery13.
The role of INT in promoter-proximal termination has only recently been revealed21. INT is a 17-subunit protein complex associated with RNAPII that is crucial for 3′-end processing of snRNAs22 and other noncoding RNAs23-26, as well as premature termination through cleavage of nascent mRNA transcripts on paused, promoter-proximal RNAPII4,5,21,27-29. Structural studies reveal that INT is composed of several modules21, including a backbone that contains INTS1/2/7/127,12,30 and a shoulder comprised of INTS5/831,32. These two modules form a scaffold upon which two enzymatic factors associate: the Integrator Phosphatase Module containing PP2A/C/INTS66,11 and the Integrator Cleavage Module (ICM) containing INTS4/9/1131,33,34. INTS9 and INTS11 are paralogs of CPSF100 and CPSF73 in the CPA machinery and the U7 snRNP35, with CPSF73 catalyzing the cleavage reaction in these machineries16,36,37, and INTS11 being the endonuclease for Integrator22. CPSF73 and INTS11 belong to the β-CASP family of nucleases38, and their active sites are between a metallo-β-lactamase and a β-CASP domain, with two metal ions critical for catalysis. CPSF73 and INTS11 exist in a closed, inactive conformation on their own. They transition to an open conformation for catalysis, with a deep canyon between the metallo-β-lactamase and β-CASP domains to bind the RNA substrate.
Integrator has also been reported to interact with other factors, including a collection of zinc finger proteins39,40, BRAT141, WDR7342, and the SOSS complex43-45. BRAT1 (CG7044 in Drosophila) was initially identified by its association with BRCA1 and ATM 46, and mutations of BRAT1 have been linked to several neurological diseases 47,48. Loss of BRAT1 disrupts UsnRNA 3′-end processing and premature termination at nascent mRNAs, which indicates its expression is somehow required for INT function41. The neurological disruptions observed in BRAT1 patients have parallels with symptoms associated with variants of WDR73 and INT subunits42,49-51, suggesting a common requirement during neuronal differentiation.
We show here that BRAT1 and CG7044 are primarily cytoplasmic proteins distinct from the nuclear INT. Using cryo-electron microscopy (cryo-EM), we determined the structures of Drosophila dIntS11-CG7044 and human INTS9-INTS11-BRAT1 complexes. The structures reveal, unexpectedly, that the conserved C-terminus of BRAT1/CG7044 is captured in the active site of INTS11, with INTS11 in a previously unobserved, semi-open state to accommodate this C-terminus. Moreover, the conserved penultimate Cys residue is directly coordinated to the metal ions in the active site. BRAT1 engagement to the INTS11 active site is critical to INT function as mutations of the BRAT1 C-terminus prevent rescue of INT loss of function in BRAT1 null cell lines. Further, loss of BRAT1 in neural organoids leads to transcriptomic disruption and precocious expression of neurogenesis-driving transcription factors. Our observations with INTS11 inspired us to search for a similar partner for CPSF73. We found that UBE3D can form a stable complex with CPSF73, and a long segment on the surface of UBE3D likely extends into the CPSF73 active site, remarkably also with a conserved Cys residue likely directly coordinating the metal ions. These studies demonstrate the unexpected existence of cytoplasmic partners for INTS11 and CPSF73 that may share conserved features in their binding to the cognate endonucleases.
RESULTS
BRAT1/CG7044 is a conserved, primarily cytoplasmic factor associated with INTS11
Structural and biochemical analyses of Integrator have revealed its modular nature12,15,31,33,34,52, but little is known about the biochemical, structural, and physiological relevance of Integrator-associated factors such as BRAT141. We generated total cell extracts from S2 cell lines stably expressing FLAG-tagged dIntS5, dIntS11, or empty vector as a control. As expected, using affinity purification followed by liquid chromatography-mass spectrometry (LC-MS), we observed strong enrichment of all 17 Drosophila Integrator subunits in both FLAG-dIntS5 (Fig. 1A) and FLAG-dIntS11 relative to control (Fig. 1B). Interestingly, CG7044 was only observed in FLAG-dIntS11 purification (Fig. 1B and Table S1). We extended this analysis to complexes associated with FLAG-dIntS1 or FLAG-dIntS8 and used LC-MS (Fig. S1A and Table S1) and Western blotting (Fig. S1B) to confirm that CG7044 is only associated with dIntS11. Finally, we generated a reciprocal S2 cell line stably expressing FLAG-CG7044 and compared associated complexes to those observed from the purification of FLAG-dIntS1, FLAG-dIntS5, or control. Using Western blot (Fig. 1C) and LC-MS analysis (Fig. S1C and Table S1), we found that CG7044 associates strongly with dIntS11 and, to a lesser extent, dIntS9, but no other Integrator subunits were observed.
Figure 1. BRAT1/CG7044 is a conserved, cytoplasmic factor associated with INTS11.
(A) Volcano plot of proteomics data analyzed from purifications using FLAG-dIntS5 relative to naive control. Integrator subunits are labeled in orange in the expanded view, Integrator associated proteins in red. Proteins in pink are statistically significant, and gray are not significant or unenriched.
(B) Volcano plot of proteomics data analyzed from purifications using FLAG-dIntS11 relative to naive control.
(C) Western blot analysis of input cellular extracts (left) and IP (right) from Drosophila S2 cells stably expressing FLAG-tagged proteins as indicated. IP conducted using anti-FLAG affinity resin was normalized to FLAG signals in each IP.
(D) Western blot of indicated endogenous proteins from cellular fractionation of HEK293T cells. The nuclear fraction is identified with Fibrillarin (Fib) and cytoplasmic fraction with Heat shock protein 90 (Hsp90).
(E) Similar to panel D, except Drosophila S2 cells were used to show conservation of localization. The nuclear fraction is identified with Lamin and cytoplasmic fraction with GAPDH.
(F) Heatmap derived from immunoprecipitation (IP) LC-MS analysis of FLAG IP from HEK293T cells expressing the indicated FLAG-tagged proteins. Intensity bars reflect MS intensity normalized to the immunoprecipitated FLAG protein in each column and then quantified as relative to control. The actual BRAT1 level in the nucleus (right column) is approximately 5-fold lower than INTS11 level (middle column). They have similar colors in the plot due to the normalization. Green: nuclear; purple: cytoplasmic. Control cells lack any exogenously expressed FLAG-tagged protein.
(G) Heatmap derived from IP LC-MS analysis of FLAG IP of Drosophila S2 cells expressing indicated FLAG-tagged proteins.
BRAT1 was initially described as a mostly cytoplasmic protein capable of shuttling into the nucleus53, but this has yet to be assessed in the context of Integrator localization and function, which is thought to be nuclear. To address this, we first fractionated human HEK293T cells into nuclear and cytoplasmic extracts (Fig. 1D). As expected, most Integrator subunits display predominantly nuclear localization except INTS11, which is observed in both compartments as consistent with previous reports54. In contrast, endogenous BRAT1 is primarily cytoplasmic, suggesting its function may be distinct from that of the nuclear Integrator. The prevalent cytoplasmic localization of BRAT1 was confirmed using indirect immunofluorescence (Fig S1D). Our observations on INTS11 and BRAT1 in human cells are conserved in Drosophila cells (Fig. 1E).
Given the cellular localization of BRAT1/CG7044 and INTS11 within human and fly cells, we used proteomics analysis to identify associated proteins in either compartment. To that end, we affinity purified complexes from cytoplasmic or nuclear extracts from HEK293T and S2 cells stably expressing exogenous FLAG-tagged INTS11, BRAT1/CG7044, or control. In the cytoplasm, human FLAG-INTS11 and FLAG-BRAT1 interact with each other, INTS9, and another Integrator-associated protein, WDR73 (Fig. 1F and Table S1). In Drosophila, we found that FLAG-dIntS11 and FLAG-CG7044 also interact with each other in the cytoplasm and a smaller amount of dIntS9 (Fig. 1G and Table S1). Notably, no other protein was significantly enriched in the fly dIntS11-CG7044 complex, and no readily identifiable ortholog exists for WDR73, suggesting it may be a vertebrate-specific interaction. In the nucleus, both human and fly FLAG-INTS11 are observed to interact with the entire INT, several members of the Pre-Elongation Complex, and BRAT1. On the contrary, FLAG-tagged BRAT1/CG7044 in the nucleus was still only associated with INTS9 and INTS11 (Figs. 1F, G). Altogether, these results indicate that a distinct pool of INTS9 and INTS11 is associated with BRAT1/CG7044 primarily in the cytoplasm, raising the intriguing question of how this cytoplasmic complex contributes to the functions of Integrator in the nucleus.
Structure of the dIntS11-CG7044 complex
To gain molecular insight into the Drosophila dIntS11-CG7044 association, we co-expressed and purified their complex (Fig 2A). We also co-expressed dIntS9-dIntS11-CG7044 (Figs. 1E, G) but found that dIntS9 was present at sub-stoichiometric levels after purification even though its expression level was very high (Fig. 2B), consistent with the immunoprecipitation data.
Figure 2. The Overall Structure of the Drosophila dIntS11-CG7044 complex.
(A) Gel filtration profile of Drosophila dIntS11-CG7044 complex. Inset: SDS-PAGE gel of the complex.
(B) Ion exchange chromatogram of Drosophila dIntS9-dIntS11-CG7044 complex. Inset: SDS-PAGE gel of fractions from peaks 1 and 2 in the chromatogram and eluate from the nickel affinity column (Ni).
(C) Domain organizations of Drosophila dIntS11 and CG7044. Residues observed in the structure are indicated with black lines. MβL: metall-β-lactamase; β-CASP: metallo-β-lactamase-associated CPSF, Artemis, SNM1/PSO2; CTD: C-terminal domain; CTE: C-terminal extension.
(D) The overall structure of the dIntS11-CG7044 complex. Domains are colored as in panel C. The red arrowhead indicates the C-terminus of CG7044 in the active site of dIntS11. CG7044 residues with >50 Å2 buried surface area in the interface with dIntS11 are shown in stick models and labeled.
(E) The overall structure after 180° rotation around the vertical axis. The back face of dIntS11 and the lasso of CG7044 are visible.
(F-G) The structure of CG7044 alone, in the same view as panels D and E. The structure figures were produced with PyMOL (www.pymol.org) unless indicated otherwise.
We determined the structure of the dIntS11-CG7044 complex at 3.54 Å resolution by cryo-EM (Figs. S2A-D, Table 1). Most CG7044 and dIntS11 residues could be located, with good side chain density (Fig. 2C). The structure reveals that CG7044 primarily forms a large Arm/HEAT domain (Figs. 2D-G), with the pairs of anti-parallel helices arranged in the shape of a horseshoe that wraps around dIntS11 (Figs. 2D, E). The connection between the two helices in the last repeat is exceptionally long, containing residues 830-920 and forming a ‘lasso’ structure (Fig. 2C). The C-terminal extension (CTE) of CG7044 beyond the Arm/HEAT domain traverses the open end of the horseshoe, contacting the N-terminal region of CG7044 (Figs. 2D, F). The IntS11-CG7044 interface is extensive, with ~3,700 Å2 buried surface area for each protein, consistent with the stability of this complex. The Arm/HEAT domain of CG7044 contributes 1,750 Å2 to the buried surface area.
Table 1.
Cryo-EM data collection, structure refinement and validation statistics
Drosophila IntS11-CG7044 complex (PDB 8UIC) |
Human INTS9-INTS11- BRAT1 complex (PDB 8UIB) |
|
---|---|---|
Data collection and processing | ||
Magnification | 81,000 | 105,000 |
Voltage (kV) | 300 | 300 |
Electron exposure (e−/Å2) | 51 | 58.2 |
Defocus range (μm) | −1 to −2.5 | −0.8 to −1.3 |
Pixel size (Å) | 1.083 | 0.83 |
Symmetry imposed | C1 | C1 |
Image stacks (no.) | 1,603 | 4,030 |
Initial particles images (no.) | 2,028,174 | 1,112,411 |
Final particle images (no.) | 406,222 | 185,172 |
Map resolution (Å) | 3.54 | 3.21 |
FSC threshold | 0.143 | 0.143 |
Map sharpening B-factor (Å2) | −171.7 | −126.1 |
Refinement | ||
Number of protein residues | 1,227 | 1,534 |
Number of metal ions | 2 | 2 |
Number of atoms | 9,968 | 11,876 |
R.m.s. deviations | ||
Bond lengths (Å) | 0.006 | 0.006 |
Bond angles (°) | 0.865 | 0.863 |
PDB validation | ||
Clash score | 6.78 | 9.84 |
Poor rotamers (%) | 0.00 | 0.00 |
Ramachandran plot | ||
Favored (%) | 91.0 | 93.0 |
Allowed (%) | 9.0 | 7.0 |
Disallowed (%) | 0.0 | 0.0 |
The conserved CG7044 DCY motif is captured in the dIntS11 active site
The structure unexpectedly reveals that the C-terminus of CG7044 is inserted into the active site region of dIntS11 (Fig. 3A). The last three residues of CG7044, Asp-Cys-Tyr974 (DCY), are conserved among all known CG7044/BRAT1 homologs (Fig. 3B), further emphasizing their importance. This DCY motif resides in a deep pocket at the interface between the dIntS11 metallo-β-lactamase and β-CASP domains. The side chain of Asp972 is hydrogen-bonded to His392 of dIntS11 (Fig. 3C), the general acid for the nuclease reaction16,37. The side chain of Cys973 is directly coordinated to both metal ions in the active site, which would not allow the coordination by the scissile phosphate of the RNA substrate37. Finally, the carboxylate group at the C-terminus of CG7044 has ion-pair interactions with Arg244 in the β-CASP domain. There is no space to accommodate additional amino acid residues following Tyr974 (Figs. 3A, C).
Figure 3. The Conserved C-terminus of CG7044 is captured in the active site of dIntS11.
(A) Residues 972-974 at the C-terminus of CG7044 (violet) are located in a pocket at the interface between the metallo-β-lactamase and β-CASP domains of dIntS11.
(B) Alignment of the C-terminal sequences of selected CG7044 and BRAT1 homologs. Conserved residues are highlighted in red. Dm: D. melanogaster; Cq: Culex quinquefasciatus; Aa: Aedes aegypti; Hs: Homo sapiens (human); Mm: Mus musculus (mouse); Xt: Xenopus tropicalis (frog); Dr: Danio rerio (zebrafish).
(C) Detailed interactions between the C-terminus of CG7044 and dIntS11 active site. Hydrogen-bonding interactions are indicated with the dashed lines in red.
(D) Overlay of the binding mode of the C-terminus of CG7044 (violet) with that of the histone pre-mRNA substrate in human CPSF73 (orange).
(E) Overlay of the semi-open state of dIntS11 (in color) and the closed state of dIntS11 in the ICM (gray). The β-CASP domain rotates by 6.3° to create the pocket for binding the C-terminus of CG7044. The asterisks indicate the location of the two helices in the CTE.
Based on these observations, the C-terminal residues of CG7044 would directly compete against the RNA substrate. In support of this, the binding mode of DCY extensively overlaps with that of the RNA substrate observed in the structure of CPSF73 (Fig. 3D)37. Overall, the structural observations demonstrate that CG7044 forms a tight complex with dIntS11 and can inhibit its nuclease activity.
dIntS11 is in a semi-open state in complex with CG7044
We and others have previously established that INTS11 in the ICM alone is in an inactive, closed state12,31,34. When we compared the structure of dIntS11 in the CG7044 complex with that in the Drosophila ICM34, we found that there is a substantial conformational difference for the β-CASP domain, corresponding to a rotation of 6.3° relative to the metallo-β-lactamase domain (Fig. 3E). This change is distinct from that for the open-closed transitions of CPSF7337 and human INTS117,15,34, which has a 17° rotation for the β-CASP domain (Fig. S3A), and does not create a canyon for RNA binding. In the ICM, the binding pocket for the C-terminus of CG7044 does not exist due to the closed state of IntS11 (Fig. S3B, C). Therefore, dIntS11 assumes a new state in the CG7044 complex, and we will refer to it as a ‘semi-open’ state. In this state, there is no canyon between the metallo-β-lactamase and β-CASP domains that would allow the RNA substrate to reach the active site (Fig. S3D, E).
The CG7044 lasso interacts with the β-CASP domain of dIntS11
The extended connection between the two helices in the last helical repeat of CG7044 forms a 'lasso' structure, with two helices at the tip of this lasso being projected more than 30 Å away from the body of CG7044 (Figs. 2E, G). The CG7044 lasso has extensive, predominantly hydrophobic interactions with the β-CASP domain of dIntS11. Residues in the two helices at the tip of the lasso (αL1 and αL2) contact the 'back' face of the β-CASP domain (Figs. 2E, S4A, B). Most residues in the two helices interacting with dIntS11 are highly conserved among insect CG7044 homologs (Fig. S4C). On the other hand, the linkers from these two helices to the Arm/HEAT domain are poorly conserved, suggesting that this structural feature is likely highly dynamic.
There is a significant conformational change for helices αC and αD of the dIntS11 β-CASP domain in the CG7044 complex (Fig. S4D). Compared to the Drosophila ICM structure, the αD helix is disordered in the CG7044 complex, and the αL2 helix of the lasso takes up the position of this helix. The αL1 helix of the lasso clashes with dIntS4 of the ICM, and a segment in the β-CASP domain of dIntS9 in the ICM clashes with CG7044. Therefore, the dIntS11-CG7044 complex is incompatible with the formation of the ICM, indicating that dIntS11 binding to CG7044 is mutually exclusive to its binding to dIntS4/dIntS9.
Structure of the human INTS9-INTS11-BRAT1 complex
Our biochemical studies with the Drosophila IntS11-CG7044 complex suggest that a human INTS11-BRAT1 complex is also likely to exist. While we were able to purify this complex, the expression levels of the two proteins were very low even though they were co-expressed (Fig. 4A). However, when we included INTS9 in the co-expression, it led to a significant improvement in the expression levels of the INTS9-INTS11-BRAT1 complex (Fig. 4B), consistent with pulldown assays (Fig. 1F). We determined the structure of this ternary INTS9-INTS11-BRAT1 complex at 3.2 Å resolution by cryo-EM (Figs. 4C-E, Figs. S2E-H, Table 1).
Figure 4. The Overall Structure of the Human INTS9-INTS11-BRAT1 Complex.
(A) Gel filtration profile of the human INTS11-BRAT1 complex. The expression levels were very low.
(B) Gel filtration profile of human INTS9-INTS11-BRAT1 complex.
(C) Domain organizations of human INTS9, INTS11 and BRAT1. The homologous domains in INTS9 are given slightly darker colors compared to INTS11. The segment in BRAT1 equivalent to the lasso in CG7044 is shown in magenta, but it is disordered.
(D) The overall structure of the human INTS9-INTS11-BRAT1 complex.
(E) The INTS9-INTS11-BRAT1 complex viewed after 180° rotation around the vertical axis. The expected position of helix αD of the β-CASP domain is indicated, but it is disordered in this structure.
(F) Overlay of the binding mode of the C-terminus of BRAT1 (violet) in human INTS11 (cyan and yellow) with that of CG7044 in Drosophila dIntS11 (gray).
(G) Overlay of the structures of the ICM and the INTS9-INTS11-BRAT1 complex, based on the metallo-β-lactamase domain of INTS11. INTS11: cyan; INTS9 in the ICM: dark cyan; INTS9 in the INTS9-INTS11-BRAT1 complex: gray; INTS4: light blue; BRAT1: pink. The red oval indicates clashes between INTS9 in the INTS9-INTS11-BRAT1 complex and INTS4 in the ICM, while the blue arrow indicates large differences in the position and orientation of INTS9 between the two complexes. See also Movie S1.
The overall structure of the human INTS11-BRAT1 portion of the ternary complex is similar to that of the Drosophila dIntS11-CG7044 complex (Fig. S5A). Human INTS11 is also in a semi-open state, and the BRAT1 C-terminus is captured in the active site region of INTS11 with a similar binding mode (Fig. 4F). With INTS11 in superposition, a rotation of 4° is needed to align the structure of BRAT1 with that of CG7044 (Fig. S5A), with substantial differences for some of the Arm/HEAT repeats (Fig. S5B). The lasso is not present in human BRAT1, as that loop is much shorter (residues 733-772) and is disordered (Fig. 4C). Therefore, the lasso may be a feature unique to CG7044 homologs in insects. Nonetheless, helix αD of the INTS11 β-CASP domain is also disordered in this complex (Fig. 4E), suggesting that the disordering of this helix is a consequence of the binding of BRAT1/CG7044 rather than the presence of the lasso.
INTS9 has weaker interactions with BRAT1
INTS9 is bound to the side of BRAT1 and has essentially no direct contact with INTS11 (buried surface area of <60 Å2) in this complex (Figs. 4D, E, Fig. S5C). The β-CASP domain of INTS9 is located in the interface with BRAT1. In contrast, much of its metall-β-lactamase domain is not in the interface (Fig. S5C). The INTS9-BRAT1 interface is much less extensive than the INTS11-BRAT1 interface, suggesting that the INTS9-BRAT1 complex is less stable. A large, primarily hydrophobic interface patch involves helices αA and αB of the INTS9 β-CASP domain contacting repeats 10 and 11 of BRAT1 (Fig. S5D).
Compared to the structure of INTS9 in the human ICM31, a large conformational change is observed for helix αA and the neighboring strand βI of the β-CASP domain (Fig. S5E). More importantly, the organization of INTS9 and INTS11 in the complex with BRAT1 is incompatible with the observed organization of the ICM. This can be visualized by overlaying the metall-β-lactamase domains of INTS11 in the ICM and the INTS9-INTS11-BRAT1 complex (Fig. 4G, Movie S1). In this overlay, there are substantial differences in the position and orientation of INTS9 between the two complexes. INTS9 in the INTS9-INTS11-BRAT1 complex has essentially no contacts with INTS11, and it clashes with INTS4 in the ICM. This is supported by our observation that BRAT1 only pulls down INTS11 and INTS9 in both the nucleus and the cytoplasm (Fig. 1F).
We investigated why Drosophila dIntS9 has a weaker association with the dIntS11-CG7044 complex (Fig. 2B). We looked at how Drosophila dIntS9 and CG7044 would fit in the observed human INTS9-BRAT1 interface. Residues in this interface are generally weakly conserved among their homologs. In particular, Ser283 in human INTS9 is replaced by Met282 in Drosophila dIntS9 (Fig. S5F), which would clash with CG7044 and destabilize the dIntS9-CG7044 complex. Based on these structural analyses, we generated the S283M single mutant as well as the E280R/S283M/L287A/R290A quadruple mutant of human INTS9 and found they were not able to disrupt the formation of the ternary complex based on gel filtration (Fig. S5G). However, the CTDs of INTS9 and INTS11 are likely still associated in the mutants31,34, which would give an apparent ternary complex on gel filtration. We then examined the E280R/S283M/L287A/R290A sample by cryo-EM and found that the particles look like dIntS11-CG7044 hetero-dimers rather than INTS9-INTS11-BRAT1 hetero-trimers (Fig. S5H), demonstrating that the mutation has disrupted the INTS9-BRAT1 interaction.
Functional Relevance of CG7044/BRAT1-containing Complexes
To examine the importance of BRAT1/CG7044 in human and Drosophila cells, we reduced its expression using either CRISPR/Cas9 or RNA interference. To that end, we generated three independent BRAT1 knockout human HCT116 cell lines and confirmed its complete depletion by Western blot analysis and indirect immunofluorescence (Figs. 5A, B). Interestingly, all three clonal lines displayed no overt growth defects but demonstrated reduced levels of both INTS9 and INTS11 proteins with only a marginal impact on other Integrator subunits (Fig. 5A). This indicates that the INTS9-INTS11-BRAT1 complex is important for the specific stabilization of INTS9/INTS11. Fractionation of the nuclear and cytoplasmic components of the BRAT1 knockout HCT116 cells revealed that the residual levels of INTS11 still localize to the nucleus, indicating that BRAT1 is not required for transport of INTS11 (Fig. 5C). RNAi of CG7044 in Drosophila S2 cells also caused co-depletion of dIntS11 but did not overtly affect dIntS9 expression (Fig. 5D). These results are consistent with human INTS9 and INTS11 associating with BRAT1 but only dIntS11 forming a stable complex with CG7044 55.
Figure 5. BRAT1 stabilizes INTS11 in the cytoplasm and is required for Integrator function.
(A) Western blot probing for HCT116 control line (Con.), a pool cell line derived from transfection of HCT116 cells with sgRNA targeting BRAT1 (pool), and 3 independent HCT116 BRAT1 knockout clonal lines (BRAT1 KO-1, -2, -3).
(B) Immunofluorescence of HCT116 cells (WT) and HCT116 BRAT1 knockout line with nuclei labeled with DAPI (blue) and BRAT1 labeled with TX-Red (red). Scale bar indicates 12 μm.
(C) Western blot analysis of cytoplasmic and nuclear fractions of wild-type HCT116 cells (Con.) and the 3 BRAT1 knockout lines (BRAT1 KO-1, -2, -3).
(D) Western blot of lysates from Drosophila S2 cells treated with dsRNA targeting either dIntS11, CG7044, or a non-specific target (Con.).
(E) RT-qPCR specific for misprocessed U1snRNA (U1 misp) or U4snRNA (U4 misp). Data are quantified from triplicate RNA isolations, normalized to 7SK RNA and plotted as fold change relative to control.
(F) Top panel is a schematic of the U7-GFP reporter that produces low levels of GFP protein when Integrator is functional but produces significant GFP expression due to transcriptional readthrough when Integrator function is diminished. Bottom panel, Western blot analysis of cell lysates derived from HCT116 cells (Con.) transfected with the U7-GFP reporter or from BRAT1 KO-1 cells also transfected with the reporter and with BRAT1 cDNA. Note that the BRAT1 cDNA were either wild-type (WT), had the DCY changed to AAA (AAA), lacked the DCY (ΔDCY), or had the CTE deleted (ΔCTE).
(G) Western blot analysis of HCT116 cells (Con.) or BRAT1 KO-1 transfected with the U7-GFP reporter or the U7-GFP reporter and INTS11.
Next, we examined the functional consequences of BRAT1 knockout and found a striking elevation of misprocessed UsnRNAs (Fig. 5E), a hallmark of INTS11-depletion phenotype and Integrator misfunction, and consistent with the overall reduction in INTS9/INTS11 accumulation and previous reports41 (Fig. 5A). The BRAT1 knockout lines created an optimal system to re-express BRAT1 or BRAT1 mutant proteins to test for functional rescue. Specifically, we generated BRAT1 constructs that are either wild-type (WT), have the DCY motif changed to alanine (AAA), have the DCY motif removed (ΔDCY), or lack the C-terminal extension including the αC1 and αC2 helices (ΔCTE). To assess the ability of BRAT1 constructs to rescue the Integrator-depletion phenotype, we utilized a previously described fluorescence-based reporter system where the U7snRNA promoter and coding region are placed upstream of a GFP open reading frame (Fig. 5F)52,56,57. As expected, transfection of this U7-GFP reporter into control HCT116 cells yielded little GFP expression versus a high amount of GFP produced in BRAT1 null cells (Fig. 5F, lanes 1 vs 2). When BRAT1-WT is re-expressed in BRAT-null cells, we observed low levels of GFP expression, comparable to control HCT116 cells, indicating a complete rescue of the INTS11-depletion phenotype (Fig. 5F, lanes 2 vs 3). In contrast, re-expression of the AAA, ΔDCY, or ΔCTE mutant failed to fully restore U7snRNA processing, resulting in GFP production. This indicates that the DCY motif is required for BRAT1 function with INTS11 (Fig. 5F). Finally, we also overexpressed INTS11 in BRAT1 knockout cells and found it can almost fully rescue Integrator dysfunction, indicating that destabilization of INTS11 is the major driving factor for the observed reduced Integrator nuclear function (Fig. 5G). These data indicate that BRAT1/CG7044 stabilizes INTS11 in the cytoplasm, and the INTS11-BRAT1 complex is required for nuclear Integrator function.
Loss of BRAT1 leads to precocious expression of neurogenesis-driving factors
BRAT1 mutations have been attributed to a spectrum of neurological development disorders connected to epilepsy and seizures41,58-60. To further study the role of BRAT1 in neural differentiation, we generated syngeneic BRAT1 knockout (ΔBRAT1) and control cell lines from human embryonic H7 (WA07) stem cells. Two independent cell lines were generated for both control and ΔBRAT1 hESC lines. ΔBRAT1 ESCs do not exhibit growth defects and exhibit normal ESC morphology and pluripotency marker expression (Fig. S6A). To examine the potential effect of BRAT1 loss on the generation of interneuron precursors, we further differentiated the ESCs into ventral neural organoids (NOs) (Figs. 6A, S6A-C)61. Similar to the HCT116 cells that lacked BRAT1, the ΔBRAT1 ESCs and NOs both display significant UsnRNA misprocessing, demonstrating that at least at the level of UsnRNA, there is an apparent Integrator deficiency phenotype (Fig. 6B).
Figure 6. BRAT1 is required for transcriptomic integrity during neural differentiation.
(A) Western blot of lysates derived from human embryonic stem cells (hESCs) or neural organoids derived from hESCs. Expression of BRAT1 was analyzed in two independent clones each of syngeneic control (Con.) or BRAT1 knockout (ΔBRAT1) cell lines. Lanes 1-4: undifferentiated embryonic stem cells. Lanes 5-8: Neural organoids derived from control and ΔBRAT1 hESCs.
(B) Results from RT-qPCR from cells shown in panel A for U1 snRNA or U2 snRNA misprocesssing. Measurements were conducted in biological replicates (n = 3) with error bars representing standard deviation.
(C) Principal component analysis plot generated from DESeq2 analysis of RNA-seq on ΔBRAT1 versus Control neural organoids, corresponding to samples from lanes 5-8 of panel A. Each cell line was tested as three independent biological replicates consisting of 20 pooled organoids each.
(D) Volcano plot derived from RNA-seq data collected from the neural organoids shown in panel A. Each cell line was tested as three independent biological replicates consisting of 20 pooled organoids each (see also 6C). The significance of differential gene expression (DGE) was set at >4-fold and adjusted p<0.01.
(E) Gene tracks of selected genes from the RNA-seq data in panel D. DLX1 (Distal-less homeobox 1) and SHH (Sonic hedgehog signaling molecule) were upregulated and Lsm10 (U7 small nuclear RNA associated Sm-like protein) was unchanged upon BRAT1 knockout.
(F) Gene Ontology analysis of the differential gene expression in panel D for both Biological Processes and Cellular Components.
To determine the impact of BRAT1 depletion on gene expression in hESCs and NOs, we performed RNA-sequencing (RNA-seq). We found that ΔBRAT1 hESCs did not distinctly cluster from control hESCs using principal component analysis (PCA) of the RNA-seq, and consistently, only a limited number of genes exhibited differential expression (Figs. S6C, D). In contrast, the PCA analysis of the NO RNA-seq data revealed tight clustering of the two ΔBRAT1 clonal lines and separate from that of control lines (Fig. 6C). Further, unlike hESCs, the ΔBRAT1 neural organoids displayed a much more disrupted transcriptome (Figs. 6D, E). We validated several of the observed changes using RT-qPCR (Fig. S6E). Gene Ontology analysis of the downregulated genes did not identify any processes above the statistical cutoff for significance. In contrast, the upregulated genes were enriched in many pathways involved in neurogenesis, axon formation, and neurodifferentiation (Fig. 6F). Further, many top upregulated genes are considered master-regulator transcription factors (DLX1, DLX2, DLX5, NKX2-1) and are implicated in brain patterning and development62-68. Altogether, these data reveal that loss of BRAT1 disrupts gene expression patterns during neural differentiation.
UBE3D may be a ‘BRAT1-like’ Binding Partner for CPSF73
Inspired by our observations on the INTS11-BRAT1 complex, we wondered whether the INTS11 paralog, CPSF73, could have a binding partner with a similar interaction mechanism. To explore this possibility, we first used the BRAT1 structure to search against all the human protein structures predicted by AlphaFold69, but failed to identify a structurally similar protein using Dali70. We then screened through the list of binding partners of human CPSF73 in the BioGRID database71, using AlphaFold-Multimer72 to predict the structure of its complex with the CPSF73-CPSF100 heterodimer. This led to the identification of a complex between CPSF73 and UBE3D (ubiquitin protein ligase 3D). UBE3D primarily interacts with the metallo-β-lactamase and β-CASP domains of CPSF73 and has essentially no contacts with CPSF100. A prediction for the CPSF73-UBE3D binary complex produced a similar model (Fig. 7A, Fig. S7A), as well as another model where there are also some contacts between UBE3D and the CTDs of CPSF73 (Fig. S7B). Most interestingly, both models suggest that Cys144 at the tip of a long segment (residues 129-159) that extends far from the body of UBE3D is positioned in the active site of CPSF73 and can directly coordinate one or both metal ions (Fig. 7B). CPSF73 is in a fully open conformation, with a canyon between the metallo-β-lactamase and β-CASP domains to accommodate this segment from UBE3D (Fig. S7C). Cys144 and the following His145 are conserved among UBE3D homologs, including the yeast homolog Ipa1 (Fig. 7C), suggesting their functional importance.
Figure 7. UBE3D is a binding partner of CPSF73 in the cytoplasm.
(A) Structure of the human CPSF73-UBE3D complex predicted by AlphaFold-Multimer. UBE3D is shown in marine color.
(B) Detailed interactions between UBE3D and the active site region of CPSF73.
(C) Sequence alignment of selected UBE3D/Ipa1 homologs for the segment in contact with CPSF73 active site. Conserved residues are highlighted in red. Sc: S. cerevisiae; Kl: K. lactis.
(D) Fractionation of HEK293T cells and Western blot analysis indicating primarily cytoplasmic localization of endogenous UBE3D.
(E) Western blot analysis of co-immunoprecipitation with anti-FLAG affinity resin from cell lysates derived from HEK293T cells co-transfected with empty vector, or FLAG-tagged UBE3D wild-type and mutant constructs as well as HA-CPSF73.
(F) Gel filtration profiles of co-expressed wild-type K. lactis Ysh1-Ipa1 complex (green), Ysh1-Ipa1 mutant replacing residues 141-144 with Ala (141-144A, blue), and Ysh1-Ipa1 mutant missing the segment in the CPSF73 active site (red). Inset: SDS gels of the samples.
(G) A model for the role of BRAT1 in Integrator function.
To gain cellular evidence that UBE3D binds CPSF73, we first probed the localization of several CPA factors and UBE3D. As was the case for BRAT1, UBE3D is primarily cytoplasmic, whereas CPSF73 and CPSF100 were more enriched in the nucleus but still demonstrate cytoplasmic localization (Fig. 7D). In addition, we verified the importance of that long segment of UBE3D for CPSF73 binding. We generated three mutants, deleting residues 131-156 or 135-154 of the segment or replacing the WCCH sequence (Fig. 7C) at the tip of the segment with alanines (142-145A). We observed that all UBE3D FLAG-tagged proteins expressed to comparable levels, but all three mutants were found to pulldown significantly less HA-CPSF73 (Fig. 7E). We co-expressed the yeast Kluyveromyces lactis Ysh1 (CPSF73 homolog), Ydh1 (CPSF100 homolog), Pta1 (symplekin homolog) and Ipa1 in insect cells and successfully purified a complex containing Ysh1 and Ipa1 (Fig. S7D). AlphaFold prediction of the Ysh1-Ipa1 complex showed similar overall structures and conserved modes of interaction in the active site region (Fig. S7E, F). We generated equivalent mutations in K. lactis Ipa1, and showed that the mutants could no longer co-purify with Ysh1 after co-expression in insect cells (Fig. 7F). Altogether, these results indicate that UBE3D associates with only CPSF73 in the CPA machinery and that the long segment is important for the binding.
DISCUSSION
Here, we have revealed binding partners that stabilize INTS11 and CPSF73 in the cytoplasm, and these interactions are important for their functions in the nucleus. Therefore, the partners behave like chaperones of these endonucleases in the cytoplasm. Our studies also suggest striking similarities in the mechanisms of these chaperone partners. They both occupy the active site of their target endonuclease and present a conserved cysteine residue to directly coordinate the metal ions. The levels of the endonucleases are greatly reduced in the absence of the partners, leading to RNA misprocessing and other functional defects. This suggests that a chaperone-mediated regulation of endonuclease activity and subcellular localization is a conserved mechanism by which the function of critical RNA processing machinery is controlled.
The location of the BRAT1 C-terminus in the active site of INTS11 demonstrates that BRAT1 should inhibit the nuclease activity of INTS11 (Fig. 4G). Further, the organization of INTS11 and INTS9, while associated with BRAT1, would prevent their association with INTS4 and the formation of the ICM (Figs. 3,4). These structural insights appear at odds with the INTS11 loss of function phenotype observed in BRAT1 knockout cells (Figs. 5, 6). A parsimonious model (Fig. 7G) consistent with both data would be that BRAT1 is a chaperone partner of newly synthesized INTS11, stabilizing this endonuclease in the cytoplasm before its association with the nuclear Integrator complex. While INTS11 alone is expected to exist in a closed, inactive state, its conformations may be more dynamic, and BRAT1 could ensure no spurious endonucleolytic events. These data also suggest that INTS11 exists in two separate cellular complexes: those bound to the full nuclear INT, ready to function in transcriptional regulation, and those in complexes with BRAT1 (and INTS9). This cytoplasmic ‘pool’ of INTS11 likely represents the early stages of an ICM assembly pathway and possibly also forms a reservoir of this enzyme in the cytoplasm that becomes necessary under specific cell states (e.g., neurogenesis).
Both homozygous and compound heterozygous recessive mutations in the BRAT1 gene have been found in patients47,48. Our structure of BRAT1 provides a foundation for understanding the molecular basis of how these mutations could perturb its function (Fig. S5I and Table S2). Many of these mutations are expected to disrupt the structure of BRAT1, which will likely lead to its destabilization. This is supported by the observation that the V62E mutation greatly reduced the levels of BRAT1 protein in cells41. Only two of these BRAT1 missense mutations are located in the interface with INTS11, showing ion-pair or hydrogen-bonding interactions, while none are in the interface with INTS9.
Our data further reveal that BRAT1 is surprisingly not essential in HCT116 or hES cells and that persistent reduction of INT function can be tolerated (Fig. 5). However, upon differentiation of embryonic stem cells to neural organoids, a requirement for BRAT1 is uncovered as these cells demonstrate precocious expression of several key transcription factors and signaling molecules that drive neurogenesis (Fig. 6). These observations are consistent with INT functioning as a transcriptional attenuator and suggest that patients with BRAT1 mutations may have disrupted transcriptome remodeling critical for neuronal development. In particular, using a ventrally fated neural organoid induction model, we found upregulation of SHH, NKX2-1, DLX1, DLX5, and DLX2, which point towards dysregulation of interneuron specification62-68,73. This imbalance may contribute to the pathology observed in patients with BRAT1 mutations, such as those with RMFSL.
UBE3D and Ipa1 are poorly understood, but it was recently reported that they may interact with CPSF73/Ysh1 and stabilize this endonuclease, and reduction in their expression leads to a CPSF73-depletion phenotype74-77. These characteristics are remarkably similar to those described here for BRAT1. Moreover, it is interesting that the binding of UBE3D, annotated as a ubiquitin ligase, to CPSF73 prevents ubiquitination and degradation by the proteasome75,77.
Our work reveals a previously unknown aspect of INTS11 and CPSF73 biogenesis in that chaperones stabilize and engage the active site of these crucial RNA endonucleases. We note that within this same issue, the Jonas laboratory reports similar observations on the role of BRAT1 in Integrator function78. While the two studies are remarkably harmonious, the other report also reveals that IP6 is important for the conversion from the INTS9-INTS11-BRAT1 complex to the ICM. Overall, these structures further show a compelling form of molecular mimicry where amino acids function as uncleavable substrates reminiscent of RNA targets to ultimately deliver these enzymes to their respective complexes.
LIMITATIONS OF THE STUDY
One potential limitation of our work pertains to insight as to how the transport of ICM subunits into the nucleus is accomplished. Our biochemical approaches, while robust, did not detect any of the well-known import factors. This may indicate that their association may be too transient for our relatively rigorous purification schemes. Another limitation of our work relates to how directly applicable our findings are to human diseases associated with BRAT1 mutations since our neural organoids were derived from hESCs that were null for BRAT1. It is likely that BRAT1 mutations lead to more subtle defects in neuronal development versus a complete knockout, and the effects of BRAT1 deficiency on specific neuronal lineages at later stages of development remains to be evaluated. Finally, our molecular insights on the CPSF73-UBE3D complex are based on predictions by AlphaFold, and experimentally determining the structure of this complex will be important for the verification of the model, which may reveal additional insights on this interaction.
STAR METHODS
RESOURCE AVAILABILITY
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Liang Tong (ltong@columbia.edu).
Materials availability
Cell lines generated and described in this study are available upon request.
Data and code availability
The atomic coordinates have been deposited at the PDB (entry codes 8UIB and 8UIC). GEO Accession numbers: all datasets generated in this study are available for download from GEO: GSE246833. All source data is available at Mendeley Data, DOI: 10.17632/ygprhc3hvt.1. The deposited data will be publicly available upon publication of the paper.
This paper does not report original code.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
Experimental Model and Study Participant Details
Cell lines
HEK 293T and HCT116 cells were grown with 5% CO2 in DMEM (Gibco, #11965-092), supplemented with 10% (v/v) FBS and 1% (v/v) penicillin-streptomycin (Gibco, #15070-063) at 37°C. Drosophila S2 cells were grown in Sf-900 II SFM (Gibco, #10902-088), supplemented with 1% (v/v) Antibiotic-Antimycotic at 27°C. Human embryonic stem cells (H7, WA07, WiCell) and derivatives (Control-1, Control-2, ΔBRAT1-1, ΔBRAT1-2) were maintained in Stemflex medium (Gibco, #A33494-01) on LDEV-free, growth factor reduced Matrigel basement membrane (Corning #356254) at 37°C in humidified 5%CO2 and 5%O2 atmosphere.
METHOD DETAILS
Plasmid construction and stable cell lines generation
Flag-tagged Drosophila Integrator subunits and CG7044 were cloned into the pMT-3xFLAG-puro vector4 to inducibly express in S2 cells. Plasmids were all sequenced to confirm identity. Drosophila S2 cells stably expressing the indicated Flag-tagged Integrator subunits or CG7044 were generated by plating 2x106 cells in a 6-well dish in complete media. After 1 hour, 2 μg of plasmids were transfected using Fugene HD (Promega, #E2311). The next day, 2.5 μg/mL puromycin was added to select and maintain the cell population.
Human Flag-tagged Integrator subunits or BRAT1 were generated by cloning cDNAs into the doxycycline inducible pTRE-3xFLAG-Blast4. HEK 293T cell lines expressing the plasmids were generated by plating 4x105 cells in a 24-well dish in complete media. Cells were transfected with 800 μg plasmid using Lipofectamine 2000 (Invitrogen, #11668027). After 24 hours, 10 μg/mL blasticidin was added to the media to select the cell population, and for maintenance.
Nuclear extract preparation
Nuclear and cytoplasmic extracts were performed using a variation of the Dignam method79. Briefly, HEK 293T or Drosophila S2 cells were grown into nearly confluent 150 mm dishes, collected, and rinsed with cold PBS. Prior to collection, HEK 293T cells were pretreated with doxycycline for 24 hrs and Drosophila cells were pretreated with 500 uM CuSO4 for 24 hours to induce protein expression. The cells were resuspended in five times the pellet volume of Buffer A (10mM Tris-HCl pH7.9, 1.5mM MgCl2, 10mM KCl, 0.5mM DTT, and 0.2mM PMSF) and incubated at 4°C for a 15-minute rotation to allow cells to swell. Cells were pelleted down at 800g for 10 minutes at 4°C, then resuspended in two times the pellet volume of Buffer A and homogenized on ice with dounce pestle B for 30 gentle strokes before centrifugation at 2,000g for 10 minutes at 4°C. From here, the supernatant containing the cytoplasmic fraction is collected. The nuclear fraction in the pellet was resuspended and washed gently with a 5-minute rotation two times in Buffer D (20mM Na HEPES pH7.4, 100mM KCl, 0.2mM EDTA, and 0.5mM DTT) and one time in Buffer A. The sample was again homogenized with dounce pestle B for 15 strokes before rotating for 60 minutes at 4°C. Sample was centrifuged at 15,000g for 30 minutes at at 4°C. Supernatant was collected and both the nuclear and cytoplasmic fractions were subjected to overnight dialysis at 4°C in 2 liters Buffer D (20mM Na HEPES, 100mM KCl, 0.2mM EDTA, 0.5mM DTT, and 20% glycerol). To remove any precipitate before further analysis, extracts were centrifuged again at 15,000g for 3 minutes at at 4°C.
Western blotting and anti-FLAG affinity purification
Cells were lysed to collect protein by directly adding 2X SDS sample buffer (120mM Tris-HCl pH6.8, 4% SDS, 200mM DTT, 20% Glycerol, and 0.02% Bromophenol blue) to cells in wells. Samples were then incubated at room temperature with periodic swirling before a 10-minute boiling at 95°C and a short sonication. Denatured protein samples were then resolved on a 10% SDS-PAGE gel and transferred to a PVDF membrane (Bio-Rad, #1620177). For both commercial and custom-designed antibodies, blots were probed as previously described 4.
Blots were probed by custom-designed Drosophila antibodies as previously described4 diluted in PBS-0.1% Tween supplemented with 5% nonfat milk. To detect proteins from 293T lysate, anti-hIntS11 (Bethyl, #A301-274A), anti-hIMPK (Thermo, #PA5-21629), anti-GFP (Clontech, #632381), anti-alpha Tubulin (abcam, #ab15246), and anti-GAPDH (Thermo, #MA5-15738) were used at the dilution suggested by the manufacturer.
To purify FLAG-tagged Integrator complexes, 500μg of nuclear extract was mixed with 30μl anti-Flag M2 affinity agarose slurry (Sigma, #A2220) equilibrated in binding buffer (20mM Na HEPES pH7.4, 150mM KCl for Drosophila, 100mM KCl for human, 10% Glycerol, 0.1% NP-40) and rotated for 4 hours at 4°C. Next, four washes were performed out in binding buffer by rotating at 4°C for 5 minutes followed by a 500g centrifugation at 4°C. After the final wash, the supernatant was removed, 50μl of 2X sample buffer was added to the anti-FLAG resin to elute the proteins, and the samples boiled at 95°C for ten minutes. To generate Western blot input samples, equal volume of 2X loading buffer was added to nuclear or cytoplasmic extracts and 1/10 of the immunoprecipitation was loaded as estimated by protein mass.
Antibody generation and purification
Commercial antibodies used to detect proteins are listed in the Key Resources Table. Others were custom-made and were raised against recombinant proteins expressed in E. coli as described previously4,6. Specifically, these proteins were: the first 100 amino acids of the N-termini of either Drosophila IntS4, Drosophila IntS7, or Drosophila IntS8. These recombinant proteins were then shipped to a commercial vendor (Cocalico Biologicals, PA) and used to inoculate guinea pigs. Sera was isolated and tested for specific reactivity to target proteins using Western blot analysis initially with nuclear extract to confirm that protein bands matched predicted size. Sera passing this filter was subsequently tested by probing lysates from DL1 cells treated with dsRNA targeting the protein of recognition to confirm loss of specific bands.
KEY RESOURCES TABLE
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
FLAG-HRP | Sigma #A8592 | AB_439702 |
Drosophila IntS1 | Ezzeddine et al., 201157 | N/A |
Drosophila IntS4 | Huang et al., 20206 | N/A |
Drosophila IntS6 | This paper | N/A |
Drosophila IntS7 | Huang et al., 20206 | N/A |
Drosophila IntS8 | Huang et al., 20206 | N/A |
Drosophila IntS9 | Ezzeddine et al., 201157 | N/A |
Drosophila IntS11 | Ezzeddine et al., 201157 | N/A |
Drosophila IntS12 | Chen et al, 201256 | N/A |
Hsp90 | Abclonal #A0365 | Pending |
Fibrillarin | Abclonal #A1136 | Pending |
BRAT1 | AbClonal #A12801 | Pending |
hINTS1 | Bethyl #A300-361A | AB_2127258 |
hINTS4 | Bethyl #A301-296A | AB_937909 |
hINTS5 | Proteintech #14069-1-AP | AB_2296187 |
hINTS9 | Proteintech #11657-1-AP | AB_2127514 |
hINTS11 | Abcam #ab75276 | AB_937779 |
Gapdh | ThermoFisher #MA5-15738 | AB_10977387 |
Lamin A/C | Santa Cruz Biotechnology #sc-7292 | AB_627875 |
CG7044 | This paper | N/A |
hINTS6 | Bethyl #A301-658A | AB_1210927 |
CPSF73 | Abclonal #A2222 | Pending |
tubulin | Abcam #ab15246 | AB_301787 |
GFP | Abclonal #AE100 | Pending |
UBE3D | Abnova #PAB21883 | AB_10966255 |
CPSF1/CPSF160 | Abclonal #A17144 | AB_2769032 |
CPSF4/CPSF30 | Abclonal #A10284 | AB_2757826 |
CPSF100 | Abclonal #A9297 | Pending |
HA-HRP | Abclonal #AEO25 | AB_2769866 |
OCT-4 | Cell Signaling Technologies #2750 | AB_823583 |
SOX2 | Cell Signaling Technologies #3579 | AB_2195767 |
NANOG | Cell Signaling Technologies #4903 | AB_10559205 |
SSEA4 | Abcam #ab16287 | AB_778073 |
Tra-1-81 | EMD Millipore #MAB4381 | AB_177638 |
Tra-1-60 | Invitrogen #MA1-0231000 | AB_2536699 |
Goat anti-rabbit IgG Alexa Fluor 488 | Invitrogen #A-11034 | AB_2576217 |
Goat anti-rabbit IgG Alexa Fluor 568 | Invitrogen #A-11036 | AB_10563566 |
Goat anti-mouse IgM Alexa Fluor 488 | Invitrogen #A-21042 | AB_2535711 |
Goat anti-mouse IgM Alexa Fluor 568 | Invitrogen #A-21043 | AB_2535712 |
Goat anti-mouse IgG(H+L) Alexa Fluor 568 | Invitrogen #A-11031 | AB_144696 |
Goat anti-rabbit IgG (H+L) Texas Red-X | Invitrogen, #T-6391 | AB_2556779 |
Chemicals, Peptides, and Recombinant Proteins | ||
AzNTPs | TriLink Technologies | Cat# K-1005 |
Biotin-11-NTPs | Perkin Elmer | Cat# NEL54(2/3/4/5)001 |
Critical Commercial Assays | ||
NEBNext Poly(A) mRNA Magnetic Isolation Module | NEB | Cat# E7490S |
Deposited Data | ||
Raw and analyzed data | This paper | GEO: GSE246833 |
Raw blots and fluorescence images | This paper | DOI: 10.17632/ygprhc3hvt.1 |
Structure of the human INTS9-INTS11-BRAT1 complex | This paper | PDB: 8UIB |
Structure of the Drosophila dIntS11-CG7044 complex | This paper | PDB: 8UIC |
Experimental Models: Cell Lines | ||
HEK293T | ATCC | N/A |
DL1 | Dr. Sara Cherry, UPenn | N/A |
S2-DGRC clone 6 | DGRC | N/A |
S2 Flag-dIntS5 | Huang et al., 20206 | N/A |
S2 Flag-dIntS11 | Elrod et al., 20194 | N/A |
S2 Flag-dIntS1 | Huang et al., 20206 | N/A |
S2 Flag-CG7044 | This paper | N/A |
S2 Flag-dIntS8 | This paper | N/A |
HEK293T-TRE-FLAG-INTS11 | This paper | N/A |
HEK293T-TRE-FLAG-BRAT1 | This paper | N/A |
HCT116 | ATCC | N/A |
HCT116-BRAT1KO-1 | This paper | N/A |
HCT116-BRAT1KO-2 | This paper | N/A |
HCT116-BRAT1KO-3 | This paper | N/A |
Human H7 Embryonic Stem Cells (H7-hESC) | WiCell Research Institute | NIHhESC-10-0061 |
H7-hESC-Con-1 | This paper | N/A |
H7-hESC-Con-2 | This paper | N/A |
H7-hESC-DBRAT1-1 | This paper | N/A |
H7-hESC-DBRAT1-2 | This paper | N/A |
HEK293T-TRE-FLAG-UBE3D | This paper | N/A |
Oligonucleotides | ||
Table S3 | This paper | N/A |
Recombinant DNA | ||
pMT-3xFLAG-puro vector | Elrod et al., 20194 | N/A |
pMT-3xFLAG-dIntS5-puro | Huang et al, 20206 | N/A |
pMT-3xFLAG-dIntS11-puro | Huang et al, 20206 | N/A |
pMT-3xFLAG-dIntS1-puro | Huang et al, 20206 | N/A |
pMT-3xFLAG-dIntS5-puro | Huang et al, 20206 | N/A |
pMT-3xFLAG-CG7044-puro | This paper | N/A |
pMT-3xFLAG-dIntS8-puro | Huang et al, 20206 | N/A |
Software and Algorithms | ||
FastP 0.23.1 | Chen et al., 201896 | N/A |
STAR_2.7.9a | Dobin et al., 201398 | N/A |
featureCounts | Liao et al., 201499 | N/A |
DESeq2 1.34.0 | Love et al., 2014100 | N/A |
R v4.0.2 | www.R-project.org | N/A |
pcaExplorer | Marini et al., 2019101 | N/A |
EnrichR | Chen et al., 2013102 | N/A |
ggplot2 | Wickham, 2016106 | N/A |
IGV | Robinson et al., 2011107 | N/A |
ShinyGO 0.77 | Ge et al., 2020105 | N/A |
Prism v10.1.0 | GraphPad | N/A |
Morpheus | Morpheus, https://software.broadinstitute.org/morpheus | N/A |
Mass spectrometry sample digestion
Drosophila samples were prepared at the University of Texas Medical Branch Mass Spectrometry Facility similar to what was described previously80. Briefly, the agarose bead-bound proteins were washed several times with 50mM Triethylammonium bicarbonate (TEAB) pH 7.1, before being solubilized with 40μL of 5% SDS, 50mM TEAB, pH 7.55 followed by a room temperature incubation for 30 minutes. The supernatant containing the proteins of interest was then transferred to a new tube, reduced by making the solution 10mM Tris(2-carboxyethyl)phosphine (TCEP) (Thermo, #77720), and further incubated at 65°C for 10 minutes. The sample was then cooled to room temperature and 3.75 μL of 1M iodoacetamide acid was added and allowed to react for 20 minutes in the dark after which 0.5μL of 2M DTT was added to quench the reaction. Then, 5 μl of 12% phosphoric acid was then added to the 50μL protein solution followed by 350μL of binding buffer (90% Methanol, 100mM TEAB final; pH 7.1). The resulting solution was administered to an S-Trap spin column (Protifi, Farmingdale NY) and passed through the column using a bench top centrifuge (30 second spin at 4,000g). The spin column was then washed three times with 400μL of binding buffer and centrifuged (1200rpm, 1min). Trypsin (Promega, #V5280, Madison, WI) was then added to the protein mixture in a ratio of 1:25 in 50mM TEAB, pH=8, and incubated at 37°C for 4 hours. Peptides were eluted with 80uL of 50mM TEAB, followed by 80μL of 0.2% formic acid, and finally 80 μL of 50% acetonitrile, 0.2% formic acid. The combined peptide solution was then dried in a speed vacuum (room temperature, 1.5 hours) and resuspended in 2% acetonitrile, 0.1% formic acid, 97.9% water and aliquoted into an autosampler vial.
Human samples were prepared at the University of Rochester Mass Spectrometry Resource Lab. To allow total protein to be evaluated in a single digest, FLAG immunoprecipitated samples were run on a 4-12% SDS-Page gel to remove contaminants and create a region about 10mm long. Each region was excised from the gel and cut into smaller cubes, about 1mm in size. Samples were de-stained, reduced with DTT, alkylated with IAA (Sigma), and dehydrated with acetonitrile. Dehydrated gel pieces were incubated at room temperature for half an hour after the addition of trypsin at 10 ng/uL in 50 mM ammonium bicarbonate to just cover the pieces. After the room temperature incubation, the pieces were completely submerged by the addition of more ammonium bicarbonate, and incubated at 37°C overnight. The next day, 50% acetonitrile and 0.1% TFA were used to extract the peptides, which were then dried down in a CentriVap concentrator (Labconco). Finally, the peptides were desalted with homemade C18 spin columns, dried once more, and reconstituted in 0.1% TFA.
Mass Spectrometry Analysis
Drosophila peptide mixtures were analyzed by nanoflow liquid chromatography-tandem mass spectrometry (nanoLC-MS/MS) using a nano-LC chromatography system (UltiMate 3000 RSLCnano, Dionex), coupled on-line to a Thermo Orbitrap Eclipse mass spectrometer (Thermo Fisher Scientific, San Jose, CA) through a nanospray ion source . A direct injection method is used onto an analytical column; Aurora (75um X 25 cm, 1.6 μm) from (ionopticks). After equilibrating the column in 98% solvent A (0.1% formic acid in water) and 2% solvent B (0.1% formic acid in acetonitrile (ACN)), the samples (2 μL in solvent A) were injected (300 nL/min) by gradient elution onto the C18 column as follows: isocratic at 2% B, 0-10 min; 2% to 27% 10-98 min, 27% to 45% B, 98-102 min; 45% to 90% B, 102-103 min; isocratic at 90% B, 103-104 min; 90% to 15%, 104-106 min; 15% to 90% 106-108 min; isocratic for two minutes; 90%-2%, 110-112 min; and isocratic at 2% B, till 120 min.
All Drosophila LC-MS/MS data were acquired using an Orbitrap Eclipse in positive ion mode using a top speed data-dependent acquisition (DDA) method with a 3 second cycle time. The survey scans (m/z 375-1500) were acquired in the Orbitrap at 50,000 resolution (at m/z = 400) in profile mode, with a maximum injection time of 100 msec and an AGC target of 400,000 ions. The S-lens RF level was set to 30. Isolation was performed in the quadrupole with a 1.6 Da isolation window, and HCD MS/MS acquisition was performed in profile mode using the orbitrap at a resolution of 7500 using the following settings: parent threshold = 5,000; collision energy = 30%; using the default settings. Monoisotopic precursor selection (MIPS) and charge state filtering were on, with charge states 2-6 included. Dynamic exclusion was used to remove selected precursor ions, with a +/− 10 ppm mass tolerance, for 60 seconds after acquisition of one MS/MS spectrum.
Prepared human peptides were injected onto a homemade 30 cm C18 column with 1.8 um beads (Sepax) using an Easy nLC-1200 HPLC (Thermo Fisher) coupled with a Fusion Lumos Tribrid mass spectrometer (Thermo Fisher). Solvent A consisted of 0.1% formic acid in water, and solvent B consisted of 0.1% formic acid in 80% acetonitrile. A Nanospray Flex source operating at 2 kV was used to introduce ions to the mass spectrometer. The gradient eluted as follows: isocratic at 3% B, 0-2 min; 3% to 10% B in 3-9 min; 10% to 38% B in 10-45 min; 38% to 90% B, 46-51 min; isocratic at 90%, 52-55 min; 90% to 3% B, 56-58 min; and isocratic at 3% B, 59-65 min. The Fusion Lumos was in data-independent mode. The MS1 scan was done over a 390-1010 m/z range with a 60,000 resolution at 200 m/z. The AGC target was 4e5 AGC, and a 50 ms maximum injection time was used. A staggered windowing scheme of 20 m/z with 10 m/z overlaps was used to measure precursor ions. The first cycle fragmented ions between 400-420 m/z, 420-440 m/z, and so on, until the final window of 980-1000 m/z. The second cycle fragmented 390-410 m/z, 410-430 m/z, and so on, until the final window of 990-1010 m/z. There were 30 MS2 scans per cycle, and fragment ions were collected between 200-2000 m/z. Higher energy C-trap dissociation (HCD) was used to fragment precursor ions with a 33% collision energy. A resolution of 15,000, AGC target of 4e5, and a maximum injection time of 23 ms were used for MS2 scans collected in the orbitrap.
Database searching for mass spectrometry
For Drosophila samples, tandem mass spectra were extracted and charge state deconvoluted using Proteome Discoverer (Thermo Fisher, version 2.2.0388). Deisotoping was not performed. All MS/MS spectra were searched against a Uniprot Drosophila database (version 04-04-2018) using Sequest. Searches were performed with a parent ion tolerance of 5 ppm and a fragment ion tolerance of 0.60 Da. Trypsin was specified as the enzyme, allowing for two missed cleavages. Fixed modification of carbamidomethyl (C) and variable modifications of oxidation (M) and deamidation were specified in Sequest. Representative heatmaps of normalized total spectral counts were generated using Morpheus (https://software.broadinstitute.org/morpheus).
For human samples, raw data was processed with DIA-NN version 1.8.1 (https://github.com/vdemichev/DIA-NN) in library-free analysis mode4. The human UniProt ‘one protein sequence per gene’ database with ‘deep learning-based spectra and RT prediction’ was used to annotate the library (UP000005640_9606, downloaded 4/7/2021). For precursor ion generation, the maximum number of missed cleavages was set at 1, maximum number of variable modifications at 1 for Ox(M), peptide length range at 7-30, precursor charge range at 2-3, precursor m/z range at 400-1000, and fragment m/z range at 200-2000. To quantify, the ‘Robust LC (high precision)’ mode was used with RT-dependent median-based cross-run normalization and MBR enabled. Protein inferences was set to ‘Genes’ and ‘Heuristic protein inference was off. The software automatically set MS1 and MS2 mass tolerances and window size. Library precursor q-value (1%), library protein group q-value (1%), and posterior error probability (50%) were used to filter precursors. The MaxLFQ algorithm was used to carry out protein quantification as a part of the DIA-NN R package (https://github.com/vdemichev/diann-rpackage) and DiannReport Generator Package was used to count the number of peptides quantified in each protein group (https://github.com/kswovick/DIANN-Report-Generator)81. Representative heatmaps of normalized total spectral counts were generated using Morpheus (https://software.broadinstitute.org/morpheus). Relative abundance was first normalized to the naïve control and were then normalized to the abundance of the sample’s bait protein, resulting in a percentage of relative intensity as compared to the bait which was immunoprecipitated.
Immunofluorescence staining
Wildtype HCT116 cells or BRAT1 knockout lines were plated on coverslips in a 6-well plate. The next day, cells were gently washed with PBS and coverslips were moved to a 24-well plate. Cells were fixed by incubating in ice cold methanol for 5 minutes. After washing with PBS, samples were blocked for 30 minutes in 1% bovine serum albumin (Sigma-Aldrich, #A7906-50G) in PBS and 0.1% Tween 20 (Fisher Scientific, #BP337-500) at 37°C with humidity. Primary antibody probing was performed overnight at 1:100 in blocking buffer with rabbit anti-BRAT1 (Abclonal, #A12801). The next morning, samples were washed three times with PBS for 5 minutes. Probing of secondary antibody was performed for 1 hour in the dark with Texas Red-X goat anti-rabbit IgG (H+L) (Invitrogen, #T-6391). After another three five minute washes with PBS, coverslips were mounted on slides with Fluoroshield Mounting medium with DAPI (Abcam, #ab104139).
Image Acquisition for BRAT1 Expression and Localization Immunofluorescence
Mounted coverslips for BRAT1 and nuclei labeled HCT wild-type (WT) and BRAT1 knockout (KO) cells were imaged in a “grid” confocal regime using a structured illumination element (OptiGrid) inserted into an Olympus BX51 microscopy. The samples were illuminated with a Prior Lumen 200 Hg lamp source (Prior; #LM200B1-A) coupled into the microscope body with a liquid light coupler (Prior; #LM587). The broadband light source was filtered for 377 nm and 561 nm, respectively, with the following excitation filters (Semrock; #FF01-377/50-25, # FF01-561/14-25). The filtered light was focused onto the sample using either 20x (Olympus; cat: UPlanApo 0.70 NA air) or 60x (Olympus; cat: PlanApo 1.4NA oil) objectives. The resultant emissions from DAPI stained nuclei and Texas Red labeled BRAT1 were filtered through corresponding emission filters for DAPI (Semrock; cat: FF02-447/60-25) and Texas Red (FF01-609/54-25), respectively, and imaged onto a Hamatsu ORCA-ER detector. Images acquisition was controlled using Volocity Image Acquisition software (Quorom Technologies). The collected images were z-stacks acquired at either 1 μm or 0.5 μm axial steps for illumination with a 20x or 60x objective, respectively.
Image Analysis for BRAT1 Expression and Localization Immunofluorescence
Collected images were pre-processed on Volocity for noise filtering and flat field illumination correction. Then, the acquired z-stacks were analyzed for DAPI and Texas Red positive objects using a standard deviation dependent thresholding algorithm. DAPI-positive objects were segmented to identify individual nuclei, which was used as a proxy for number of cells in a field-of-view and normalization scheme for expression levels at a “per nuclei” or “per cell” basis. Texas Red positive objects were also segmented and determined to be nuclear or cytoplasmic based on spatial overlap of detected fluorescence with DAPI objects. BRAT1 localization was then determined as the fraction of cytoplasmic (non-overlapping with DAPI) BRAT1 volume, to nuclear (overlapping with DAPI) BRAT1 volume. BRAT1 expression levels was defined as the total intensity of BRAT1 objects normalized to the total BRAT1 volume and then normalized to a “per cell” quantal value based on the number of nuclei in a field-of-view. These values were then imported in GraphPad Prism for statistical analysis using a student’s t-test.
Protein expression and purification
Drosophila IntS11 and CG7044 were cloned into the same pFL vector and co-expressed in insect cells using Multibac technology (Geneva Biotech) 82. A 6xHis tag was added to the N terminus of IntS11. Bacmids expressing IntS11 and CG7044 were generated in DH10EMBacY competent cells (Geneva Biotech) by transformation. Tni insect cells were grown in ESF 921 medium (Expression Systems) by shaking at 120 rpm at 27 °C until the density reached 2 × 106 cells·ml−1. Cells were infected with 16 ml IntS11-CG7044 P2 virus and harvested after 48 h.
For purification, the cell pellet was resuspended and lysed by sonication in 100 ml of buffer containing 20 mM Tris HCl (pH 8.0), 250 mM NaCl, 2 mM β-mercaptoethanol (βME), 5% (v/v) glycerol, and one tablet of protease inhibitor mixture (Sigma). The cell lysate was then centrifuged at 13,000 rpm for 40 min at 4 °C. The protein complex was purified from the supernatant via nickel affinity chromatography. The protein complex was further purified by a HiTrap Q column (GE Healthcare) and a Hiload 16/60 Superdex 200 column (GE Healthcare). The IntS11-CG7044 complex was concentrated to 1 mg·ml−1 in a buffer containing 20 mM Tris HCl (pH 8.0), 100 mM NaCl, and 2 mM dithiothreitol (DTT), and stored at −80 °C. The protein concentration of the freshly purified samples was measured with a NanoDrop spectrophotometer (Thermo Fisher Scientific).
N-terminal 6xHis-tagged Drosophila Ints9 was cloned into the pSPL donor vector. Drosophila IntS11 and CG7044 were cloned into the pFL acceptor vector. These two vectors were fused together by Cre recombinase. One liter of Tni insect cells (2 × 106 cells·ml−1) were infected with 15 ml of IntS9-IntS11-CG7044 P2 virus. For purification, the cell pellet was resuspended and lysed by sonication in 100 ml of buffer containing 20 mM Tris HCl (pH 8.0), 250 mM NaCl, 2 mM βME, 5% (v/v) glycerol, and one tablet of protease inhibitor mixture (Sigma). The cell lysate was then centrifuged at 13,000 rpm for 40 min at 4 °C. The protein complex was purified from the supernatant via nickel affinity chromatography. The protein complex was further purified by a HiTrap Q column (GE Healthcare).
Human INTS9, INTS11 and BRAT1 were co-expressed in insect cells. N-terminal 6xHis-tagged BRAT1 and INTS11 were cloned into the same pFL vector. INTS9 was cloned into another pFL vector. One liter of Tni insect cells (2 × 106 cells·ml−1) were co-infected with 16 ml of BRAT1-INTS11 P2 virus and 16 ml of INTS9 P2 virus. The cells were harvested after 48 h. For purification, the cell pellet was resuspended and lysed by sonication in 100 ml of buffer containing 20 mM Tris HCl (pH 8.0), 250 mM NaCl, 2 mM βME, 5% (v/v) glycerol, and one tablet of protease inhibitor mixture (Sigma). The cell lysate was then centrifuged at 13,000 rpm for 40 min at 4 °C. The protein complex was purified from the supernatant via nickel affinity chromatography. The protein complex was further purified by a Hiload 16/60 Superdex 200 column. The INTS9-INTS11-BRAT1 complex was concentrated to 1 mg·ml−1 in a buffer containing 20 mM Tris HCl (pH 8.0), 200 mM NaCl, and 2 mM DTT, and stored at −80 °C
The human INTS9 S283M and E280R/S283M/L287A/R290A mutations were introduced using site-directed mutagenesis PCR and verified by DNA sequencing. The mutant INTS9-INTS11-BRAT1 complexes were expressed and purified following the protocols for the wild-type complex.
K. lactis Ysh1 and Ipa1 were cloned into the 438B vector83 (Addgene, #55219). K. lactis Ydh1 was cloned into the 438A vector (Addgene, #55218), and Pta1 was cloned into the 438C vector (Addgene, #P55220). Ysh1, Ydh1, and Pta1 were combined into a single vector using the MacroBac technology83. One liter of Tni insect cells (2 × 106 cells·ml−1) were co-infected with 16 ml of Ysh1-Ydh1-Pta1 P2 virus and 16 ml of Ipa1 P2 virus. The cells were harvested after 48 h. For purification, the cell pellet was resuspended and lysed by sonication in 100 ml of buffer containing 20 mM Tris HCl (pH 8.0), 250 mM NaCl, 2 mM βME, 5% (v/v) glycerol, and one tablet of protease inhibitor mixture (Sigma). The cell lysate was then centrifuged at 13,000 rpm for 40 min at 4 °C. The protein complex was purified from the supernatant via nickel affinity chromatography. The protein complex was further purified by a Hiload 16/60 Superdex 200 column. The Ysh1-Ipa1 complex was concentrated to 2 mg·ml−1 in a buffer containing 20 mM Na HEPES (pH 7.4), 200 mM NaCl, and 2 mM DTT, and stored at −80 °C.
EM specimen preparation and data collection
Cryo-EM grids for the Drosophila IntS11-CG7044 complex were prepared by applying 3.5 μL of protein sample at a concentration of 0.18 mg·ml−1 to one side of a Quantifoil 400 mesh 1.2/1.3 gold grid with graphene oxide support film (Quantifoil). After 30 s, the grid was blotted for 1.5 s on the other side under 99% humidity and 20 °C using EM GP2 plunge freezer (Leica) and immediately plunged into liquid ethane. 1,603 image stacks were collected on a Titan Krios electron microscope at the New York Structural Biology Center, equipped with a K3 direct electron detector (Gatan) at 300 kV with a total dose of 51 e− Å−2 subdivided into 40 frames in 2 s exposure using Leginon84. The images were recorded at a nominal magnification of 81,000× and a calibrated pixel size of 1.083 Å, with a defocus range from −1 to −2.5 μm.
For the INTS9-INTS11-BRAT1 complex, a 3.5 μL aliquot at 0.3 mg·ml−1 was applied to a plasma-cleaned Quantifoil 300 mesh 1.2/1.3 grid (Quantifoil) with a locally-applied gold coating (30 nm thickness setting, Safematic GmbH, Switzerland). After 5 s, the grid was blotted for 1.5 s and plunged into liquid ethane with a Vitrobot Mark VI (FEI) set at 20 °C and 100 % humidity. 4,030 image stacks were collected on a Titan Krios electron microscope at the Columbia University Cryogenic Electron Microscopy Center, equipped with a K3 direct electron detector (Gatan) at 300 kV with a total dose of 58.2 e− Å−2 subdivided into 50 frames in 2.5 s exposure using Leginon84. The images were recorded at a nominal magnification of 105,000× and a calibrated pixel size of 0.83 Å, with a defocus range from −0.8 to −1.3 μm.
Image processing
For both cryo-EM datasets, image stacks were motion-corrected and dose-weighted using RELION 3.185. For the IntS11-CG7044 dataset, the CTF parameters were determined with CTFFIND486 in cryoSPARC87. 2,028,174 particles were auto-picked and subjected to 2D classification and ab initio reconstruction in cryoSPARC to generate eight initial 3D models. These models were then used in heterogeneous refinement against all the particles in cryoSPARC, and the good particles were then used in another round of heterogeneous refinement. 406,222 particles were then selected and imported to RELION for CTF refinement and Bayesian polishing. The polished particles were then imported back to cryoSPARC for homogeneous refinement, yielding a final map at 3.54 Å resolution.
For the INTS9-INTS11-BRAT1 dataset, the patch CTF parameters were determined with cryoSPARC. 1,805,449 particles were auto-picked and subjected to 2D classification in cryoSPARC. 323,946 particles in classes with recognizable features by visual inspection were used for ab initio reconstruction in cryoSPARC to generate two initial models. One of the initial models with 185,172 particles was selected and subjected to non-uniform refinement together with per-particle defocus refinement and global CTF refinement in cryoSPARC, yielding a final map at 3.21 Å resolution.
Model Building
Atomic models for CG7044 and Drosophila IntS11 were built manually into the cryo-EM density with Coot88. A homology model for Drosophila IntS11 was generated with I-TASSER89, based on the structures of human CPSF7390. The starting models for human INTS9 and INTS11 were from the PDB entry 7BFP31, and the starting model for human BRAT1 was from AlphaFold69. These models were fitted as a rigid body into the cryo-EM density map and manually revised using Coot. The atomic models were improved by real-space refinement with the program PHENIX91.
AlphaFold Prediction
AlphaFold-Multimer72 was used to predict models of protein complexes. The models were manually examined with PyMOL and Coot. The PAE plot was produced with PAE viewer92. It should be noted that currently there are no structures in the Protein Data Bank that have a protein segment bound in the active site of CPSF73 or its homologs, and therefore the prediction of the UBE3D complex is unlikely to be biased by prior related structures.
CRISPR genomic editing
CRISPR/Cas9 was used to generate HCT116 BRAT1 knockout cell lines. Briefly, a 100 base pair genomic sequence flanking the BRAT1 start codon was input into CRISPOR to predict gRNAs (http://crispor.tefor.net/crispor.py)93. gRNA selection criteria was based upon specificity score based off number of off targets and minimal distance between the BRAT1 start codon and the Cas9 cut site. The selected gRNAs (Table S3) were cloned into a modified pSpCas9(BB)-2A-Puro (pX459), originally from Feng Zhang (Addgene, #62988), using an annealed oligonucleotide strategy. Our group modified the Flag tag in the vector to a myc tag using site directed mutagenesis prior to gRNA cloning. To transfect, 4x105 HCT116 cells were plated in a 24-well dish in complete media and 800 μg plasmid with Lipofectamine 2000 (Invitrogen, #11668027) was added to the cells according to the manufacturer’s instructions. After 24 hours, 1 μg/mL puromycin was added to the media for 48 hours to select the cell population. Cell populations were expanded in normal growth medium and protein lysates collected by adding 2X SDS sample buffer (120mM Tris HCl pH6.8, 4% SDS, 200mM DTT, 20% Glycerol, and 0.02% Bromophenol blue) to cells in wells, incubated at room temperature with periodic swirling, boiled for 10 minutes at 95°C, and subjected to a short sonication. Western blotting was performed to verify decrease in BRAT1 expression across the heterogeneous cell population. Clonal lines were selected followed by additional western blotting and genotyping to confirm knockout. To genotype, genomic DNA was extracted from candidate clonal lines, and PCR performed to amplify the genomic region containing the BRAT1 start codon. PCR products were resolved on an agarose gel to verify disruption at the beginning of the gene on each allele. Finally, PCR products were cloned and sequenced to confirm identity.
RNA Interference (Drosophila cells)
Double-stranded RNAs targeting the 5’ and 3’ UTRs of Drosophila IntS11 and the CDS of CG7044 were generated using in vitro transcription of PCR templates with the T7 promoter sequence on each end using MEGAscript kit (Thermo, #AMB13345). Primer sequences are in Table S3. For RNA interference experiments, 1.5 x 106/ml of S2 cells were seeded into a 6-well plate with 10 μg of dsRNA after being washed into serum free media. After a 1-hour incubation, 2 mL of complete growth medium was added. Samples were harvested for protein lysates 60 hours after the start of the incubation.
RT-qPCR
Total RNA was isolated using Trizol (Invitrogen, #15596026) following the manufacturer’s instructions. cDNA was then reverse transcribed using M-MLV Reverse Transcriptase (Invitrogen, #28025013). Random hexamers were used for cDNA synthesis and RT-qPCR was then carried out in triplicate using Bio-Rad iTaq Universal SYBR Green Supermix (Bio-Rad, #1725120) and measured in AriaMx Real-Time PCR (qPCR) Instrument (Agilent, G8830A). All RT-qPCR primers are provided in Table S3.
Reporter cell line establishment
Reporter lines were generated as described previously33. Briefly, a reporter plasmid was generated by cutting pAAVS1-TLR targeting vector (Addgene, #64215) with Cal1 and PspXI and replacing in the coding region of U7 small nuclear RNA, the 500 bp upstream of the transcription start site, and 50 bp downstream of the coding region followed by the GFP coding region and SV40 poly(A) signal. This was co-transfected with the pU6-(BbsI)CBh-Cas9-T2A-mCherry (Addgene, #64324) with the gRNA targeting the AAVS1 locus into HEK 293T cells. Lipofectamine 2000 was used to transfect in equal amounts of plasmid for 24 hours before a 2-day selection with 800 ng/ml puromycin. Cells were grown in growth medium without selection for a week before clonal selection.
Rescue experiments in Fig. 5F were performed by transfecting in pSH-EFIRES-Puro Empty Vector, pSH-EFIRES-P-BRAT1-WT, pSH-EFIRES-P-BRAT1-AAA, pSH-EFIRES-P-BRAT1-ΔDCY, or pSH-EFIRES-P-BRAT1-ΔCTE. Plasmids were generated by replacing the AtAFB2 coding sequence in pSH-EFIRES-P-AtAFB2 (Addgene, #129715) with a multicloning site using oligonucleotides GGCCACGCGTTCTAGACAGCCAAACGGGTCAAACTTG and GGCCACGCGTGAATTCAGCGCTAGCCTATAGTGAGT, and then cloning in the respective cDNAs. Mutations were introduced in oligonucleotide sequences (Table S3). Lysates were collected by adding 2X SDS loading buffer to cells in wells, incubating at room temperature with periodic swirling, boiling for 10 minutes at 95°C, and a brief sonication before loading into 10% SDS-PAGE gels for analysis.
Embryonic stem cell line establishment and organoid differentiation
Embryonic stem cell line H7 (WA07, WiCell) was used to generate syngeneic, BRAT1-knock out cell lines, by CRISPR-targeted disruption of BRAT1 gene using the gRNA and Cas9m plasmids described under CRISPR genomic editing. gRNAs (Table S3) were cloned into a modified pSpCas9(BB)-2A-Puro (pX459) vector (Addgene, #62988), and then transfected into H7 cells using Lipofectamine Stem transfection reagent (Invitrogen #STEM00003) according to manufacturer instructions. The next day stem cell cultures were rinsed with PBS and passaged as single cells using Accutase (Stem Cell Technologies #07922). To enable clonal selection, cells were plated on Matrigel (Corning #356254)-coated 100mm plates at 50 cells/cm2 in Stemflex (Gibco, #A33494-01) supplemented with CloneR (Stem Cell Technologies, #05888). One day later, medium was replaced with Stemflex supplemented with 1μg/ml puromycin to select for transfected cells. At the same time, colonies containing more than one cell were removed manually by pick-to-remove using an inverted microscope (Nikon TS-100) equipped with 4x Apochromat objective. After 4 days puromycin was removed from the medium. After an additional 4 days, surviving clones were manually harvested by pick-to-keep and expanded as separate clones in 24-well and then 6-well plates at which point cell pellets were harvested in parallel and analyzed by Western blot for expression of BRAT1. Two clonal cell lines each were selected from control (no sgRNA) and ΔBRAT1 cultures for further analysis.
To examine early effects of BRAT1 loss on interneuron lineages in human ESC-derived cells, we adapted the established dual Smad inhibition protocol for the generation of neural organoids which were subsequently induced to aquire a ventral fate61,94. Briefly, on day 0 ESCs were passaged into a single cell suspension using Accutase, and then plated at 9000 cells per well into ultra-low adhesion, v-bottom, 96 well cell culture plates (SBio #MS-9096VZ) in Stemflex medium supplemented with 10μM ROCK inhibitor Y-27632 (HelloBio #HB2297). Cells were allowed to settle and form aggregates. Neural induction is initiated on day 1 by addition of 10μM SB431542 (MedChemExpress #HY-10431) and 250nM LDN193189 (Tocris #4602) in the presence of Wnt-inhibitor XAV939 (2μM, Tocris #3748) in DMEM/F12 medium (Gibco #11330-057) supplemented with 15% Knock-out serum replacement (Gibco #10828010), Glutamax (Gibco #35050061), non-essential amino acids (Gibco #11140050), 100uM β-mercaptoethanol (Sigma #M3148). Going forward medium was exchanged every 2 days. On days 5-8 organoids were switched to a 50:50 mix of DMEM/F12 and Neurobasal (Gibco #21103049) medium supplemented with N2 (Gibco #17520248), Glutamax, penicillin-streptomycin antibiotic (Gibco #15140122), 10μM SB431542, 250nM LDN193189 and 2μM XAV939. On day 9 SB431542, LDN193189 and XAV939 are omitted and a ventral fate is induced by addition of the Sonic Hedgehog agonist SAG (0.5μM, Cayman Chemicals #11915) and 1μM purmorphamine (Tocris, #45-511-0). Ventral organoids were then harvested on day 14.
Stem cell staining
Cell colonies of control and ΔBRAT1 embryonic stem cells were grown in low-profile, 4-well dishes (Nunc, #179830) in stem cell conditions prior to harvesting for immunofluorescent analysis. For live staining with Tra-1-81 (1:200, mouse IgM, EMD Millipore #MAB4381), Tra-1-60 (1:200, mouse IgM, Invitrogen #MA1-023) and SSEA4 (1:100, mouse IgG3, Abcam #ab16287), antibodies were added directly to the cell culture medium and incubated for 30 minutes prior to washing with PBS and fixation in 4% paraformaldehyde (PFA)/PBS for 10 min at room temperature. For labeling with OCT-4, SOX2 and NANOG, cultures were fixed in 4% PFA/PBS for 10 minutes at room temperature, and rinsed in Wash Medium (PBS, 5% (v/v) goat serum), then permeabilized for 10 minutes in 0.5% (v/v) Triton X-100 in Wash Medium, and blocked for one hour in Staining Buffer (1% (v/v) bovine serum albumin, 5% (v/v) goat serum and 0.2% (v/v) Triton X-100). OCT-4 (1:200, rabbit host, Cell Signaling Technologies #2750), SOX2 (1:500, rabbit monoclonal, Cell Signaling Technologies #3579) and NANOG (1:200, rabbit monoclonal, Cell Signaling Technologies #4903) antibody labeling was performed overnight at 4°C in Staining Buffer. Following primary antibody staining, cells were washed with Wash Medium and incubated with appropriate secondary antibodies diluted in Staining Buffer for 1 hour: Goat anti-rabbit IgG Alexa Fluor 488 (Invitrogen, #A-11034), goat anti-rabbit IgG Alexa Fluor 568 (Invitrogen #A-11036), goat anti-mouse IgM Alexa Fluor 488 (Invitrogen #A-21042), goat anti-mouse IgM Alexa Fluor 568 (Invitrogen #A-21043), goat anti-mouse IgG(H+L) Alexa Fluor 568 (Invitrogen #A11031). Samples were then washed three times in Wash Medium and then stored in PBS with 0.05% sodium azide prior to fluorescent imaging on a Evos FL AMF4300 (Invitrogen #12-563-460) inverted epi-fluorescent microscope equipped for imaging ex470/em525nm (GFP) and ex531/em593nm (RFP) fluorophores.
Small scale extract and co-immunoprecipitation
Small scale nuclear extract and co-immunoprecipitations in Fig. 7F were performed similarly to described above. HEK 293T cells were transfected with equal amounts of pcDNA-HA-CPSF73 and pcDNA-FLAG empty vector, pcDNA-FLAG-UBE3D-WT, pcDNA-FLAG-UBE3D-Δloop-1, pcDNA-FLAG-UBE3D-Δloop-2, or pcDNA-FLAG-UBE3D-AAAA using Lipofectamine 2000 according to the manufacturer’s directions (Invitrogen, #11668027). Cells were harvested from a 6-well plate by washing with 1x PBS, then lysing with low salt lysis buffer (150mM NaCl, 1% NP-40, 100mM Tris-HCl pH8.8) and incubating on ice for 30 minutes. Samples were centrifuged at 20,000g for 5 minutes at 4°C. The supernatant was removed for co-immunoprecipitation and a small aliquot saved for input blots. Similarly to above, extract was incubated with anti-Flag M2 affinity agarose slurry (Sigma, #A2220) equilibrated in IP wash buffer (20mM Na HEPES pH7.4, 100mM KCl, 10% Glycerol, 0.1% NP-40) and rotated for 4 hours at 4°C. After a 10 minute incubation on ice, samples were centrifuged at 500g for 3 minutes at 4°C. Four washes were performed by removing supernatant, adding 1 mL IP wash buffer, rotating for 5 minutes at 4°C, and spin at 500g for 3 minutes at 4°C. After the four washes, the supernatant was removed, and 2x SDS loading buffer was added the beads before a ten-minute boil at 95°C, and loading onto 10% SDS-PAGE gels for analysis.
QUANTIFICATION AND STATISTICAL ANALYSIS
RT-qPCR quantification and analysis
Data were analyzed using the ΔΔCt method with 7sk as the reference gene and wildtype HCT116 or Day 0 H7 ESC cells as the control, as described previously57. Results are shown from biologically independent replicates, depicting averages and standard deviations (mean +/− SD, N=3).
RNA-seq library generation and mapping
Embryonic stem cells and neural organoids were washed once in PBS prior to harvest, and up to 2 x 106 cells or 20 organoids were lysed in 500ul RNA lysis buffer (Zymo Research #R1060) using 2 mm Bashing Beads (Zymo Research #S6003) in a bead mill homogenizer (VWR #19-2141T) with a 10 second pulse. RNA was extracted using affinity column purification and on column DNase treatment (Zymo Research, #R1054). RNA quality and concentration was confirmed with a NanoDrop OneC (ThermoFisher, #ND-ONE-W). Input RNA was subjected to polyA selection using NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB, #E7490S). 2 μg of RNA from each sample was used for the kit purification and were eluted in 10 mM Tris HCl pH7.5 before proceeding to library generation. RNA-seq libraries were generated (3 independent biological replicates per condition) using the Click-seq library preparation method with a 1:35 azido-nucleotide ratio95. Libraries were sequenced using a single-end 150 bp cycle run on an Illumina NovaSeq 6000.
Raw reads generated from the Illumina basecalls were demultiplexed using bcl2fastq version 2.20.0. Quality filtering and adapter removal are performed using FastP version 0.23.196 with the following parameters: "--length_required 35 --cut_front_window_size 1 --cut_front_mean_quality 13 --cut_front --cut_tail_window_size 1 --cut_tail_mean_quality 13 --cut_tail -y –r". Processed/cleaned reads were then mapped to the GRCh38/gencode38 (Mouse OR Human) reference97 using STAR_2.7.9a98 with the following parameters: "—twopass Mode Basic --runMode alignReads --outSAMtype BAM Unsorted – outSAMstrandField intronMotif --outFilterIntronMotifs RemoveNoncanonical –outReadsUnmapped Fastx." Gene level read quantification was derived using the subread-2.0.1 package (featureCounts)99 with a GTF annotation file (GRCh38/gencode42) and the following parameters for stranded RNA libraries "-s 2 -t exon -g gene_name." Differential expression analysis was performed using DESeq2-1.34.0 with a P-value threshold of 0.05 within R version 4.0.2 (https://www.R-project.org/)100. PCAplot was created within R using the pcaExplorer to measure sample expression variance101. Gene ontology analyses were performed using the EnrichR package102-104 and ShinyGO 0.77105. Volcano plots and dot plots were created using ggplot2106. Gene tracks were generated using Integrated Genome Viewer107.
Genomic statistical tests
For RNA-seq, statistical significance was determined using Mann-Whitney pairwise tests unless otherwise noted. The details of violin plots, statistical tests and error bars are explained in their respective figure legends.
Supplementary Material
Movie S1. Different positions of INTS9 in ICM and the INTS9-INTS11-BRAT1 complex. Related to Figure 4. A movie morphing the position of INTS9 from that in the INTS9-INTS11-BRAT1 complex to that in the ICM. The structures of the two complexes are overlaid based on INTS11 (cyan). BRAT1 is in pink, and INTS4 in light blue.
Table S1. Mass-Spectrometry datasets as Excel spreadsheets. Related to Figure 1.
Table S3. Oligonucleotides used in the study. Related to STAR Methods.
Table S2. Disease-causing BRAT1 missense mutations. Related to Figure 4.
Highlights.
BRAT1/CG7044 is a binding partner that stabilizes INTS11 in the cytoplasm.
BRAT1/CG7044 is required for Integrator function in the nucleus.
The conserved C-terminus of BRAT1-CG7044 is captured in the INTS11 active site.
UBE3D is a binding partner of CPSF73 with a similar mechanism of action.
ACKNOWLEDGEMENTS
We thank David Baillat and Todd Albrecht for help in generating Drosophila Integrator antibodies, the Genomics Research Core at The University of Rochester Medical Center, Kevin Welle and other staff at The University of Rochester Mass Spectrometry Resource Lab, and other members of the Wagner lab for helpful discussions. We thank Oliver Clarke for helpful discussions, Zhening Zhang, Bob Grassucci, and the staff at the Columbia University Cryo-Electron Microscopy Center for help with cryo-EM data collection, Huihui Kuang and the staff at the New York Structural Biology Center for help with cryo-EM data collection. This work was supported by National Institutes of Health grants R01NS135070 (C.P., L.T., E.J.W.), R01GM134539 (E.J.W.), R35GM118093 (L.T.), T32GM135134 (M.J., M.H.), F31CA284555 (M.J.), and the Elon Huntington Hooker Fellowship (M.J.). The UTMB Mass Spectrometry Facility is supported in part by The Cancer Prevention Research Institute of Texas (CPRIT) grant number RP190682 (W.K.R).
Footnotes
DECLARATION OF INTERESTS
The authors declare no competing interests.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- 1.Eaton JD, and West S (2020). Termination of Transcription by RNA Polymerase II: BOOM! Trends Genet 36, 664–675. 10.1016/j.tig.2020.05.008. [DOI] [PubMed] [Google Scholar]
- 2.Rodriguez-Molina JB, West S, and Passmore LA (2023). Knowing when to stop: Transcription termination on protein-coding genes by eukaryotic RNAPII. Mol Cell 83, 404–415. 10.1016/j.molcel.2022.12.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Marzluff WF, Wagner EJ, and Duronio RJ (2008). Metabolism and regulation of canonical histone mRNAs: life without a poly(A) tail. Nat Rev Genet 9, 843–854. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Elrod ND, Henriques T, Huang KL, Tatomer DC, Wilusz JE, Wagner EJ, and Adelman K (2019). The Integrator Complex Attenuates Promoter-Proximal Transcription at Protein-Coding Genes. Mol Cell 76, 738–752 e737. 10.1016/j.molcel.2019.10.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Tatomer DC, Elrod ND, Liang D, Xiao MS, Jiang JZ, Jonathan M, Huang KL, Wagner EJ, Cherry S, and Wilusz JE (2019). The Integrator complex cleaves nascent mRNAs to attenuate transcription. Genes Dev 33, 1525–1538. 10.1101/gad.330167.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Huang KL, Jee D, Stein CB, Elrod ND, Henriques T, Mascibroda LG, Baillat D, Russell WK, Adelman K, and Wagner EJ (2020). Integrator Recruits Protein Phosphatase 2A to Prevent Pause Release and Facilitate Transcription Termination. Mol Cell 80, 345–358 e349. 10.1016/j.molcel.2020.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Fianu I, Chen Y, Dienemann C, Dybkov O, Linden A, Urlaub H, and Cramer P (2021). Structural basis of Integrator-mediated transcription regulation. Science 374, 883–887. 10.1126/science.abk0154. [DOI] [PubMed] [Google Scholar]
- 8.Lykke-Andersen S, Zumer K, Molska ES, Rouviere JO, Wu G, Demel C, Schwalb B, Schmid M, Cramer P, and Jensen TH (2021). Integrator is a genome-wide attenuator of non-productive transcription. Mol Cell 81, 514–529 e516. 10.1016/j.molcel.2020.12.014. [DOI] [PubMed] [Google Scholar]
- 9.Stein CB, Field AR, Mimoso CA, Zhao C, Huang KL, Wagner EJ, and Adelman K (2022). Integrator endonuclease drives promoter-proximal termination at all RNA polymerase II-transcribed loci. Mol Cell. 10.1016/j.molcel.2022.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hu S, Peng L, Song A, Ji YX, Cheng J, Wang M, and Chen FX (2023). INTAC endonuclease and phosphatase modules differentially regulate transcription by RNA polymerase II. Mol Cell 83, 1588–1604 e1585. 10.1016/j.molcel.2023.03.022. [DOI] [PubMed] [Google Scholar]
- 11.Vervoort SJ, Welsh SA, Devlin JR, Barbieri E, Knight DA, Offley S, Bjelosevic S, Costacurta M, Todorovski I, Kearney CJ, et al. (2021). The PP2A-Integrator-CDK9 axis fine-tunes transcription and can be targeted therapeutically in cancer. Cell 184, 3143–3162 e3132. 10.1016/j.cell.2021.04.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Zheng H, Qi Y, Hu S, Cao X, Xu C, Yin Z, Chen X, Li Y, Liu W, Li J, et al. (2020). Identification of Integrator-PP2A complex (INTAC), an RNA polymerase II phosphatase. Science 370. 10.1126/science.abb5872. [DOI] [PubMed] [Google Scholar]
- 13.Cortazar MA, Sheridan RM, Erickson B, Fong N, Glover-Cutter K, Brannan K, and Bentley DL (2019). Control of RNA Pol II Speed by PNUTS-PP1 and Spt5 Dephosphorylation Facilitates Termination by a "Sitting Duck Torpedo" Mechanism. Mol Cell 76, 896–908 e894. 10.1016/j.molcel.2019.09.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Schreieck A, Easter AD, Etzold S, Wiederhold K, Lidschreiber M, Cramer P, and Passmore LA (2014). RNA polymerase II termination involves C-terminal-domain tyrosine dephosphorylation by CPF subunit Glc7. Nat Struct Mol Biol 21, 175–179. 10.1038/nsmb.2753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zheng H, Jin Q, Wang X, Qi Y, Liu W, Ren Y, Zhao D, Xavier Chen F, Cheng J, Chen X, and Xu Y (2023). Structural basis of INTAC-regulated transcription. Protein Cell 14, 698–702. 10.1093/procel/pwad010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Mandel CR, Kaneko S, Zhang H, Gebauer D, Vethantham V, Manley JL, and Tong L (2006). Polyadenylation factor CPSF-73 is the pre-mRNA 3'-end-processing endonuclease. Nature 444, 953–956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Gutierrez PA, Wei J, Sun Y, and Tong L (2022). Molecular basis for the recognition of the AUUAAA polyadenylation signal by mPSF. RNA 28, 1534–1541. 10.1261/rna.079322.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Clerici M, Faini M, Muckenfuss LM, Aebersold R, and Jinek M (2018). Structural basis of AAUAAA polyadenylation signal recognition by the human CPSF complex. Nat Struct Mol Biol 25, 135–138. 10.1038/s41594-017-0020-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Sun Y, Zhang Y, Hamilton K, Manley JL, Shi Y, Walz T, and Tong L (2018). Molecular basis for the recognition of the human AAUAAA polyadenylation signal. Proc Natl Acad Sci U S A 115, E1419–E1428. 10.1073/pnas.1718723115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Eaton JD, Davidson L, Bauer DLV, Natsume T, Kanemaki MT, and West S (2018). Xrn2 accelerates termination by RNA polymerase II, which is underpinned by CPSF73 activity. Genes Dev 32, 127–139. 10.1101/gad.308528.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wagner EJ, Tong L, and Adelman K (2023). Integrator is a global promoter-proximal termination complex. Mol Cell. 10.1016/j.molcel.2022.11.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Baillat D, Hakimi M-A, Naar AM, Shilatifard A, Cooch N, and Shiekhattar R (2005). Integrator, a multiprotein mediator of small nuclear RNA processing, associates with the C-terminal repeat of RNA polymerase II. Cell 123, 265–276. [DOI] [PubMed] [Google Scholar]
- 23.Baillat D, and Wagner EJ (2015). Integrator: surprisingly diverse functions in gene expression. Trends Biochem. Sci 40, 257–264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Mendoza-Figueroa MS, Tatomer DC, and Wilusz JE (2020). The Integrator Complex in Transcription and Development. Trends Biochem Sci 45, 923–934. 10.1016/j.tibs.2020.07.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kirstein N, Gomes Dos Santos H, Blumenthal E, and Shiekhattar R (2020). The Integrator complex at the crossroad of coding and noncoding RNA. Curr Opin Cell Biol 70, 37–43. 10.1016/j.ceb.2020.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Beltran T, Pahita E, Ghosh S, Lenhard B, and Sarkies P (2020). Integrator is recruited to promoter-proximally paused RNA Pol II to generate Caenorhabditis elegans piRNA precursors. Embo j 40, e105564. 10.15252/embj.2020105564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Beckedorff F, Blumenthal E, daSilva LF, Aoi Y, Cingaram PR, Yue J, Zhang A, Dokaneheifard S, Valencia MG, Gaidosh G, et al. (2020). The Human Integrator Complex Facilitates Transcriptional Elongation by Endonucleolytic Cleavage of Nascent Transcripts. Cell Rep 32, 107917. 10.1016/j.celrep.2020.107917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Rosa-Mercado NA, Zimmer JT, Apostolidi M, Rinehart J, Simon MD, and Steitz JA (2021). Hyperosmotic stress alters the RNA polymerase II interactome and induces readthrough transcription despite widespread transcriptional repression. Mol Cell 81, 502–513 e504. 10.1016/j.molcel.2020.12.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Thomas QA, Ard R, Liu J, Li B, Wang J, Pelechano V, and Marquardt S (2020). Transcript isoform sequencing reveals widespread promoter-proximal transcriptional termination in Arabidopsis. Nat Commun 11, 2589. 10.1038/s41467-020-16390-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Chen J, Waltenspiel B, Warren WD, and Wagner EJ (2013). Functional analysis of the integrator subunit 12 identifies a microdomain that mediates activation of the Drosophila integrator complex. J Biol Chem 288, 4867–4877. 10.1074/jbc.M112.425892M112.425892 [pii]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Pfleiderer MM, and Galej WP (2021). Structure of the catalytic core of the Integrator complex. Mol Cell 81, 1246–1259 e1248. 10.1016/j.molcel.2021.01.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Offley SR, Pfleiderer MM, Zucco A, Fraudeau A, Welsh SA, Razew M, Galej WP, and Gardini A (2023). A combinatorial approach to uncover an additional Integrator subunit. Cell Rep 42, 112244. 10.1016/j.celrep.2023.112244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Albrecht TR, Shevtsov SP, Wu Y, Mascibroda LG, Peart NJ, Huang KL, Sawyer IA, Tong L, Dundr M, and Wagner EJ (2018). Integrator subunit 4 is a 'Symplekin-like' scaffold that associates with INTS9/11 to form the Integrator cleavage module. Nucleic Acids Res 46, 4241–4255. 10.1093/nar/gky100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Lin MH, Jensen MK, Elrod ND, Huang KL, Welle KA, Wagner EJ, and Tong L (2022). Inositol hexakisphosphate is required for Integrator function. Nat Commun 13, 5742. 10.1038/s41467-022-33506-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Sun Y, Hamilton K, and Tong L (2020). Recent molecular insights into canonical pre-mRNA 3'-end processing. Transcription 11, 83–96. 10.1080/21541264.2020.1777047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Dominski Z, Yang X-C, and Marzluff WF (2005). The polyadenylation factor CPSF-73 is involved in histone-pre-mRNA processing. Cell 123, 37–48. [DOI] [PubMed] [Google Scholar]
- 37.Sun Y, Zhang Y, Aik WS, Yang XC, Marzluff WF, Walz T, Dominski Z, and Tong L (2020). Structure of an active human histone pre-mRNA 3'-end processing machinery. Science 367, 700–703. 10.1126/science.aaz7758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Callebaut I, Moshous D, Mornon J-P, and de Villartay J-P (2002). Metallo-b-lactamase fold within nucleic acids processing enzymes: the b-CASP family. Nucl. Acid Res 30, 3592–3601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Malovannaya A, Lanz RB, Jung SY, Bulynko Y, Le NT, Chan DW, Ding C, Shi Y, Yucer N, Krenciute G, et al. (2011). Analysis of the human endogenous coregulator complexome. Cell 145, 787–799. S0092-8674(11)00532-0 [pii] 10.1016/j.cell.2011.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Malovannaya A, Li Y, Bulynko Y, Jung SY, Wang Y, Lanz RB, O'Malley BW, and Qin J (2010). Streamlined analysis schema for high-throughput identification of endogenous protein complexes. Proc Natl Acad Sci U S A 107, 2431–2436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Cihlarova Z, Kubovciak J, Sobol M, Krejcikova K, Sachova J, Kolar M, Stanek D, Barinka C, Yoon G, Caldecott KW, and Hanzlikova H (2022). BRAT1 links Integrator and defective RNA processing with neurodegeneration. Nat Commun 13, 5026. 10.1038/s41467-022-32763-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Tilley FC, Arrondel C, Chhuon C, Boisson M, Cagnard N, Parisot M, Menara G, Lefort N, Guerrera IC, Bole-Feysot C, et al. (2021). Disruption of pathways regulated by Integrator complex in Galloway-Mowat syndrome due to WDR73 mutations. Sci Rep 11, 5388. 10.1038/s41598-021-84472-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Xu C, Li C, Chen J, Xiong Y, Qiao Z, Fan P, Li C, Ma S, Liu J, Song A, et al. (2023). R-loop-dependent promoter-proximal termination ensures genome stability. Nature 621, 610–619. 10.1038/s41586-023-06515-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Huang J, Gong Z, Ghosal G, and Chen J (2009). SOSS complexes participate in the maintenance of genomic stability. Mol Cell 35, 384–393. 10.1016/j.molcel.2009.06.011 S1097-2765(09)00402-X [pii]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Jia Y, Cheng Z, Bharath SR, Sun Q, Su N, Huang J, and Song H (2021). Crystal structure of the INTS3/INTS6 complex reveals the functional importance of INTS3 dimerization in DSB repair. Cell Discov 7, 66. 10.1038/s41421-021-00283-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Aglipay JA, Martin SA, Tawara H, Lee SW, and Ouchi T (2006). ATM activation by ionizing radiation requires BRCA1-associated BAAT1. J Biol Chem 281, 9710–9718. 10.1074/jbc.M510332200. [DOI] [PubMed] [Google Scholar]
- 47.Van Ommeren RH, Gao AF, Blaser SI, Chitayat DA, and Hazrati LN (2018). BRAT1 Mutation: The First Reported Case of Chinese Origin and Review of the Literature. J Neuropathol Exp Neurol 77, 1071–1078. 10.1093/jnen/nly093. [DOI] [PubMed] [Google Scholar]
- 48.Fowkes R, Elwan M, Akay E, Mitchell CJ, Thomas RH, and Lewis-Smith D (2022). A review of the clinical spectrum of BRAT1 disorders and case of developmental and epileptic encephalopathy surviving into adulthood. Epilepsy Behav Rep 19, 100549. 10.1016/j.ebr.2022.100549. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Oegema R, Baillat D, Schot R, van Unen LM, Brooks A, Kia SK, Hoogeboom AJM, Xia Z, Li W, Cesaroni M, et al. (2017). Human mutations in integrator complex subunits link transcriptome integrity to brain development. PLoS Genet 13, e1006809. 10.1371/journal.pgen.1006809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Krall M, Htun S, Schnur RE, Brooks AS, Baker L, de Alba Campomanes A, Lamont RE, Gripp KW, Care 4 Rare Canada, C., Schneidman-Duhovny D, et al. (2019). Biallelic sequence variants in INTS1 in patients with developmental delays, cataracts, and craniofacial anomalies. Eur J Hum Genet 27, 582–593. 10.1038/s41431-018-0298-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Tepe B, Macke EL, Niceta M, Weisz Hubshman M, Kanca O, Schultz-Rogers L, Zarate YA, Schaefer GB, Granadillo De Luque JL, Wegner DJ, et al. (2023). Biallelic variants in INTS11 are associated with a complex neurological disorder. Am J Hum Genet 110, 774–789. 10.1016/j.ajhg.2023.03.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Wu Y, Albrecht TR, Baillat D, Wagner EJ, and Tong L (2017). Molecular basis for the interaction between Integrator subunits IntS9 and IntS11 and its functional importance. Proc Natl Acad Sci U S A 114, 4394–4399. 10.1073/pnas.1616605114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Low LH, Chow YL, Li Y, Goh CP, Putz U, Silke J, Ouchi T, Howitt J, and Tan SS (2015). Nedd4 family interacting protein 1 (Ndfip1) is required for ubiquitination and nuclear trafficking of BRCA1-associated ATM activator 1 (BRAT1) during the DNA damage response. J Biol Chem 290, 7141–7150. 10.1074/jbc.M114.613687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Dominski Z, Yang XC, Purdy M, Wagner EJ, and Marzluff WF (2005). A CPSF-73 homologue is required for cell cycle progression but not cell growth and interacts with a protein having features of CPSF-100. Mol Cell Biol 25, 1489–1500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Albrecht TR, and Wagner EJ (2012). snRNA 3' end formation requires heterodimeric association of integrator subunits. Mol Cell Biol 32, 1112–1123. MCB.06511-11 [pii] 10.1128/MCB.06511-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Chen J, Ezzeddine N, Waltenspiel B, Albrecht TR, Warren WD, Marzluff WF, and Wagner EJ (2012). An RNAi screen identifies additional members of the Drosophila Integrator complex and a requirement for cyclin C/Cdk8 in snRNA 3'-end formation. RNA 18, 2148–2156. 10.1261/rna.035725.112rna.035725.112 [pii]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Ezzeddine N, Chen J, Waltenspiel B, Burch B, Albrecht T, Zhuo M, Warren WD, Marzluff WF, and Wagner EJ (2011). A subset of Drosophila integrator proteins is essential for efficient U7 snRNA and spliceosomal snRNA 3'-end formation. Mol Cell Biol 31, 328–341. MCB.00943-10 [pii] 10.1128/MCB.00943-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Horn D, Weschke B, Knierim E, Fischer-Zirnsak B, Stenzel W, Schuelke M, and Zemojtel T (2016). BRAT1 mutations are associated with infantile epileptic encephalopathy, mitochondrial dysfunction, and survival into childhood. Am J Med Genet A 170, 2274–2281. 10.1002/ajmg.a.37798. [DOI] [PubMed] [Google Scholar]
- 59.Mahjoub A, Cihlarova Z, Tetreault M, MacNeil L, Sondheimer N, Caldecott KW, Hanzlikova H, Yoon G, and Care4Rare Canada, C. (2019). Homozygous pathogenic variant in BRAT1 associated with nonprogressive cerebellar ataxia. Neurol Genet 5, e359. 10.1212/NXG.0000000000000359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Srivastava S, and Naidu S (2016). Epileptic Encephalopathy Due to BRAT1 Pathogenic Variants. Pediatr Neurol Briefs 30, 45. 10.15844/pedneurbriefs-30-12-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Xiang Y, Tanaka Y, Patterson B, Kang YJ, Govindaiah G, Roselaar N, Cakir B, Kim KY, Lombroso AP, Hwang SM, et al. (2017). Fusion of Regionally Specified hPSC-Derived Organoids Models Human Brain Development and Interneuron Migration. Cell Stem Cell 21, 383–398 e387. 10.1016/j.stem.2017.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Anderson SA, Eisenstat DD, Shi L, and Rubenstein JL (1997). Interneuron migration from basal forebrain to neocortex: dependence on Dlx genes. Science 278, 474–476. 10.1126/science.278.5337.474. [DOI] [PubMed] [Google Scholar]
- 63.Cobos I, Calcagnotto ME, Vilaythong AJ, Thwin MT, Noebels JL, Baraban SC, and Rubenstein JL (2005). Mice lacking Dlx1 show subtype-specific loss of interneurons, reduced inhibition and epilepsy. Nat Neurosci 8, 1059–1068. 10.1038/nn1499. [DOI] [PubMed] [Google Scholar]
- 64.Eisenstat DD, Liu JK, Mione M, Zhong W, Yu G, Anderson SA, Ghattas I, Puelles L, and Rubenstein JL (1999). DLX-1, DLX-2, and DLX-5 expression define distinct stages of basal forebrain differentiation. J Comp Neurol 414, 217–237. . [DOI] [PubMed] [Google Scholar]
- 65.Le TN, Du G, Fonseca M, Zhou QP, Wigle JT, and Eisenstat DD (2007). Dlx homeobox genes promote cortical interneuron migration from the basal forebrain by direct repression of the semaphorin receptor neuropilin-2. J Biol Chem 282, 19071–19081. 10.1074/jbc.M607486200. [DOI] [PubMed] [Google Scholar]
- 66.Liu X, Novosedlik N, Wang A, Hudson ML, Cohen IL, Chudley AE, Forster-Gibson CJ, Lewis SM, and Holden JJ (2009). The DLX1 and DLX2 genes and susceptibility to autism spectrum disorders. Eur J Hum Genet 17, 228–235. 10.1038/ejhg.2008.148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Perera M, Merlo GR, Verardo S, Paleari L, Corte G, and Levi G (2004). Defective neuronogenesis in the absence of Dlx5. Mol Cell Neurosci 25, 153–161. 10.1016/j.mcn.2003.10.004. [DOI] [PubMed] [Google Scholar]
- 68.Wonders C, and Anderson S (2005). Beyond migration: Dlx1 regulates interneuron differentiation. Nat Neurosci 8, 979–981. 10.1038/nn0805-979. [DOI] [PubMed] [Google Scholar]
- 69.Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589. 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Holm L, Kaariainen S, Rosenstrom P, and Schenkel A (2008). Searching protein structure databases with DaliLite v.3. Bioinformatics 24, 2780–2781. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Oughtred R, Rust J, Chang C, Breitkreutz BJ, Stark C, Willems A, Boucher L, Leung G, Kolas N, Zhang F, et al. (2021). The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci 30, 187–200. 10.1002/pro.3978. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Evans R, O’Neill M, Pritzel A, Antropova N, Senior AW, Green T, Zidek A, Bates R, Blackwell S, Yim J, et al. (2022). Protein complex prediction with AlphaFold-Multimer. BioRxiv. [Google Scholar]
- 73.Briscoe J (2009). Making a grade: Sonic Hedgehog signalling and the control of neural cell fate. EMBO J 28, 457–465. 10.1038/emboj.2009.12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Pearson EL, Graber JH, Lee SD, Naggert KS, and Moore CL (2019). Ipa1 Is an RNA Polymerase II Elongation Factor that Facilitates Termination by Maintaining Levels of the Poly(A) Site Endonuclease Ysh1. Cell Rep 26, 1919–1933.e1915. 10.1016/j.celrep.2019.01.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Lee SD, Liu HY, Graber JH, Heller-Trulli D, Kaczmarek Michaels K, Cerezo JF, and Moore CL (2020). Regulation of the Ysh1 endonuclease of the mRNA cleavage/polyadenylation complex by ubiquitin-mediated degradation. RNA Biol 17, 689–702. 10.1080/15476286.2020.1724717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Heller-Trulli D, Liu H, Mukherjee S, and Moore CL (2022). UBE3D Regulates mRNA 3'-End Processing and Maintains Adipogenic Potential in 3T3-L1 Cells. Mol Cell Biol 42, e0017422. 10.1128/mcb.00174-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Liu H, Heller-Trulli D, and Moore CL (2022). Targeting the mRNA endonuclease CPSF73 inhibits breast cancer cell migration, invasion, and self-renewal. iScience 25, 104804. 10.1016/j.isci.2022.104804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Sabath K, Qiu C, and Jonas S (2024). Assembly mechanism of Integrator’s RNA cleavage module. Mol Cell NA. [DOI] [PubMed] [Google Scholar]
- 79.Dignam JD, Lebovitz RM, and Roeder RG (1983). Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucleic Acids Res 11, 1475–1489. 10.1093/nar/11.5.1475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Anderson AP, Luo X, Russell W, and Yin YW (2020). Oxidative damage diminishes mitochondrial DNA polymerase replication fidelity. Nucleic Acids Res 48, 817–829. 10.1093/nar/gkz1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Cox J, Hein MY, Luber CA, Paron I, Nagaraj N, and Mann M (2014). Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol Cell Proteomics 13, 2513–2526. 10.1074/mcp.M113.031591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Sari D, Gupta K, Thimiri Govinda Raj DB, Aubert A, Drncova P, Garzoni F, Fitzgerald D, and Berger I (2016). The MultiBac baculovirus/insect cell expression vector system for producing complex protein biologics. Adv. Exp. Med. Biol 896, 199–215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Gradia SD, Ishida JP, Tsai MS, Jeans C, Tainer JA, and Fuss JO (2017). MacroBac: New Technologies for Robust and Efficient Large-Scale Production of Recombinant Multiprotein Complexes. Methods Enzymol 592, 1–26. 10.1016/bs.mie.2017.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Suloway C, Pulokas J, Fellmann D, Cheng A, Guerra F, Quispe J, Stagg S, Potter CS, and Carragher B (2005). Automated molecular microscopy: the new Leginon system. J. Struct. Biol 151, 41–60. [DOI] [PubMed] [Google Scholar]
- 85.Zivanov J, Nakane T, Forsberg BO, Kimanius D, Hagen WJH, Lindahl E, and Scheres SH (2018). New tools for automated high-resolution cryo-EM structure determination in RELION-3. eLife 7, e42166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Rohou A, and Grigorieff N (2015). CTFFIND4: fast and accurate defocus estimation from electron micrographs. J. Struct. Biol 192, 216–221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Punjani A, Rubinstein JL, Fleet DJ, and Brubaker MA (2017). cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat Methods 14, 290–296. [DOI] [PubMed] [Google Scholar]
- 88.Emsley P, and Cowtan KD (2004). Coot: model-building tools for molecular graphics. Acta Cryst. D 60, 2126–2132. [DOI] [PubMed] [Google Scholar]
- 89.Roy A, Kucukural A, and Zhang Y (2010). I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protoc 5, 725–738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Zhang Y, Sun Y, Shi Y, Walz T, and Tong L (2020). Structural insights into the human pre-mRNA 3’-end processing machinery. Mol. Cell 77, 800–809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Liebschner D, Afonine PV, Baker ML, Bunkóczi G, Chen VB, Croll TI, Hintze B, Hung LW, Jain S, McCoy AJ, et al. (2019). Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr D Struct Biol 75, 861–877. 10.1107/s2059798319011471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Elfmann C, and Stulke J (2023). PAE viewer: a Webserver for the interactive visualization of the predicted aligned error for multimer structure predictions and crosslinks. Nucleic Acids Res 51, W404–W410. 10.1093/nar/gkad350. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Haeussler M, Schonig K, Eckert H, Eschstruth A, Mianne J, Renaud JB, Schneider-Maunoury S, Shkumatava A, Teboul L, Kent J, et al. (2016). Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol 17, 148. 10.1186/s13059-016-1012-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Chambers SM, Fasano CA, Papapetrou EP, Tomishima M, Sadelain M, and Studer L (2009). Highly efficient neural conversion of human ES and iPS cells by dual inhibition of SMAD signaling. Nat Biotechnol 27, 275–280. 10.1038/nbt.1529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Jaworski E, and Routh A (2018). ClickSeq: Replacing Fragmentation and Enzymatic Ligation with Click-Chemistry to Prevent Sequence Chimeras. Methods Mol Biol 1712, 71–85. 10.1007/978-1-4939-7514-3_6. [DOI] [PubMed] [Google Scholar]
- 96.Chen S, Zhou Y, Chen Y, and Gu J (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890. 10.1093/bioinformatics/bty560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Frankish A, Diekhans M, Jungreis I, Lagarde J, Loveland JE, Mudge JM, Sisu C, Wright JC, Armstrong J, Barnes I, et al. (2021). Gencode 2021. Nucleic Acids Res 49, D916–D923. 10.1093/nar/gkaa1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, and Gingeras TR (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21. 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Liao Y, Smyth GK, and Shi W (2019). The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res 47, e47. 10.1093/nar/gkz114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Love MI, Huber W, and Anders S (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550. 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Marini F, and Binder H (2019). pcaExplorer: an R/Bioconductor package for interacting with RNA-seq principal components. BMC Bioinformatics 20, 331. 10.1186/s12859-019-2879-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV, Clark NR, and Ma'ayan A (2013). Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 14, 128. 10.1186/1471-2105-14-128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, et al. (2016). Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res 44, W90–97. 10.1093/nar/gkw377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Xie Z, Bailey A, Kuleshov MV, Clarke DJB, Evangelista JE, Jenkins SL, Lachmann A, Wojciechowicz ML, Kropiwnicki E, Jagodnik KM, et al. (2021). Gene Set Knowledge Discovery with Enrichr. Curr Protoc 1, e90. 10.1002/cpz1.90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Ge SX, Jung D, and Yao R (2020). ShinyGO: a graphical gene-set enrichment tool for animals and plants. Bioinformatics 36, 2628–2629. 10.1093/bioinformatics/btz931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Wickham H (2016). ggplot2 : Elegant Graphics for Data Analysis. Use R!,. 2nd ed. Springer International Publishing : Imprint: Springer,. [Google Scholar]
- 107.Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, and Mesirov JP (2011). Integrative genomics viewer. Nat Biotechnol 29, 24–26. 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Movie S1. Different positions of INTS9 in ICM and the INTS9-INTS11-BRAT1 complex. Related to Figure 4. A movie morphing the position of INTS9 from that in the INTS9-INTS11-BRAT1 complex to that in the ICM. The structures of the two complexes are overlaid based on INTS11 (cyan). BRAT1 is in pink, and INTS4 in light blue.
Table S1. Mass-Spectrometry datasets as Excel spreadsheets. Related to Figure 1.
Table S3. Oligonucleotides used in the study. Related to STAR Methods.
Table S2. Disease-causing BRAT1 missense mutations. Related to Figure 4.
Data Availability Statement
The atomic coordinates have been deposited at the PDB (entry codes 8UIB and 8UIC). GEO Accession numbers: all datasets generated in this study are available for download from GEO: GSE246833. All source data is available at Mendeley Data, DOI: 10.17632/ygprhc3hvt.1. The deposited data will be publicly available upon publication of the paper.
This paper does not report original code.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.