Abstract
CSTF2 encodes an RNA-binding protein that is essential for mRNA cleavage and polyadenylation (C/P). No disease-associated mutations have been described for this gene. Here, we report a mutation in the RNA recognition motif (RRM) of CSTF2 that changes an aspartic acid at position 50 to alanine (p.D50A), resulting in intellectual disability in male patients. In mice, this mutation was sufficient to alter polyadenylation sites in over 1300 genes critical for brain development. Using a reporter gene assay, we demonstrated that C/P efficiency of CSTF2D50A was lower than wild type. To account for this, we determined that p.D50A changed locations of amino acid side chains altering RNA binding sites in the RRM. The changes modified the electrostatic potential of the RRM leading to a greater affinity for RNA. These results highlight the significance of 3′ end mRNA processing in expression of genes important for brain plasticity and neuronal development.
INTRODUCTION
Learning, memory, and intelligence require synaptic plasticity involving persistent changes in neural gene expression. Abnormalities in brain development can result in neurodevelopmental disorders that impact intellectual functioning, behavioral deficits involving adaptive behaviors and autism spectrum disorders. Most neuronal developmental changes are accomplished by post-transcriptional processes that alter the regulation and protein coding capacity of specific mRNAs (1,2). Similarly, neurodegenerative conditions are associated with altered mRNA metabolism. Frequently, mRNAs in the brain have extremely long 3′ untranslated regions (UTRs), and altered 3′ UTRs in MECP2 (Rett syndrome), APP (Alzheimer disease) and HTT (Huntington disease) are associated with improper metabolism of each of these mRNAs leading to disease states. Though rarer, monogenic forms of intellectual deficiencies offer specific insight into neuronal plasticity and development. The most common monogenic forms of intellectual deficiency are X-linked (XLID (3,4)). For example, Fragile X Syndrome is caused by expansion of CGG repeats in the 5′ UTR of FMR1, resulting in loss-of-function of FMRP, an RNA-binding protein that promotes transport of mRNAs to dendrites for protein synthesis at synaptic sites (5,6). Similarly, mutations in mRNA processing genes encoding decapping enzymes (7,8), the polyglutamine binding protein 1 (PQBP1 (9)), spliceosomal proteins (10), hnRNA-binding proteins (11), mRNA surveillance proteins (12) and cleavage and polyadenylation factors (13,14) cause intellectual disabilities. These disorders are associated with changes in the mRNA processing landscape, especially 3′ end cleavage and polyadenylation (C/P), highlighting the importance of RNA processing in controlling neuronal function (15–18).
More than eighty different proteins are involved in mRNA C/P (19). Two core C/P factors are the cleavage and polyadenylation specificity factor (CPSF) and the cleavage stimulation factor (CstF). CPSF has six subunits that recognize the polyadenylation signal (AAUAAA and closely related sequences), cleaves the nascent pre-mRNA, then recruits the poly(A) polymerase, which adds up to 250 non-template adenosines to the upstream product of the cleavage reaction (20,21). CstF is the regulatory factor in C/P, consisting of three subunits, CstF-50 (gene symbol CSTF1), CstF-64 (CSTF2) and CstF-77 (CSTF3). CSTF2, the RNA-binding component of CstF, binds to U- or GU-rich sequences downstream of the cleavage site through its RNA recognition motif (RRM (22)). As such, CSTF2 regulates gene expression in immune cells (23–25), spermatogenesis (26–29), and embryonic stem cell development (30,31). Furthermore, the CSTF2 paralog, Cstf2t has been shown to affect learning and memory in mice (32).
Here, we describe members of a family in whom a single nucleotide mutation in the RRM of CSTF2 changes an aspartic acid at amino acid 50 to an alanine (D50A). The probands presented with non-syndromic intellectual disability. The mutation, which is X-linked, co-segregated with the clinical phenotype. In a reporter gene assay, CSTF2 containing the D50A mutation (CSTF2D50A) reduced C/P by 15%. However, the CSTF2D50A RRM showed an almost 2-fold increase in its affinity for RNA. Solution state nuclear magnetic resonance (NMR) studies showed that the overall backbone structure of the CSTF2D50A RRM was similar to wild type. However, repositioning of side chains in RNA binding sites and the α4-helix resulted in changes in the electrostatic potential and fast timescale protein dynamics of the RNA-bound state. Together, these changes led to an increased affinity of the CSTF2D50A RRM for RNA due to a faster kon rate. Differential gene expression analysis of RNA isolated from brains of male mice harboring the D50A mutation (Cstf2D50A) identified fourteen genes important for synapse formation and neuronal development. Genome-wide 3′ end sequencing of polyadenylated mRNAs identified >1300 genes with altered C/P sites, leading to alternative polyadenylation of genes involved in neurogenesis, neuronal differentiation and development, and neuronal projection development. Our results indicate that in affected individuals, different affinity and rate of binding to the nascent mRNAs of the mutant CSTF2 RRM could misregulate expression of genes critical for brain plasticity and development.
MATERIALS AND METHODS
Human subjects
All cases had a normal karyotype, were negative for FMR1 repeat expansion, and large insertions or deletions were excluded using array Comparative Genomic Hybridization (CGH). The study was approved by all institutional review boards of the participating institutions collecting the samples, and written informed consent was obtained from all participants or their legal guardians.
DNA isolation and X-chromosome exome sequencing and segregation analysis
DNAs from the family members were isolated from peripheral blood using standard techniques. X-chromosome exome enrichment using DNA from the index patient, sequencing and analysis was performed as previously described (4). Segregation analysis of variants of uncertain clinical significance was performed by PCR using gene-specific primers flanking the respective variant identified followed by Sanger sequencing.
Animal use and generation of Cstf2D50A mice
All animal treatments and tissues obtained in the study were performed according to protocols approved by the Institutional Animal Care and Use Committee at the Texas Tech University Health Sciences Center in accordance with the National Institutes of Health animal welfare guidelines. TTUHSC’s vivarium is AAALAC-certified and has a 12-/12-h light/dark cycle with temperature and relative humidity of 20–22°C and 30–50%, respectively.
B6;Cstf2em1Ccma-D50A founder mice (herein Cstf2D50A; hemizygous males are designated Cstf2D50A/Y) were generated by Cyagen US Inc. (Santa Clara, CA, USA). To create C57BL/6 mice with a point mutation (D50A) in the Cstf2 locus, exon 3 in the mouse Cstf2 gene (GenBank accession number: NM_133196.6; Ensembl: ENSMUSG00000031256) located on mouse chromosome X was selected as the target site. The D50A (GAT to GCT, see Figure 1 and Supplementary Figure S1) mutation site in the donor oligo was introduced into exon 3 by homology-directed repair. A gRNA targeting vector and donor oligo (with targeting sequence, flanked by 120 bp homologous sequences combined on both sides) was designed. Cas9 mRNA, sgRNA and donor oligo were co-injected into zygotes for knock-in mouse production. The pups were genotyped by PCR followed by sequence analysis. Positive founders were bred to the next generation (F1) and subsequently genotyped by PCR and DNA sequencing analysis. Mutants were maintained as a congenic strain by backcrossing four generations to C57BL/6NCrl (Charles River) and subsequently breeding exclusively within the colony. At the time of this study, mice were bred to ∼10 generations.
Figure 1.
Family P167 carries a mutation in the X-linked CSTF2 gene that affects intelligence in males. (A) Pedigree of family 167 showing the index proband (III:1) who was 6 years old at the time of the study, his affected brother (III:2), and two affected uncles (II:1 and II:2). Females (I:2 and II:4) were carriers for the trait, consistent with an X-linked recessive trait. Individuals labeled Mut (II:1, II:2, III:1, III:2) or Mut/WT (II:4) were tested for co-segregation of the mutation with the clinical phenotype by PCR and Sanger sequencing of the specific products. (B) The missense mutation in exon 3 of CSTF2 substitutes an alanine codon for an aspartic acid at position 50 (p.D50A) within the RNA recognition motif (RRM). (C) CLUSTAL alignment of CSTF2 orthologs in twenty-eight species indicate that the aspartic acid (amino acid D) at position 50 (red arrow) is highly conserved. Amino acids comprising site-I (Y25 and R85) and site-II (N19 and N91) are indicated in green and blue, respectively. An “*” (asterisk) indicates amino acids that are conserved; a “:” (colon) indicates amino acids with a score greater than 0.5 in the Clustal Omega matrix; and a “.” (period) indicates amino acids with a score smaller or equal to 0.5 in the Clustal Omega matrix. (D) Domains of human CSTF2 (577 amino acids). Indicated are the RRM, the Hinge, the proline/glycine-rich region, the 12× MEARA repeats, and the C-terminal domain. Numbers at top indicate the amino acid positions of interest.
Genotyping of Cstf2D50A mice by PCR and restriction enzyme digestion
Genomic DNA was extracted from tail snips of Cstf2D50A mice by proteinase K digestion followed by isopropanol precipitation (26). PCRs were performed using specific primers surrounding the D50A mutation site. The presence of the mutation converted a CGATAGG sequences to CGCTAGG, thus introducing a BfaI restriction site (underlined). Digestion of the PCR products with BfaI revealed the presence of the mutation (Supplementary Figure S1, PCR-RFLP). Forward primer: 5′-GAAAGCAGTCAGAGTGGGCT-3′; reverse primer: 5′-CTGCGAAGGAAAGTGAGGGT-3′.
Cell culture, transfection, stem-loop assay for polyadenylation, immunohistochemistry, tissue protein isolation and western blots
Culturing of HeLa cells was performed as described (22). Transfection was carried out in a 24-well plates using Lipofectamine LTX (ThermoFisher Scientific) and 250 ng of a mixture of plasmid DNAs. Luciferase measurements were performed between 36 and 48 h after the transfection (22,33,34). Western blots to assess the abundance of the proteins were performed on the same volumes of lysates obtained from the luciferase measurements using the passive lysis buffer supplied with the Dual-Luciferase Reporter Assay System (Promega). Antibodies used for western blots were previously described (22,35). Immunohistochemistry and microscopic imaging were performed as described (22).
Protein tissue samples (wild type and Cstf2D50A/Y) were obtained from four brothers from the same litter. The entire brain and pieces representing all lobes of the liver were collected from the animals and homogenized in 3 ml RIPA buffer (50 mM Tris–HCl pH 7.4, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% deoxycholate) supplemented with PMSF and Halt™ Protease Inhibitor Cocktail (ThermoFischer). Samples were homogenized with several strokes of Brinkmann polytron homogenizer and a third of the volume was spun at 16 000 × g for 10 min at 4°C. The supernatant was frozen at −80°C until use. 20 μg total protein from the brain and the liver was separated using 8% SDS-PAGE, then transferred to an Immobilon-P PVDF membrane, 0.45 μm (Sigma-Aldrich). Membranes were incubated overnight at 4°C with mouse anti-CSTF2 antibody clone 3A7 (35), rabbit anti-CSTF3 antibody (Bethyl laboratories) and the mouse anti-β-tubulin clone E7 (Developmental Studies Hybridoma Bank, University of Iowa). After the addition of appropriate HRP-conjugated secondary antibody, the immunoreactivity was developed by SuperSignal West Pico Chemiluminescent Substrate (ThermoFischer) and the signal captured using an ImageQuant™ LAS 4000 imager. Images were adjusted for brightness and cropped using ImageJ software.
Plasmids and site-directed mutagenesis
The pGL3 plasmid (Promega), Renilla-luciferase construct (SL-Luc) containing a modified C/P site by the addition of two MS2 stem-loop downstream sequences was previously described (33,36). The D50A mutant was created through site-directed mutagenesis. The RRM (amino acids 1–107) and D50A mutant RRM were cloned in bacterial expression vectors as fusions with a His-tag followed by a TEV site at the amino terminal end of the RRM and was previously described (22). All plasmids were verified by sequencing before use.
Bacterial protein expression and purification
Expression and purification of the proteins over metal affinity resin and His-tag removal was as done before (22). For the NMR experiments, transfected Rosetta (DE3) pLysS cells were grown in 2× minimal M9 media using 15NH4Cl (1 g/l) and unlabeled or uniformly 13C-labeled d-glucose (3 g/l) as sole nitrogen and carbon sources, respectively (37). Induction of the transfected cells with 0.5 mM isopropyl-β-d-thiogalactopyranoside (IPTG), harvest and purification of the labeled proteins was also carried out as previously described (22).
Circular dichroism and stability assays
Circular dichroism experiments were performed on a JASCO J-815 instrument in 10 mM sodium phosphate, pH 7.25 with 10 μM of either wild type or D50A RRM protein. The spectra were scanned from 185 to 260 nm with 0.1 nm resolution. The average spectrum was obtained from three technical replicates.
Guanidine–HCl experiments were performed in 5 mM HEPES pH 7.4, 50 mM NaCl with 5 μM protein samples. Guanidine–HCl concentrations ranged from 0.5 to 3 M. Protein samples without guanidine–HCl were used as reference. Spectra were collected between 205 and 230 nm with 0.1 nm resolution. Values for 216 nm and 222 nm were plotted to represent the denaturation of the secondary structure of the proteins for β-sheets and α-helixes, respectively. The average spectra were obtained from three technical replicates.
3′-End fluorescent RNA labeling and fluorescence polarization/anisotropy
SVL (5′-AUUUUAUGUUUCAGGU-3′) and (GU)8 (5′-GUGUGUGUGUGUGUGU-3′) RNAs were commercially synthesized (Sigma-Aldrich). 3′-end fluorescent labeling of the RNAs with fluorescein-5-thiosemicarbazide (ThermoFisher) was done as previously reported (38). The labeled RNAs were used as 3.2 nM final concentration in polarization assays.
Fluorescence polarization experiments were performed in binding buffer (16 mM HEPES pH 7.4, 40 mM NaCl, 0.008% (vol/vol) IGEPAL CA630, 5 μg/ml heparin, and 8 μg/ml yeast tRNA). Purified wild type and D50A RRMs were diluted in 20 mM HEPES pH 7.4, 50 mM NaCl, 0.01% NaN3 and 0.0001% IGEPAL CA630 to 32 μM concentration and used for 2-fold serial dilutions. The highest final concentration of the protein in the polarization assay was 16 μM. Fluorescence polarization samples were equilibrated for 2 h at room temperature in 96-well black plates (Greiner Bio) and measured on Infinite M1000 PRO instrument (Tecan Inc) with excitation set at 470 nm (5 nm bandwidth) and emission 520 nm (5 nm bandwidth). At least three technical replicates were performed. The apparent dissociation constants were calculated by fitting the data to a modified version of the Hill equation (39) using GraphPad Prism version 5.2 software for Windows (GraphPad Software). Unpaired t-test was performed using Microsoft Excel (Microsoft Corp).
Isothermal titration calorimetry (ITC)
The RRM wild type, D50A mutant RRM proteins with SVL RNA were dialyzed overnight into 10 mM sodium phosphate, 1 mM tris(2-carboxyethyl)phosphine (TCEP), 0.05% (v/v) sodium azide, pH 6 using a 2 kDa MWCO dialysis unit (ThermoFisher). Heats of binding were measured using a MicroCal iTC200 calorimeter (GE Healthcare) with a stirring rate of 1000 rpm at 27°C. For all titrations, isotherms were corrected by subtracting the heats of RNA dilution. The concentrations of proteins and RNA were calculated by absorbance spectroscopy with extinction coefficients of ϵ280 = 5960 M−1 cm−1 and ϵ260 = 165.7 M−1 cm−1, respectively. Data were analyzed by Origin 7, and all measurements were performed at least three times.
NMR techniques
NMR experiments were recorded in 10 mM phosphate buffer pH 6.0 with 1 mM TCEP, 0.05% (w/v) sodium azide, 0.1 mg/ml 4-(2-aminoethyl)benzenesulfonyl fluoride (AEBSF), and 10% D2O using an Agilent 600 MHz (14.1 T) DD2 NMR spectrometer equipped with a room temperature HCN z-axis gradient probe. Data were processed with NMRPipe/NMRDraw (40) and analyzed with CCPN Analysis (41). Backbone 13Cα, 13Cβ, 13C′, 15N and 1HN resonance assignments of RRM wild type and D50A mutant, both in the apo and SVL-bound states were obtained from standard gradient-selected, triple-resonance HNCACB, HN(CO)CACB, HNCO, HN(CA)CO (42) and CHH-TOCSY (43) at 27°C. Assignment data were collected with a random nonuniform sampling scheme and processed by Sparse Multidimensional Iterative Lineshape-Enhanced (SMILE) algorithm (44). Amide group residual dipolar couplings (RDCs; 1DNH) were measured from 2D 15N, 1H In-Phase/Anti-Phase (IPAP) experiments recorded on wild type and D50A mutant protein samples in the absence (1JNH) and presence (1JNH + 1DNH) of filamentous Pf1 bacteriophage (45) and bicelles alignment media (46). For samples in Pf1 bacteriophage, a concentrated stock of protein was mixed with 50 mg/ml bacteriophage stock solutions (Asla) to a final concentration of 15 mg/ml. The bicelle samples contained 5% (w/v) total DLPC and CHAPSO with a molar ratio of 4.2:1 in a 50 mM phosphate buffer, pH 6.8, 50 mM KCl, 1 mM TCEP, 0.05% (w/v) sodium azide, 0.1 mg/ml AEBSF and 10% D2O. Bicelles were prepared by dissolving DLPC and CHAPSO in phosphate buffer and vortexing for 1 min. Lipids were temperature cycled three times (30 min at 4°C followed by 30 min at 40°C), then diluted 1:1 with wild type or D50A RRM. Samples were equilibrated at 27°C for 1 h before recording the NMR spectra. The DLPC/CHAPSO bicelle samples were stable at 27°C for 3 days.
Wild type (BMRB id: 30652) and D50A (BMRB id: 30653) RRM backbone assignments and amide RDC data from Pf1 bacteriophage were submitted to the CS-ROSETTA webserver (https://csrosetta.bmrb.wisc.edu/csrosetta/submit) for structure calculation (47). The 10 lowest energy structures from CS-ROSETTA were fitted to the RDC data collected in DLPC/CHAPSO bicelles with the PALES program (48). The best wild type and D50A mutant structures are available at protein data bank (PDB codes: 6Q2I for wild type and 6TZE for D50A). The surface electrostatic potential for each structure was calculated with the Adaptive Poisson-Boltzmann Solver (APBS), and figures were generated with PyMol (49).
RNA titration experiments were performed by adding unlabeled SVL RNA to the 15N-labeled proteins and monitoring the change in amide chemical shifts in 2D 15N,1H HSQC spectra until complete saturation. Binding affinities were calculated from spectra using a two-state ligand binding model in the TITAN program (50). RNA binding on- and off-rates were calculated for both wild type and D50A RRM from amide peaks in the fast and intermediate exchange regimes by TITAN. Backbone 15N R1 (longitudinal spin–lattice relaxation rate), R2 (transverse spin–spin relaxation rate) and heteronuclear {1H}–15N NOE relaxation data (51,52) for RRM wild type and D50A mutant in the absence and presence of SVL RNA were acquired at 600 MHz and 27°C. 15N R1, R2. NOE values were used to calculate the backbone order parameters (S2) and the global correlation time (τc) from the model-free approach (53,54) using ‘model 2’ in modelfree v4.2 (http://nysbc.org/departments/nmr/relaxation-and-dynamics-nysbc/software/modelfree/).
RNA-seq and 3′-seq
Total brain tissues were collected from three wild type and five mutant 50-day-old male animals from the same litter. Brains were immediately treated with TRIzol reagent (ThermoFisher). RNA isolation, Poly(A)+ selection, RNA-seq library preparation (NEBNext® UltraTM RNA Library Prep Kit) and high-throughput sequencing (pair end 150-nucleotide sequencing, PE150, 20M unstranded reads per sample), data filtering and quality, mapping on the mouse reference genome (mm10) using the STAR aligner (55) and differential gene expression analysis (DESeq2) (56) were performed by Novogene Corporation. Over-Representation Enrichment Analysis of the differentially expressed genes were assessed using WebGestalt (57) with settings: Mus musculus; Overrepresentation Enrichment Analysis; Geneontology; Biological Process. Reference Gene List was set on genome. Cellular Component and Molecular Function did not return any gene ontology terms with a false discovery rate (FDR) smaller than or equal to 0.05.
3′ End sequencing and library preparation were performed by Admera Health Inc. 3′-seq libraries were prepared using the QuantSeq 3′ mRNA-Seq Library Prep Kit REV for Illumina from Lexogen. Libraries were sequenced with a specific sequencing primer as per the manufacturer recommendations. At least 40M stranded reads were obtained per sample. The sequences were aligned using STAR aligner (55). Aligned reads were normalized and the genes changing the C/P sites were identified. Analysis of APA was performed by Admera Health following the protocol and using the latest polyA_DB database developed by the laboratory of Dr. Bin Tian (58). Polyadenylation sites within 24 nt were clustered together to reflect the heterogeneity of the cleavage and polyadenylation process (59). Over Representation Enrichment Analysis of the genes changing C/P sites were assessed using WebGestalt (57) with settings: Mus musculus; Overrepresentation Enrichment Analysis; Geneontology; Biological Process. Reference Gene List was set on genome; top 10 gene ontology terms with a false discovery rate (FDR) smaller than or equal to 0.05.
RESULTS
Males in Family P167 presented with mild intellectual disability and a mutation in the RNA recognition motif in CSTF2
Family P167 is of Algerian origin currently living in France. The family has four affected male individuals (II:1, II:2, III:1, and III:2, Figure 1) in two different generations connected through the maternal germline. This pattern of inheritance suggested an X-linked trait carried by females I:2 and II:4. The index proband, a 6-year-old boy (III.1) was presented to the Division of Pediatrics at Bicêtre Hospital (Kremlin-Bicêtre, France) with developmental delays. There was no parental consanguinity. He has one affected brother (II.2) and two affected maternal uncles (I.3 and I.4). All affected males were born after uneventful pregnancies and delivery. The index proband (III.1) was born at term with normal weight (3,760 g), height (50 cm) and occipitofrontal circumference (34 cm). Motor milestones, such as walking, were reached within normal limits. Developmental delay was recognized in early childhood during the first years at school and speech development was retarded. Differential scales of intellectual efficiency (Échelles Différentielles d’Efficience Intellectuelle, EDEI, 60) showed low verbal and nonverbal communication skills, with no other consistent clinical phenotype. He attended a school for children with additional speech therapy and special assistance. Brain MRI showed no specific abnormalities except for an abnormal hypersignal at the posterior thalamic nuclei due to an episode of intracranial hypertension at the age of 9.5 years (data not shown). His two uncles (ages 44- and 46-years-old, II.1 and II.2, Figure 1A) attended a special school for children with learning difficulties. They are both married and employed; one uncle has a 15-year-old unaffected girl, and the other a 7-year-old unaffected girl and a 5-year-old unaffected boy.
Initial karyotype analysis testing for Fragile X and a mutation search in a few selected XLID genes, including SLC6A8, NLGN4 and JARID1C, gave normal results. X chromosome exome sequencing of the index proband (III:1) as part of a large study of >500 unrelated males from families with likely XLID (4) identified a 3 bp deletion in STARD8, and single nucleotide variants in ALG13, COL4A5 and CSTF2. All variants co-segregated with the phenotype: they were present in all affected males and were transmitted through the maternal germline (Figure 1A and data not shown). The single amino acid deletion identified in STARD8 is present in 180 control males in the Genome Aggregation Database (gnomAD, non-neuro control individuals, https://gnomad.broadinstitute.org/) (61,62), and was therefore interpreted as a benign polymorphism. ALG13 is an established epilepsy-associated gene in females (63), but the ALG13 missense mutation in Family P167 (chrX:110951509G>A, NCBI RefSeq NM_001099922.3:c.638G>A, p.S213N, rs374748006) has been reported in four control males (gnomAD, non-neuro) and was predicted as a polymorphism or benign variant by MutationTaster2 (64) and the ClinVar database (65). Mutations in COL4A5 have been associated with X-linked Alport syndrome, also known as hereditary nephritis. The COL4A5 single nucleotide mutation (chrX:107846262C>A, NM_000495.4:c.2215C>A, NP_203699.1:p.P739H) has not been reported in any publicly available database. However, the same proline residue substituted by an alanine (p.P739A) is most likely a benign polymorphism (44 hemizygous males in gnomAD).
In contrast, the CSTF2 missense mutation (exon 3, chrX:100077251A>C (GRCh38.p12), RefSeq NM_001306206.1:c.149A>C) was identified as novel and was not reported in >14 000 control individuals from publicly available databases including gnomAD. The mutation was predicted as disease causing or pathogenic by several prediction tools, including MutationTaster2 (disease causing), DANN (pathogenicity score of 0.9956 with a value of 1 given to the variants predicted to the most damaging (66)) and a high CADD score of 26.9 (67). Furthermore, at the gene level, CSTF2 is highly constrained with zero known loss-of-function mutations and a Z-score of 3.43 (ratio of observed to expected = 0.37) for missense mutations in gnomAD. Conceptual translation of the cDNA demonstrated a mutation of aspartic acid (GAT) to alanine (GCT) within the RNA recognition motif (RRM) of CSTF2 (p.D50A, Figure 1B, D). The aspartic acid at position 50 is conserved in every CSTF2 ortholog surveyed (Figure 1C). We designated this allele CSTF2D50A.
Polyadenylation efficiency is reduced by the CSTF2D50A mutation
To determine the effectiveness of CSTF2D50A in control of C/P, we performed the Stem-Loop Assay for Polyadenylation (SLAP). SLAP permits determination of C/P efficiency of a luciferase reporter gene that contains two MS2 bacteriophage RNA stem-loops in the position of the U/GU-rich region downstream of the SV40 late polyadenylation signal; the stem-loops bind to an MS2 coat protein domain (MCP) engineered at the N-terminus of CSTF2 in transfected HeLa cells (33). Previously, using this assay we showed that mutations in the RRM of CSTF2 influenced RNA binding properties of the motif, thus altering reporter gene expression (22). Wild type MCP-CSTF2 produced a SLAP value of 4.49 ± 0.16 normalized luciferase units (NLU), whereas MCP-CSTF2D50A consistently achieved 15% lower polyadenylation efficiency (3.92 ± 0.13 NLU) than wild type MCP-CSTF2 (Figure 2A).
Figure 2.
CSTF2D50A is less efficient for C/P because it binds substrate RNA with a higher affinity. (A) SLAP results showing normalized luciferase units (NLU) in HeLa cells without MCP-CSTF2 (–), in cells transfected with MCP-CSTF2 or MCP-CSTF2D50A (black bars) and with CSTF3-Myc (white bars). Western blots to show the expression of FLAG-tagged MCP-CstF-64 (WB: FLAG) or Myc-tagged CSTF3-Myc (WB: Myc). β-tubulin was used as a loading control. Asterisk indicates a statistically significant P value equal to 0.0021, by student's t-test with unequal distribution of variance. (B) Immunofluorescent images of the described constructs stained with antibodies against the FLAG tag for MCP-CSTF2 (green), Myc tag for CSTF3 (red), and counterstained with DAPI to delineate the nucleus. (C) The Kd for RNA binding was determined for the isolated RRM domain of wild type CSTF2 and CSTF2D50A mutant via the change in fluorescence polarization of a 3′-end labeled RNA substrate (C, D) and isothermal titration calorimetry (ITC; E, F). Changes in fluorescence polarization for wild type and D50A RRMs binding to SVL (C) and (GU)8 (D) substrate RNAs. ITC thermograms of wild type (E) and D50A (F) RRM binding to SVL RNA. The Kd and the number of binding sites (N) are indicated on the figures along with the corresponding standard deviation from three replicates. Raw injection heats are shown in the upper panels and the corresponding integrated heat changes are shown in the bottom panels versus the molar ratio of RNA to protein. Kds and thermodynamics, derived from the fits of the ITC data, are provided in Supplementary Table S1.
Mutations in the RNA binding site I and II of CSTF2 affect the synergetic function of CSTF2 and CSTF3 (22). Therefore, we co-expressed CSTF3 with either MCP-CSTF2 or MCP-CSTF2D50A to determine whether the D50A mutation affects cleavage and polyadenylation in the presence of CSTF3. CSTF3 increased SLAP for both wild type and MCP-CSTF2D50A to a similar extent, effectively eliminating the reduction observed in the MCP-CSTF2D50A construct expressed alone. Mutations in the CSTF2 RRM altered nuclear-to-cytoplasmic localization (22). Therefore, we hypothesized that the CSTF2D50A mutation might also affect the ratio between nuclear and cytoplasmic CSTF2. Immunohistochemical staining revealed that CSTF2D50A was localized more in the cytoplasm than wild type CSTF2 (Figure 2B). Co-expression of CSTF3 with CSTF2D50A increased the nuclear localization of CSTF2D50A similar to wild type CSTF2 protein (Figure 2B and (22)).
The CSTF2D50A RRM has a greater affinity for RNA
Because the aspartic acid (D) to alanine (A) mutation changes the charge in the loop connecting the β2 and β3-strands (Figures 3 and 4, (22,68)), we wanted to determine whether RNA binding was altered in the CSTF2D50A RRM mutant. To measure binding via fluorescence polarization/anisotropy, bacterially expressed CSTF2 and CSTF2D50A RRMs (amino acids 1–107 (22)) were incubated with fluorescently-labeled RNA oligonucleotides from either the SV40 late transcription unit (SVL (69)) or (GU)8 (68). The wild type RRM bound to the SVL and (GU)8 RNAs with Kd, app of 1.45 ± 0.07 μM and 0.26 ± 0.02 μM, respectively (Figure 2C, D and Supplementary Table S1). The CSTF2D50A mutant RRM bound the two RNAs with significantly higher affinities (Kd, app 0.93 ± 0.13 μM and 0.17 ± 0.01 μM for the SVL and (GU)8 RNA, respectively, Figure 2C, D, Supplementary Table S1).
Figure 4.
The altered side chain interactions of CSTF2D50A lead to different local electrostatic surface potentials. Panels A–C show the different side chain orientations and interactions present in the wild type CSTF2 RRM (green) and mutant CSTFD50A RRM (red). (D) Calculated electrostatic potential for wild type (left) and D50A mutant (right) RRM. The red-to-blue surface representation highlights negative-to-positive electrostatic potential from –2 to 2 kbT/e, where kb is Boltzmann's constant, T is the temperature in Kelvin and e is the charge of an electron.
To confirm the RNA-binding affinities, we performed isothermal titration calorimetry (ITC) titrations using the SVL RNA oligonucleotide. The average Kd was 1.52 ± 0.17 μM for wild type RRM and 0.698 ± 0.06 μM for the CSTF2D50A RRM (Figure 2E and F, Supplementary Table S1), which were in the same range as the Kds measured before. In both cases, binding was driven by a favorable enthalpy overcoming an unfavorable entropy. However, the enthalpy and entropy for mutant and wild type binding to SVL RNA were different (Figure 2F, Supplementary Table S1). Thus, the enthalpy-entropy compensation for the D50A mutant binding to SVL RNA was greater than that of the wild type RRM, leading to the higher observed affinity, suggesting that CSTF2D50A would bind to RNAs with a greater affinity during C/P.
The D50A mutation affects the side chain interactions and electrostatic surface of the RRM
To characterize the structural and dynamic changes that occur in the CSTF2D50A RRM upon binding to RNA, we turned to solution state nuclear magnetic resonance (NMR) spectroscopy, initially characterizing the general backbone conformation via 2D 15N,1H heteronuclear single quantum coherence (HSQC) spectra for the wild type and mutant RRMs. Predictably, the overlay of the HSQC spectra showed differences in the peak positions of the residues in the loop where the D50A mutation is located (Figure 3A and C, green bars). Otherwise, the majority of the residues in the CSTF2D50A RRM showed almost identical NMR spectra as the wild type.
Figure 3.
The p.D50A mutation perturbs the environment of the β-sheet and C-terminal α-helix. The 2D 15N,1H HSQC of apo (A) RRM WT (green) and RRM D50A (red) and RNA-bound (B) RRM WT•SVL (black) and RRM D50A•SVL (yellow) complexes. All structures were determined from amino acids 1–107 of human CSTF2 or CSTF2D50A. Data were collected at 600 MHz and 27°C. (C) Bar graph indicating backbone amide chemical shift perturbations (CSP) for the RNA-free (Apo) WT and D50A RRM (green) and SVL RNA-bound WT and D50A mutant RRM (red). The CSPs are calculated from the HSQC spectra from panels A and B. (D) The tertiary folds of the wild type CSTF2 and CSTF2D50A RRM domain are similar. Superimposition of the lowest energy solution structure of the WT and D50A RRM in the apo form, after PALES re-scoring against an independent set of RDC data, showing a view of site I (left panel) and the D50A loop (right view). β-strands and α-helices are labelled on the structures; the black arrows indicate the position of residue D50.
Next, we determined the three-dimensional structures of the CSTF2 and CSTF2D50A RRMs using CS-Rosetta (47,70). The CSTF2D50A RRM was almost identical to the CSTF2 RRM structure (Figure 3D; 3.081 Å all atom root-mean-square deviation, RMSD), consistent with circular dichroism data (Supplementary Figure S2A). The major difference between the RRMs was observed in the α4-helix, which angled away to make room for the repositioned side chains of the β4-strand to interact with the α4-helix. A small difference was also noted in the relative twist of the β-sheet, which includes RNA binding sites-II and -III.
These differences in secondary structural elements resulted from repositioning of the side chains for the residues in the loop surrounding the D50A mutation and in the amino acids involved in site-I (Tyr25 and Arg85) and -II (Phe19 and Asn91; Figures 1C and 4A–C), which are two of the three sites identified as important for RNA interactions in Rna15, the yeast homolog of CSTF2 (71). Specifically, in wild type CSTF2, the carboxylic acid side chain of Asp50 formed a hydrogen bond with the hydroxyl side chain of Thr53 (Figure 4A, left). Replacing the hydrogen bond acceptor with a methyl group disrupted this interaction in CSTF2D50A (Figure 4A, right). In addition, the Arg51 side chain formed a new ion pair interaction with the side chain carbonyl group of Glu52, instead of the backbone carbonyl of the terminal helix α4 residue Ser103, leading to re-orientation of helix α4 (Figures 3D and 4A). In the CSTF2D50A RRM, the β1-strand had a different twist, which allows the aromatic ring of Phe19 (site-II, Figure 1C) to lift up towards the RNA binding pocket (Figure 4B), contributing to the reorientation of helix α4 in the mutant as the edge of the Phe19 aromatic ring now packs against the aliphatic portions of Asn97, Glu100, and Leu104 of helix α4. In binding site-I of the wild type RRM, the Tyr25 aromatic ring was adjacent to the amino side chain on Lys55, forming a π–cation interaction (Figure 4C). However, in CSTF2D50A, the Tyr25 ring moved away, allowing Glu26 to form a hydrogen bond with Arg85 (site-I residue), increasing the rigidity in binding site-I (Figure 4C, right). Thus, local differences in side chain arrangement, particularly in binding sites-I and -II, contribute to the differences we observed in RNA binding.
We calculated the electrostatic potentials using the Adaptive Poisson–Boltzmann Solver (72). Changing the negatively charged Asp50 to the non-charged Ala altered the charge distribution of the loop from negative to positive in the mutant (Figure 4D). Binding site-I (Site I, Figure 4D), which forms upon RNA binding (71) in the wild type, was within a cavity with low positive electrostatic potential for RNA binding (Figure 4D, left). However, in the D50A mutant, the hydrogen bonding between Arg85 and Glu26 (Figure 4C and D) moved the positively charged side chain of Arg85 toward the inside of cavity, resulting in a higher positive electrostatic potential distribution in the binding site-I (Figure 4D, right). We hypothesize that this larger positive electrostatic potential could be the driving force for the more favorable enthalpy of binding and the tighter RNA binding affinity.
The D50A mutation affects the structure and dynamics of the RNA-bound RRM
To test effects of RNA binding, the CSTF2 and CSTF2D50A RRMs were titrated with SVL RNA to saturation (molar ratio of 2.3:1, RNA to protein) with changes monitored in 2D 15N,1H HSQC spectra (Figures 3B and 5A). Both proteins showed the same binding patterns (i.e. direction, magnitude and exchange regime) for most residues (e.g. Val18, Val20, Ala27, Asp90 and Leu47; Figure 5A). Gly54 located within the D50 loop was perturbed upon addition of RNA to the CSTF2D50A RRM but not the wild type RRM (Figure 5A, left), indicating a potentially new interaction in the mutated domain. Next, we calculated amide chemical shift perturbations (CSPs) between SVL-bound CSTF2 and CSTFD50A RRMs (Figure 3C, red bars). The largest differences were observed in the loop containing the mutation, similar to unbound RRMs (Figure 3C, green bars), and other significant amide CSPs (i.e. above the mean, 0.032 ppm for apo and 0.048 ppm for RNA-bound) were observed in the RNA-bound domain for the N-terminus (residues 5–7) near binding site-I (25 and 85), site-II (residues 19 and 91; Figures 1C and 3C), and in the β2-strand preceding the mutation (Figure 3B and C, red bars).
Figure 5.
The p.D50A mutation affects the RNA-binding kinetics and dynamics of the RRM. (A) Overlays of the 2D 15N,1H HSQC spectra for the titration of CSTF2 (top panels) and CSTF2D50A (bottom panels) RRMs. Red-yellow-blue and red–green–blue gradients represent the 0–100% titration of wild type and D50A RRMs with SVL RNA, respectively. Titration data were acquired at 600 MHz and 27°C. All spectra are shown with the same contour base level relative to noise. The magnitude and direction of each titrated residue is shown with arrows and assignment. (B and C) Backbone amide 15N order parameter (S2) for (B) apo RRM WT (green) and RRM D50A (red) and (C) SVL RNA-bound RRM WT•SVL (black), and D50A•SVL (yellow). The pink lines at an S2 value of 1.0 denotes the maximum value for the order parameter.
The on- and off-rates of RNA binding were determined with a simple two state ligand binding model using the 2D NMR line shape analysis program (TITAN, Supplementary Figure S3 (50)). From the RNA-induced perturbations that were in the fast and intermediate exchange regimes, we calculated a Kd of 0.70 ± 0.02 μM for wild type and 0.54 ± 0.02 μM for CSTF2D50A, which followed the trend established by fluorescence polarization and ITC (Supplementary Table S1). The off-rates (koff) were similar for wild type (305.5 ± 1.4 s−1) and CSTF2D50A (312.4 ± 2 s−1) RRMs. Both on-rates were the same as or exceeded the rate of diffusion, a feature seen in other protein-nucleic acid complexes where electrostatic attraction accelerates the on-rate (73,74). However, the on-rate (kon) for CSTF2D50A (5.83 ± 0.21 × 108 M−1 s−1) was faster than the rate for wild type (4.35 ± 0.12 × 108 M−1 s−1). We conclude, therefore, that the lower Kd for CSTF2D50A was primarily due to the faster on-rate for RNA binding to CSTF2D50A compared to the wild type.
We measured 15N R1, R2, and nuclear Overhauser effect (NOE) relaxation values at 600 MHz 27°C to derive backbone model-free order parameters (S2), which report on the amplitude of fast timescale motion and the global correlation time (τc) (53,54). In the absence of RNA, the S2 values for the CSTF2D50A RRM domain differed from wild type only in the D50A loop, which became more rigid (Figure 5B and Supplementary Figure S4A). However, for the RNA-bound structures, the S2 values indicated different amide group flexibilities in the mutant RRM (Figure 5C and Supplementary Figure S4B). SVL RNA binding to the CSTF2 RRM reduced S2 values throughout the β-sheet (e.g. residues 18–22 and 44–48), indicating increased flexibility upon RNA binding, in agreement with earlier studies (68,75). However, SVL RNA binding to the CSTF2D50A RRM increased S2 values for the β-sheet and the α3-helix adjacent to site-II (Phe19 and Asn91, Figure 5C). This indicated greater rigidity in the pico-to-nanosecond time scale in the RNA binding surface of the D50A mutant when bound to RNA. The order parameter is a measure of conformational entropy within a protein (76,77). Indeed, the wild type RRM, which became more flexible upon RNA binding, pays less of an entropic penalty compared to the D50A mutant (Supplementary Table S1), which maintained the same flexibility upon RNA binding.
We also determined whether the point mutation in D50A changed the stability of the secondary structure of the motif using CD spectroscopy (Supplementary Figure S2B and C). Both RRMs were thermally stable with minimal changes in ellipticity at 100°C (data not shown). However, upon chemical denaturation with guanidine, we observed that the inflection point for the wild type protein was reached at 2 M guanidine and was decreased to at 1.5 M for the CSTF2D50A RRM mutant. This result suggested that secondary structure of the mutant was less stable than the wild type.
The D50A mutation causes differential gene expression in mice
To examine effects of the D50A mutation in vivo, we developed Cstf2D50A mice using CRISPR-Cas9 technology. Hemizygous male (Cstf2D50A/Y) and homozygous female (Cstf2D50A/D50A) mice are viable and fertile. Male Cstf2D50A/Y mice appear to be runted. To determine CSTF2 protein expression levels in Cstf2D50A/Y mice, we isolated total protein from brains and livers of 50-day old wild type and Cstf2D50A/Y littermates. Cstf2D50A/Y mice expressed slightly higher amounts of CSTF2 in both total brain and liver (Figure 6A). We also noted strong expression of the βCstF-64 splice variant (78,79) in brains of both wild type and Cstf2D50A/Y mice (Figure 6A, lanes 1–4, upper band). RNA-seq (see below) did not indicate differential gene expression of the Cstf2 mRNA in the brains of Cstf2D50A/Y mice (Supplementary Table S2), suggesting that protein differences were translational or post-translational.
Figure 6.
Gene expression and 3′ APA analyses in brains of Cstf2D50A/Y mice. (A) Protein immunoblots of CSTF2/CstF-64 (top), CSTF3/CstF-77 (middle) and β-tubulin (bottom) from two wild type (lanes 1, 2, 5, 6) and two Cstf2D50A/Y (3, 4, 7, 8) male mice from indicated tissues (total brain and liver). Bands corresponding to CstF-64, βCstF-64, CstF-77, and β-tubulin are indicated at right. Genotypes of the wild type and Cstf2D50A/Y mice were verified by PCR-RFLP as described in the Materials and Methods. (B) Differential gene expression analysis of RNA-seq from total brain samples isolated from 50-day old male wild type and Cstf2D50A/Y mice (three wild type and five mutants). Left, volcano plot of the differentially expressed genes. Right, list of genes that are differentially expressed, fold change (log2) and adjusted P value (–log10) are indicated. (C) Gene ontology for biological functions of differentially expressed genes in the brains of wild type and Cstf2D50A/Y mice. Number (n) on the right indicates the number of genes found in each functional category. (D) Schematic illustration of APA events analyzed. (Top) Genes in which APA changes the length of the last (3′-most) exon. (Bottom) Genes in which APA occurs in the coding region (CDS) changes the protein coding potential of the gene. Proximal polyadenylation site (PAS), distal PAS, start and stop codons in translation, and regions containing cis-acting RNA elements (e.g. miR and RNA-binding protein binding sites) are indicated. (E) Scatter plot showing expression change of proximal PAS isoform (x-axis) and that of the distal PAS isoform (y-axis) in total RNA. Genes with significantly shortened or lengthened 3′ ends (P < 0.05, Fisher's exact test) in Cstf2D50A/Y total mouse brains (three wild type and five mutant) are highlighted in red and blue, respectively. Color coded numbers indicated the number of genes shortened, lengthened, or showing no change. (F) C/P in the last exon of the Rbm24 gene in wild type (green) or Cstf2D50A/Y (rust) mouse brains switch from the distal to the proximal poly(A) sites. Proximal and distal PASs are indicated with the maximum RPM values shown. (G) Top ten gene ontology categories (FDR < 0.05) of enriched biological functions in Cstf2D50A/Y mice showing lengthening of their 3′-most exons. Number of genes (n) that are detected in each functional category are shown on the right. (H) Scatter plot showing expression change of proximal PAS isoform (x-axis) and that of distal PAS isoform (y-axis) in total RNA. Genes with significantly shortened or lengthened CDSs (P < 0.05, Fisher's exact test) in Cstf2D50A/Y or wild type mouse brains are highlighted in red and blue, respectively. Color coded numbers indicated the number of genes shortened, lengthened, or showing no change. (I) C/P in the Copg1 gene shortens the CDS in Cstf2D50A/Y mouse brains (rust) compared to wild type (green). Maximum RPM values are shown for each animal. Gene structure and different transcripts are shown at the bottom. (J) Top ten gene ontology categories (FDR < 0.05) of enriched biological functions in Cstf2D50A/Y brains showing shortened transcripts in their CDS. Number of genes (n) that are detected in each functional category are shown on the right.
To further characterize the changes in the brain gene expression, we isolated whole brain RNA from 50-day old male wild type and Cstf2D50A/Y littermates and performed RNA-seq. Only fourteen genes showed significant differential mRNA expression between wild type and Cstf2D50A/Y male mice; eleven were up-regulated and three were down-regulated (Figure 6B and Supplementary Table S2). Functional gene enrichment analysis for biological processes (57) indicated that up-regulated genes were involved in several brain developmental pathways including hindbrain development, neural tube patterning, head and brain development (Figure 6C), demonstrating that the D50A mutation in Cstf2 influenced overall expression of key genes in mice. These key genes included Cbln1 (Figure 6B), which is essential for the formation and maintenance of synaptic integrity and plasticity between granule cells and the Purkinje cells (80).
The D50A mutation affects alternative polyadenylation (APA) of over one thousand genes in mice
To examine specific changes in C/P in the Cstf2D50A/Y mice, we performed genome-wide 3′ end sequencing (81,82) on total RNA isolated from brains from 50-day old male wild type and Cstf2D50A/Y mice. To simplify the analysis, we focused on (i) genes in which APA changed the length of the last (3′-most) exon which usually contains the 3′ UTR and (ii) genes in which APA caused changes in coding regions (CDS, Figure 6D). The site that was farther from the transcription start site of the gene was designated the distal polyadenylation site (dPAS) and the nearer one was designated the proximal polyadenylation site (pPAS) for both types of APA changes. Based on our biochemical and structural analyses showing a higher affinity for RNA associated with the CSTF2D50A RRM and slightly higher protein expression, we hypothesized that on average more genes would be shortened and pPAS would be favored.
In total, 1370 genes changed polyadenylation sites in Cstf2D50A/Y mouse brains. Indeed, the number of the genes showing shortening of the last exon in the D50A mice (i.e. favoring the pPAS) was 1.97-fold greater than those showing the opposite pattern (571 versus 290, Figure 6E and Supplementary Table S3). An example of last exon shortening is the RNA Binding Motif Protein 24 (Rbm24) gene (Figure 6F). Gene ontology (GO) analysis of the genes shortening the last exon showed localization of macromolecules in cells, retrograde transport, cell migration, and more (Supplementary Table S4). GO terms of the genes lengthening the last exon indicated enrichment in neurogenesis, neuron development, and more (Figure 6G).
The number of genes in which APA caused changes in the CDS favoring the pPAS in Cstf2D50A/Y animals was 3.5-fold greater than the number of genes showing the opposite pattern (396 versus 113, Figure 6H and Supplementary Table S5). For example, the mRNA for the shorter version of the Coatomer Protein Complex Subunit Gamma 1 (Copg1) gene was increased in Cstf2D50A/Y mouse brains (Figure 6I). This isoform was previously reported to be enriched in granule cells of the brain (83). GO term analysis for the 113 genes lengthening their CDSs did not reveal ontologies that were enriched (FDR < 0.05, Supplementary Table S3). However, GO terms for genes with shorter CDSs were enriched for neurogenesis, neurodevelopment and cell adhesion (Figure 6J). We propose that in the patients carrying the D50A mutation, the balance of the critical protein isoforms involved in brain development is disrupted, contributing to their cognitive disability.
We also evaluated the genes involved in the organization and long-term maintenance of the synapses. In addition to up-regulation of the post-synaptic gene Cbln1 (Figure 6B), our analysis revealed that both the 3′ UTR and the CDS of the mRNA encoding the pre-synaptic cell-surface receptor Nrxn1 is shorter in brains of Cstf2D50A/Y mice (Supplementary Figure S5). Similarly, the 3′ UTR of Cntnap2 is longer in mutant mice (Supplementary Figure S5). On the post-synaptic side, 3′ UTRs of several genes involved in regulation of protein synthesis, scaffolding and signal transmission (Shank2, Fmr1, Cacng4) were affected by APA (Supplementary Tables S3 and S4).
Finally, we examined hexamer frequencies within 150 nt 5′ and 3′ of the pPAS and dPAS of sites that are affected in Cstf2D50A/Y mice (Supplementary Figure S6). The hexamer CACACA was most enriched between 50 and 150 nt upstream of the pPAS, while a similar sequence, CAACCA was enriched upstream of dPAS. Downstream sequences tended to be U-rich (pPAS) or A-rich (dPAS).
These changes in 3′ UTRs and protein coding capacity of these key synaptic genes in the excitatory synapses may lead to changes in the long-term potentiation, affecting neuronal plasticity and development, consistent with the speech delays and intellectual disability observed in humans.
DISCUSSION
Many monogenic intellectual deficiencies are X-linked (XLIDs) because of hemizygosity of X-chromosomal genes in males (84–86). The X-linked CSTF2 gene is essential for embryonic growth and development (30,31). Therefore, it was surprising that a single nucleotide mutation in CSTF2 would result in structural and functional changes in 3′ end mRNA processing yet not have lethal effects. Other mutations in the RRM of CSTF2 have been reported in gnomAD, but with no associated disease (62). As reported here (Figure 1), it appears that the CSTF2D50A mutation affects primarily intellectual functions and speech development in the affected males, but effects on other physiological functions were not noted, suggesting that the brain is more sensitive to this particular mutation than other organs. We propose, therefore, that the p.D50A mutation in CSTF2 results in non-syndromic intellectual deficiency in males by altering RNA binding during C/P, thus changing the expression of key genes in neurodevelopment.
Only one mutation involving a core polyadenylation protein, NUDT21 (which encodes the 25 kDa subunit of mammalian cleavage factor I), has been implicated in neuropsychiatric disorders resulting in Rett syndrome-like symptoms and intellectual disability (13,14). We did not observe altered expression of Mecp2 — which is frequently associated with Rett syndrome — in our mouse model (not shown), but observed altered sites of polyadenylation in many other neurodevelopmental genes (Figure 6).
RNA-contact residues in the CSTF2 RRM play specific roles during C/P (22). Introduction of the D50A mutation into CSTF2 resulted in a small but consistent reduction in C/P efficiency (Figure 2A). Such a reduction in activity would likely result in altered C/P in vivo, favoring proximal sites over more distal sites (87,88). Analysis of genome-wide polyadenylation changes in the Cstf2D50A/Y mice indicated exactly that shorter RNAs (3′ most exons and changes in the last exon) were enriched in brains of Cstf2D50A/Y mice (Figure 6E, H). Residues within the CSTF2 RRM are also important for functional interactions between CSTF2 and CSTF3 during C/P (22). However, we did not observe a reduction in the ratio between MCP-CSTF2D50A alone and MCP-CSTF2D50A co-expressed with CSTF3, which we previously observed with CSTF2 RRM binding site-I and -II mutants (Figure 2). This suggests that the D50A mutation causes the phenotype independent of the interactions with CTSF3. Furthermore, the small decrease in the polyadenylation efficiency in our reporter system might suggest that the mutated protein is retained longer in the cytoplasm (Figure 2B), possibly through increased interaction with cytoplasmic RNAs (22). Indeed, we observed increased levels of CSTF2 protein (Figure 6A) in the brains of Cstf2D50A/Y mice, which could favor proximal sites of C/P. However, we observed no change of CSTF3 protein in the same samples (Figure 6A). This suggests that the functional level of CSTF is unchanged in nuclei of neuronal and hepatic cells (22).
With reduced C/P efficiency, we observed an increased affinity of the CSTF2D50A RRM for RNA (Figures 3 and 5), probably also contributing to the retention of CSTF2D50A in the cytoplasm (Figure 2B). The Kd of the CSTF2D50A RRM for RNA was less than half that of wild type CSTF2 due to the faster kon rate (Supplementary Table S1). Based on the on- and off-rates, the D50A mutant binds to RNA faster than wild type but releases the RNA with the same off-rate. It has been previously shown that the β2–β3 loop (D50A loop here) is important for the shape recognition of RNA in some RRM containing proteins (89,90). Our structures for the CSTF2 and CSTF2D50A RRMs allowed us to model the differences in RNA binding based on the relative orientation of the side chains of the mutant RRM, electrostatic potential, and rigidity of individual backbone atoms of the β-sheet, causing faster RNA binding in the mutant (Figure 4D) by helping to overcome the greater entropic penalty of binding resulting from the enhanced rigidity of the β-sheet in the mutant.
This mutation increases the affinity of CSTF2D50A for RNA (Figure 2C, D), which we might consider a ‘gain-of-function’ mutation at the molecular level. However, the mutation leads to reduced polyadenylation efficiency (Figure 2A), which could be considered a loss-of-function. The phenotype observed is intellectual disability (Figure 1), which many would consider to be a loss-of-function. Thus, the mutation appears to be a molecular gain-of-function that causes a loss-of-function phenotype.
To confirm that the CSTF2D50A mutation had a neurophysiological effect, we created Cstf2D50A mice with the same mutation. Several labs have examined effects of knockdowns of CSTF2 in cells in culture, but noted altered C/P in only a relatively small number of genes (87,91–93). Unlike those studies, we observed C/P site changes in 1370 genes in the brains of hemizygous Cstf2D50A/Y mice (Figure 6). The majority of the changes in our study favored proximal PASs, effectively shortening the CDSs or shortening the length of the last exons (3′ UTRs). Why did the knockdown studies have fewer consequences than the D50A point mutation? Possibly, their results were confounded by the presence of τCstF-64 (gene symbol CSTF2T) compensating for decreased CSTF2 in those cells. In support of that idea, knockout of Cstf2t in mice resulted in few phenotypes in most cells (where Cstf2 was also expressed), but led to severe disruption of spermatogenesis in germ cells (where Cstf2t was expressed in the absence of Cstf2 (26)).
We noted APA in several genes involved in pre-synaptic (Nrnx1 and Cntnap2) granule cells and post-synaptic Purkinje cells (Cbln1, Shank2, Fmr1 and Cacng4) (94). Mutations in these genes have been associated with development disorders, delayed speech, and intellectual disability (6,95–97). It seems possible that these same disruptions are affecting intellectual ability in the human patients.
But it is also possible that cells in the brain (either neurons, glia, or both) are more sensitive to mutations in CSTF2. It has long been known that the brain uses alternative polyadenylation extensively to provide transcriptomic diversity and mRNA targeting signals for plasticity and behavioral adaptation (2,98–100). By extending occupancy times on the pre-mRNA, the D50A mutation likely changes the balance of APA to favor shorter mRNAs, thus reducing the effectiveness of over one thousand mRNA transcripts (Figure 6). The brain also expresses a neuron-specific alternatively-spliced isoform of CSTF2 that contains 49 extra amino acids, called βCstF-64 (Figure 6A, (78,79)). Thus, we speculate that the D50A mutation might interact with the additional domain in βCstF-64 in an as-yet undefined manner to give the brain phenotype.
How does an increase in the affinity of CSTF2 for RNA result in altered cleavage and polyadenylation? Previous work showed that formation of the C/P complex takes 10–20 seconds for a weak site; assembly on stronger sites is faster (101). We speculate that the increased affinity of CSTF2D50A alters the rate of formation of the CstF complex on the downstream sequence element of the nascent pre-mRNA during transcription (69). We further speculate that the specificity of binding is reduced in CSTF2D50A. This combination of faster binding and reduced specificity could promote increased C/P of weaker sites, which tend to be more proximal (91), not unlike the effects of slower RNA polymerase II elongation (102,103). These C/P changes subsequently affect post-transcriptional regulation of key mRNAs by changing 3′ UTR regulation by revealing miRNA or RNA-binding protein sites, or by changing targeted localization of mRNAs to neural projections (16,17,83,104–107). Mutations in CSTF2 may have even more striking effects in other target tissues. Studying these mutation-induced changes in gene expression will be important for understanding the mechanisms of polyadenylation as well as the intricacies of neuronal development.
DATA AVAILABILITY
High-throughput sequence data has been deposited to the Gene Expression Omnibus (GEO) database repository at NCBI with the accession numbers GSE152976 (RNA-seq) and GSE152975 (3′-seq). The structures of the wild type CSTF2 (PDB 6Q2I) and CSTF2D50A (PDB 6TZE) RRM domains were deposited to the Research Collaboratory for Structural Bioinformatics Protein Data Bank.
Supplementary Material
ACKNOWLEDGEMENTS
The authors wish to acknowledge R. Bryan Sutton, Michaela Jansen, Kerri A. White, Charles Faust, Anne Rice, Jeffrey Thomas and the members of the Latham laboratory for technical and intellectual contributions. We thank Admera Health for 3′-seq services and Ruijia Wang, Bin Tian and Yun Zhao for assistance with 3′-seq analysis. Microscopy was performed in the Image Analysis Core Facility supported in part by TTUHSC. We thank Helene Maurey (Bicêtre), Florence Pinton (Bicetre), Brigitte Simon-Bouy (CH Versailles), and Cecile Zordan (Bordeaux), who performed genetic counselling for this family. The authors would also like to thank the Genome Aggregation Database (gnomAD) and the groups who provided exome and genome variant data to this resource. A full list of contributing groups can be found at https://gnomad.broadinstitute.org/about.
Author contributions: Conceptualization, V.M.K., P.N.G. and C.C.M.; Methodology, V.M.K., P.N.G., E.M. and M.P.L.; Formal analysis, V.M.K., P.N.G., E.M., T.B., P. B., M-A.D. and M.P.L.; Writing - Original draft, P.N.G.; Writing - review and editing, V.M.K., P.N.G., E.M., M.P.L. and C.C.M.; Supervision, P.N.G. and C.C.M.
Contributor Information
Petar N Grozdanov, Department of Cell Biology & Biochemistry, School of Medicine, Texas Tech University Health Sciences Center, Lubbock, TX 79430-6540, USA.
Elahe Masoumzadeh, Department of Chemistry & Biochemistry, Texas Tech University, Lubbock, TX 79409-1061, USA.
Vera M Kalscheuer, Max Planck Institute for Molecular Genetics, Research Group Development and Disease, Ihnestr. 63-73, D-14195 Berlin, Germany.
Thierry Bienvenu, Institut de Psychiatrie et de Neurosciences de Paris, Inserm U1266, 102 rue de la Santé, 75014 Paris, France.
Pierre Billuart, Institut de Psychiatrie et de Neurosciences de Paris, Inserm U1266, 102 rue de la Santé, 75014 Paris, France.
Marie-Ange Delrue, Département de Génétique Médicale, CHU Sainte Justine, Montréal, Canada.
Michael P Latham, Department of Chemistry & Biochemistry, Texas Tech University, Lubbock, TX 79409-1061, USA.
Clinton C MacDonald, Department of Cell Biology & Biochemistry, School of Medicine, Texas Tech University Health Sciences Center, Lubbock, TX 79430-6540, USA.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
FUNDING
EU FP7 project GENCODYS [241995]; 2017–2018 Presidents’ Collaborative Research Initiative of Texas Tech University System, the Department of Cell Biology & Biochemistry (Texas Tech University Health Sciences Center); South Plains Foundation; National Institute of General Medical Sciences; NIH [1R35GM128906 to M.P.L.]. Funding for open access charge: Internal funding.
Conflict of interest statement. None declared.
REFERENCES
- 1. Miura P., Sanfilippo P., Shenker S., Lai E.C.. Alternative polyadenylation in the nervous system: to what lengths will 3′ UTR extensions take us. Bioessays. 2014; 36:766–777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Raj B., Blencowe B.J.. Alternative splicing in the mammalian nervous System: Recent insights into mechanisms and functional roles. Neuron. 2015; 87:14–27. [DOI] [PubMed] [Google Scholar]
- 3. Jamra R. Genetics of autosomal recessive intellectual disability. Med Genet. 2018; 30:323–327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Hu H., Haas S.A., Chelly J., Van Esch H., Raynaud M., de Brouwer A.P., Weinert S., Froyen G., Frints S.G., Laumonnier F. et al.. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes. Mol. Psychiatry. 2016; 21:133–148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Davis J.K., Broadie K.. Multifarious functions of the fragile X mental retardation protein. Trends Genet. 2017; 33:703–714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Mila M., Alvarez-Mora M.I., Madrigal I., Rodriguez-Revenga L.. Fragile X syndrome: An overview and update of the FMR1 gene. Clin. Genet. 2018; 93:197–205. [DOI] [PubMed] [Google Scholar]
- 7. Ahmed I., Buchert R., Zhou M., Jiao X., Mittal K., Sheikh T.I., Scheller U., Vasli N., Rafiq M.A., Brohi M.Q. et al.. Mutations in DCPS and EDC3 in autosomal recessive intellectual disability indicate a crucial role for mRNA decapping in neurodevelopment. Hum. Mol. Genet. 2015; 24:3172–3180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Ng C.K., Shboul M., Taverniti V., Bonnard C., Lee H., Eskin A., Nelson S.F., Al-Raqad M., Altawalbeh S., Seraphin B. et al.. Loss of the scavenger mRNA decapping enzyme DCPS causes syndromic intellectual disability with neuromuscular defects. Hum. Mol. Genet. 2015; 24:3163–3171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Kalscheuer V.M., Freude K., Musante L., Jensen L.R., Yntema H.G., Gecz J., Sefiani A., Hoffmann K., Moser B., Haas S. et al.. Mutations in the polyglutamine binding protein 1 gene cause X-linked mental retardation. Nat. Genet. 2003; 35:313–315. [DOI] [PubMed] [Google Scholar]
- 10. Carroll R., Kumar R., Shaw M., Slee J., Kalscheuer V.M., Corbett M.A., Gecz J.. Variant in the X-chromosome spliceosomal gene GPKOW causes male-lethal microcephaly with intrauterine growth restriction. Eur. J. Hum. Genet. 2017; 25:1078–1082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Bain J.M., Cho M.T., Telegrafi A., Wilson A., Brooks S., Botti C., Gowans G., Autullo L.A., Krishnamurthy V., Willing M.C. et al.. Variants in HNRNPH2 on the X chromosome are associated with a neurodevelopmental disorder in females. Am. J. Hum. Genet. 2016; 99:728–734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Tarpey P.S., Raymond F.L., Nguyen L.S., Rodriguez J., Hackett A., Vandeleur L., Smith R., Shoubridge C., Edkins S., Stevens C. et al.. Mutations in UPF3B, a member of the nonsense-mediated mRNA decay complex, cause syndromic and nonsyndromic mental retardation. Nat. Genet. 2007; 39:1127–1133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Gennarino V.A., Alcott C.E., Chen C.A., Chaudhury A., Gillentine M.A., Rosenfeld J.A., Parikh S., Wheless J.W., Roeder E.R., Horovitz D.D. et al.. NUDT21-spanning CNVs lead to neuropsychiatric disease and altered MeCP2 abundance via alternative polyadenylation. Elife. 2015; 4:e10782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Alcott C.E., Yalamanchili H.K., Ji P., van der Heijden M.E., Saltzman A., Elrod N., Lin A., Leng M., Bhatt B., Hao S. et al.. Partial loss of CFIm25 causes learning deficits and aberrant neuronal alternative polyadenylation. Elife. 2020; 9:e50895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Szkop K.J., Cooke P.I.C., Humphries J.A., Kalna V., Moss D.S., Schuster E.F., Nobeli I.. Dysregulation of alternative Poly-adenylation as a potential player in autism spectrum disorder. Front. Mol. Neurosci. 2017; 10:279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Fontes M.M., Guvenek A., Kawaguchi R., Zheng D., Huang A., Ho V.M., Chen P.B., Liu X., O’Dell T.J., Coppola G. et al.. Activity-Dependent regulation of alternative cleavage and polyadenylation during hippocampal Long-Term potentiation. Sci. Rep. 2017; 7:17377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Wanke K.A., Devanna P., Vernes S.C.. Understanding neurodevelopmental disorders: the promise of regulatory variation in the 3′UTRome. Biol. Psychiatry. 2018; 83:548–557. [DOI] [PubMed] [Google Scholar]
- 18. MacDonald C.C. Tissue-specific mechanisms of alternative polyadenylation: testis, brain, and beyond (2018 update). Wiley Interdiscip. Rev. RNA. 2019; 10:e1526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Shi Y., Di Giammartino D.C., Taylor D., Sarkeshik A., Rice W.J., Yates J.R. 3rd, Frank J., Manley J.L.. Molecular architecture of the human pre-mRNA 3′ processing complex. Mol. Cell. 2009; 33:365–376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Tian B., Manley J.L.. Alternative polyadenylation of mRNA precursors. Nat. Rev. Mol. Cell Biol. 2017; 18:18–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Shi Y., Manley J.L.. The end of the message: multiple protein-RNA interactions define the mRNA polyadenylation site. Genes Dev. 2015; 29:889–897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Grozdanov P.N., Masoumzadeh E., Latham M.P., MacDonald C.C.. The structural basis of CstF-77 modulation of cleavage and polyadenylation through stimulation of CstF-64 activity. Nucleic Acids Res. 2018; 46:12022–12039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Takagaki Y., Seipelt R.L., Peterson M.L., Manley J.L.. The polyadenylation factor CstF-64 regulates alternative processing of IgM heavy chain pre-mRNA during B cell differentiation. Cell. 1996; 87:941–952. [DOI] [PubMed] [Google Scholar]
- 24. Edwalds-Gilbert G., Milcarek C.. Regulation of poly(A) site use during mouse B-cell development involves a change in the binding of a general polyadenylation factor in a B-cell stage-specific manner. Mol. Cell. Biol. 1995; 15:6420–6429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Chuvpilo S., Zimmer M., Kerstan A., Glockner J., Avots A., Escher C., Fischer C., Inashkina I., Jankevics E., Berberich-Siebelt F. et al.. Alternative polyadenylation events contribute to the induction of NF-ATc in effector T cells. Immunity. 1999; 10:261–269. [DOI] [PubMed] [Google Scholar]
- 26. Dass B., Tardif S., Park J.Y., Tian B., Weitlauf H.M., Hess R.A., Carnes K., Griswold M.D., Small C.L., MacDonald C.C.. Loss of polyadenylation protein τCstF-64 causes spermatogenic defects and male infertility. Proc. Natl. Acad. Sci. U.S.A. 2007; 104:20374–20379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Hockert K.J., Martincic K., Mendis-Handagama S.M.L.C., Borghesi L.A., Milcarek C., Dass B., MacDonald C.C.. Spermatogenetic but not immunological defects in mice lacking the τCstF-64 polyadenylation protein. J. Reprod. Immunol. 2011; 89:26–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Li W., Yeh H.J., Shankarling G.S., Ji Z., Tian B., MacDonald C.C.. The τCstF-64 polyadenylation protein controls genome expression in testis. PLoS One. 2012; 7:e48373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Grozdanov P.N., Li J., Yu P., Yan W., MacDonald C.C.. Cstf2t Regulates expression of histones and histone-like proteins in male germ cells. Andrology. 2018; 6:605–615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Youngblood B.A., Grozdanov P.N., MacDonald C.C.. CstF-64 supports pluripotency and regulates cell cycle progression in embryonic stem cells through histone 3′ end processing. Nucleic Acids Res. 2014; 42:8330–8342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Youngblood B.A., MacDonald C.C.. CstF-64 is necessary for endoderm differentiation resulting in cardiomyocyte defects. Stem Cell Res. 2014; 13:413–421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Harris J.C., Martinez J.M., Grozdanov P.N., Bergeson S.E., Grammas P., MacDonald C.C.. The Cstf2t polyadenylation gene plays a Sex-Specific role in learning behaviors in mice. PLoS One. 2016; 11:e0165976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Hockert J.A., Yeh H.J., MacDonald C.C.. The hinge domain of the cleavage stimulation factor protein CstF-64 is essential for CstF-77 interaction, nuclear localization, and polyadenylation. J. Biol. Chem. 2010; 285:695–704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Hockert J.A., MacDonald C.C.. The stem-loop luciferase assay for polyadenylation (SLAP) method for determining CstF-64-dependent polyadenylation activity. Methods Mol. Biol. 2014; 1125:109–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Wallace A.M., Dass B., Ravnik S.E., Tonk V., Jenkins N.A., Gilbert D.J., Copeland N.G., MacDonald C.C.. Two distinct forms of the 64,000 Mr protein of the cleavage stimulation factor are expressed in mouse male germ cells. Proc. Natl. Acad. Sci. U.S.A. 1999; 96:6763–6768. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Maciolek N.L., McNally M.T.. Characterization of Rous sarcoma virus polyadenylation site use in vitro. Virology. 2008; 374:468–476. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Azatian S.B., Kaur N., Latham M.P.. Increasing the buffering capacity of minimal media leads to higher protein yield. J. Biomol. NMR. 2019; 73:11–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Grozdanov P.N., Stocco D.M.. Short RNA molecules with high binding affinity to the KH motif of A-kinase anchoring protein 1 (AKAP1): implications for the regulation of steroidogenesis. Mol. Endocrinol. 2012; 26:2104–2117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Pagano J.M., Clingman C.C., Ryder S.P.. Quantitative approaches to monitor protein-nucleic acid interactions using fluorescent probes. RNA. 2011; 17:14–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Delaglio F., Grzesiek S., Vuister G.W., Zhu G., Pfeifer J., Bax A.. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR. 1995; 6:277–293. [DOI] [PubMed] [Google Scholar]
- 41. Vranken W.F., Boucher W., Stevens T.J., Fogh R.H., Pajon A., Llinas M., Ulrich E.L., Markley J.L., Ionides J., Laue E.D.. The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins. 2005; 59:687–696. [DOI] [PubMed] [Google Scholar]
- 42. Muhandiram D.R., Kay L.E.. Gradient-Enhanced Triple-Resonance Three-Dimensional NMR experiments with improved sensitivity. J. Magn. Reson. Ser. B. 1994; 103:203–216. [Google Scholar]
- 43. Bax A., Clore G.M., Gronenborn A.M.. 1H-1H correlation via isotropic mixing of 13C magnetization, a new three-dimensional approach for assigning 1H and 13C spectra of 13C-enriched proteins. J. Magn. Reson. 1990; 88:425–431. [Google Scholar]
- 44. Ying J., Delaglio F., Torchia D.A., Bax A.. Sparse multidimensional iterative lineshape-enhanced (SMILE) reconstruction of both non-uniformly sampled and conventional NMR data. J. Biomol. NMR. 2017; 68:101–118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Hansen M.R., Mueller L., Pardi A.. Tunable alignment of macromolecules by filamentous phage yields dipolar coupling interactions. Nat. Struct. Biol. 1998; 5:1065–1074. [DOI] [PubMed] [Google Scholar]
- 46. Wang H., Eberstadt M., Olejniczak E.T., Meadows R.P., Fesik S.W.. A liquid crystalline medium for measuring residual dipolar couplings over a wide range of temperatures. J. Biomol. NMR. 1998; 12:443–446. [Google Scholar]
- 47. Shen Y., Lange O., Delaglio F., Rossi P., Aramini J.M., Liu G., Eletsky A., Wu Y., Singarapu K.K., Lemak A. et al.. Consistent blind protein structure generation from NMR chemical shift data. Proc. Natl. Acad. Sci. U.S.A. 2008; 105:4685–4690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Zweckstetter M., Bax A.. Prediction of sterically induced alignment in a dilute liquid crystalline Phase: Aid to protein structure determination by NMR. J. Am. Chem. Soc. 2000; 122:3791–3792. [Google Scholar]
- 49. DeLano W.L. PyMOL: an open-source molecular graphics tool. 2002; 82–92.
- 50. Waudby C.A., Ramos A., Cabrita L.D., Christodoulou J.. Two-dimensional NMR lineshape analysis. Sci. Rep. 2016; 6:24826. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Kay L.E., Nicholson L.K., Delaglio F., Bax A., Torchia D.A.. Pulse sequences for removal of the effects of cross correlation between dipolar and chemical-shift anisotropy relaxation mechanisms on the measurement of heteronuclear T1 and T2 values in proteins. J. Magn. Reson. 1992; 97:8972–8979. [Google Scholar]
- 52. Kay L.E., Torchia D.A., Bax A.. Backbone dynamics of proteins as studied by 15N inverse detected heteronuclear NMR spectroscopy: application to staphylococcal nuclease. Biochemistry. 1989; 28:8972–8979. [DOI] [PubMed] [Google Scholar]
- 53. Lipari G., Szabo A.. Model-free approach to the interpretation of nuclear magnetic resonance relaxation in macromolecules. 1. Theory and range of validity. J. Am. Chem. Soc. 1982; 104:4546–4559. [Google Scholar]
- 54. Lipari G., Szabo A.. Model-free approach to the interpretation of nuclear magnetic resonance relaxation in macromolecules. 2. Analysis of experimental results. J. Am. Chem. Soc. 1982; 104:4559–4570. [Google Scholar]
- 55. Dobin A., Davis C.A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T.R.. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013; 29:15–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Love M.I., Huber W., Anders S.. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014; 15:550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Liao Y., Wang J., Jaehnig E.J., Shi Z., Zhang B.. WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs. Nucleic Acids Res. 2019; 47:W199–W205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Wang R., Nambiar R., Zheng D., Tian B.. PolyA_DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes. Nucleic Acids Res. 2018; 46:D315–D319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Tian B., Hu J., Zhang H., Lutz C.S.. A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res. 2005; 33:201–212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Fiasse C., Nader-Grosbois N.. Perceived social acceptance, theory of mind and social adjustment in children with intellectual disabilities. Res. Dev. Disabil. 2012; 33:1871–1880. [DOI] [PubMed] [Google Scholar]
- 61. Lek M., Karczewski K.J., Minikel E.V., Samocha K.E., Banks E., Fennell T., O’Donnell-Luria A.H., Ware J.S., Hill A.J., Cummings B.B. et al.. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016; 536:285–291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alfoldi J., Wang Q., Collins R.L., Laricchia K.M., Ganna A., Birnbaum D.P. et al.. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020; 581:434–443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Bissar-Tadmouri N., Donahue W.L., Al-Gazali L., Nelson S.F., Bayrak-Toydemir P., Kantarci S.. X chromosome exome sequencing reveals a novel ALG13 mutation in a nonsyndromic intellectual disability family with multiple affected male siblings. Am. J. Med. Genet. A. 2014; 164A:164–169. [DOI] [PubMed] [Google Scholar]
- 64. Schwarz J.M., Cooper D.N., Schuelke M., Seelow D.. MutationTaster2: mutation prediction for the deep-sequencing age. Nat. Methods. 2014; 11:361–362. [DOI] [PubMed] [Google Scholar]
- 65. Landrum M.J., Lee J.M., Benson M., Brown G., Chao C., Chitipiralla S., Gu B., Hart J., Hoffman D., Hoover J. et al.. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016; 44:D862–868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Quang D., Chen Y., Xie X.. DANN: a deep learning approach for annotating the pathogenicity of genetic variants. Bioinformatics. 2015; 31:761–763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Rentzsch P., Witten D., Cooper G.M., Shendure J., Kircher M.. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019; 47:D886–D894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Deka P., Rajan P.K., Perez-Canadillas J.M., Varani G.. Protein and RNA dynamics play key roles in determining the specific recognition of GU-rich polyadenylation regulatory elements by human Cstf-64 protein. J. Mol. Biol. 2005; 347:719–733. [DOI] [PubMed] [Google Scholar]
- 69. MacDonald C.C., Wilusz J., Shenk T.. The 64-kilodalton subunit of the CstF polyadenylation factor binds to pre-mRNAs downstream of the cleavage site and influences cleavage site location. Mol. Cell. Biol. 1994; 14:6647–6654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Ramelot T.A., Raman S., Kuzin A.P., Xiao R., Ma L.C., Acton T.B., Hunt J.F., Montelione G.T., Baker D., Kennedy M.A.. Improving NMR protein structure quality by Rosetta refinement: a molecular replacement study. Proteins. 2009; 75:147–167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. Pancevac C., Goldstone D.C., Ramos A., Taylor I.A.. Structure of the Rna15 RRM-RNA complex reveals the molecular basis of GU specificity in transcriptional 3′-end processing factors. Nucleic Acids Res. 2010; 38:3119–3132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Baker N.A., Sept D., Joseph S., Holst M.J., McCammon J.A.. Electrostatics of nanosystems: application to microtubules and the ribosome. Proc. Natl. Acad. Sci. U.S.A. 2001; 98:10037–10041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. von Hippel P.H., Berg O.G.. Facilitated target location in biological systems. J. Biol. Chem. 1989; 264:675–678. [PubMed] [Google Scholar]
- 74. Riggs A.D., Bourgeois S., Cohn M.. The lac repressor-operator interaction. 3. Kinetic studies. J. Mol. Biol. 1970; 53:401–417. [DOI] [PubMed] [Google Scholar]
- 75. Pérez Cañadillas J.M., Varani G.. Recognition of GU-rich polyadenylation regulatory elements by human CstF-64 protein. EMBO J. 2003; 22:2821–2830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76. Yang D., Kay L.E.. Contributions to conformational entropy arising from bond vector fluctuations measured from NMR-derived order parameters: application to protein folding. J. Mol. Biol. 1996; 263:369–382. [DOI] [PubMed] [Google Scholar]
- 77. Wand A.J. The dark energy of proteins comes to light: conformational entropy and its role in protein function revealed by NMR relaxation. Curr. Opin. Struct. Biol. 2013; 23:75–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Shankarling G.S., Coates P.W., Dass B., MacDonald C.C.. A family of splice variants of CstF-64 expressed in vertebrate nervous systems. BMC Mol. Biol. 2009; 10:22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Shankarling G.S., MacDonald C.C.. Polyadenylation site-specific differences in the activity of the neuronal βCstF-64 protein in PC-12 cells. Gene. 2013; 529:220–227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Hirai H., Pang Z., Bao D., Miyazaki T., Li L., Miura E., Parris J., Rong Y., Watanabe M., Yuzaki M. et al.. Cbln1 is essential for synaptic integrity and plasticity in the cerebellum. Nat. Neurosci. 2005; 8:1534–1541. [DOI] [PubMed] [Google Scholar]
- 81. Li W., Park J.Y., Zheng D., Hoque M., Yehia G., Tian B.. Alternative cleavage and polyadenylation in spermatogenesis connects chromatin regulation with post-transcriptional control. BMC Biol. 2016; 14:6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82. Li W., You B., Hoque M., Zheng D., Luo W., Ji Z., Park J.Y., Gunderson S.I., Kalsotra A., Manley J.L. et al.. Systematic profiling of poly(A)+ transcripts modulated by core 3′ end processing and splicing factors reveals regulatory rules of alternative cleavage and polyadenylation. PLos Genet. 2015; 11:e1005166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83. Jereb S., Hwang H.W., Van Otterloo E., Govek E.E., Fak J.J., Yuan Y., Hatten M.E., Darnell R.B.. Differential 3′ processing of specific transcripts expands regulatory and protein diversity across neuronal cell types. Elife. 2018; 7:e34042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84. Penrose L. A Clinical and Genetic Study of 1280 Cases of Mental Defect. 1938; London: H. M. Stationary Office. [Google Scholar]
- 85. Raymond F.L. X linked mental retardation: a clinical guide. J. Med. Genet. 2006; 43:193–200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86. Hu H., Kahrizi K., Musante L., Fattahi Z., Herwig R., Hosseini M., Oppitz C., Abedini S.S., Suckow V., Larti F. et al.. Genetics of intellectual disability in consanguineous families. Mol. Psychiatry. 2018; 24:1027–1039. [DOI] [PubMed] [Google Scholar]
- 87. Yao C., Biesinger J., Wan J., Weng L., Xing Y., Xie X., Shi Y.. Transcriptome-wide analyses of CstF64-RNA interactions in global regulation of mRNA alternative polyadenylation. Proc. Natl. Acad. Sci. U.S.A. 2012; 109:18773–18778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88. Hwang H.W., Park C.Y., Goodarzi H., Fak J.J., Mele A., Moore M.J., Saito Y., Darnell R.B.. PAPERCLIP identifies MicroRNA targets and a role of CstF64/64tau in promoting non-canonical poly(A) site usage. Cell Rep. 2016; 15:423–435. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89. Stefl R., Skrisovska L., Allain F.H.. RNA sequence- and shape-dependent recognition by proteins in the ribonucleoprotein particle. EMBO Rep. 2005; 6:33–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90. Skrisovska L., Bourgeois C.F., Stefl R., Grellscheid S.N., Kister L., Wenter P., Elliott D.J., Stevenin J., Allain F.H.. The testis-specific human protein RBMY recognizes RNA through a novel mode of interaction. EMBO Rep. 2007; 8:372–379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91. Martin G., Gruber A.R., Keller W., Zavolan M.. Genome-wide analysis of pre-mRNA 3′ end processing reveals a decisive role of human cleavage factor I in the regulation of 3′ UTR length. Cell Rep. 2012; 1:753–763. [DOI] [PubMed] [Google Scholar]
- 92. Gruber A.R., Martin G., Keller W., Zavolan M.. Cleavage factor Im is a key regulator of 3′ UTR length. RNA Biol. 2012; 9:1405–1412. [DOI] [PubMed] [Google Scholar]
- 93. Kim S., Yamamoto J., Chen Y., Aida M., Wada T., Handa H., Yamaguchi Y.. Evidence that cleavage factor Im is a heterotetrameric protein complex controlling alternative polyadenylation. Genes Cells. 2010; 15:1003–1013. [DOI] [PubMed] [Google Scholar]
- 94. Yuzaki M. Two classes of secreted synaptic organizers in the central nervous system. Annu. Rev. Physiol. 2018; 80:243–262. [DOI] [PubMed] [Google Scholar]
- 95. Camacho-Garcia R.J., Planelles M.I., Margalef M., Pecero M.L., Martinez-Leal R., Aguilera F., Vilella E., Martinez-Mir A., Scholl F.G.. Mutations affecting synaptic levels of neurexin-1beta in autism and mental retardation. Neurobiol. Dis. 2012; 47:135–143. [DOI] [PubMed] [Google Scholar]
- 96. Yangngam S., Plong-On O., Sripo T., Roongpraiwan R., Hansakunachai T., Wirojanan J., Sombuntham T., Ruangdaraganon N., Limprasert P.. Mutation screening of the neurexin 1 gene in thai patients with intellectual disability and autism spectrum disorder. Genet Test Mol Biomarkers. 2014; 18:510–515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97. Guang S., Pang N., Deng X., Yang L., He F., Wu L., Chen C., Yin F., Peng J.. Synaptopathology involved in autism spectrum disorder. Front Cell Neurosci. 2018; 12:470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98. Zhang H., Lee J.Y., Tian B.. Biased alternative polyadenylation in human tissues. Genome Biol. 2005; 6:R100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99. Miura P., Shenker S., Andreu-Agullo C., Westholm J.O., Lai E.C.. Widespread and extensive lengthening of 3′ UTRs in the mammalian brain. Genome Res. 2013; 23:812–825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100. Fasken M.B., Corbett A.H.. Links between mRNA splicing, mRNA quality control, and intellectual disability. RNA Dis. 2016; 3:e1448. [PMC free article] [PubMed] [Google Scholar]
- 101. Chao L.C., Jamil A., Kim S.J., Huang L., Martinson H.G.. Assembly of the cleavage and polyadenylation apparatus requires about 10 seconds in vivo and is faster for strong than for weak poly(A) sites. Mol. Cell. Biol. 1999; 19:5588–5600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102. Liu X., Freitas J., Zheng D., Oliveira M.S., Hoque M., Martins T., Henriques T., Tian B., Moreira A.. Transcription elongation rate has a tissue-specific impact on alternative cleavage and polyadenylation in Drosophila melanogaster. RNA. 2017; 23:1807–1816. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103. Pinto P.A., Henriques T., Freitas M.O., Martins T., Domingues R.G., Wyrzykowska P.S., Coelho P.A., Carmo A.M., Sunkel C.E., Proudfoot N.J. et al.. RNA polymerase II kinetics in polo polyadenylation signal selection. EMBO J. 2011; 30:2431–2444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104. Ciolli Mattioli C., Rom A., Franke V., Imami K., Arrey G., Terne M., Woehler A., Akalin A., Ulitsky I., Chekulaeva M.. Alternative 3′ UTRs direct localization of functionally diverse protein isoforms in neuronal compartments. Nucleic Acids Res. 2019; 47:2560–2573. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105. Taliaferro J.M., Vidaki M., Oliveira R., Olson S., Zhan L., Saxena T., Wang E.T., Graveley B.R., Gertler F.B., Swanson M.S. et al.. Distal alternative last exons localize mRNAs to neural projections. Mol. Cell. 2016; 61:821–833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106. Nazim M., Masuda A., Rahman M.A., Nasrin F., Takeda J.I., Ohe K., Ohkawara B., Ito M., Ohno K.. Competitive regulation of alternative splicing and alternative polyadenylation by hnRNP H and CstF64 determines acetylcholinesterase isoforms. Nucleic Acids Res. 2017; 45:1455–1468. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107. Hafner A.S., Donlin-Asp P.G., Leitch B., Herzog E., Schuman E.M.. Local protein synthesis is a ubiquitous feature of neuronal pre- and postsynaptic compartments. Science. 2019; 364:eaau3644. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
High-throughput sequence data has been deposited to the Gene Expression Omnibus (GEO) database repository at NCBI with the accession numbers GSE152976 (RNA-seq) and GSE152975 (3′-seq). The structures of the wild type CSTF2 (PDB 6Q2I) and CSTF2D50A (PDB 6TZE) RRM domains were deposited to the Research Collaboratory for Structural Bioinformatics Protein Data Bank.